I’ve been looking at multispeaker VITS TTS models lately, so thought I’d share the Google Colab notebook. Its similar to the others posted, but this is using precomputed vectors; the configuration is similar to the YourTTS model, however this seems a little easier to fine tune. As always, this stuff is experimental, but this should […]
Month: February 2023
Updated | Near-Automated Voice Cloning | Whisper STT + Coqui TTS | Fine Tune a VITS Model on Colab
This is about as close to automated as I can make things. I’ve put together a Colab notebook that uses a bunch of spaghetti code, rnnoise, OpenAI’s Whisper Speech to Text, and Coqui Text to Speech to train a VITS model. Upload audio files, split and process clips, denoise clips, transcribe clips with Whisper, then […]