Even more Voice Cloning | Train a Multi-Speaker VITS model using Google Colab and a Custom Dataset

I’ve been looking at multispeaker VITS TTS models lately, so thought I’d share the Google Colab notebook. Its similar to the others posted, but this is using precomputed vectors; the configuration is similar to the YourTTS model, however this seems a little easier to fine tune. As always, this stuff is experimental, but this should […]

Read More… from Even more Voice Cloning | Train a Multi-Speaker VITS model using Google Colab and a Custom Dataset

Updated | Near-Automated Voice Cloning | Whisper STT + Coqui TTS | Fine Tune a VITS Model on Colab

This is about as close to automated as I can make things. I’ve put together a Colab notebook that uses a bunch of spaghetti code, rnnoise, OpenAI’s Whisper Speech to Text, and Coqui Text to Speech to train a VITS model. Upload audio files, split and process clips, denoise clips, transcribe clips with Whisper, then […]

Read More… from Updated | Near-Automated Voice Cloning | Whisper STT + Coqui TTS | Fine Tune a VITS Model on Colab