XTTSv2 Hindi Finetuned Checkpoints For use in most implementations of XTTSv2, these must be renamed to model.pth and replace the original XTTSv2 checkpoint. https://huggingface.co/AOLCDROM/XTTSv2-Hi_ft/tree/main Indic TTS Hindi Dataset https://www.iitm.ac.in/donlab/indictts/database Common Voice Dataset https://commonvoice.mozilla.org/en/datasets Convert Mozilla Common Voice .TSV to VCTK format dataset metadata conv_cv_vctk.py Download and install ffmpeg, and add it to your windows system […]
Tag: XTTS
Fine Tuning XTTS V2 with (forked) Coqui
Fine tuning XTTS v2 with the forked Coqui project. Coqui AI shut down earlier this year, so what does that mean for us? Here I go over adjusting the Coqui XTTS v2 training recipe, creating a dataset using Audacity and faster-whisper, and training a single speaker and multispeaker XTTS v2 english model Convert wavs to […]
A look at XTTS v1 and Tools for Comparing Audio Embeddings
In this video I look at Coqui’s new XTTS v1 text to speech model, and complain about licensing. Then I look at a couple tools, pyannote and Speechbrain, and use a model to generate and compare audio embeddings. This can be used to identify mismatching audio clips in your datasets. Remove poor quality clips, and […]
Read More… from A look at XTTS v1 and Tools for Comparing Audio Embeddings