A look at XTTS v1 and Tools for Comparing Audio Embeddings

In this video I look at Coqui’s new XTTS v1 text to speech model, and complain about licensing. Then I look at a couple tools, pyannote and Speechbrain, and use a model to generate and compare audio embeddings. This can be used to identify mismatching audio clips in your datasets. Remove poor quality clips, and […]

Read More… from A look at XTTS v1 and Tools for Comparing Audio Embeddings

.:Demo:. Tortoise TTS Expressive Speech narrating Norman Arkawy’s 1955 Sci-Fi short “Selling Point”

Narration of the short story ‘Selling Point’ by Norman Arkawy using a Tortoise TTS model generating a familiar-sounding, expressive, British voice. Originally published in ‘Imagination Stories of Science and Fantasy’, December 1955.One of my favorite short stories. Full story text:https://www.gutenberg.org/cache/epub/66713/pg66713.txt Training: 10 epochs total. Epochs 1-4 LR 1e-5, 5-6 LR 1e-6, 7-10 LR 1e-7 Mel/Text: […]

Read More… from .:Demo:. Tortoise TTS Expressive Speech narrating Norman Arkawy’s 1955 Sci-Fi short “Selling Point”

.::Demo::. 4 Voice Multispeaker Tortoise TTS English Fine-Tuned Model Test :: Great Dictator Speech

First test of the new Tortoise model. 4 voices, which also can be found in the YourTTS model I posted recently. LJS, John, and Tom rendered without any stammers or repeats, Lah has some stutters if I recall. No cherry-picked examples. Gen settings same as in my Tortoise fine tuning video, except denoise set to […]

Read More… from .::Demo::. 4 Voice Multispeaker Tortoise TTS English Fine-Tuned Model Test :: Great Dictator Speech