A look at XTTS v1 and Tools for Comparing Audio Embeddings

In this video I look at Coqui’s new XTTS v1 text to speech model, and complain about licensing. Then I look at a couple tools, pyannote and Speechbrain, and use a model to generate and compare audio embeddings. This can be used to identify mismatching audio clips in your datasets. Remove poor quality clips, and […]

Read More… from A look at XTTS v1 and Tools for Comparing Audio Embeddings