AI Audio Tools – nanonomad

Training LoRAs and GLoRAs for Stable Diffusion 1.5 and XL Using the New Prodigy Optimizer

Posted on January 18, 2024 (April 2, 2024) by nanonomad

Training LoRA and GLoRA on SD 1.5 & XL with the Prodigy Optimizer using the Kohya_SS scripts.In today’s video I look at training LoRA and GLoRA adapters for Stable Diffusion 1.5 and XL using the Prodigy optimizer on a large and varied dataset made up of 16 characters. Then I show an example of how […]

A look at XTTS v1 and Tools for Comparing Audio Embeddings

Posted on September 21, 2023 (April 2, 2024) by nanonomad

In this video I look at Coqui’s new XTTS v1 text to speech model, and complain about licensing. Then I look at a couple tools, pyannote and Speechbrain, and use a model to generate and compare audio embeddings. This can be used to identify mismatching audio clips in your datasets. Remove poor quality clips, and […]

Remove Background Music and Enhance Speech with Free AI Tools | Avoid ContentID on YouTube

Posted on August 8, 2023 (August 28, 2023) by nanonomad

A look at using Ultimate Voice Remover, a free frontend for AI audio source separation models, to remove background music from TV clips and radio broadcasts. Then, using FFMpeg to separate audio tracks, as well as recombining single and multiple audio tracks back into a video using FFMpeg. Ultimate Vocal Remover GUI GitHub:https://github.com/Anjok07/ultimatevocalremovergui FFMpeg Windows […]