A look at the TTS Generation Web UI for Bark text to speech, generating music, translation and more. Just a broad overview of what seems to work well, and not so well in this feature-packed project. Video Link: https://www.youtube.com/watch?v=Y8J717tr9t0 Sources for RVC Models: https://rvc-models.com/ https://voice-models.com/ https://huggingface.co/spaces/zomehwh… TTS Generation WebUI: https://github.com/rsxdalv/tts-generation-webui […]
Category: Uncategorized
Site will be updated/worked on soon
I’ve been dealing with some health issues, so the site remains unfinished. Videos are on pause for now, but I’m still working on a few things.In addition, my PC finally died from the load of training ML models nearly 24/7 for the past year. Either the CPU or MB is completely dead. No visual signs […]
Automate Image Captioning using Multimodal LLMs
Using multi-modal large language models for automated image captioning. Rich captions can be used for training Stable Diffusion Dreambooth or LoRAs. […]
Read More… from Automate Image Captioning using Multimodal LLMs