DiffRhythm – Fast, Full-Length Song Generation

A look at DiffRhythm, a diffusion model for generative music. This one is FAST. I’m looking over the demos, sharing some installation notes, trying some demo generations, seeing what works and what doesn’t, and trying to make a decent sounding tune. This is not a detailed tutorial. DiffRhythm Huggingface demo – https://huggingface.co/spaces/ASLP-lab/DiffRhythm DiffRhythm demo page […]

Read More… from DiffRhythm – Fast, Full-Length Song Generation

YuE can’t have Suno.AI We have Generative Music at Home!

I recently discovered M-A-P YuE, an open-source AI music generator that creates complete songs with lyrics. While the original model demands hefty 80GB VRAM requirements, I came across a great GitHub project by Mozer called “YuE-extend”. This adds music extension support (with the -icl models) and exllamav2 quantized model loading. That makes it possible to […]

Read More… from YuE can’t have Suno.AI We have Generative Music at Home!

Standalone Whisper XXL: The Hassle-Free Transcription Tool

I recently discovered a GitHub project which I cover in this video that’s become my go-to for transcription work – Standalone Whisper XXL by Purfview. If you’ve tried implementing OpenAI’s Whisper speech-to-text model before, you know it can get messy with dependencies, especially when using enhanced forks like faster-whisper. This project solves all those headaches […]

Read More… from Standalone Whisper XXL: The Hassle-Free Transcription Tool

XTTSv2 Hindi Finetuning

XTTSv2 Hindi Finetuned Checkpoints For use in most implementations of XTTSv2, these must be renamed to model.pth and replace the original XTTSv2 checkpoint. https://huggingface.co/AOLCDROM/XTTSv2-Hi_ft/tree/main Indic TTS Hindi Dataset https://www.iitm.ac.in/donlab/indictts/database Common Voice Dataset https://commonvoice.mozilla.org/en/datasets Convert Mozilla Common Voice .TSV to VCTK format dataset metadata conv_cv_vctk.py Download and install ffmpeg, and add it to your windows system […]

Read More… from XTTSv2 Hindi Finetuning

Fine Tuning XTTS V2 with (forked) Coqui

Fine tuning XTTS v2 with the forked Coqui project. Coqui AI shut down earlier this year, so what does that mean for us? Here I go over adjusting the Coqui XTTS v2 training recipe, creating a dataset using Audacity and faster-whisper, and training a single speaker and multispeaker XTTS v2 english model Convert wavs to […]

Read More… from Fine Tuning XTTS V2 with (forked) Coqui

LLaMA-Factory with Flash Attention 2 and Unsloth

It was tough to get this working, but I think I’ve figured it out enough to share. Here’s a quick guide on how to set up LLaMA-Factory with support for Flash Attention 2 and Unsloth training on Windows. This is using a RTX3060 12GB GPU, Windows 10, and CUDA 12.1. Unsloth is an optimization library […]

Read More… from LLaMA-Factory with Flash Attention 2 and Unsloth

Stable Audio Open 1.0 | Open Source Generative Audio with Fine Tuning

A look at Stability AI’s new Stable Open Audio 1.0 open source (kinda, sorta, mostly, technically) model and codebase with fine tuning support (kinda, sorta, technically). I’ve managed to get the trainer running, but I only have a 12gb GPU, which isn’t enough for training right now. Resources: https://huggingface.co/stabilityai/stable-audio-open-1.0https://github.com/Saganaki22/StableAudioWebUI https://github.com/Stability-AI/stable-audio-tools/issues/34 Example training launch command: python […]

Read More… from Stable Audio Open 1.0 | Open Source Generative Audio with Fine Tuning

Troubleshooting Sega Saturn Emulation with Retroarch for iOS/Apple

The video talks about common issues that people run into when using Beetle Saturn core on Apple devices and how to fix them. One issue is that the ROMs available online are probably in .bin and .cue format, which can sometimes cause problems if the .cue file contains path names. The .cue file is a […]

Read More… from Troubleshooting Sega Saturn Emulation with Retroarch for iOS/Apple

Play Windows 98 and MS-DOS Games on iPad/iOS/iPhone with DOSBox-Pure and Retroarch for FREE

Summary Learn how to play Windows 98 and MS-DOS games on iPad, iOS, and iPhone for free using RetroArch and DOSBox-Pure. This video guide covers topics such as adding games, keyboard and mouse setup, performance tweaks, installing Windows 98, and running games within the Windows 98 environment. Highlights Key Insights […]

Read More… from Play Windows 98 and MS-DOS Games on iPad/iOS/iPhone with DOSBox-Pure and Retroarch for FREE

The Lost Art of Optical Disc Repair | Fixing and Testing a PlayStation Disc

Summary The video discusses the decline of garage sales and the difficulty in finding bargain items in good condition. It then provides information on how to repair damaged PlayStation discs using software and DIY techniques, such as polishing and using nail polish. The video concludes with a demonstration of how the repaired disc can be […]

Read More… from The Lost Art of Optical Disc Repair | Fixing and Testing a PlayStation Disc