LLaMA-Factory with Flash Attention 2 and Unsloth

It was tough to get this working, but I think I’ve figured it out enough to share. Here’s a quick guide on how to set up LLaMA-Factory with support for Flash Attention 2 and Unsloth training on Windows. This is using a RTX3060 12GB GPU, Windows 10, and CUDA 12.1. Unsloth is an optimization library […]

Read More… from LLaMA-Factory with Flash Attention 2 and Unsloth

Automate Image Captioning using Multimodal LLMs

Using multi-modal large language models for automated image captioning. Rich captions can be used for training Stable Diffusion Dreambooth or LoRAs. […]

Read More… from Automate Image Captioning using Multimodal LLMs

Fine Tuning Mistral 7B

Can you train new or forbidden knowledge into a LLM? Let’s fine out as I throw 1 gigabyte of scraped, cleaned, plaintext KiwiFarms posts at Mistral 7B. I go over my experience fine-tuning Mistral 7B on a few large datasets of scraped text data including English language song lyrics, and a huge KiwiFarms post dataset. […]

Read More… from Fine Tuning Mistral 7B