Can you train new or forbidden knowledge into a LLM? Let’s fine out as I throw 1 gigabyte of scraped, cleaned, plaintext KiwiFarms posts at Mistral 7B. I go over my experience fine-tuning Mistral 7B on a few large datasets of scraped text data including English language song lyrics, and a huge KiwiFarms post dataset. […]