Skip to content

Main Navigation

nanonomad

Tag: BLIP

Automate Image Captioning using Multimodal LLMs

Posted on November 19, 2023 (April 2, 2024) by nanonomad

Using multi-modal large language models for automated image captioning. Rich captions can be used for training Stable Diffusion Dreambooth or LoRAs. […]

Read More… from Automate Image Captioning using Multimodal LLMs

Posted in UncategorizedTagged BLIP, caption generation, captioning, image captioning, KOSMOS-2, LLM, recognize anythingLeave a comment on Automate Image Captioning using Multimodal LLMs

Recent Posts

  • DiffRhythm – Fast, Full-Length Song Generation
  • YuE can’t have Suno.AI We have Generative Music at Home!
  • Standalone Whisper XXL: The Hassle-Free Transcription Tool
  • XTTSv2 Hindi Finetuning
  • Fine Tuning XTTS V2 with (forked) Coqui

Recent Comments

  1. Technology on Fine Tuning XTTS V2 with (forked) Coqui
  2. Tommyirorm on DiffRhythm – Fast, Full-Length Song Generation
  3. Форум on DiffRhythm – Fast, Full-Length Song Generation
  4. Finance on DiffRhythm – Fast, Full-Length Song Generation
  5. nanonomad on XTTSv2 Hindi Finetuning

Archives

  • April 2025
  • February 2025
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • February 2023

Categories

  • AI Audio Tools
  • AI Image Tools
  • Demos
  • Hardware General
  • Large Language Models
  • Retro Gaming and Emulation
  • Text to Speech
  • Tutorials
  • Uncategorized
Proudly powered by WordPress | Theme: Understrap by understrap.com.(Version: 1.2.3)