recognize anything – nanonomad

Automate Image Captioning using Multimodal LLMs

Posted on November 19, 2023 (April 2, 2024) by nanonomad

Using multi-modal large language models for automated image captioning. Rich captions can be used for training Stable Diffusion Dreambooth or LoRAs. […]

Read More… from Automate Image Captioning using Multimodal LLMs