This project uses Salesforce's BLIP (Bootstrapping Language-Image Pre-training) model to automatically generate text descriptions for uploaded images. The interface is built using the Gradio library.
- Generate natural language captions for any image.
- Simple web interface (drag and drop files or browse).
- Optimized generation parameters to avoid repetitions (
repetition_penalty,no_repeat_ngram_size). - Automatic conversion of images to RGB format (support for PNG with transparency).
- Clone the repository:
git clone https://github.com/MNJMARIA/blip-image-captioning.git
cd blip-image-captioning