🧠 Generative Image Evaluation using CLIP

This project uses OpenAI’s CLIP model to evaluate how well an image matches a text prompt (semantic alignment) and estimate its aesthetic quality by comparing it to reference high-quality images.

It provides both:

A Jupyter notebook for step-by-step experimentation
A Gradio app for an interactive demo (locally or via Hugging Face Spaces)

📂 Project Structure

├── app.py # Main entrypoint for the Gradio app (used in Hugging Face Space)

├── requirements.txt # Dependencies for local setup or deployment

├── src/ └── evaluate.py # Core helper functions (embeddings, scoring, loading references)

├── demo/ └── gradio_app.py # Earlier local Gradio prototype (for reference)

├── data/ └── example_images/ # Reference high-quality images for aesthetic scoring

├── notebooks/ └── GenerativeImageEvaluation_CLIP.ipynb # Detailed step-by-step notebook version

└── README.md # You’re here

⚙️ How It Works

1. Alignment Score

Uses CLIP cosine similarity to measure how closely the uploaded image matches the text prompt.

Range: 0.0 → ~0.45 (rarely above 0.5)

Interpretation:

Score Range	Meaning
0.00 – 0.15	Weak alignment (image unrelated to prompt)
0.15 – 0.30	Moderate alignment
0.30 – 0.45	Strong alignment
0.45+	Very strong match (almost literal description)

(Note: CLIP embeddings are normalized, so cosine values rarely approach 1.0 even for perfect matches.)

2. Aesthetic Score

Compares the uploaded image to reference high-quality photos stored in data/example_images/ and computes the average cosine similarity.

Higher scores → more visually pleasing, balanced, and professional-looking.

Typical ranges:

Score Range	Meaning
0.00 – 0.20	Low aesthetic similarity (noisy or cluttered)
0.20 – 0.40	Moderate appeal
0.40 – 0.60	High aesthetic similarity
0.60+	Very high — visually striking / professional look

🚀 Running the Demo Locally

Prerequisites

Python 3.10+
PyTorch
Internet connection (for downloading CLIP weights)

Steps

Create and activate a virtual environment (optional but recommended):
```
python -m venv .venv
source .venv/bin/activate
```
Install dependencies:
```
pip install -r requirements.txt
```
Run the demo:
```
python app.py
```
To run locally : If you prefer to use the earlier prototype inside the /demo folder, run bash python -m demo.gradio_app instead.
Open the local URL displayed (usually http://127.0.0.1:7860).

Upload an image and a short text prompt — you’ll see Alignment Score and Aesthetic Score displayed in real time.

🌐 Public Demo (Hugging Face Space)

If you’d like to try this app directly in your browser, without setting up anything locally, visit the public Hugging Face Space.

This hosted demo runs the same Gradio app on Hugging Face’s infrastructure, allowing anyone to test the evaluator interactively.

🧩 Technologies Used

•	Python
•	PyTorch
•	Hugging Face Transformers
•	Gradio
•	OpenAI CLIP model

💡 Notes

•	Alignment scores depend on the semantic similarity between the prompt and image content.
•	Aesthetic scores rely on chosen reference images — replacing them with your own dataset changes the scoring context.
•	If you face low alignment scores, try rephrasing prompts to be more literal (e.g., “a red sports car on a road” instead of “fast luxury vehicle”).

✨ Credits

Developed by Rithika Baskaran as part of a creative exploration in evaluating generative AI outputs with CLIP-based models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Generative Image Evaluation using CLIP

📂 Project Structure

⚙️ How It Works

1. Alignment Score

2. Aesthetic Score

🚀 Running the Demo Locally

Prerequisites

Steps

🌐 Public Demo (Hugging Face Space)

🧩 Technologies Used

💡 Notes

✨ Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
demo		demo
notebooks		notebooks
src		src
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🧠 Generative Image Evaluation using CLIP

📂 Project Structure

⚙️ How It Works

1. Alignment Score

2. Aesthetic Score

🚀 Running the Demo Locally

Prerequisites

Steps

🌐 Public Demo (Hugging Face Space)

🧩 Technologies Used

💡 Notes

✨ Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages