Project: “Green AI Automated RAG Testing with Gemma”

Overview

We built an interactive retrieval-augmented generation (RAG) pipeline using the open-weight model Gemma 2-2b by Google and applied it to a standardized text and prompt set derived from the Apollo 11 lunar landing.

The goal:

evaluate summarisation, reasoning, retrieval, paraphrasing, and creative tasks in a controlled, reproducible way — logging both answer quality and local sustainability metrics (energy/carbon emissions) via CodeCarbon.

Model

Model ID: google/gemma-2-2b-it (Hugging Face)

Key attributes:

Open-weight decoder-only model trained by Google
Supports text-generation and conversational usage
Suitable for research, summarisation, reasoning, and retrieval tasks
Lightweight enough for deployment on modest compute resources

Model link: https://huggingface.co/google/gemma-2-2b-it

What We Did

Created a source document (source.txt) using ~1,400 words of selected Wikipedia excerpts on Apollo 11.
Defined a set of 21 standardised prompts spanning five categories: summarisation, reasoning, RAG (fact retrieval), paraphrasing, and creative generation.
Built a document retrieval component using sentence-transformers to chunk the document and select top-k relevant chunks per query.
Developed an interactive notebook workflow that:
- Accepts a question at runtime
- Runs RAG → Draft → Critic → Refiner cycles using Gemma
- Tracks local CPU/GPU energy usage and CO₂ emissions with CodeCarbon
- Logs each question, answer, timestamp, and emissions to a single append-only log file
Logged runtime latency and emissions per query for performance and sustainability insights.

How to Use

Clone the repository:

git clone <YOUR_REPO_URL>
cd your_repo_folder

Place your source.txt into ./data/.
Add your Hugging Face API key in the config cell.
Run the notebook setup cells, then use the interactive prompt cell to ask questions.

Why This Matters

Reproducibility — fixed source text and prompt set allow consistent evaluation across models.
Efficiency vs. Accuracy — emissions are logged alongside outputs to explore trade-offs between model performance and energy cost.
Accessibility — uses an open model and standard Python tools, making research on small language models feasible even on laptops.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project: “Green AI Automated RAG Testing with Gemma”

Overview

Model

What We Did

How to Use

Why This Matters

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Project: “Green AI Automated RAG Testing with Gemma”

Overview

Model

What We Did

How to Use

Why This Matters