Skip to content

Commit b85ec8c

Browse files
committed
adding README to the open_source_models folder
1 parent 7d74113 commit b85ec8c

1 file changed

Lines changed: 133 additions & 0 deletions

File tree

2_open_source_models/README.md

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# **Open-Source Model Experiments**
2+
3+
This directory contains four standalone experiments exploring
4+
**local, open-source language models** for Retrieval-Augmented Generation
5+
(RAG), model evaluation, recursive editing, and sustainability tracking
6+
(energy & CO₂ emissions).
7+
Each subfolder includes its own notebook, documentation, outputs, and
8+
model-specific setup.
9+
10+
---
11+
12+
## Directory Structure
13+
14+
```text
15+
2_open_source_models/
16+
17+
├── distilled_models/
18+
│ └── rag_and_distilled_model/
19+
20+
├── quantized_models/
21+
│ └── mistral7b/
22+
23+
└── slm/
24+
├── google_gemm/
25+
└── qwen/
26+
```
27+
28+
Each subfolder contains a self-contained model with its own README,
29+
notebook(s), generated outputs, and energy/emissions logs where applicable.
30+
31+
---
32+
33+
## Project Summaries
34+
35+
Below is a concise description of each model project to understand
36+
the purpose of the overall folder at a glance.
37+
38+
---
39+
40+
### **1. Distilled Models – RAG + Instruction-Tuned Distilled LMs**
41+
42+
**Folder:** `distilled_models/rag_and_distilled_model/`
43+
**Notebook:** `Apollo11_rag&distilled.ipynb`
44+
45+
This project uses a lightweight **LaMini-Flan-T5-248M** distilled model
46+
combined with a **MiniLM** embedding model to run a fully local
47+
Retrieval-Augmented Generation pipeline on the Apollo 11 dataset.
48+
It demonstrates:
49+
50+
* Local embeddings and ChromaDB vector storage
51+
* RAG-based question answering
52+
* Evaluation across several prompt types
53+
* Emissions tracking and generated output logs
54+
55+
Ideal for showing how **compact distilled models** can handle
56+
RAG efficiently on CPU or modest GPU hardware.
57+
58+
---
59+
60+
### **2. Quantized Models – Mistral 7B RAG Pipeline**
61+
62+
**Folder:** `quantized_models/mistral7b/`
63+
64+
This project evaluates a **quantized Mistral-7B (GGUF)** model running
65+
fully locally via `llama-cpp-python`.
66+
It focuses on:
67+
68+
* Retrieval-Augmented Generation using LlamaIndex
69+
* Local inference using a 4-bit quantized LLM
70+
* Document processing, embedding (BGE-small), and top-k retrieval
71+
* Practical observations on feasibility and performance on a laptop
72+
73+
A strong example of how quantization enables
74+
**large-model capability at small-device cost**.
75+
76+
---
77+
78+
### **3. Small Language Model (SLM): Google Gemma 2-2B**
79+
80+
**Folder:** `slm/google_gemm/`
81+
82+
This experiment implements a structured RAG workflow with Google’s lightweight
83+
**Gemma 2-2B** model and a fixed Apollo 11 source text.
84+
Key features include:
85+
86+
* Standardized 21-prompt evaluation set
87+
* RAG pipeline with chunked retrieval
88+
* Draft to Critic to Refiner multi-step generation
89+
* Real-time emissions logging with CodeCarbon
90+
* Fully reproducible testing and reporting
91+
92+
This project demonstrates how even very small open-weight models can
93+
perform multi-step reasoning when paired with thoughtful prompting and revision
94+
cycles.
95+
96+
---
97+
98+
### **4. Small Language Model (SLM): Qwen 2.5B + Recursive Editing**
99+
100+
**Folder:** `slm/qwen/`
101+
102+
This notebook experiments with **Qwen 2.5B**, integrating:
103+
104+
* RAG retrieval
105+
* A recursive editing loop (Draft to Critic to Refine)
106+
* Context retrieval through Hugging Face embeddings
107+
* Energy + CO₂ logging for each query
108+
109+
Outputs are saved in markdown form with all iterations and emissions data.
110+
111+
---
112+
113+
## Purpose of This Collection
114+
115+
This folder exists to:
116+
117+
* Compare how different **model sizes**, **architectures**, and
118+
**inference strategies** behave on the **same tasks**.
119+
* Demonstrate **fully local RAG pipelines** using only open-source components.
120+
* Document **energy and carbon trade-offs** in local LLM usage.
121+
* Provide reproducible examples that can be extended or rerun with other models.
122+
123+
Each subfolder is designed as a standalone experiment, but together they
124+
form a cohesive study of open-source LLM efficiency and performance.
125+
126+
---
127+
128+
## Notes
129+
130+
* All code is intended to run locally.
131+
* Each folder includes its own notebook and README with instructions.
132+
* Energy/emissions reporting is included where relevant (via CodeCarbon).
133+
* Datasets and prompts are standardized across projects for fairness and comparability.

0 commit comments

Comments
 (0)