66This repository presents the ** ELO2 – GREEN AI Project** , developed
77within the ** MIT Emerging Talent – AI & ML Program (2025)** . The work
88investigates the technical performance, sustainability traits, and
9- human-perceived quality of ** open-source small language models (SLMs) **
9+ human-perceived quality of ** open-source language models**
1010compared to commercial systems.
1111
1212---
@@ -18,6 +18,8 @@ compared to commercial systems.
1818** To what extent can open-source LLMs provide competitive output quality
1919while operating at significantly lower environmental cost?**
2020
21+ ![ image] ( readme_images/trade-off.png )
22+
2123### Motivation
2224
2325Large commercial LLMs deliver strong performance but demand substantial
@@ -30,6 +32,8 @@ alternatives for everyday tasks.
3032
3133## 🧪 Methods
3234
35+ ![ image] ( readme_images/project-timeline.png )
36+
3337### 1. Model Families
3438
3539The study evaluates several open-source model groups:
@@ -45,15 +49,13 @@ The study evaluates several open-source model groups:
4549These configurations serve as the optimized open-source setups used in
4650the comparison against commercial models.
4751
48- ---
49-
5052### 2. Tasks & Dataset
5153
5254Evaluation tasks include:
5355
5456- summarization
5557- factual reasoning
56- - paraphasing
58+ - paraphrasing
5759- short creative writing
5860- instruction following
5961- question answering
@@ -64,8 +66,6 @@ directly from this shared material. Using a single, consistent source ensured
6466that every model was tested under identical informational conditions, allowing
6567clear and fair comparison of output quality and relevance.
6668
67- ---
68-
6969### 3. RAG Pipeline
7070
7171Retrieval-Augmented Generation (RAG) was applied to multiple model
@@ -78,8 +78,6 @@ families. The pipeline includes:
7878
7979RAG improved factual grounding in nearly all models.
8080
81- ---
82-
8381### 4. Recursive Editing Framework
8482
8583A lightweight iterative refinement procedure was implemented:
@@ -100,21 +98,17 @@ A lightweight iterative refinement procedure was implemented:
10098This approach allowed weaker SLMs to yield higher-quality results
10199without relying on large models.
102100
103- ---
104-
105101### 5. Environmental Measurement
106102
107103Environmental footprint data was captured with ** CodeCarbon** , recording:
108104
109105- CPU/GPU energy usage
110- - carbon emissions
106+ - Carbon emissions
111107- PUE-adjusted overhead
112108
113109These measurements enabled comparison with published metrics for
114110commercial LLMs.
115111
116- ---
117-
118112### 6. Human Evaluation (Single-Blind)
119113
120114A structured Google Form experiment collected:
@@ -126,6 +120,14 @@ A structured Google Form experiment collected:
126120Outputs were randomized and anonymized to avoid bias. This provided a
127121perception-based counterpart to technical evaluation.
128122
123+ ### 7. Analysing the Results
124+
125+ ....
126+
127+ ### 8. Publishing an Article
128+
129+ ....
130+
129131---
130132
131133## 📊 Key Findings
@@ -135,7 +137,49 @@ perception-based counterpart to technical evaluation.
135137- FINDING3.....
136138- FINDING4.....
137139
138- More detailed results are included in the research article.
140+ ---
141+
142+ ## 🔮 Future Work
143+
144+ - Evaluate additional open-source model families across diverse tasks
145+ - Test optimized pipelines in specialized domains (medical, legal, technical writing)
146+ - Track carbon footprint across full lifecycle (training to deployment)
147+ - Conduct ablation studies isolating RAG vs. recursive editing contributions
148+
149+ ---
150+
151+ ## 📢 Communication Strategy
152+
153+ The research findings will be shared through formats designed for different
154+ audiences and purposes:
155+
156+ ### For Researchers
157+
158+ A comprehensive research article will document the complete experimental design,
159+ statistical analysis, and implications.
160+
161+ 🔗 ** [ View Aticle] ( link1 ) **
162+
163+ ### For Practitioners & Educators
164+
165+ An executive presentation provides a visual overview of the research question,
166+ methodology, and key findings without requiring deep technical background.
167+
168+ 🔗 ** [ View Presentation] ( link2 ) **
169+
170+ ### For the Community
171+
172+ A public evaluation study invites participation in assessing AI-generated texts.
173+ This crowdsourced data forms a critical component of the research.
174+
175+ 🔗 ** [ Participate in Study] ( link3 ) **
176+
177+ ### For Reproducibility
178+
179+ All materials (dataset, prompts, model outputs, evaluation scripts, and carbon
180+ tracking logs) are publicly available in this repository.
181+
182+ 🔗 ** [ Browse Repository] ( https://github.com/banuozyilmaz2-jpg/ELO2-GREEN-AI ) **
139183
140184---
141185
@@ -152,5 +196,5 @@ More detailed results are included in the research article.
152196
153197## 🙏 Acknowledgments
154198
155- Special thanks to the ** MIT Emerging Talent Program** for their
156- guidance and feedback throughout the project.
199+ Special thanks to the ** MIT Emerging Talent Program** for their guidance and
200+ feedback throughout the project.
0 commit comments