Skip to content

Commit d2ae928

Browse files
authored
Merge pull request #15 from MIT-Emerging-Talent/meeting_minutes
Meeting minutes: Adding milestone 1&2 meeting notes + Meeting notes ReadMe
2 parents e6df924 + 1dc6416 commit d2ae928

3 files changed

Lines changed: 360 additions & 0 deletions

File tree

meeting_minutes/README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
<!-- markdownlint-disable MD013 -->
2+
3+
# 🗓️ Meeting Minutes – Environmental Impact of AI Models
4+
5+
This directory documents the weekly progress and decision-making process for the research project on **the environmental and performance trade-offs between large proprietary and small open-source AI models**.
6+
7+
Each meeting entry outlines team discussions, feedback, experimental progress, and assigned tasks across project milestones.
8+
9+
## 🧭 Milestone 1 – Scoping & Research Question Refinement
10+
11+
**Timeline:** September 27 – October 14, 2025
12+
13+
The first milestone focused on refining the research direction and defining a clear, measurable problem within **Green AI**. After exploring various AI-related topics, the team finalized the project title — **“Green AI Benchmarking of Foundation Models”** — and the research question:
14+
15+
> Can open-source LLMs match the accuracy of commercial models while reducing environmental impact?
16+
>
17+
18+
Key progress included reviewing literature on energy, carbon, and water use in AI systems, selecting benchmark tasks (**reasoning** and **summarization**), and identifying evaluation metrics for **accuracy** and **environmental footprint**. The team also chose comparison models (**GPT-4** and **Mistral-7B**), created shared documentation, and distributed responsibilities among members.
19+
20+
By the end of Milestone 1, the project established its scope, research framework, and collaborative infrastructure, setting the stage for **Milestone 2**, focused on tool setup and metric calibration.
21+
22+
## ⚙️ Milestone 2 – Tool Setup & Experiment Planning
23+
24+
**Timeline:** October 15 – Ongoing
25+
26+
With the research framework and scope finalized in Milestone 1, **Milestone 2** focuses on preparing the experimental environment and defining how sustainability metrics will be measured. This phase involves setting up tools such as **CodeCarbon****CarbonTracker**, and **Eco2AI** to monitor energy and carbon usage, and exploring **Water Usage Effectiveness (WUE)** datasets from major cloud providers like AWS, Microsoft, and Google.
27+
28+
The team also plans to configure testing environments for small open-source models (e.g., **Mistral****LLaMA-2**) using **Hugging Face Transformers****PyTorch**, and GPU-enabled platforms such as **Colab**. Another core deliverable is the **experimental design document**, which will outline the metrics (energy, carbon, water, and accuracy), workflows, and methodology diagrams guiding the model evaluation process.
29+
30+
This milestone sets the foundation for **Milestone 3**, where real model experiments and energy tracking will begin.
Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
<!-- markdownlint-disable MD024 -->
2+
<!-- Disabled MD024 (Multiple headings with the same content) rule
3+
because repeated headings (Summary, Action Items) are
4+
intentionally used across multiple sections for structural clarity. -->
5+
# Milestone 1 Meeting Minutes
6+
7+
## Meeting 1
8+
9+
**Date:** September 27, 2025 (Saturday, 10:00 AM EST)
10+
**Attendees:** Amro, Aseel, Reem, Caesar, Banu
11+
12+
### Summary
13+
14+
- Group members met and introduced themselves.
15+
- Project topic suggestions were presented:
16+
- *AI Jobs vs Real Jobs* (continuation of CDSP)
17+
- *Reddit Mental Health Text Analysis*
18+
- *Machine Learning for Climate–Environmental Data*
19+
20+
### Action Items
21+
22+
- Conduct a **domain search** on the proposed topics.
23+
- Bring **alternative project ideas** to the next meeting.
24+
- Create a [**Google Doc**](https://docs.google.com/document/d/1dk0j0GUoDWqBHmLArcS2xoW5ct5nOjdlCeX3P-yhhOw/edit?tab=t.0)
25+
to facilitate asynchronous collaboration.
26+
27+
---
28+
29+
## Meeting 2
30+
31+
**Date:** September 29, 2025 (Monday, 12:00 PM EST)
32+
**Attendees:** Amro, Aseel, Reem, Caesar, Banu
33+
34+
### Summary
35+
36+
- Members presented new project ideas and ELO2 process plans:
37+
- *Mental Health of University Students in Sudan*
38+
- *Probabilistic Dental Triage System with Synthetic Data Generation for
39+
Resource-Limited Settings*
40+
- *Project: Green AI Benchmarking of Foundation Models*
41+
- *Green AI — Energy & Water Efficiency in Machine Learning*
42+
- Previously proposed topics were dropped due to various constraints.
43+
- The new ideas were discussed, but no final consensus was reached.
44+
45+
### Action Items
46+
47+
- All members will research the newly proposed topics before the next meeting.
48+
- The group will reach a **final decision** on the project topic at the next
49+
session.
50+
51+
---
52+
53+
## Meeting 3
54+
55+
**Date:** September 30, 2025 (Tuesday, 1:30 PM EST)
56+
**Attendees:** Amro, Aseel, Reem, Caesar, Banu
57+
58+
### Summary
59+
60+
- The topics discussed in the previous meeting were revisited.
61+
- After evaluating the group’s collective knowledge, experience, and skills,
62+
the team decided that **“Project: Green AI Benchmarking of Foundation
63+
Models”** was the most suitable topic for the ELO2 project.
64+
65+
### Action Items
66+
67+
- Conduct **domain research** on the selected project topic.
68+
69+
---
70+
71+
## Meeting 4
72+
73+
**Date:** October 5, 2025 (Sunday, 12:00 PM EST)
74+
**Attendees:** Amro, Aseel, Reem, Caesar, Banu, Safia
75+
76+
### Summary
77+
78+
- Safia officially joined the project team.
79+
- Amro presented a [**two-month (ELO2 deadline) project plan**](https://docs.google.com/document/d/19OCqflqeRLHzdPs9URrRWPzIdh3g1uw9TgX7-d_SXp8/edit?tab=t.0#heading=h.qd58vuomlp42).
80+
- The team discussed **how to kick off the project**, including **milestones,
81+
constraints, and deliverables**.
82+
- During domain research, Reem found a [**recently published study**](https://mitemergingtalent.slack.com/files/U082U854W8Y/F09JUBJQ9C2/2505.09598v4.pdf)
83+
with striking methodological similarities to the group’s topic and shared it
84+
with us.
85+
86+
### Action Items
87+
88+
- Seek **Evan’s feedback** on how to proceed with the project in light of the
89+
new findings.
90+
91+
---
92+
93+
## Meeting 5
94+
95+
**Date:** October 7, 2025 (Tuesday, 11:00 AM EST)
96+
**Attendees:** Amro, Aseel, Reem, Caesar, Banu, Safia
97+
98+
### Summary
99+
100+
- Based on Evan’s feedback, the group decided to **extend the topic-finalization
101+
phase** by approximately two weeks and focused to adjust the project subject.
102+
- Members proposed ways to **refine and make the project more original**, such
103+
as:
104+
- Comparing *Big AI vs Small AI* models
105+
- Evaluating *Accuracy vs Eco-Friendliness*
106+
107+
### Action Items
108+
109+
- Conduct **in-depth research** to refine and strengthen the project’s
110+
originality.
111+
- Review the **sources cited in the research paper** previously shared by Reem.
112+
113+
---
114+
115+
## Meeting 6
116+
117+
**Date:** October 9, 2025 (Thursday, 10:30 AM EST)
118+
**Attendees:** Amro, Aseel, Reem, Caesar, Banu, Safia
119+
120+
### Summary
121+
122+
- The group held a **brainstorming session** to further develop and differentiate
123+
the project topic.
124+
- Amro drafted a [**preliminary project plan**](https://mitemergingtalent.slack.com/files/U082U854W8Y/F09KJCKUEUB/approach_.pdf)
125+
based on the discussion.
126+
127+
### Action Items
128+
129+
- Agreed to hold another meeting the following day to finalize the details.
130+
- A GitHub repository will be created for the project.
131+
132+
---
133+
134+
## Meeting 7
135+
136+
**Date:** October 10, 2025 (Friday, 12:00 PM EST)
137+
**Attendees:** Amro, Reem, Caesar, Banu
138+
139+
### Summary
140+
141+
- A **new and original research question** was finalized:
142+
*“To what extent can open-source LLMs achieve comparable accuracy to
143+
corporate (commercial) models while significantly reducing environmental
144+
footprint?”*
145+
- A new [**Google Doc**](https://docs.google.com/document/d/1BAoWHe8D3c_QAEFugS1CNEUqU5jugBwg1dFJE6-LVQo/edit?tab=t.0)
146+
was created to share useful resources and references for the project.
147+
148+
### Action Items
149+
150+
- All members to gain **basic knowledge about RAG and distilled models**.
151+
- **Banu and Aseel:** Select which models to use.
152+
- **Caesar and Safia:** Define how to measure **accuracy metrics**.
153+
- **Amro and Reem:** Define how to measure **environmental cost metrics**.
154+
155+
---
156+
157+
## Meeting 8
158+
159+
**Date:** October 14, 2025 (Tuesday, 1:30 PM EST)
160+
**Attendees:** Amro, Aseel, Caesar, Banu, Safia
161+
162+
### Summary
163+
164+
- Members presented progress on their assigned tasks from the previous meeting.
165+
- **Aseel & Banu:** Selected *GPT-4* (commercial) and *Mistral-7B* (open-source)
166+
models. Evaluation will focus on *reasoning* and *summarization* using *MMLU*
167+
and *Math* datasets. Detailed documentation can be found in the
168+
[Model Evaluation Report](https://docs.google.com/document/d/1oOYIdLDumoZyYqgsQuBXDlXr1yZfo1sJNNanmIEGD8I/edit?tab=t.0).
169+
- **Caesar & Safia:** Suggested using the *LightEval* library with a customized
170+
dataset. Caesar demonstrated how to split the *GSM8K* dataset into a
171+
500-example subset. Detailed documentation can be found in the
172+
[Accuracy Notes](https://docs.google.com/document/d/19L4vX-67O-fNNSmY9S8QaHUULZwgzwmKVoJGdzSsUWo/edit?tab=t.0).
173+
- **Amro & Reem:** Presented environmental metrics and detailed evaluation
174+
methods for environmental factors.
175+
176+
### Action Items
177+
178+
- Review all presented work by **October 16th**.
179+
- Meet again on **October 16th** to **discuss task allocation** for the second
180+
milestone.
Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
<!-- markdownlint-disable MD024 MD013 -->
2+
<!-- Disabled MD024 (Multiple headings with the same content) rule
3+
because repeated headings (Summary, Action Items) are
4+
intentionally used across multiple sections for structural clarity.
5+
Disabled MD013 (Line length) rule because mathematical formulas
6+
and technical content require longer lines for readability. -->
7+
8+
# Milestone 2 Meeting Minutes
9+
10+
## **Meeting 9**
11+
12+
**Date:** October 16, 2025 (Thursday, 2:00 PM EST)
13+
14+
**Attendees:** Amro, Aseel, Caesar, Safia
15+
16+
### **Summary**
17+
18+
- The team decided to change the project approach due to limited access to environmental data (energy, carbon, and water consumption) for commercial AI models such as GPT, Claude, and Gemini.
19+
- Since large-scale testing requires computational resources beyond the team’s capacity, the new plan focuses on evaluating open-source models using laptop hardware.
20+
- Results will be compared with published environmental and performance data of commercial models to highlight how open-source AI can provide sustainable and accessible alternatives.
21+
22+
### **Action Items**
23+
24+
1. **Research and calculate environmental cost metrics:**
25+
- **Energy Consumption:**
26+
27+
Etotal=(PGPU×UGPU+PCPU×UCPU+Pothers)×tEtotal=(PGPU×UGPU+PCPU×UCPU+Pothers)×t
28+
29+
- **Facility Overhead:**
30+
31+
Efacility=Etotal×PUEEfacility=Etotal×PUE
32+
33+
- **Carbon Footprint:**
34+
35+
Cemissions=Efacility×CICemissions=Efacility×CI
36+
37+
- **Water Footprint:**
38+
39+
Wconsumed=Efacility×WUEWconsumed=Efacility×WUE
40+
41+
2. Determine how much laptop hardware can handle (small, medium, large up to 3B).
42+
3. Apply FLOPs-based linear scaling and empirical interpolation to improve result accuracy.
43+
4. Add all presented work from previous meeting (model selection, evaluation methodology, environmental metrics) to the **domain study section** of the repository.
44+
45+
---
46+
47+
## **Meeting 10**
48+
49+
**Date:** October 19, 2025 (Saturday, 12:00 PM EST)
50+
51+
**Attendees:** Amro, Aseel, Caesar, Banu, Reem
52+
53+
### Summary
54+
55+
- The group discussed options for testing and running AI models.
56+
- Ideas included running quantized models locally (with some accuracy loss) and using Google Colab for limited runs.
57+
- Another idea was to use the Hugging Face API for accuracy and RAG testing, though this approach does not allow measuring environmental costs.
58+
- The team also explored Recursive Reasoning Models as efficient and environmentally friendly alternatives, though task variety for testing remains limited.
59+
60+
### Action Items
61+
62+
1. Watch the video about recursive models and explore whether a small-scale recursive model can be built.
63+
2. If possible, compare its accuracy and environmental impact with a distilled model (e.g., **DistilGPT**).
64+
3. If not feasible, return to comparing **basic****RAG****distilled**, and **commercial models**.
65+
66+
---
67+
68+
## **Meeting 11**
69+
70+
**Date:** October 22, 2025 (Wednesday, 12:00 PM EST)
71+
72+
**Attendees:** Amro, Aseel, Caesar, Reem, Safia, Banu
73+
74+
### Summary
75+
76+
- Following office hour feedback from Evan, the team decided to focus on **small language models (SLMs)** due to their efficiency.
77+
- The group agreed to compare open-source SLMs with distilled commercial models.
78+
- It was decided to apply **RAG techniques** (via the **Ragas Python library**) to quantized, SLM, and recursive models to narrow the gap with commercial systems.
79+
- Because of the project’s evolving direction, the final deliverable will shift from a **dashboard** to a **research paper or article**.
80+
- The team also plans to create a **Google Form** later to assess public and expert awareness of the topic.
81+
82+
### Action Items
83+
84+
- **Reem:** Test DistilBERT on Hugging Face
85+
- **Aseel:** Research commercial models
86+
- **Amro:** Test the RAG method
87+
- **Caesar:** Combine Distilled + RAG models
88+
- **Safia:** Combine SLM + RAG models
89+
- **Banu:** Develop a unified test prompt (e.g., a poem or short text)
90+
- **All:** Prepare the GitHub repository
91+
92+
### **Future Tasks**
93+
94+
- Create and distribute an awareness form
95+
- Develop a communication strategy
96+
- Publish the research article
97+
98+
---
99+
100+
## **Meeting 12**
101+
102+
**Date:** October 27, 2025 (Monday, 1:00 PM EST)
103+
104+
**Attendees:** Amro, Aseel, Caesar, Reem, Safia, Banu
105+
106+
### Summary
107+
108+
- Team members presented updates on their assigned tasks from the previous meeting.
109+
- **Reem** shared findings on **DistilBERT**, concluding that the model performed poorly for the project’s needs.
110+
- **Caesar** presented a **DistilBERT + RAG demo**, confirming similar inefficiencies; both suggested that RAG could still be valuable if paired with a more capable distilled model.
111+
- **Amro** demonstrated his **RAG implementation**, discussed constraints, and noted ongoing refinements.
112+
- **Safia** showcased her **SLM + RAG demo** and shared documentation.
113+
- **Aseel** and **Banu** updated on **commercial model research** and **test prompt development** respectively.
114+
- The team discussed next research directions:
115+
- Experiment with **recursive models**
116+
- Search for a more efficient **distilled model**
117+
- Possibly abandon commercial model comparisons in favor of evaluating specific approaches or model-task pairings
118+
119+
### Action Items
120+
121+
1. All members continue their respective research and experiments.
122+
2. Push all updates and outputs to the **GitHub repository** before the **ELO2 Midpoint Breakout Room Session** on **Wednesday, October 29**.
123+
3. Identify a better distilled model for testing.
124+
4. Evaluate test prompts on **SLM + RAG models**.
125+
5. Hold a follow-up meeting on **Thursday** to review progress and next steps.
126+
127+
---
128+
129+
## **Meeting 13**
130+
131+
**Date:** October 31, 2025 (Friday, 12:00 PM EST)
132+
133+
**Attendees:** Amro, Aseel, Banu, Caesar
134+
135+
### Summary
136+
137+
- The originally planned follow-up meeting was postponed due to scheduling conflicts.
138+
- **Amro** presented his **RAG demo** using **Banu’s test prompts** — the model answered most questions correctly but added unnecessary details and struggled with harder ones. Some hallucinations were observed.
139+
- **Caesar** discovered a new, improved distilled model (**MBZUAI/LaMini-Flan-T5-248M**), applied **RAG**, and shared a demo. It performed well on most test prompts except the hard ones.
140+
- The team outlined a **two-week roadmap** focused on **coding and technical tasks**, followed by **repository organization**.
141+
142+
### Action Items
143+
144+
- Prioritize coding tasks now; clean and organize the repository later.
145+
- **Amro:** Continue refining RAG implementation.
146+
- **Caesar:** Test the **CodeCarbon** library on the new model.
147+
- **Banu:** Add a **generative paragraph task** to test prompts and create **three new prompts** for it (for use in the upcoming Google Form).
148+
- **Aseel:** Prepare a draft for the **main README**.
149+
- Team to explore **recursive models** in the coming days.
150+
- Use **Slack** actively for communication and finalize the next meeting date later.

0 commit comments

Comments
 (0)