Skip to content

Commit 1ba740e

Browse files
committed
feat(meetings): add Milestone 3 and Milestone 4 sections to meeting-minutes folder README
1 parent c1538c5 commit 1ba740e

1 file changed

Lines changed: 23 additions & 2 deletions

File tree

meeting_minutes/README.md

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
<!-- markdownlint-disable MD013 -->
2+
<!-- Disabled MD013 (Line length) for better readability -->
23

34
# 🗓️ Meeting Minutes – Environmental Impact of AI Models
45

@@ -21,10 +22,30 @@ By the end of Milestone 1, the project established its scope, research framework
2122

2223
## ⚙️ Milestone 2 – Tool Setup & Experiment Planning
2324

24-
**Timeline:** October 15 – Ongoing
25+
**Timeline:** October 15 – November 6, 2025
2526

2627
With the research framework and scope finalized in Milestone 1, **Milestone 2** focuses on preparing the experimental environment and defining how sustainability metrics will be measured. This phase involves setting up tools such as **CodeCarbon****CarbonTracker**, and **Eco2AI** to monitor energy and carbon usage, and exploring **Water Usage Effectiveness (WUE)** datasets from major cloud providers like AWS, Microsoft, and Google.
2728

2829
The team also plans to configure testing environments for small open-source models (e.g., **Mistral****LLaMA-2**) using **Hugging Face Transformers****PyTorch**, and GPU-enabled platforms such as **Colab**. Another core deliverable is the **experimental design document**, which will outline the metrics (energy, carbon, water, and accuracy), workflows, and methodology diagrams guiding the model evaluation process.
2930

30-
This milestone sets the foundation for **Milestone 3**, where real model experiments and energy tracking will begin.
31+
By the end of Milestone 2, the team completed the technical setup, finalized the measurement pipeline, and validated that all tracking tools operate consistently across model types—ensuring a smooth transition into Milestone 3, where full experiments will be executed.
32+
33+
## 📊 Milestone 3 – Model Benchmarking & Data Collection
34+
35+
**Timeline:** November 7 – November 18, 2025
36+
37+
Milestone 3 marks the beginning of the full experimental phase. Using the measurement pipeline and tooling established in Milestone 2, the team runs benchmark tasks on both proprietary and open-source models to collect data on **accuracy** and **environmental impact**. This includes tracking **energy consumption and carbon emissions** for each testing model under consistent test conditions.
38+
39+
During this phase, the team also validates accuracy results on selected reasoning and summarization tasks, investigates irregular outputs, and updates evaluation scripts when needed. Additional observations such as **inference time, token throughput**, and **hardware utilization** are recorded to support later analysis.
40+
41+
By the end of Milestone 3, the project has produced a complete experimental dataset covering sustainability metrics and accuracy scores for all evaluated models, providing a strong foundation for **Milestone 4**, which focuses on human evaluation and qualitative assessment.
42+
43+
## 🧪 Milestone 4 – Human Evaluation & Survey Analysis
44+
45+
**Timeline:** November 19 – ongoing
46+
47+
Milestone 4 centers on incorporating **human judgment** into the benchmarking process. The team prepares a Google Form survey designed to compare model outputs side-by-side. Participants evaluate **clarity, coherence, informativeness, factuality,** and **overall preference**.
48+
49+
Once responses are collected, the team analyzes the results by aggregating scores, assessing agreement among reviewers, and comparing human preferences with automated accuracy metrics from earlier milestones. This helps identify where quantitative and qualitative assessments align or diverge.
50+
51+
By the end of Milestone 4, the project integrates the human evaluation results into the broader dataset, enabling a more nuanced understanding of model performance and preparing the groundwork for **Milestone 5**.

0 commit comments

Comments
 (0)