Meeting 9
Date: October 16, 2025 (Thursday, 2:00 PM EST)
Attendees: Amro, Aseel, Caesar, Safia
Summary
- The team decided to change the project approach due to limited access to environmental data (energy, carbon, and water consumption) for commercial AI models such as GPT, Claude, and Gemini.
- Since large-scale testing requires computational resources beyond the team’s capacity, the new plan focuses on evaluating open-source models using laptop hardware.
- Results will be compared with published environmental and performance data of commercial models to highlight how open-source AI can provide sustainable and accessible alternatives.
Action Items
- Research and calculate environmental cost metrics:
- Energy Consumption: (E_{total} = (P_{GPU}×U_{GPU} + P_{CPU}×U_{CPU} + P_{others})×t)
- Facility Overhead: (E_{facility} = E_{total}×PUE)
- Carbon Footprint: (C_{emissions} = E_{facility}×CI)
- Water Footprint: (W_{consumed} = E_{facility}×WUE)
- Determine how much laptop hardware can handle (small, medium, large up to 3B).
- Apply FLOPs-based linear scaling and empirical interpolation to improve result accuracy.
- Add all presented work from previous meeting (model selection, evaluation methodology, environmental metrics) to the domain study section of the repository.
Meeting 10
Date: October 19, 2025 (Saturday, 12:00 PM EST)
Attendees: Amro, Aseel, Caesar, Banu, Reem
Summary
- The group discussed options for testing and running AI models.
- Ideas included running quantized models locally (with some accuracy loss) and using Google Colab for limited runs.
- Another idea was to use the Hugging Face API for accuracy and RAG testing, though this approach does not allow measuring environmental costs.
- The team also explored Recursive Reasoning Models as efficient and environmentally friendly alternatives, though task variety for testing remains limited.
Action Items
- Watch the video about recursive models and explore whether a small-scale recursive model can be built.
- If possible, compare its accuracy and environmental impact with a distilled model (e.g., DistilGPT).
- If not feasible, return to comparing basic, RAG, distilled, and commercial models.
Meeting 11
Date: October 22, 2025 (Wednesday, 12:00 PM EST)
Attendees: Amro, Aseel, Caesar, Reem, Safia, Banu
Summary
- Following office hour feedback from Evan, the team decided to focus on small language models (SLMs) due to their efficiency.
- The group agreed to compare open-source SLMs with distilled commercial models.
- It is decided to apply RAG techniques (via the Ragas Python library) to quantized, SLM, and recursive models to narrow the gap with commercial systems.
- Because of the project’s direction gradually evolving over time, the final deliverable will change from a dashboard to a research paper or article.
- The team also plans to create a Google Form later to assess public and expert awareness of the topic.
Action Items
- Reem: Test DistilBERT on Hugging Face
- Aseel: Research commercial models
- Amro: Test the RAG method
- Caesar: Combine Distilled + RAG models
- Safia: Combine SLM + RAG models
- Banu: Develop a unified test prompt (e.g., a poem or short text)
- All: Prepare the GitHub repository
Future Tasks
- Create and distribute an awareness form
- Develop a communication strategy
- Publish the research article
Meeting 12
Date: October 27, 2025 (Monday, 1:00 PM EST)
Attendees: Amro, Aseel, Caesar, Reem, Safia, Banu
Summary
- Team members presented updates on their assigned tasks from the previous meeting.
- Reem shared a document summarizing her findings on DistilBERT, concluding that the model performed poorly for the project’s needs.
- Caesar presented a DistilBERT + RAG demo, which confirmed similar inefficiencies. Both suggested that RAG could still be valuable if paired with a more capable distilled model.
- Amro demonstrated his current RAG implementation, discussed the technical constraints he encountered, and noted that he is continuing to refine the setup.
- Safia showcased her SLM + RAG demo and shared the related documentation.
- Aseel and Banu gave status updates on their respective tasks—commercial model research and unified test prompt development—and shared their documents.
- The team discussed the next research directions, including:
- experimenting with recursive models,
- searching for a more efficient distilled model, and
- possibly abandoning commercial model comparisons in favor of evaluating specific approaches or model-task pairings.
Further research and experiments will help determine the best path forward.
Action Items
- All members will continue with their respective research and experiments.
- All updates and outputs must be pushed to the GitHub repository before the ELO2 Midpoint Breakout Room Session on Wednesday, October 29.
- The team will work on identifying a better distilled model for subsequent testing.
- Test prompts will be evaluated on SLM + RAG models.
- A follow-up meeting will be held on Thursday to review progress and next steps.
Meeting 13
Date: October 31, 2025 (Friday, 12:00 PM EST)
Attendees: Amro, Aseel, Banu, Caesar
Summary
- The originally planned follow-up meeting was postponed to today due to scheduling conflicts of team members.
- Amro presented his current work on RAG with a demo, testing Banu’s test prompts. The model was able to answer most questions correctly, but provided extra unnecessary details and struggled particularly with medium and hard questions. Some hallucinations were observed.
- Caesar discovered a new, improved distilled model, MBZUAI/LaMini-Flan-T5-248M, applied RAG to it, and shared a demo. This model successfully answered nearly all test prompt questions, except the hard ones.
- A roadmap for project progress was outlined. Considering team members’ upcoming schedules, the plan for the next two weeks is to focus primarily on coding and technical tasks, followed by cleaning and organizing the repository.
Action Items
- Prioritize coding tasks now; repository cleaning and organization will follow later.
- Amro will continue working on his RAG task and implement improvements.
- Caesar will test the CodeCarbon library on the new model he prepared.
- Banu will add a generative paragraph task to the test prompts and create three additional prompts for it. This will be important for inclusion in the Google Form task planned in previous meetings.
- Aseel will prepare a draft for the main README.
- Recursive model will be explored by the team members in the coming days.
- Slack will be used more actively for communication, and the date for the next meeting will be decided later.
Meeting 9
Date: October 16, 2025 (Thursday, 2:00 PM EST)
Attendees: Amro, Aseel, Caesar, Safia
Summary
Action Items
Meeting 10
Date: October 19, 2025 (Saturday, 12:00 PM EST)
Attendees: Amro, Aseel, Caesar, Banu, Reem
Summary
Action Items
Meeting 11
Date: October 22, 2025 (Wednesday, 12:00 PM EST)
Attendees: Amro, Aseel, Caesar, Reem, Safia, Banu
Summary
Action Items
Future Tasks
Meeting 12
Date: October 27, 2025 (Monday, 1:00 PM EST)
Attendees: Amro, Aseel, Caesar, Reem, Safia, Banu
Summary
Further research and experiments will help determine the best path forward.
Action Items
Meeting 13
Date: October 31, 2025 (Friday, 12:00 PM EST)
Attendees: Amro, Aseel, Banu, Caesar
Summary
Action Items