diff --git a/google_gemma/README.md b/google_gemma/README.md new file mode 100644 index 0000000..fab2946 --- /dev/null +++ b/google_gemma/README.md @@ -0,0 +1,70 @@ +# Project: “Green AI Automated RAG Testing with Gemma” + +## Overview + +We built an interactive retrieval-augmented generation (RAG) pipeline using the +open-weight model *Gemma 2-2b* by Google and applied it to a standardized text +and prompt set derived from the Apollo 11 lunar landing. + +**The goal:** + +* evaluate summarisation, reasoning, retrieval, paraphrasing, and creative tasks +in a controlled, reproducible way — logging both answer quality and local +sustainability metrics (energy/carbon emissions) via CodeCarbon. + +## Model + +**Model ID:** `google/gemma-2-2b-it` (Hugging Face) + +**Key attributes:** + +* Open-weight decoder-only model trained by Google +* Supports text-generation and conversational usage +* Suitable for research, summarisation, reasoning, and retrieval tasks +* Lightweight enough for deployment on modest compute resources + +Model link: [https://huggingface.co/google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) + +## What We Did + +1. Created a source document (`source.txt`) using ~1,400 words of selected + Wikipedia excerpts on Apollo 11. +2. Defined a set of 21 standardised prompts spanning five categories: + summarisation, reasoning, RAG (fact retrieval), paraphrasing, and creative + generation. +3. Built a document retrieval component using sentence-transformers to chunk + the document and select top-k relevant chunks per query. +4. Developed an interactive notebook workflow that: + + * Accepts a question at runtime + * Runs RAG → Draft → Critic → Refiner cycles using Gemma + * Tracks local CPU/GPU energy usage and CO₂ emissions with CodeCarbon + * Logs each question, answer, timestamp, and emissions to a single + append-only log file +5. Logged runtime latency and emissions per query for performance and + sustainability insights. + +## How to Use + +1. Clone the repository: + + ```bash + git clone + cd your_repo_folder + ``` + +2. Place your `source.txt` into `./data/`. + +3. Add your Hugging Face API key in the config cell. + +4. Run the notebook setup cells, then use the interactive prompt cell to ask + questions. + +## Why This Matters + +* **Reproducibility** — fixed source text and prompt set allow consistent + evaluation across models. +* **Efficiency vs. Accuracy** — emissions are logged alongside outputs to + explore trade-offs between model performance and energy cost. +* **Accessibility** — uses an open model and standard Python tools, making + research on small language models feasible even on laptops. diff --git a/google_gemma/data/source.txt b/google_gemma/data/source.txt new file mode 100644 index 0000000..51a6874 --- /dev/null +++ b/google_gemma/data/source.txt @@ -0,0 +1,127 @@ +Apollo 11 – Lunar Descent and Moonwalk + +As the descent began, Armstrong and Aldrin found themselves passing landmarks on the +surface two or three seconds early, and reported that they were “long”; they would land +miles west of their target point. Eagle was traveling too fast. The problem could have been +mascons—concentrations of high mass in a region or regions of the Moon’s crust that +contains a gravitational anomaly, potentially altering Eagle’s trajectory. + +Five minutes into the descent burn, and 6,000 feet (1,800 m) above the surface of the +Moon, the LM guidance computer (LGC) distracted the crew with the first of several +unexpected 1201 and 1202 program alarms. Inside Mission Control Center, computer +engineer Jack Garman told Guidance Officer Steve Bales it was safe to continue the +descent, and this was relayed to the crew. The program alarms indicated “executive +overflows”, meaning the guidance computer could not complete all its tasks in real-time and +had to postpone some of them. Margaret Hamilton, the Director of Apollo Flight Computer +Programming at the MIT Charles Stark Draper Laboratory later recalled: “To blame the +computer for the Apollo 11 problems is like blaming the person who spots a fire and calls +the fire department. Actually, the computer was programmed to do more than recognize +error conditions. A complete set of recovery programs was incorporated into the software. +The software’s action, in this case, was to eliminate lower priority tasks and re-establish +the more important ones. The computer, rather than almost forcing an abort, prevented an +abort. If the computer hadn’t recognized this problem and taken recovery action, I doubt if +Apollo 11 would have been the successful Moon landing it was.” + +When Armstrong again looked outside, he saw that the computer’s landing target was in a +boulder-strewn area just north and east of a 300-foot-diameter (91 m) crater, so he took +semi-automatic control. Throughout the descent, Aldrin called out navigation data to +Armstrong, who was busy piloting Eagle. Now 107 feet (33 m) above the surface, +Armstrong knew their propellant supply was dwindling and was determined to land at the +first possible landing site. + +Armstrong found a clear patch of ground and maneuvered the spacecraft towards it. They +were now 100 feet (30 m) from the surface, with only 90 seconds of propellant remaining. +Lunar dust kicked up by the LM’s engine began to impair his ability to determine the +spacecraft’s motion. + +A light informed Aldrin that at least one of the 67-inch (170 cm) probes hanging from +Eagle’s footpads had touched the surface and he said: “Contact light!” Three seconds later, +Eagle landed and Armstrong shut the engine down. Aldrin immediately said “Okay, engine +stop.” + +Eagle landed at 20:17:40 UTC on Sunday July 20 with 216 pounds (98 kg) of usable fuel +remaining. Information available to the crew and mission controllers during the landing +showed the LM had enough fuel for another 25 seconds of powered flight before an abort +without touchdown would have become unsafe, but post-mission analysis showed that the +real figure was probably closer to 50 seconds. + +Armstrong acknowledged Aldrin’s completion of the post-landing checklist with “Engine +arm is off”, before responding to the CAPCOM, Charles Duke, with the words, “Houston, +Tranquility Base here. The Eagle has landed.” Duke expressed the relief at Mission Control: +“Roger, Twan—Tranquility, we copy you on the ground. You got a bunch of guys about to +turn blue. We’re breathing again. Thanks a lot.” + +Preparations for Neil Armstrong and Buzz Aldrin to walk on the Moon began at 23:43 UTC. +These took longer than expected; three and a half hours instead of two. Six hours and +thirty-nine minutes after landing, Armstrong and Aldrin were ready to go outside, and +Eagle was depressurized. + +Eagle’s hatch was opened at 02:39:33. Armstrong initially had some difficulties squeezing +through the hatch with his portable life support system (PLSS). At 02:51 Armstrong began +his descent to the lunar surface. Climbing down the nine-rung ladder, Armstrong pulled a +D-ring to deploy the modular equipment stowage assembly (MESA) folded against Eagle’s +side and activate the TV camera. + +Despite some technical and weather difficulties, black and white images of the first lunar +EVA were received and broadcast to at least 600 million people on Earth. + +After describing the surface dust as “very fine-grained” and “almost like a powder”, at +02:56:15, six and a half hours after landing, Armstrong stepped off Eagle’s landing pad and +declared: “That’s one small step for [a] man, one giant leap for mankind.” + +Armstrong intended to say “That’s one small step for a man”, but the word “a” is not +audible in the transmission, and thus was not initially reported by most observers of the live +broadcast. When later asked about his quote, Armstrong said he believed he said “for a +man”, and subsequent printed versions of the quote included the “a” in square brackets. + +About seven minutes after stepping onto the Moon’s surface, Armstrong collected a +contingency soil sample using a sample bag on a stick. Twelve minutes after the sample +was collected, he removed the TV camera from the MESA and made a panoramic sweep, +then mounted it on a tripod. Aldrin joined Armstrong on the surface. He described the view +with the simple phrase: “Magnificent desolation.” + +Armstrong said moving in the lunar gravity, one-sixth of Earth’s, was “even perhaps easier +than the simulations … It’s absolutely no trouble to walk around.” Aldrin joined him on the +surface and tested methods for moving around, including two-footed kangaroo hops. The +PLSS backpack created a tendency to tip backward, but neither astronaut had serious +problems maintaining balance. The fine soil was quite slippery. + +The astronauts planted the Lunar Flag Assembly containing a flag of the United States on +the lunar surface, in clear view of the TV camera. Aldrin remembered, “Of all the jobs I had +to do on the Moon the one I wanted to go the smoothest was the flag raising.” But the +astronauts struggled with the telescoping rod and could only insert the pole about 2 inches +(5 cm) into the hard lunar surface. Before Aldrin could take a photo of Armstrong with the +flag, President Richard Nixon spoke to them through a telephone-radio transmission, which +Nixon called “the most historic phone call ever made from the White House.” + +They deployed the EASEP, which included a Passive Seismic Experiment Package used to +measure moonquakes and a retroreflector array used for the lunar laser ranging +experiment. Then Armstrong walked 196 feet (60 m) from the LM to take photographs at +the rim of Little West Crater while Aldrin collected two core samples. He used the +geologist’s hammer to pound in the tubes—the only time the hammer was used on Apollo +11—but was unable to penetrate more than 6 inches (15 cm) deep. + +The astronauts then collected rock samples using scoops and tongs on extension handles. +Many of the surface activities took longer than expected, so they had to stop documenting +sample collection halfway through the allotted 34 minutes. Aldrin shoveled 6 kilograms +(13 lb) of soil into the box of rocks to pack them in tightly. Two types of rocks were found in +the geological samples: basalt and breccia. + +While on the surface, Armstrong uncovered a plaque mounted on the LM ladder, bearing +two drawings of Earth, an inscription, and signatures of the astronauts and President Nixon. +The inscription read: “Here men from the planet Earth first set foot upon the Moon July +1969, A. D. We came in peace for all mankind.” + +Mission Control used a coded phrase to warn Armstrong his metabolic rates were high, and +that he should slow down. As metabolic rates remained generally lower than expected for +both astronauts throughout the walk, Mission Control granted the astronauts a 15-minute +extension. + +Aldrin entered Eagle first. With some difficulty the astronauts lifted film and two sample +boxes containing 21.55 kilograms (47.5 lb) of lunar surface material to the LM hatch using a +flat cable pulley device called the Lunar Equipment Conveyor (LEC). Armstrong then +jumped onto the ladder’s third rung, and climbed into the LM. After transferring to LM life +support, the explorers lightened the ascent stage for the return to lunar orbit by tossing out +their PLSS backpacks, lunar overshoes, an empty Hasselblad camera, and other equipment. +The hatch was closed again at 05:11:13. They then pressurized the LM and settled down to +sleep. diff --git a/google_gemma/emissions.csv b/google_gemma/emissions.csv new file mode 100644 index 0000000..4454146 --- /dev/null +++ b/google_gemma/emissions.csv @@ -0,0 +1,5 @@ +timestamp,project_name,run_id,experiment_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,water_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud,pue,wue +2025-11-15T15:47:16,codecarbon,27c11b81-b7a7-47c9-ab82-9ae317559fe9,5b0fa12a-3dd7-45bb-9766-cc326314d9f1,45.95574010000564,0.0005071162574599,1.103488391997087e-05,42.5,0.0,10.0,0.0007198499136804,0.0,0.0001693503708332,0.0008892002845137,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N,1.0,0.0 +2025-11-15T15:51:00,codecarbon,7005ea2e-d9a3-40aa-900e-f95b5e5623a2,5b0fa12a-3dd7-45bb-9766-cc326314d9f1,158.46234359999653,0.0013179222183238,8.316942614774643e-06,42.5,0.0,10.0,0.0018707337542361,0.0,0.0004401698980555,0.0023109036522917,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N,1.0,0.0 +2025-11-15T15:52:55,codecarbon,f2cce301-9b4f-4e6a-b97d-5179efd64332,5b0fa12a-3dd7-45bb-9766-cc326314d9f1,47.64462380000623,0.0003962538208029,8.316863251272021e-06,42.5,0.0,10.0,0.0005624669925697,0.0,0.0001323421463889,0.0006948091389586,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N,1.0,0.0 +2025-11-15T16:05:41,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,5b0fa12a-3dd7-45bb-9766-cc326314d9f1,23.700261899997713,0.0014245466564409932,6.010678964020776e-05,42.5,0.0,10.0,0.002022092715903293,0.0,0.0004757709861110521,0.002497863702014345,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N,1.0,0.0 diff --git a/google_gemma/emissions_base_27c11b81-b7a7-47c9-ab82-9ae317559fe9.csv b/google_gemma/emissions_base_27c11b81-b7a7-47c9-ab82-9ae317559fe9.csv new file mode 100644 index 0000000..231216d --- /dev/null +++ b/google_gemma/emissions_base_27c11b81-b7a7-47c9-ab82-9ae317559fe9.csv @@ -0,0 +1,2 @@ +task_name,timestamp,project_name,run_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,water_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud +RAG Query,2025-11-15T15:47:13,codecarbon,27c11b81-b7a7-47c9-ab82-9ae317559fe9,42.43775189999724,0.00035304605540907173,8.316975773721854e-06,42.5,0.0,10.0,0.0005011351056248814,0.0,0.00011791160861111065,0.0006190467142359921,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N diff --git a/google_gemma/emissions_base_685c527f-27ef-47b7-9955-a38ea34424e8.csv b/google_gemma/emissions_base_685c527f-27ef-47b7-9955-a38ea34424e8.csv new file mode 100644 index 0000000..ee1a15c --- /dev/null +++ b/google_gemma/emissions_base_685c527f-27ef-47b7-9955-a38ea34424e8.csv @@ -0,0 +1,17 @@ +task_name,timestamp,project_name,run_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,water_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud +RAG Query,2025-11-15T16:00:34,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,5.143026200006716,4.276275102764759e-05,8.31687815598402e-06,42.5,0.0,10.0,6.0700336944399985e-05,0.0,1.4281779722174785e-05,7.498211666657477e-05,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_5da8a26a-c4d2-4bcf-add1-905e45e76295,2025-11-15T16:01:03,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,4.355888399994001,3.621624132552722e-05,8.316909939855327e-06,42.5,0.0,10.0,5.1407813472259386e-05,0.0,1.2095360833336599e-05,6.350317430559598e-05,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_82cfcf38-ae6d-4284-8275-eae3dc613a3b,2025-11-15T16:01:21,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,3.6073870999971405,2.99936220247417e-05,8.316813191449018e-06,42.5,0.0,10.0,4.257522500010964e-05,0.0,1.001693083332308e-05,5.2592155833432733e-05,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_e51b9fbe-d856-4af9-a60a-4a7c67e8b8a1,2025-11-15T16:01:46,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,9.219469099974958,7.661944272522639e-05,8.316912257452313e-06,42.5,0.0,10.0,0.00010875838187506212,0.0,2.5589569444451216e-05,0.00013434795131951334,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_92692222-ca85-4dc5-80dc-105a50e7a8ca,2025-11-15T16:02:02,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,6.141885800025193,5.1067201964595185e-05,8.316873934293634e-06,42.5,0.0,10.0,7.248822194453372e-05,0.0,1.7055289722177095e-05,8.954351166671082e-05,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_ae3b9135-290a-41e0-b857-610d59826c68,2025-11-15T16:02:20,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,7.603243899997324,6.320639832673667e-05,8.316767053409205e-06,42.5,0.0,10.0,8.972000159721775e-05,0.0,2.1108918888866258e-05,0.00011082892048608405,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_20e216ad-7983-45fa-b17a-0b7533829968,2025-11-15T16:02:32,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,8.463724199973512,7.036219887974961e-05,8.3158945177321e-06,42.5,0.0,10.0,9.987748506937351e-05,0.0,2.3498735555606086e-05,0.00012337622062497957,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_38cd6e78-939a-4960-af2f-637b6b44d07c,2025-11-15T16:03:16,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,42.18541020000703,0.0003507050408846856,8.313335372644498e-06,42.5,0.0,10.0,0.0004978106387502015,0.0,0.00011713123611116317,0.0006149418748613649,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_6f05ecd8-86db-4856-88ef-c74ef0e67f0b,2025-11-15T16:03:27,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,5.265247900009854,4.379344308081052e-05,8.316873507222905e-06,42.5,0.0,10.0,6.216346201365222e-05,0.0,1.4625916111092195e-05,7.678937812474447e-05,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_3fd23cb5-9a28-4ad2-bc6d-f4a34b86c021,2025-11-15T16:03:37,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,1.0712502999813296,8.912184456301496e-06,8.316454375127905e-06,42.5,0.0,10.0,1.2651076527860463e-05,0.0,2.975945555550041e-06,1.562702208341045e-05,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_e4bcd89d-c749-4d36-ae0d-ba00e40673a9,2025-11-15T16:04:08,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,14.916377299989108,0.0001240567429616939,8.316933593438408e-06,42.5,0.0,10.0,0.00017609339736120297,0.0,4.143323388885214e-05,0.00021752663125005505,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_82cbfc8d-358d-49f0-8953-c69d381f3f26,2025-11-15T16:04:16,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,2.063675499986857,1.716327376177708e-05,8.31683347718052e-06,42.5,0.0,10.0,2.4362905416698208e-05,0.0,5.731945000055343e-06,3.009485041675355e-05,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_2821dd0b-9b0e-4a8b-83c0-d6227bba46c8,2025-11-15T16:04:37,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,9.451002499990864,7.859097282431884e-05,8.31680596193543e-06,42.5,0.0,10.0,0.0001115576963194244,0.0,2.6247223888882824e-05,0.00013780492020830717,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_2b078761-f6ac-4d1c-8b7c-55ae00f04424,2025-11-15T16:04:52,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,11.863896100025158,9.864592993043373e-05,8.316879166602493e-06,42.5,0.0,10.0,0.00014002445840269502,0.0,3.294571888885306e-05,0.00017297017729154813,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_0ead3c08-48e2-42cf-9495-5744a003289a,2025-11-15T16:05:16,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,16.271289299998898,0.0001353386514635099,8.31693587654185e-06,42.5,0.0,10.0,0.00019210761104186528,0.0,4.5201222222224e-05,0.00023730883326408955,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N +RAG Query_038aa8b4-70d0-4b0f-8416-d5b2f375f320,2025-11-15T16:05:30,codecarbon,685c527f-27ef-47b7-9955-a38ea34424e8,12.32425589999184,0.00010248386803964776,8.315196431279565e-06,42.5,0.0,10.0,0.00014547186402774572,0.0,3.422792527772901e-05,0.00017969978930547446,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N diff --git a/google_gemma/emissions_base_7005ea2e-d9a3-40aa-900e-f95b5e5623a2.csv b/google_gemma/emissions_base_7005ea2e-d9a3-40aa-900e-f95b5e5623a2.csv new file mode 100644 index 0000000..33e330f --- /dev/null +++ b/google_gemma/emissions_base_7005ea2e-d9a3-40aa-900e-f95b5e5623a2.csv @@ -0,0 +1,2 @@ +task_name,timestamp,project_name,run_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,water_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud +RAG Query,2025-11-15T15:49:03,codecarbon,7005ea2e-d9a3-40aa-900e-f95b5e5623a2,42.1122289999912,0.00035017224684081086,8.316906185488746e-06,42.5,0.0,10.0,0.0004970553783332233,0.0,0.00011695227166662033,0.0006140076499998436,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N diff --git a/google_gemma/emissions_base_f2cce301-9b4f-4e6a-b97d-5179efd64332.csv b/google_gemma/emissions_base_f2cce301-9b4f-4e6a-b97d-5179efd64332.csv new file mode 100644 index 0000000..93e83f1 --- /dev/null +++ b/google_gemma/emissions_base_f2cce301-9b4f-4e6a-b97d-5179efd64332.csv @@ -0,0 +1,2 @@ +task_name,timestamp,project_name,run_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,water_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud +RAG Query,2025-11-15T15:52:50,codecarbon,f2cce301-9b4f-4e6a-b97d-5179efd64332,42.26910139998654,0.0003517129052918133,8.316909249224448e-06,42.5,0.0,10.0,0.0004992425298613954,0.0,0.00011746657944450918,0.0006167091093059046,0.0,Egypt,EGY,,,,Windows-11-10.0.26100-SP0,3.13.5,3.0.8,12,12th Gen Intel(R) Core(TM) i5-1235U,,,,,7.692127227783203,machine,N diff --git a/google_gemma/gemma.ipynb b/google_gemma/gemma.ipynb new file mode 100644 index 0000000..001850d --- /dev/null +++ b/google_gemma/gemma.ipynb @@ -0,0 +1,2090 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "82882a24", + "metadata": {}, + "source": [ + "# Cell 1 — Install required packages\n", + "Installs Hugging Face client, sentence-transformers, CodeCarbon, and other required libraries.\n", + "These are needed for embeddings, API calls, and energy tracking.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "6c6a33dc", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Requirement already satisfied: huggingface_hub in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (0.36.0)\n", + "Collecting huggingface_hub\n", + " Using cached huggingface_hub-1.1.4-py3-none-any.whl.metadata (13 kB)\n", + "Requirement already satisfied: sentence-transformers in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (5.1.2)\n", + "Requirement already satisfied: codecarbon in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (3.0.8)\n", + "Requirement already satisfied: pypdf in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (6.2.0)\n", + "Requirement already satisfied: torch in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (2.9.1)\n", + "Requirement already satisfied: scikit-learn in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (1.7.2)\n", + "Requirement already satisfied: pandas in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (2.3.1)\n", + "Collecting pandas\n", + " Downloading pandas-2.3.3-cp313-cp313-win_amd64.whl.metadata (19 kB)\n", + "Requirement already satisfied: filelock in c:\\users\\dell\\appdata\\roaming\\python\\python313\\site-packages (from huggingface_hub) (3.18.0)\n", + "Requirement already satisfied: fsspec>=2023.5.0 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from huggingface_hub) (2025.10.0)\n", + "Collecting hf-xet<2.0.0,>=1.2.0 (from huggingface_hub)\n", + " Using cached hf_xet-1.2.0-cp37-abi3-win_amd64.whl.metadata (5.0 kB)\n", + "Requirement already satisfied: httpx<1,>=0.23.0 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from huggingface_hub) (0.27.2)\n", + "Requirement already satisfied: packaging>=20.9 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from huggingface_hub) (25.0)\n", + "Requirement already satisfied: pyyaml>=5.1 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from huggingface_hub) (6.0.3)\n", + "Requirement already satisfied: shellingham in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from huggingface_hub) (1.5.4)\n", + "Requirement already satisfied: tqdm>=4.42.1 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from huggingface_hub) (4.67.1)\n", + "Collecting typer-slim (from huggingface_hub)\n", + " Using cached typer_slim-0.20.0-py3-none-any.whl.metadata (16 kB)\n", + "Requirement already satisfied: typing-extensions>=3.7.4.3 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from huggingface_hub) (4.15.0)\n", + "Requirement already satisfied: anyio in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from httpx<1,>=0.23.0->huggingface_hub) (4.11.0)\n", + "Requirement already satisfied: certifi in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from httpx<1,>=0.23.0->huggingface_hub) (2025.11.12)\n", + "Requirement already satisfied: httpcore==1.* in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from httpx<1,>=0.23.0->huggingface_hub) (1.0.9)\n", + "Requirement already satisfied: idna in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from httpx<1,>=0.23.0->huggingface_hub) (3.11)\n", + "Requirement already satisfied: sniffio in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from httpx<1,>=0.23.0->huggingface_hub) (1.3.1)\n", + "Requirement already satisfied: h11>=0.16 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from httpcore==1.*->httpx<1,>=0.23.0->huggingface_hub) (0.16.0)\n", + "Requirement already satisfied: transformers<5.0.0,>=4.41.0 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from sentence-transformers) (4.57.1)\n", + "Requirement already satisfied: scipy in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from sentence-transformers) (1.16.0)\n", + "Requirement already satisfied: Pillow in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from sentence-transformers) (12.0.0)\n", + "Requirement already satisfied: numpy>=1.17 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from transformers<5.0.0,>=4.41.0->sentence-transformers) (2.3.4)\n", + "Requirement already satisfied: regex!=2019.12.17 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from transformers<5.0.0,>=4.41.0->sentence-transformers) (2025.11.3)\n", + "Requirement already satisfied: requests in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from transformers<5.0.0,>=4.41.0->sentence-transformers) (2.32.5)\n", + "Requirement already satisfied: tokenizers<=0.23.0,>=0.22.0 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from transformers<5.0.0,>=4.41.0->sentence-transformers) (0.22.1)\n", + "Requirement already satisfied: safetensors>=0.4.3 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from transformers<5.0.0,>=4.41.0->sentence-transformers) (0.6.2)\n", + "Requirement already satisfied: arrow in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from codecarbon) (1.3.0)\n", + "Requirement already satisfied: click in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from codecarbon) (8.3.0)\n", + "Requirement already satisfied: fief-client[cli] in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from codecarbon) (0.20.0)\n", + "Requirement already satisfied: prometheus_client in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from codecarbon) (0.22.1)\n", + "Requirement already satisfied: psutil>=6.0.0 in c:\\users\\dell\\appdata\\roaming\\python\\python313\\site-packages (from codecarbon) (7.0.0)\n", + "Requirement already satisfied: py-cpuinfo in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from codecarbon) (9.0.0)\n", + "Requirement already satisfied: pydantic in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from codecarbon) (2.12.4)\n", + "Requirement already satisfied: nvidia-ml-py in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from codecarbon) (13.580.82)\n", + "Requirement already satisfied: rapidfuzz in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from codecarbon) (3.14.3)\n", + "Requirement already satisfied: questionary in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from codecarbon) (2.1.1)\n", + "Requirement already satisfied: rich in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from codecarbon) (14.0.0)\n", + "Requirement already satisfied: typer in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from codecarbon) (0.16.0)\n", + "Requirement already satisfied: sympy>=1.13.3 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from torch) (1.14.0)\n", + "Requirement already satisfied: networkx>=2.5.1 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from torch) (3.5)\n", + "Requirement already satisfied: jinja2 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from torch) (3.1.6)\n", + "Requirement already satisfied: setuptools in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from torch) (80.9.0)\n", + "Requirement already satisfied: joblib>=1.2.0 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from scikit-learn) (1.5.2)\n", + "Requirement already satisfied: threadpoolctl>=3.1.0 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from scikit-learn) (3.6.0)\n", + "Requirement already satisfied: python-dateutil>=2.8.2 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from pandas) (2.9.0.post0)\n", + "Requirement already satisfied: pytz>=2020.1 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from pandas) (2025.2)\n", + "Requirement already satisfied: tzdata>=2022.7 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from pandas) (2025.2)\n", + "Requirement already satisfied: six>=1.5 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)\n", + "Requirement already satisfied: mpmath<1.4,>=1.1.0 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from sympy>=1.13.3->torch) (1.3.0)\n", + "Requirement already satisfied: colorama in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from tqdm>=4.42.1->huggingface_hub) (0.4.6)\n", + "Requirement already satisfied: types-python-dateutil>=2.8.10 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from arrow->codecarbon) (2.9.0.20250708)\n", + "Requirement already satisfied: jwcrypto<2.0.0,>=1.4 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from fief-client[cli]->codecarbon) (1.5.6)\n", + "Requirement already satisfied: yaspin in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from fief-client[cli]->codecarbon) (3.3.0)\n", + "Requirement already satisfied: cryptography>=3.4 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from jwcrypto<2.0.0,>=1.4->fief-client[cli]->codecarbon) (46.0.3)\n", + "Requirement already satisfied: cffi>=2.0.0 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from cryptography>=3.4->jwcrypto<2.0.0,>=1.4->fief-client[cli]->codecarbon) (2.0.0)\n", + "Requirement already satisfied: pycparser in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from cffi>=2.0.0->cryptography>=3.4->jwcrypto<2.0.0,>=1.4->fief-client[cli]->codecarbon) (2.22)\n", + "Requirement already satisfied: MarkupSafe>=2.0 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from jinja2->torch) (3.0.3)\n", + "Requirement already satisfied: annotated-types>=0.6.0 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from pydantic->codecarbon) (0.7.0)\n", + "Requirement already satisfied: pydantic-core==2.41.5 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from pydantic->codecarbon) (2.41.5)\n", + "Requirement already satisfied: typing-inspection>=0.4.2 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from pydantic->codecarbon) (0.4.2)\n", + "Requirement already satisfied: prompt_toolkit<4.0,>=2.0 in c:\\users\\dell\\appdata\\roaming\\python\\python313\\site-packages (from questionary->codecarbon) (3.0.51)\n", + "Requirement already satisfied: wcwidth in c:\\users\\dell\\appdata\\roaming\\python\\python313\\site-packages (from prompt_toolkit<4.0,>=2.0->questionary->codecarbon) (0.2.13)\n", + "Requirement already satisfied: charset_normalizer<4,>=2 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from requests->transformers<5.0.0,>=4.41.0->sentence-transformers) (3.4.4)\n", + "Requirement already satisfied: urllib3<3,>=1.21.1 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from requests->transformers<5.0.0,>=4.41.0->sentence-transformers) (2.5.0)\n", + "Requirement already satisfied: markdown-it-py>=2.2.0 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from rich->codecarbon) (3.0.0)\n", + "Requirement already satisfied: pygments<3.0.0,>=2.13.0 in c:\\users\\dell\\appdata\\roaming\\python\\python313\\site-packages (from rich->codecarbon) (2.19.2)\n", + "Requirement already satisfied: mdurl~=0.1 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from markdown-it-py>=2.2.0->rich->codecarbon) (0.1.2)\n", + "Requirement already satisfied: termcolor<4.0,>=3.1 in c:\\users\\dell\\appdata\\local\\programs\\python\\python313\\lib\\site-packages (from yaspin->fief-client[cli]->codecarbon) (3.2.0)\n", + "Downloading pandas-2.3.3-cp313-cp313-win_amd64.whl (11.0 MB)\n", + " ---------------------------------------- 0.0/11.0 MB ? eta -:--:--\n", + " ---------------------------------------- 0.0/11.0 MB ? eta -:--:--\n", + " ---------------------------------------- 0.0/11.0 MB ? eta -:--:--\n", + " --------------------------------------- 0.3/11.0 MB ? eta -:--:--\n", + " - -------------------------------------- 0.5/11.0 MB 1.3 MB/s eta 0:00:09\n", + " -- ------------------------------------- 0.8/11.0 MB 1.0 MB/s eta 0:00:10\n", + " -- ------------------------------------- 0.8/11.0 MB 1.0 MB/s eta 0:00:10\n", + " -- ------------------------------------- 0.8/11.0 MB 1.0 MB/s eta 0:00:10\n", + " --- ------------------------------------ 1.0/11.0 MB 754.3 kB/s eta 0:00:14\n", + " --- ------------------------------------ 1.0/11.0 MB 754.3 kB/s eta 0:00:14\n", + " --- ------------------------------------ 1.0/11.0 MB 754.3 kB/s eta 0:00:14\n", + " --- ------------------------------------ 1.0/11.0 MB 754.3 kB/s eta 0:00:14\n", + " ---- ----------------------------------- 1.3/11.0 MB 567.4 kB/s eta 0:00:18\n", + " ---- ----------------------------------- 1.3/11.0 MB 567.4 kB/s eta 0:00:18\n", + " ---- ----------------------------------- 1.3/11.0 MB 567.4 kB/s eta 0:00:18\n", + " ----- ---------------------------------- 1.6/11.0 MB 528.8 kB/s eta 0:00:18\n", + " ----- ---------------------------------- 1.6/11.0 MB 528.8 kB/s eta 0:00:18\n", + " ----- ---------------------------------- 1.6/11.0 MB 528.8 kB/s eta 0:00:18\n", + " ------ --------------------------------- 1.8/11.0 MB 506.3 kB/s eta 0:00:19\n", + " ------ --------------------------------- 1.8/11.0 MB 506.3 kB/s eta 0:00:19\n", + " ------ --------------------------------- 1.8/11.0 MB 506.3 kB/s eta 0:00:19\n", + " ------- -------------------------------- 2.1/11.0 MB 472.6 kB/s eta 0:00:19\n", + " ------- -------------------------------- 2.1/11.0 MB 472.6 kB/s eta 0:00:19\n", + " ------- -------------------------------- 2.1/11.0 MB 472.6 kB/s eta 0:00:19\n", + " ------- -------------------------------- 2.1/11.0 MB 472.6 kB/s eta 0:00:19\n", + " -------- ------------------------------- 2.4/11.0 MB 445.8 kB/s eta 0:00:20\n", + " -------- ------------------------------- 2.4/11.0 MB 445.8 kB/s eta 0:00:20\n", + " -------- ------------------------------- 2.4/11.0 MB 445.8 kB/s eta 0:00:20\n", + " -------- ------------------------------- 2.4/11.0 MB 445.8 kB/s eta 0:00:20\n", + " --------- ------------------------------ 2.6/11.0 MB 432.5 kB/s eta 0:00:20\n", + " --------- ------------------------------ 2.6/11.0 MB 432.5 kB/s eta 0:00:20\n", + " --------- ------------------------------ 2.6/11.0 MB 432.5 kB/s eta 0:00:20\n", + " ---------- ----------------------------- 2.9/11.0 MB 430.8 kB/s eta 0:00:19\n", + " ---------- ----------------------------- 2.9/11.0 MB 430.8 kB/s eta 0:00:19\n", + " ---------- ----------------------------- 2.9/11.0 MB 430.8 kB/s eta 0:00:19\n", + " ----------- ---------------------------- 3.1/11.0 MB 433.1 kB/s eta 0:00:19\n", + " ----------- ---------------------------- 3.1/11.0 MB 433.1 kB/s eta 0:00:19\n", + " ------------ --------------------------- 3.4/11.0 MB 439.8 kB/s eta 0:00:18\n", + " ------------ --------------------------- 3.4/11.0 MB 439.8 kB/s eta 0:00:18\n", + " ------------- -------------------------- 3.7/11.0 MB 447.4 kB/s eta 0:00:17\n", + " ------------- -------------------------- 3.7/11.0 MB 447.4 kB/s eta 0:00:17\n", + " -------------- ------------------------- 3.9/11.0 MB 451.1 kB/s eta 0:00:16\n", + " -------------- ------------------------- 3.9/11.0 MB 451.1 kB/s eta 0:00:16\n", + " -------------- ------------------------- 3.9/11.0 MB 451.1 kB/s eta 0:00:16\n", + " --------------- ------------------------ 4.2/11.0 MB 453.6 kB/s eta 0:00:15\n", + " --------------- ------------------------ 4.2/11.0 MB 453.6 kB/s eta 0:00:15\n", + " ---------------- ----------------------- 4.5/11.0 MB 458.8 kB/s eta 0:00:15\n", + " ---------------- ----------------------- 4.5/11.0 MB 458.8 kB/s eta 0:00:15\n", + " ---------------- ----------------------- 4.5/11.0 MB 458.8 kB/s eta 0:00:15\n", + " ----------------- ---------------------- 4.7/11.0 MB 455.6 kB/s eta 0:00:14\n", + " ----------------- ---------------------- 4.7/11.0 MB 455.6 kB/s eta 0:00:14\n", + " ----------------- ---------------------- 4.7/11.0 MB 455.6 kB/s eta 0:00:14\n", + " ------------------ --------------------- 5.0/11.0 MB 456.9 kB/s eta 0:00:14\n", + " ------------------ --------------------- 5.0/11.0 MB 456.9 kB/s eta 0:00:14\n", + " ------------------- -------------------- 5.2/11.0 MB 459.6 kB/s eta 0:00:13\n", + " ------------------- -------------------- 5.2/11.0 MB 459.6 kB/s eta 0:00:13\n", + " ------------------- -------------------- 5.2/11.0 MB 459.6 kB/s eta 0:00:13\n", + " -------------------- ------------------- 5.5/11.0 MB 462.9 kB/s eta 0:00:12\n", + " -------------------- ------------------- 5.5/11.0 MB 462.9 kB/s eta 0:00:12\n", + " -------------------- ------------------- 5.5/11.0 MB 462.9 kB/s eta 0:00:12\n", + " -------------------- ------------------- 5.8/11.0 MB 461.7 kB/s eta 0:00:12\n", + " -------------------- ------------------- 5.8/11.0 MB 461.7 kB/s eta 0:00:12\n", + " --------------------- ------------------ 6.0/11.0 MB 462.8 kB/s eta 0:00:11\n", + " --------------------- ------------------ 6.0/11.0 MB 462.8 kB/s eta 0:00:11\n", + " --------------------- ------------------ 6.0/11.0 MB 462.8 kB/s eta 0:00:11\n", + " ---------------------- ----------------- 6.3/11.0 MB 465.8 kB/s eta 0:00:11\n", + " ---------------------- ----------------- 6.3/11.0 MB 465.8 kB/s eta 0:00:11\n", + " ----------------------- ---------------- 6.6/11.0 MB 463.5 kB/s eta 0:00:10\n", + " ----------------------- ---------------- 6.6/11.0 MB 463.5 kB/s eta 0:00:10\n", + " ----------------------- ---------------- 6.6/11.0 MB 463.5 kB/s eta 0:00:10\n", + " ------------------------ --------------- 6.8/11.0 MB 462.3 kB/s eta 0:00:10\n", + " ------------------------ --------------- 6.8/11.0 MB 462.3 kB/s eta 0:00:10\n", + " ------------------------ --------------- 6.8/11.0 MB 462.3 kB/s eta 0:00:10\n", + " ------------------------ --------------- 6.8/11.0 MB 462.3 kB/s eta 0:00:10\n", + " ------------------------- -------------- 7.1/11.0 MB 454.8 kB/s eta 0:00:09\n", + " ------------------------- -------------- 7.1/11.0 MB 454.8 kB/s eta 0:00:09\n", + " ------------------------- -------------- 7.1/11.0 MB 454.8 kB/s eta 0:00:09\n", + " ------------------------- -------------- 7.1/11.0 MB 454.8 kB/s eta 0:00:09\n", + " -------------------------- ------------- 7.3/11.0 MB 444.4 kB/s eta 0:00:09\n", + " -------------------------- ------------- 7.3/11.0 MB 444.4 kB/s eta 0:00:09\n", + " -------------------------- ------------- 7.3/11.0 MB 444.4 kB/s eta 0:00:09\n", + " -------------------------- ------------- 7.3/11.0 MB 444.4 kB/s eta 0:00:09\n", + " --------------------------- ------------ 7.6/11.0 MB 440.5 kB/s eta 0:00:08\n", + " --------------------------- ------------ 7.6/11.0 MB 440.5 kB/s eta 0:00:08\n", + " --------------------------- ------------ 7.6/11.0 MB 440.5 kB/s eta 0:00:08\n", + " --------------------------- ------------ 7.6/11.0 MB 440.5 kB/s eta 0:00:08\n", + " ---------------------------- ----------- 7.9/11.0 MB 436.4 kB/s eta 0:00:08\n", + " ---------------------------- ----------- 7.9/11.0 MB 436.4 kB/s eta 0:00:08\n", + " ---------------------------- ----------- 7.9/11.0 MB 436.4 kB/s eta 0:00:08\n", + " ----------------------------- ---------- 8.1/11.0 MB 434.8 kB/s eta 0:00:07\n", + " ----------------------------- ---------- 8.1/11.0 MB 434.8 kB/s eta 0:00:07\n", + " ----------------------------- ---------- 8.1/11.0 MB 434.8 kB/s eta 0:00:07\n", + " ------------------------------ --------- 8.4/11.0 MB 432.2 kB/s eta 0:00:07\n", + " ------------------------------ --------- 8.4/11.0 MB 432.2 kB/s eta 0:00:07\n", + " ------------------------------ --------- 8.4/11.0 MB 432.2 kB/s eta 0:00:07\n", + " ------------------------------ --------- 8.4/11.0 MB 432.2 kB/s eta 0:00:07\n", + " ------------------------------ --------- 8.4/11.0 MB 432.2 kB/s eta 0:00:07\n", + " ------------------------------ --------- 8.4/11.0 MB 432.2 kB/s eta 0:00:07\n", + " ------------------------------- -------- 8.7/11.0 MB 421.3 kB/s eta 0:00:06\n", + " ------------------------------- -------- 8.7/11.0 MB 421.3 kB/s eta 0:00:06\n", + " ------------------------------- -------- 8.7/11.0 MB 421.3 kB/s eta 0:00:06\n", + " ------------------------------- -------- 8.7/11.0 MB 421.3 kB/s eta 0:00:06\n", + " -------------------------------- ------- 8.9/11.0 MB 413.2 kB/s eta 0:00:06\n", + " -------------------------------- ------- 8.9/11.0 MB 413.2 kB/s eta 0:00:06\n", + " -------------------------------- ------- 8.9/11.0 MB 413.2 kB/s eta 0:00:06\n", + " -------------------------------- ------- 8.9/11.0 MB 413.2 kB/s eta 0:00:06\n", + " --------------------------------- ------ 9.2/11.0 MB 408.9 kB/s eta 0:00:05\n", + " --------------------------------- ------ 9.2/11.0 MB 408.9 kB/s eta 0:00:05\n", + " --------------------------------- ------ 9.2/11.0 MB 408.9 kB/s eta 0:00:05\n", + " --------------------------------- ------ 9.2/11.0 MB 408.9 kB/s eta 0:00:05\n", + " --------------------------------- ------ 9.2/11.0 MB 408.9 kB/s eta 0:00:05\n", + " --------------------------------- ------ 9.2/11.0 MB 408.9 kB/s eta 0:00:05\n", + " --------------------------------- ------ 9.2/11.0 MB 408.9 kB/s eta 0:00:05\n", + " --------------------------------- ------ 9.2/11.0 MB 408.9 kB/s eta 0:00:05\n", + " --------------------------------- ------ 9.2/11.0 MB 408.9 kB/s eta 0:00:05\n", + " --------------------------------- ------ 9.2/11.0 MB 408.9 kB/s eta 0:00:05\n", + " --------------------------------- ------ 9.2/11.0 MB 408.9 kB/s eta 0:00:05\n", + " ---------------------------------- ----- 9.4/11.0 MB 383.5 kB/s eta 0:00:05\n", + " ---------------------------------- ----- 9.4/11.0 MB 383.5 kB/s eta 0:00:05\n", + " ---------------------------------- ----- 9.4/11.0 MB 383.5 kB/s eta 0:00:05\n", + " ---------------------------------- ----- 9.4/11.0 MB 383.5 kB/s eta 0:00:05\n", + " ---------------------------------- ----- 9.4/11.0 MB 383.5 kB/s eta 0:00:05\n", + " ---------------------------------- ----- 9.4/11.0 MB 383.5 kB/s eta 0:00:05\n", + " ---------------------------------- ----- 9.4/11.0 MB 383.5 kB/s eta 0:00:05\n", + " ----------------------------------- ---- 9.7/11.0 MB 372.0 kB/s eta 0:00:04\n", + " ----------------------------------- ---- 9.7/11.0 MB 372.0 kB/s eta 0:00:04\n", + " ----------------------------------- ---- 9.7/11.0 MB 372.0 kB/s eta 0:00:04\n", + " ----------------------------------- ---- 9.7/11.0 MB 372.0 kB/s eta 0:00:04\n", + " ----------------------------------- ---- 9.7/11.0 MB 372.0 kB/s eta 0:00:04\n", + " ------------------------------------ --- 10.0/11.0 MB 367.6 kB/s eta 0:00:03\n", + " ------------------------------------ --- 10.0/11.0 MB 367.6 kB/s eta 0:00:03\n", + " ------------------------------------ --- 10.0/11.0 MB 367.6 kB/s eta 0:00:03\n", + " ------------------------------------- -- 10.2/11.0 MB 366.8 kB/s eta 0:00:03\n", + " ------------------------------------- -- 10.2/11.0 MB 366.8 kB/s eta 0:00:03\n", + " ------------------------------------- -- 10.2/11.0 MB 366.8 kB/s eta 0:00:03\n", + " ------------------------------------- -- 10.2/11.0 MB 366.8 kB/s eta 0:00:03\n", + " -------------------------------------- - 10.5/11.0 MB 366.0 kB/s eta 0:00:02\n", + " -------------------------------------- - 10.5/11.0 MB 366.0 kB/s eta 0:00:02\n", + " -------------------------------------- - 10.5/11.0 MB 366.0 kB/s eta 0:00:02\n", + " -------------------------------------- - 10.5/11.0 MB 366.0 kB/s eta 0:00:02\n", + " -------------------------------------- - 10.5/11.0 MB 366.0 kB/s eta 0:00:02\n", + " --------------------------------------- 10.7/11.0 MB 361.5 kB/s eta 0:00:01\n", + " --------------------------------------- 10.7/11.0 MB 361.5 kB/s eta 0:00:01\n", + " --------------------------------------- 10.7/11.0 MB 361.5 kB/s eta 0:00:01\n", + " --------------------------------------- 10.7/11.0 MB 361.5 kB/s eta 0:00:01\n", + " --------------------------------------- 10.7/11.0 MB 361.5 kB/s eta 0:00:01\n", + " ---------------------------------------- 11.0/11.0 MB 358.1 kB/s 0:00:30\n", + "Installing collected packages: pandas\n", + " Attempting uninstall: pandas\n", + " Found existing installation: pandas 2.3.1\n", + " Uninstalling pandas-2.3.1:\n", + " Successfully uninstalled pandas-2.3.1\n", + "Successfully installed pandas-2.3.3\n" + ] + } + ], + "source": [ + "# Cell 1\n", + "!pip install --upgrade huggingface_hub sentence-transformers codecarbon pypdf torch scikit-learn pandas" + ] + }, + { + "cell_type": "markdown", + "id": "7a833bed", + "metadata": {}, + "source": [ + "# Cell 2 — Configuration and imports\n", + "Set HF API key, model ID (Gemma), data folder path, and import core libraries.\n", + "Also define file paths for answer and emission logging.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1cf3a79b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "HF model: google/gemma-2-2b-it\n", + "Data folder: c:\\Users\\DELL\\ELO2_GREEN_AI\\google-gemma\\data\n", + "Answer log file: c:\\Users\\DELL\\ELO2_GREEN_AI\\google-gemma\\answers_log.csv\n", + "Emissions log file: c:\\Users\\DELL\\ELO2_GREEN_AI\\google-gemma\\emissions_log.csv\n", + "API key loaded? hf_FiHWyoy...\n" + ] + } + ], + "source": [ + "# Cell 2\n", + "import os\n", + "import time\n", + "import textwrap\n", + "import glob\n", + "import numpy as np\n", + "from huggingface_hub import InferenceClient\n", + "from sentence_transformers import SentenceTransformer\n", + "from codecarbon import OfflineEmissionsTracker\n", + "\n", + "# === EDIT THIS: paste your HF API key here (read permission) ===\n", + "HF_API_KEY = \"api_org_your_hf_api_key_here\"\n", + "\n", + "# Model to use\n", + "HF_MODEL_ID = \"google/gemma-2-2b-it\"\n", + "\n", + "# Data folder must be ./data containing source.txt\n", + "DATA_PATH = os.path.join(os.getcwd(), \"data\")\n", + "\n", + "# Country for CodeCarbon\n", + "YOUR_COUNTRY_ISO_CODE = \"EGY\"\n", + "\n", + "# Files for logging answers & emissions\n", + "LOG_FILE = os.path.join(os.getcwd(), \"answers_log.csv\")\n", + "EMISSIONS_FILE = os.path.join(os.getcwd(), \"emissions_log.csv\")\n", + "\n", + "# Quick checks\n", + "print(\"HF model:\", HF_MODEL_ID)\n", + "print(\"Data folder:\", DATA_PATH)\n", + "print(\"Answer log file:\", LOG_FILE)\n", + "print(\"Emissions log file:\", EMISSIONS_FILE)\n", + "print(\"API key loaded?\", (HF_API_KEY[:10] + \"...\") if HF_API_KEY else \"MISSING\")" + ] + }, + { + "cell_type": "markdown", + "id": "40c103f7", + "metadata": {}, + "source": [ + "# Cell 3 — Define prompts\n", + "Defines the system/user templates for the draft, critic, and refiner steps.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "87dc6a66", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Prompts ready to use.\n" + ] + } + ], + "source": [ + "# Cell 3\n", + "DRAFT_SYSTEM_PROMPT = (\n", + " \"You are an expert assistant. Answer the user's question based *only* on the provided context. \"\n", + " \"Do not make up facts beyond the context. If the answer is not present in the context, say so clearly.\"\n", + ")\n", + "DRAFT_USER_TEMPLATE = \"Context:\\n{context_str}\\n\\nQuestion:\\n{query_str}\"\n", + "\n", + "CRITIC_SYSTEM_PROMPT = (\n", + " \"You are a 'Critic' AI. Evaluate the 'Draft Answer' using ONLY the Source Context. \"\n", + " \"Check Faithfulness (is each claim supported by the context?) and Relevance (does it answer?). \"\n", + " \"Provide short bullet points describing problems or write 'The draft is perfect.' if no problems.\"\n", + ")\n", + "CRITIC_USER_TEMPLATE = \"Source Context:\\n{context}\\n\\nOriginal Question:\\n{question}\\n\\nDraft Answer:\\n{draft}\"\n", + "\n", + "REFINER_SYSTEM_PROMPT = (\n", + " \"You are a 'Refiner' AI. Rewrite the Draft Answer to incorporate the Critic's Feedback. \"\n", + " \"Do not add new factual information beyond what is present in the draft or context. Output only the improved answer.\"\n", + ")\n", + "REFINER_USER_TEMPLATE = \"Original Draft:\\n{draft}\\n\\nCritic's Feedback:\\n{feedback}\"\n", + "\n", + "print(\"Prompts ready to use.\")" + ] + }, + { + "cell_type": "markdown", + "id": "5cd943ba", + "metadata": {}, + "source": [ + "# Cell 4 — Load & index documents\n", + "Loads all text files from ./data, chunks text into smaller segments,\n", + "computes embeddings with SentenceTransformer, and builds a simple\n", + "cosine-similarity retriever. Each chunk is stored as a Node object\n", + "for easy retrieval during the RAG loop.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "dec3b6db", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "All files found in folder: ['c:\\\\Users\\\\DELL\\\\ELO2_GREEN_AI\\\\google-gemma\\\\data\\\\source.txt']\n", + "Computing embeddings for 5 chunks...\n" + ] + }, + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "99a8c3dcfa4f4ae5afce160e814a37a3", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Batches: 0%| | 0/1 [00:00