IDKHowToCodeFR
diff --git a/‎README.md‎
Lines changed: 47 additions & 43 deletions b/‎README.md‎
Lines changed: 47 additions & 43 deletions
diff --git a/‎backend/database.py‎
Lines changed: 38 additions & 6 deletions b/‎backend/database.py‎
Lines changed: 38 additions & 6 deletions
diff --git a/‎backend/main.py‎
Lines changed: 15 additions & 0 deletions b/‎backend/main.py‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎backend/patient_history.db‎
0 Bytes b/‎backend/patient_history.db‎
0 Bytes
diff --git a/‎backend/requirements.txt‎
Lines changed: 1 addition & 0 deletions b/‎backend/requirements.txt‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎frontend/Home.py‎
Lines changed: 36 additions & 27 deletions b/‎frontend/Home.py‎
Lines changed: 36 additions & 27 deletions
@@ -1,17 +1,21 @@
-# TinyML Heart Health Monitoring Dashboard
+# Heart Health Monitoring Dashboard
 
-![Python Version](https://img.shields.io/badge/python-3.9%2B-blue)
-![FastAPI](https://img.shields.io/badge/FastAPI-0.100%2B-009688?logo=fastapi)
-![Streamlit](https://img.shields.io/badge/Streamlit-1.25%2B-FF4B4B?logo=streamlit)
-![License](https://img.shields.io/badge/license-MIT-green)
+A user-friendly web application for monitoring heart health metrics and predicting patient risks using Machine Learning. This project features a responsive Streamlit frontend and a robust FastAPI backend.
 
-A scalable, end-to-end Machine Learning ecosystem designed to provide heart health analytics, ensemble modeling predictions, and Edge AI deployment capabilities for resource-constrained microcontrollers.
+## 🌟 Key Engineering Achievements (Project Highlights)
+
+From a systems engineering and machine learning perspective, this platform demonstrates several advanced concepts:
+
+- **TinyML & Edge Computing Deployment**: Designed with hardware constraints in mind, this project transpiles complex Python ML models into highly optimized, dependency-free **C-code headers**. By utilizing **INT8 Quantization**, the system mathematically compresses 64-bit floating-point weights down to 8-bit integers, reducing the memory footprint by ~75%. This allows predictive models to be deployed directly onto severely resource-constrained microcontrollers (like the ESP32 or ARM Cortex-M) ensuring offline, ultra-low latency, and privacy-preserving inference right at the edge.
+- **Dynamic Hardware Profiling**: The deployment engine actively calculates predictive heuristics, such as expected inference latency (in microseconds) and flash memory payload size, mapped against specific embedded hardware profiles (e.g., Arduino Nano 33 BLE, Raspberry Pi Pico) before the code is even exported.
+- **Interpretable AI (SHAP)**: To solve the "black box" problem common in healthcare tech, the backend generates real-time SHAP (SHapley Additive exPlanations) values. This provides explicit, feature-level transparency into *why* the AI made a specific clinical decision, building essential trust with end-users.
+- **Robust Model Ensembling**: Rather than relying on a single algorithm, the system features a Soft-Voting Ensemble architecture that aggregates predictions across five distinct model architectures (KNN, SVM, Logistic Regression, Random Forest, and a Neural Network) to maximize predictive accuracy and minimize bias.
 
 ---
 
 ## 🏗️ System Architecture
 
-The system follows a separated frontend/backend architecture, enabling scaling flexibility and clean separation of concerns.
+We designed the system with strict decoupling between the client UI and the heavy-lifting ML pipeline. Doing this lets us scale instances efficiently while enabling simple API integration for other health services later on.
 
 ```mermaid
 flowchart TD
@@ -32,87 +36,87 @@ flowchart TD
 
     %% ML Subgraph
     subgraph ML [🧠 Machine Learning & Inference Pipeline]
-        ENS{Ensemble Engine<br/>Soft-Voting Aggregator}:::ml_node
+        ENS{Ensemble Engine<br/>Soft-Voting}:::ml_node
         Models[Classifiers:<br/>RF, SVM, LogReg, NN, KNN]:::ml_node
-        SHAP[🔍 SHAP Explainer<br/>Feature Impact Analysis]:::ml_node
+        SHAP[🔍 SHAP Explainer]:::ml_node
     end
 
     %% Edge Quantization Subgraph
-    subgraph Quantization [⚡ Hardware & Edge AI Export]
+    subgraph Quantization [⚡ Hardware Export]
         TRANS[C-Code Transpiler<br/>INT8 Quantization]:::edge_tech
-        HEADER((tinyml_model.h<br/>Optimized Header File)):::edge_tech
-        MCU>📟 ESP32 / ARM Cortex-M<br/>Microcontroller Node]:::edge_tech
+        HEADER((tinyml_model.h)):::edge_tech
+        MCU>ESP32 / Cortex-M Node]:::edge_tech
     end
 
-    %% Routing Connections
     API -->|Predict Request| ENS
     ENS --> Models
-    
     API -->|Explain Request| SHAP
-    SHAP -.->|Analyzes Decision Trees| Models
-    
+    SHAP -.-> Models
     API -->|Export Request| TRANS
-    TRANS -->|Scales FP32 to INT8| HEADER
+    TRANS --> HEADER
     HEADER -->|Flash Firmware| MCU
 ```
 
 ---
 
-## ✨ Key Technical Features
-
-### 1. TinyML Edge Deployment & INT8 Quantization 📉
-The core value proposition is executing model inference offline on devices like the ESP32 and ARM Cortex-M. The backend transpiler natively parses trained models (Logistic Regression, Support Vector Machines, Neural Networks, K-Nearest Neighbors) into portable `stdint.h` C-code binaries.
-
-Crucially, an automated **INT8 Quantization pipeline** is available. This procedure calculates appropriate linear scaling factors globally across Neural Network layers or SVM hyperplanes to convert 64-bit Floating Point (`double`) weights into constrained 8-bit Integer (`int8_t`) representations. This technique drastically reduces firmware flash size requirements (by approximately 75%) and limits execution to low-power integer arithmetic operations.
-
-### 2. Clinical Model Interpretability (Explainable AI) 🧠
-Predictive medical systems cannot function as black boxes. By employing **SHAP (SHapley Additive exPlanations)**, the FastAPI backend interprets Random Forest decision trees, resolving the algebraic impact weights of individual clinical variables (e.g. SpO2 vs Heart Rate) over the final prediction. These explanations are visually charted on the frontend to explicitly map risk-increasing or protective physiological factors.
+## 🚀 Getting Started
 
-### 3. Ensemble Prediction Algorithm 🤝
-The framework combines the predictive strengths of various fundamental Machine Learning techniques. Instead of relying on a singular hypothesis, an ensemble pipeline coordinates inferences from KNN, SVM, Logistic Regression, Random Forest, and a Multi-layer Perceptron. A soft-voting probability aggregator dictates the final classification, balancing variance and bias.
+Follow these instructions to get a copy of the project up and running on your local machine.
 
-### 4. MLOps CI/CD and Drift Adjustment 🔄
-System performance over time heavily correlates with dataset drift. The integrated MLOps framework facilitates straightforward uploading of new Patient CSV cohorts payload schemas. The system will asynchronously restart the Scikit-Learn training pipelines across all active supervised learning models, store updated `.pkl` binaries, and seamlessly refresh caching layers to serve the updated ecosystem in realtime.
+### Prerequisites
+- Python 3.9 or higher
 
-## Installation & Setup
+### Installation
 
 1. **Clone the repository:**
    ```bash
    git clone https://github.com/IDKHowToCodeFR/TinyML-Heart-Health-Monitoring-Dashboard.git
    cd TinyML-Heart-Health-Monitoring-Dashboard
    ```
 
-2. **Initialize Python environments and dependencies:**
-   Ensure Python 3.9+ is installed.
+2. **Install Dependencies:**
+   Ensure you have all required libraries installed.
    ```bash
-   python -m pip install -r requirements.txt
+   pip install -r requirements.txt
    pip install -r backend/requirements.txt
    pip install -r frontend/requirements.txt
    ```
 
-3. **Train initial `.pkl` Models:**
-   The dashboard requires foundational model states to establish the FastAPI ensemble arrays. 
+3. **Train Initial Models:**
+   Before starting the system, generate the initial machine learning models.
    ```bash
    cd backend
    python models.py
    cd ..
    ```
 
-## Running the Application Locally
+## 💻 Usage
 
-1. **Start the FastAPI Backend Service:**
+### Quick Start (Windows)
+If you are on Windows, simply run the included batch file to start both the frontend and backend automatically:
+```bash
+run.bat
+```
+
+### Manual Start
+1. **Start the Backend:**
    ```bash
    cd backend
    uvicorn main:app --host 0.0.0.0 --port 8000
    ```
-
-2. **Launch the Streamlit Frontend Client:** (In a separate terminal)
+2. **Start the Frontend:**
+   Open a new terminal window and run:
    ```bash
    cd frontend
    streamlit run Home.py
    ```
 
-The Streamlit UI will bind to `localhost:8501`. Navigate through the sidebar implementations to access prediction simulation, SHAP interpretation, or Edge specific transpilation outputs.
+### Docker
+If you prefer Docker, you can spin up the entire project with one command:
+```bash
+docker-compose up --build
+```
+Once running, open your browser and go to `http://localhost:8501`.
 
-## License
-MIT License. See `LICENSE` for details.
+## 📜 License
+This project is licensed under the MIT License.
@@ -1,10 +1,44 @@
 import sqlite3
 import os
 from datetime import datetime
+from huggingface_hub import HfApi, hf_hub_download
 
-DB_PATH = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'patient_history.db')
+REPO_ID = "IDKHowToCodeFr/tinyml-logs"
+DB_NAME = 'patient_history.db'
+DB_PATH = os.path.join(os.path.dirname(os.path.abspath(__file__)), DB_NAME)
+HF_TOKEN = os.getenv("HF_TOKEN")
+
+api = HfApi()
+
+def sync_from_hub():
+    if not HF_TOKEN:
+        print("No HF_TOKEN found. Skipping sync.")
+        return
+    try:
+        print(f"Downloading {DB_NAME} from Hub...")
+        path = hf_hub_download(repo_id=REPO_ID, filename=DB_NAME, repo_type="dataset", token=HF_TOKEN)
+        import shutil
+        shutil.copy(path, DB_PATH)
+        print("Sync from Hub complete.")
+    except Exception as e:
+        print(f"Sync from Hub failed (maybe first run?): {e}")
+
+def sync_to_hub():
+    if not HF_TOKEN:
+        return
+    try:
+        api.upload_file(
+            path_or_fileobj=DB_PATH,
+            path_in_repo=DB_NAME,
+            repo_id=REPO_ID,
+            repo_type="dataset",
+            token=HF_TOKEN
+        )
+    except Exception as e:
+        print(f"Sync to Hub failed: {e}")
 
 def init_db():
+    sync_from_hub()
     conn = sqlite3.connect(DB_PATH)
     cursor = conn.cursor()
     cursor.execute('''
@@ -43,23 +77,21 @@ def log_prediction(data, prediction_label, confidence):
     ))
     conn.commit()
     conn.close()
+    sync_to_hub()
 
 def get_history():
+    if not os.path.exists(DB_PATH):
+        return []
     conn = sqlite3.connect(DB_PATH)
     cursor = conn.cursor()
     cursor.execute('SELECT * FROM predictions ORDER BY timestamp DESC LIMIT 100')
     rows = cursor.fetchall()
-    
-    # Get column names
     col_names = [description[0] for description in cursor.description]
-    
     conn.close()
 
-    # Format as list of dicts
     history = []
     for row in rows:
         history.append(dict(zip(col_names, row)))
-        
     return history
 
 # Initialize db when this module is loaded
 
@@ -99,6 +99,21 @@ async def predict(data: PatientData):
 def history():
     return get_history()
 
+@app.get("/dataset")
+async def get_dataset():
+    data_path = '/app/data/patient_dataset.csv' if os.path.exists('/app/data') else '../data/patient_dataset.csv' if os.path.exists('../data') else 'data/patient_dataset.csv'
+    if os.path.exists(data_path):
+        df = pd.read_csv(data_path)
+        # Return summary or partial data to avoid huge payloads
+        return df.to_dict(orient="records")
+    return {"error": "Dataset not found"}
+
+@app.get("/sync")
+def force_sync():
+    from database import sync_from_hub, sync_to_hub
+    sync_from_hub()
+    return {"status": "Sync attempted"}
+
 @app.post("/explain")
 async def explain(data: PatientData):
     eng = get_ensemble()
 
@@ -8,3 +8,4 @@ shap
 m2cgen
 matplotlib
 python-multipart
+huggingface_hub
@@ -79,11 +79,15 @@
 
 # ──────── System Status ────────
 try:
-    resp = requests.get(f"{API_URL}/health", timeout=2)
-    status = resp.json().get("status", "Unknown")
-    is_online = "Healthy" in status
-except Exception:
-    status = "Cloud Mode (Monolith Fallback)"
+    resp = requests.get(f"{API_URL}/health", timeout=5)
+    if resp.status_code == 200:
+        status = resp.json().get("status", "Unknown")
+        is_online = "Healthy" in status
+    else:
+        status = f"Backend Error ({resp.status_code})"
+        is_online = False
+except Exception as e:
+    status = f"Offline / Monolith Fallback ({str(e)[:50]})"
     is_online = False
 
 badge_class = "status-online" if is_online else "status-fallback"
@@ -176,7 +180,7 @@ def live_metrics():
     ad_c = "#00cc96" if ad_good else "#ef553b"
     ad_d = "delta-good" if ad_good else "delta-bad"
     ad_i = "▼" if ad_good else "▲"
-    ad_txt = f"{abs(alert_delta)} fewer" if ad_good else f"+{alert_delta} more"
+    ad_txt = f"{abs(alert_delta)} fewer" if ad_good else f"{alert_delta} more"
 
     hr_c = "#00cc96" if hr_ok else "#ef553b"
     hr_d = "delta-good" if hr_ok else "delta-bad"
@@ -212,45 +216,50 @@ def live_metrics():
     </div>
     """, unsafe_allow_html=True)
 
-    # ──── Candlestick chart: tracks alert level changes ────
-    st.markdown("##### 📡 Live Alert Activity (Candlestick + Moving Average)")
+    # ──── Smoothed Activity Area Chart: tracks average heart-rate & alerts ────
+    st.markdown("##### 📡 Live Patient Emulation Trace")
 
     cd = st.session_state.chart_candles
     ts = [x["time"] for x in cd]
-    op = [x["open"] for x in cd]
-    hi = [x["high"] for x in cd]
-    lo = [x["low"] for x in cd]
+    # Represent the primary metric as the close/current state
     cl = [x["close"] for x in cd]
 
     fig = go.Figure()
-    fig.add_trace(go.Candlestick(
-        x=ts, open=op, high=hi, low=lo, close=cl,
-        increasing=dict(line=dict(color="#ef553b"), fillcolor="rgba(239,85,59,0.4)"),
-        decreasing=dict(line=dict(color="#00cc96"), fillcolor="rgba(0,204,150,0.4)"),
-        name="Alerts"
+
+    # Main Area Waveform
+    fig.add_trace(go.Scatter(
+        x=ts, y=cl,
+        mode='lines',
+        line=dict(color='#00f2fe', width=3, shape='spline', smoothing=1.3),
+        fill='tozeroy',
+        fillcolor='rgba(0, 242, 254, 0.15)',
+        name="Activity Index"
     ))
 
-    w = min(7, len(cl))
+    # Add a thin secondary baseline trend
+    w = min(10, len(cl))
     ma = pd.Series(cl).rolling(window=w, min_periods=1).mean().tolist()
     fig.add_trace(go.Scatter(
         x=ts, y=ma, mode='lines',
-        line=dict(color='rgba(79,172,254,0.9)', width=2.5, shape='spline'),
-        name="MA-7"
+        line=dict(color='rgba(255, 255, 255, 0.4)', width=1.5, dash='dot'),
+        name="Moving Average"
     ))
 
-    vol = [abs(cv - ov) * 2 + 1 for ov, cv in zip(op, cl)]
-    vc = ["rgba(239,85,59,0.2)" if cv >= ov else "rgba(0,204,150,0.2)" for ov, cv in zip(op, cl)]
-    fig.add_trace(go.Bar(x=ts, y=vol, marker=dict(color=vc), yaxis="y2", showlegend=False, hoverinfo='skip'))
-
     fig.update_layout(
         template="plotly_dark", height=320,
         margin=dict(l=0, r=50, t=10, b=0),
         paper_bgcolor='rgba(0,0,0,0)', plot_bgcolor='rgba(0,0,0,0)',
-        xaxis=dict(gridcolor='rgba(255,255,255,0.03)', showgrid=False, rangeslider=dict(visible=False)),
-        yaxis=dict(title="Alert Level", gridcolor='rgba(255,255,255,0.05)', title_font=dict(size=11), side="right", range=[0, ALERT_MAX + 10]),
-        yaxis2=dict(overlaying="y", side="left", showgrid=False, showticklabels=False, range=[0, max(vol)*5] if vol else [0, 20]),
+        xaxis=dict(gridcolor='rgba(255,255,255,0.03)', showgrid=True, rangeslider=dict(visible=False)),
+        yaxis=dict(
+            title="Relative Activity",
+            gridcolor='rgba(255,255,255,0.05)',
+            title_font=dict(size=11),
+            side="right",
+            range=[0, ALERT_MAX + 15]
+        ),
         legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1, font=dict(size=10)),
-        showlegend=True
+        showlegend=True,
+        hovermode="x unified"
     )
     st.plotly_chart(fig, use_container_width=True)