Document explicit Linux-to-Windows baseline/deep handoff workflow

LMBooth · LMBooth · commit e3cfe73fc736 · 2026-02-19T21:27:11.000Z
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,104 @@
+# AGENTS.md
+
+Scope: applies to this repository (`analysis_pipeline`) and all subdirectories.
+
+## Cross-Machine Workflow (Linux Source of Truth)
+
+If the Linux machine already ran classic + deep neural model training, treat Linux outputs as the source of truth.
+
+When switching to Windows for documentation work:
+
+- Do not rerun Stage 6 model training unless the user explicitly asks.
+- Copy result artifacts from Linux first.
+- Regenerate plots/reports from copied `ml_results*.json` files.
+
+This saves time and avoids unnecessary retraining.
+
+## Baseline/Deep Handoff Rule (Linux -> Windows)
+
+When asked to rebuild baseline confusion reports on Windows, do **not** assume `git pull` is enough.
+
+Reason: key ML/report artifacts are gitignored (large/reproducible), including:
+
+- `analysis_pipeline/reports/ml_results*.json`
+- `analysis_pipeline/reports/confusion_highlights*.json`
+- `analysis_pipeline/reports/confusion_pngs/`
+- `analysis_pipeline/features/`
+
+## Required Linux Artifacts (Minimum for Plot/Report Regeneration)
+
+Copy these from Linux into the same paths on Windows:
+
+- Classic baseline results:
+  - `analysis_pipeline/reports/ml_results_baseline_all_bins_baseline.json`
+  - `analysis_pipeline/reports/ml_results_baseline_omit_hardest_baseline.json`
+  - `analysis_pipeline/reports/ml_results_baseline_low_high_omit_hardest_baseline.json`
+  - `analysis_pipeline/reports/ml_results_baseline_grouped_4class_omit_hardest_baseline.json`
+  - `analysis_pipeline/reports/ml_results_baseline_omit_easiest_baseline.json`
+- Advanced NN baseline results:
+  - `analysis_pipeline/reports/ml_results_baseline_all_bins_baseline_advanced_nn.json`
+  - `analysis_pipeline/reports/ml_results_baseline_omit_hardest_baseline_advanced_nn.json`
+  - `analysis_pipeline/reports/ml_results_baseline_low_high_omit_hardest_baseline_advanced_nn.json`
+  - `analysis_pipeline/reports/ml_results_baseline_grouped_4class_omit_hardest_baseline_advanced_nn.json`
+  - `analysis_pipeline/reports/ml_results_baseline_omit_easiest_baseline_advanced_nn.json`
+- Baseline split manifest used by stage6 configs:
+  - `analysis_pipeline/features/split_manifest_tutorial_baseline.json`
+
+If Stage 6 rerun is explicitly requested, copy full baseline feature artifacts from Linux as well, at minimum:
+
+- `analysis_pipeline/features/split_manifest_tutorial_baseline.json`
+- `analysis_pipeline/features/features_fused_tutorial_baseline.tsv`
+
+## Required Verification Before Rebuild
+
+Run:
+
+```powershell
+Get-ChildItem .\analysis_pipeline\reports\ml_results_baseline_*_baseline.json | Measure-Object
+Get-ChildItem .\analysis_pipeline\reports\ml_results_baseline_*_baseline_advanced_nn.json | Measure-Object
+```
+
+Each should report `Count = 5`.
+
+If either count is below 5:
+
+- Stop.
+- Ask user to sync missing Linux artifacts.
+- Do not claim full baseline/deep coverage.
+
+## Rebuild Command (All Baseline + Advanced JSONs Present)
+
+```powershell
+Get-ChildItem .\analysis_pipeline\reports\ml_results_baseline*.json | ForEach-Object {
+  $scenario = $_.BaseName -replace '^ml_results_', ''
+  python .\analysis_pipeline\stage6_highlight_confusions.py `
+    --results-json $_.FullName `
+    --out-json ("analysis_pipeline/reports/confusion_highlights_{0}.json" -f $scenario) `
+    --out-md ("analysis_pipeline/reports/confusion_highlights_{0}.md" -f $scenario) `
+    --metric balanced_accuracy_mean `
+    --top-k-per-protocol 1 `
+    --include-all `
+    --out-png-dir analysis_pipeline/reports/confusion_pngs
+}
+```
+
+Then regenerate documentation outputs (docx/md):
+
+```powershell
+python .\scripts\build_coherent_docx.py
+```
+
+## Output Naming Convention
+
+Baseline confusion markdown names should follow:
+
+- Classic: `confusion_highlights_baseline_{scenario}_baseline.md`
+- Advanced NN: `confusion_highlights_baseline_{scenario}_baseline_advanced_nn.md`
+
+Where `{scenario}` is one of:
+
+- `all_bins`
+- `omit_hardest`
+- `low_high_omit_hardest`
+- `grouped_4class_omit_hardest`
+- `omit_easiest`
diff --git a/README.md b/README.md
@@ -81,6 +81,60 @@ python .\analysis_pipeline\run_pipeline.py `
   --only stage6 stage6_confusions
 ```
 
+## Linux -> Windows Handoff (Baseline Reports)
+
+`git pull` alone is not enough to reproduce baseline report assets on Windows. Large ML/report artifacts are intentionally ignored in `.gitignore`:
+
+- `analysis_pipeline/reports/ml_results*.json`
+- `analysis_pipeline/reports/confusion_highlights*.json`
+- `analysis_pipeline/reports/confusion_pngs/`
+- `analysis_pipeline/features/` (entire folder)
+
+If the Linux machine already produced the baseline runs, copy these files from Linux into the same paths on Windows.
+
+Required files to rebuild all baseline confusion reports on Windows:
+
+- `analysis_pipeline/reports/ml_results_baseline_*_baseline.json` (classic baseline track, 5 files)
+- `analysis_pipeline/reports/ml_results_baseline_*_baseline_advanced_nn.json` (advanced NN baseline track, 5 files)
+
+Expected scenarios in those filenames:
+
+- `baseline_all_bins`
+- `baseline_omit_hardest`
+- `baseline_low_high_omit_hardest`
+- `baseline_grouped_4class_omit_hardest`
+- `baseline_omit_easiest`
+
+If you need to rerun baseline ML on Windows (not only regenerate confusion reports), also copy:
+
+- `analysis_pipeline/features/` (all feature tables/manifests referenced by Stage 6)
+- especially `analysis_pipeline/features/split_manifest_tutorial_baseline.json`
+
+Rebuild confusion reports for all available `ml_results*.json` files (includes baseline + advanced if present):
+
+```powershell
+Get-ChildItem .\analysis_pipeline\reports\ml_results*.json | ForEach-Object {
+  $scenario = $_.BaseName -replace '^ml_results_', ''
+  python .\analysis_pipeline\stage6_highlight_confusions.py `
+    --results-json $_.FullName `
+    --out-json ("analysis_pipeline/reports/confusion_highlights_{0}.json" -f $scenario) `
+    --out-md ("analysis_pipeline/reports/confusion_highlights_{0}.md" -f $scenario) `
+    --metric balanced_accuracy_mean `
+    --top-k-per-protocol 1 `
+    --include-all `
+    --out-png-dir analysis_pipeline/reports/confusion_pngs
+}
+```
+
+Quick check that baseline inputs arrived:
+
+```powershell
+Get-ChildItem .\analysis_pipeline\reports\ml_results_baseline_*_baseline.json | Measure-Object
+Get-ChildItem .\analysis_pipeline\reports\ml_results_baseline_*_baseline_advanced_nn.json | Measure-Object
+```
+
+Each command should report `Count = 5`.
+
 ## Stage Summary
 
 - Stage 0: canonical trial table from BIDS events.
diff --git a/docs/README.md b/docs/README.md
@@ -19,3 +19,20 @@ Supporting tables are in `docs/tables/`, including:
 - `docs/tables/ml_numeric_feature_columns_fused.csv`
 - `docs/tables/ml_feature_prefix_counts.csv`
 - `docs/tables/ml_numeric_feature_prefix_counts.csv`
+
+## Cross-Machine Note (Linux -> Windows)
+
+Baseline ML result JSONs are large and gitignored, so they do not come through `git pull`.
+
+To regenerate full baseline confusion reports on Windows, copy from Linux:
+
+- `analysis_pipeline/reports/ml_results_baseline_*_baseline.json`
+- `analysis_pipeline/reports/ml_results_baseline_*_baseline_advanced_nn.json`
+
+Then run `analysis_pipeline/stage6_highlight_confusions.py` on those files to rebuild:
+
+- `analysis_pipeline/reports/confusion_highlights_baseline_*.md`
+- `analysis_pipeline/reports/confusion_highlights_baseline_*.json`
+- `analysis_pipeline/reports/confusion_pngs/*`
+
+See the handoff checklist and commands in `README.md`.