Skip to content

Commit e3cfe73

Browse files
committed
Document explicit Linux-to-Windows baseline/deep handoff workflow
1 parent 30ad922 commit e3cfe73

3 files changed

Lines changed: 175 additions & 0 deletions

File tree

AGENTS.md

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# AGENTS.md
2+
3+
Scope: applies to this repository (`analysis_pipeline`) and all subdirectories.
4+
5+
## Cross-Machine Workflow (Linux Source of Truth)
6+
7+
If the Linux machine already ran classic + deep neural model training, treat Linux outputs as the source of truth.
8+
9+
When switching to Windows for documentation work:
10+
11+
- Do not rerun Stage 6 model training unless the user explicitly asks.
12+
- Copy result artifacts from Linux first.
13+
- Regenerate plots/reports from copied `ml_results*.json` files.
14+
15+
This saves time and avoids unnecessary retraining.
16+
17+
## Baseline/Deep Handoff Rule (Linux -> Windows)
18+
19+
When asked to rebuild baseline confusion reports on Windows, do **not** assume `git pull` is enough.
20+
21+
Reason: key ML/report artifacts are gitignored (large/reproducible), including:
22+
23+
- `analysis_pipeline/reports/ml_results*.json`
24+
- `analysis_pipeline/reports/confusion_highlights*.json`
25+
- `analysis_pipeline/reports/confusion_pngs/`
26+
- `analysis_pipeline/features/`
27+
28+
## Required Linux Artifacts (Minimum for Plot/Report Regeneration)
29+
30+
Copy these from Linux into the same paths on Windows:
31+
32+
- Classic baseline results:
33+
- `analysis_pipeline/reports/ml_results_baseline_all_bins_baseline.json`
34+
- `analysis_pipeline/reports/ml_results_baseline_omit_hardest_baseline.json`
35+
- `analysis_pipeline/reports/ml_results_baseline_low_high_omit_hardest_baseline.json`
36+
- `analysis_pipeline/reports/ml_results_baseline_grouped_4class_omit_hardest_baseline.json`
37+
- `analysis_pipeline/reports/ml_results_baseline_omit_easiest_baseline.json`
38+
- Advanced NN baseline results:
39+
- `analysis_pipeline/reports/ml_results_baseline_all_bins_baseline_advanced_nn.json`
40+
- `analysis_pipeline/reports/ml_results_baseline_omit_hardest_baseline_advanced_nn.json`
41+
- `analysis_pipeline/reports/ml_results_baseline_low_high_omit_hardest_baseline_advanced_nn.json`
42+
- `analysis_pipeline/reports/ml_results_baseline_grouped_4class_omit_hardest_baseline_advanced_nn.json`
43+
- `analysis_pipeline/reports/ml_results_baseline_omit_easiest_baseline_advanced_nn.json`
44+
- Baseline split manifest used by stage6 configs:
45+
- `analysis_pipeline/features/split_manifest_tutorial_baseline.json`
46+
47+
If Stage 6 rerun is explicitly requested, copy full baseline feature artifacts from Linux as well, at minimum:
48+
49+
- `analysis_pipeline/features/split_manifest_tutorial_baseline.json`
50+
- `analysis_pipeline/features/features_fused_tutorial_baseline.tsv`
51+
52+
## Required Verification Before Rebuild
53+
54+
Run:
55+
56+
```powershell
57+
Get-ChildItem .\analysis_pipeline\reports\ml_results_baseline_*_baseline.json | Measure-Object
58+
Get-ChildItem .\analysis_pipeline\reports\ml_results_baseline_*_baseline_advanced_nn.json | Measure-Object
59+
```
60+
61+
Each should report `Count = 5`.
62+
63+
If either count is below 5:
64+
65+
- Stop.
66+
- Ask user to sync missing Linux artifacts.
67+
- Do not claim full baseline/deep coverage.
68+
69+
## Rebuild Command (All Baseline + Advanced JSONs Present)
70+
71+
```powershell
72+
Get-ChildItem .\analysis_pipeline\reports\ml_results_baseline*.json | ForEach-Object {
73+
$scenario = $_.BaseName -replace '^ml_results_', ''
74+
python .\analysis_pipeline\stage6_highlight_confusions.py `
75+
--results-json $_.FullName `
76+
--out-json ("analysis_pipeline/reports/confusion_highlights_{0}.json" -f $scenario) `
77+
--out-md ("analysis_pipeline/reports/confusion_highlights_{0}.md" -f $scenario) `
78+
--metric balanced_accuracy_mean `
79+
--top-k-per-protocol 1 `
80+
--include-all `
81+
--out-png-dir analysis_pipeline/reports/confusion_pngs
82+
}
83+
```
84+
85+
Then regenerate documentation outputs (docx/md):
86+
87+
```powershell
88+
python .\scripts\build_coherent_docx.py
89+
```
90+
91+
## Output Naming Convention
92+
93+
Baseline confusion markdown names should follow:
94+
95+
- Classic: `confusion_highlights_baseline_{scenario}_baseline.md`
96+
- Advanced NN: `confusion_highlights_baseline_{scenario}_baseline_advanced_nn.md`
97+
98+
Where `{scenario}` is one of:
99+
100+
- `all_bins`
101+
- `omit_hardest`
102+
- `low_high_omit_hardest`
103+
- `grouped_4class_omit_hardest`
104+
- `omit_easiest`

README.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,60 @@ python .\analysis_pipeline\run_pipeline.py `
8181
--only stage6 stage6_confusions
8282
```
8383

84+
## Linux -> Windows Handoff (Baseline Reports)
85+
86+
`git pull` alone is not enough to reproduce baseline report assets on Windows. Large ML/report artifacts are intentionally ignored in `.gitignore`:
87+
88+
- `analysis_pipeline/reports/ml_results*.json`
89+
- `analysis_pipeline/reports/confusion_highlights*.json`
90+
- `analysis_pipeline/reports/confusion_pngs/`
91+
- `analysis_pipeline/features/` (entire folder)
92+
93+
If the Linux machine already produced the baseline runs, copy these files from Linux into the same paths on Windows.
94+
95+
Required files to rebuild all baseline confusion reports on Windows:
96+
97+
- `analysis_pipeline/reports/ml_results_baseline_*_baseline.json` (classic baseline track, 5 files)
98+
- `analysis_pipeline/reports/ml_results_baseline_*_baseline_advanced_nn.json` (advanced NN baseline track, 5 files)
99+
100+
Expected scenarios in those filenames:
101+
102+
- `baseline_all_bins`
103+
- `baseline_omit_hardest`
104+
- `baseline_low_high_omit_hardest`
105+
- `baseline_grouped_4class_omit_hardest`
106+
- `baseline_omit_easiest`
107+
108+
If you need to rerun baseline ML on Windows (not only regenerate confusion reports), also copy:
109+
110+
- `analysis_pipeline/features/` (all feature tables/manifests referenced by Stage 6)
111+
- especially `analysis_pipeline/features/split_manifest_tutorial_baseline.json`
112+
113+
Rebuild confusion reports for all available `ml_results*.json` files (includes baseline + advanced if present):
114+
115+
```powershell
116+
Get-ChildItem .\analysis_pipeline\reports\ml_results*.json | ForEach-Object {
117+
$scenario = $_.BaseName -replace '^ml_results_', ''
118+
python .\analysis_pipeline\stage6_highlight_confusions.py `
119+
--results-json $_.FullName `
120+
--out-json ("analysis_pipeline/reports/confusion_highlights_{0}.json" -f $scenario) `
121+
--out-md ("analysis_pipeline/reports/confusion_highlights_{0}.md" -f $scenario) `
122+
--metric balanced_accuracy_mean `
123+
--top-k-per-protocol 1 `
124+
--include-all `
125+
--out-png-dir analysis_pipeline/reports/confusion_pngs
126+
}
127+
```
128+
129+
Quick check that baseline inputs arrived:
130+
131+
```powershell
132+
Get-ChildItem .\analysis_pipeline\reports\ml_results_baseline_*_baseline.json | Measure-Object
133+
Get-ChildItem .\analysis_pipeline\reports\ml_results_baseline_*_baseline_advanced_nn.json | Measure-Object
134+
```
135+
136+
Each command should report `Count = 5`.
137+
84138
## Stage Summary
85139

86140
- Stage 0: canonical trial table from BIDS events.

docs/README.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,3 +19,20 @@ Supporting tables are in `docs/tables/`, including:
1919
- `docs/tables/ml_numeric_feature_columns_fused.csv`
2020
- `docs/tables/ml_feature_prefix_counts.csv`
2121
- `docs/tables/ml_numeric_feature_prefix_counts.csv`
22+
23+
## Cross-Machine Note (Linux -> Windows)
24+
25+
Baseline ML result JSONs are large and gitignored, so they do not come through `git pull`.
26+
27+
To regenerate full baseline confusion reports on Windows, copy from Linux:
28+
29+
- `analysis_pipeline/reports/ml_results_baseline_*_baseline.json`
30+
- `analysis_pipeline/reports/ml_results_baseline_*_baseline_advanced_nn.json`
31+
32+
Then run `analysis_pipeline/stage6_highlight_confusions.py` on those files to rebuild:
33+
34+
- `analysis_pipeline/reports/confusion_highlights_baseline_*.md`
35+
- `analysis_pipeline/reports/confusion_highlights_baseline_*.json`
36+
- `analysis_pipeline/reports/confusion_pngs/*`
37+
38+
See the handoff checklist and commands in `README.md`.

0 commit comments

Comments
 (0)