Skip to content

Commit c3e1d06

Browse files
committed
Stage 7: peer-review-grade significance and bootstrap statistics
Adds analysis_pipeline/stage7_significance.py and wires it into the run_pipeline orchestrator and both YAML profiles. Stage 7 derives every inferential statistic referenced in the paper's Statistical Significance Procedures section directly from the ml_results_<scenario>_classic_nn.json files emitted by Stage 6: * Paired Wilcoxon signed-rank (3 s overlap vs 6 s baseline) over the 60 matched scenario x dataset x protocol best-model cells. * 2000-sample percentile-method nonparametric bootstrap on outer-fold balanced accuracies for per-cell 95 percent CIs. * One-sided Wilcoxon signed-rank vs the per-scenario chance level (1/n_classes) for each cell. * Vectorised 10000-iteration label-shuffle permutation test on held-out predicted-label vectors reconstructed from per-fold confusion matrices, with Bonferroni correction over the full 120-cell grid. * Optional paired Wilcoxon on baseline-vs-task heart rate from the Stage 1 QC summary (used in the manuscript's QC sanity check). Outputs a machine-readable significance_summary.json plus a Markdown report. Reproduces the manuscript exactly: W=1636, p=5.55e-8, paired n=60, mean balanced accuracy 0.324 -> 0.380, 46/14/0 cells improved/ worsened/unchanged, 27 winner-identity changes, 111/120 cells with one-sided Wilcoxon vs chance p<0.05; with N=10000 perms the 10 scenario winners survive Bonferroni (threshold 4.17e-4).
1 parent b41a126 commit c3e1d06

4 files changed

Lines changed: 1109 additions & 0 deletions

File tree

analysis_pipeline/config/pipeline_unified_classic_nn_baseline_overlap3s_50pct_preproc.yaml

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ stages:
2121
stage6: true
2222
stage6_confusions: true
2323
stage6_publication_report: true
24+
stage7_significance: true
2425

2526
stage_args:
2627
stage0:
@@ -163,3 +164,19 @@ stage6_publication_report:
163164
run_manifest_json: "{reports_dir}/run_manifest.json"
164165
out_md: "{reports_dir}/publication_full_report.md"
165166
out_json: "{reports_dir}/publication_full_report.json"
167+
168+
stage7_significance:
169+
# Identical Stage 7 stanza to the 6 s baseline profile but pointed at
170+
# this profile's own reports as the "baseline" anchor. The overlap
171+
# reports are paired against the 6 s baseline output written by the
172+
# other profile, so this profile reproduces the same significance
173+
# numbers regardless of which configuration triggered the run.
174+
args:
175+
baseline_reports: "analysis_pipeline/runs/pipeline_unified_classic_nn_baseline_preproc/reports"
176+
overlap_reports: "{reports_dir}"
177+
baseline_qc_summary: "analysis_pipeline/runs/pipeline_unified_classic_nn_baseline_preproc/reports/qc_dataset_summary.json"
178+
out_json: "{reports_dir}/significance_summary.json"
179+
out_md: "{reports_dir}/significance_summary.md"
180+
n_bootstrap: 2000
181+
n_permutations: 10000
182+
random_seed: 42

analysis_pipeline/config/pipeline_unified_classic_nn_baseline_preproc.yaml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ stages:
2121
stage6: true
2222
stage6_confusions: true
2323
stage6_publication_report: true
24+
stage7_significance: true
2425

2526
stage_args:
2627
stage0:
@@ -159,3 +160,22 @@ stage6_publication_report:
159160
run_manifest_json: "{reports_dir}/run_manifest.json"
160161
out_md: "{reports_dir}/publication_full_report.md"
161162
out_json: "{reports_dir}/publication_full_report.json"
163+
164+
stage7_significance:
165+
# Stage 7 derives every inferential statistic referenced in the
166+
# manuscript's Statistical Significance Procedures section:
167+
# paired Wilcoxon (3 s overlap vs 6 s baseline), 2000-sample
168+
# bootstrap CIs over outer-fold balanced accuracies, one-sided
169+
# Wilcoxon vs chance, and a 10 000-iteration label-shuffle
170+
# permutation test with Bonferroni correction over the full
171+
# cell grid. Runs after Stage 6 and consumes its
172+
# ml_results_<scenario>_classic_nn.json files.
173+
args:
174+
baseline_reports: "{reports_dir}"
175+
overlap_reports: "analysis_pipeline/runs/pipeline_unified_classic_nn_baseline_overlap3s_50pct_preproc/reports"
176+
baseline_qc_summary: "{reports_dir}/qc_dataset_summary.json"
177+
out_json: "{reports_dir}/significance_summary.json"
178+
out_md: "{reports_dir}/significance_summary.md"
179+
n_bootstrap: 2000
180+
n_permutations: 10000
181+
random_seed: 42

analysis_pipeline/run_pipeline.py

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ class OutputLayout:
4848
"stage6",
4949
"stage6_confusions",
5050
"stage6_publication_report",
51+
"stage7_significance",
5152
]
5253

5354

@@ -869,6 +870,40 @@ def _plan_pipeline(
869870
outputs=[str(stage6_pub_out_md), str(stage6_pub_out_json)],
870871
)
871872

873+
# ------------------------------------------------------------------
874+
# Stage 7 — significance and bootstrap statistics
875+
# ------------------------------------------------------------------
876+
stage7_cfg = config.get("stage7_significance", {})
877+
stage7_args = dict(stage7_cfg.get("args", {}))
878+
if "baseline_reports" not in stage7_args:
879+
stage7_args["baseline_reports"] = str(output_layout.reports_dir)
880+
stage7_out_json = _resolve_output_path_or_default(
881+
stage7_args.get("out_json"),
882+
output_layout.reports_dir / "significance_summary.json",
883+
repo_root=repo_root,
884+
)
885+
stage7_out_md = _resolve_output_path_or_default(
886+
stage7_args.get("out_md"),
887+
output_layout.reports_dir / "significance_summary.md",
888+
repo_root=repo_root,
889+
)
890+
_setdefault_arg(stage7_args, "out_json", stage7_out_json)
891+
_setdefault_arg(stage7_args, "out_md", stage7_out_md)
892+
cmd_stage7 = [
893+
python_exe,
894+
"-m",
895+
"analysis_pipeline.stage7_significance",
896+
] + _to_cli_args(stage7_args)
897+
_append_stage_if_enabled(
898+
planned=planned,
899+
config=config,
900+
stage="stage7_significance",
901+
only=only,
902+
name="stage7_significance",
903+
command=cmd_stage7,
904+
outputs=[str(stage7_out_json), str(stage7_out_md)],
905+
)
906+
872907
return planned, manifest_out
873908

874909

0 commit comments

Comments
 (0)