DeliciousBuding
diff --git a/‎AGENTS.md‎
Lines changed: 12 additions & 11 deletions b/‎AGENTS.md‎
Lines changed: 12 additions & 11 deletions
diff --git a/‎ROADMAP.md‎
Lines changed: 32 additions & 4 deletions b/‎ROADMAP.md‎
Lines changed: 32 additions & 4 deletions
diff --git a/‎docs/evidence/rediffuse-stl10-bounded-scout-20260525.md‎
Lines changed: 100 additions & 0 deletions b/‎docs/evidence/rediffuse-stl10-bounded-scout-20260525.md‎
Lines changed: 100 additions & 0 deletions
@@ -28,8 +28,8 @@ Do not start from memory or old chat context. Re-anchor on repository files.
 
 ## Current Operating State
 
-- Active work: `2026-05-25 ReDiffuse DDPM/STL-10 split/statistics/resource preflight is the latest roadmap operating-system update. The official STL-10 split is exact and public (50k / 50k, SHA256 14a06133f36c74e7d3cb97dbe74385fb42c22335a7cb955fd9944ca503baca52), binds to the local STL-10 unlabeled payload, and does not show obvious low-level image-statistics leakage (linear-probe holdout AUC = 0.4994776215625). The CUDA-capable surface is conda env diffaudit-research, not the default PATH Python. Official ReDiffuse DDPM UNet + GaussianDiffusionTrainer calibration succeeded at batch 4 / 20 steps and batch 64 / 10 steps, with batch 64 peak allocated VRAM 4.419 GB. This is not a membership metric, checkpoint, score packet, or admitted row. active_gpu_question = ReDiffuse DDPM/STL-10 bounded scout; next_gpu_candidate = one bounded STL-10 DDPM pipeline scout only; CPU sidecar = none; split/statistics/resource preflight complete.`
-- Next GPU candidate: one bounded ReDiffuse DDPM/STL-10 pipeline scout only
+- Active work: `2026-05-25 ReDiffuse DDPM/STL-10 bounded scout is the latest roadmap operating-system update. The official STL-10 split is exact and public, and the local pipeline produced a short-target checkpoint plus 256 / 256 score packet, but fixed-timestep denoising-loss is random-level: AUC = 0.4996337890625, ASR = 0.509765625, TPR@1%FPR = 0.01171875, TPR@0.1%FPR = 0.0. This is scoreable negative evidence, not a second asset, not a full-paper reproduction, and not an admitted row. active_gpu_question = none; next_gpu_candidate = none; CPU sidecar = none selected after ReDiffuse STL-10 bounded scout weak result.`
+- Next GPU candidate: none selected
 - Long-horizon control: follow `ROADMAP.md` section
   `Long-Horizon Research Task Board（2026-05-13 起）` before reopening any
   Research lane. The selected forward path is Lane A external asset acquisition
@@ -553,15 +553,16 @@ Do not start from memory or old chat context. Re-anchor on repository files.
   package it as a black-box/conditional response-contract candidate unless a
   separate reproducibility-maintenance task explicitly reopens admitted GSA
   provenance.
-- ReDiffuse is open only for one bounded DDPM/STL-10 scout after the 2026-05-25
-  split/statistics/resource preflight. The official OpenReview supplement gives
-  exact DDPM split manifests, the STL-10 `50k / 50k` split binds to local data,
-  low-level statistics do not separate labels, and the official UNet/trainer path
-  fits local CUDA at batch 64 calibration scale. This still has no third-party
-  trained checkpoint, generated response/feature cache, score packet, ROC CSV,
-  or metric artifact. Do not run full DDPM/DiT/Stable Diffusion training,
-  `800k`-step jobs, Tiny-ImageNet downloads, Stable Diffusion downloads, or
-  same-family attack-script sweeps by default.
+- ReDiffuse STL-10 is closed after the one bounded DDPM/STL-10 scout. The
+  official OpenReview split binds cleanly to STL-10, and the local pipeline is
+  executable, but the `300`-step short target with fixed-timestep denoising-loss
+  produced random-level membership metrics (`AUC = 0.4996337890625`). This still
+  has no third-party trained checkpoint, generated response/feature cache,
+  strong score packet, ROC CSV, or admitted metric artifact. Do not expand into
+  step-count, seed, timestep, batch-size, subset-size, EMA, scheduler,
+  denoising-loss, full DDPM/DiT/Stable Diffusion training, `800k`-step jobs,
+  Tiny-ImageNet downloads, Stable Diffusion downloads, or same-family
+  attack-script sweeps by default.
 - `YuxinWenRick/diffusion_memorization` is closed as memorization semantic-shift
   watch. It has a real `500`-row `sdv1_500_memorized.jsonl` prompt manifest, but
   the ground-truth image package is `2.60G`, `CompVis/stable-diffusion-v1-4` is
 
@@ -0,0 +1,100 @@
+# ReDiffuse STL-10 Bounded Scout
+
+> Date: 2026-05-25
+> Status: bounded scout completed / score packet produced / weak denoising-loss signal / no GPU expansion
+
+## Question
+
+After the STL-10 split and resource preflight, can a short ReDiffuse DDPM/STL-10
+target produce a scoreable membership packet, and does fixed-timestep
+denoising-loss show any immediate membership signal?
+
+This is a bounded scout, not a paper-level ReDiffuse reproduction. It does not
+claim a trained STL-10 DDPM benchmark, does not use full-paper training length,
+and does not promote Platform/Runtime evidence.
+
+## Frozen Contract
+
+| Field | Value |
+| --- | --- |
+| Target family | Official ReDiffuse DDPM `UNet` + `GaussianDiffusionTrainer` |
+| Dataset | STL-10 unlabeled |
+| Split source | `STL10_train_ratio0.5.npz` |
+| Train member subset | `1024` samples from official member split |
+| Score packet | `256` trained members + `256` official nonmembers |
+| Training budget | `300` steps, batch `32` |
+| Hard guards | `900s` wall-clock or `7.4 GB` allocated CUDA memory |
+| Score definition | `score = -mean_fixed_timestep_denoising_mse` |
+| Score timesteps | `50`, `200`, `500`, `800` |
+| Seed | `20260525` |
+| CUDA env | `conda run -n diffaudit-research python` |
+
+Batch `32` was selected instead of `64` for the scout because live free VRAM
+before the run was only about `4.6 GB`, while the earlier batch-64 calibration
+had peaked at `4.419 GB`. This was a safety change, not a change in hypothesis.
+
+## Run Artifacts
+
+Artifacts are stored outside Git under:
+`<DOWNLOAD_ROOT>/shared/runs/rediffuse-stl10-bounded-scout-20260525/`.
+
+| Artifact | Size | SHA256 |
+| --- | ---: | --- |
+| `summary.json` | `2,243` bytes | `02cfe2af7346e7b380608ef748d164031ec89ba67d10822ef9d0badb8c3b209e` |
+| `scores.csv` | `28,732` bytes | `c0f396502114986c8c3549f626ce5083fcaaec2fcf8a319aabe509588d8abd0a` |
+| `checkpoint-step-final.pt` | `573,473,892` bytes | `006f5247ef2f91a331a097d16bdf1c153f94f3c2f112e1e9b8d3efdd5bb2ec5e` |
+| `run_rediffuse_stl10_bounded_scout.py` | `10,945` bytes | `4ee76c37594cb7834459b3059f8c0af58c1eabb5c34323275f3990c8c9933d1f` |
+
+The script was intentionally kept as a run artifact rather than promoted into a
+repo CLI. The run answered the decision question without needing a new reusable
+tool surface.
+
+## Result
+
+| Metric | Value |
+| --- | ---: |
+| Completed steps | `300` |
+| Stop reason | `step_budget` |
+| Elapsed time | `92.750s` |
+| Peak allocated VRAM | `2.430 GB` |
+| First training loss | `1.0019083023` |
+| Last training loss | `0.0491645448` |
+| Mean last-25 loss | `0.0447090799` |
+| Score packet size | `256` members + `256` nonmembers |
+| AUC | `0.4996337890625` |
+| ASR | `0.509765625` |
+| TPR@1%FPR | `0.01171875` |
+| TPR@0.1%FPR | `0.0` |
+| Nonmember denominator | `256` |
+| Minimum nonzero FPR | `0.00390625` |
+
+The target did train in the narrow engineering sense: loss fell quickly over
+the `300` steps, and a checkpoint plus row-level score packet were produced.
+The membership signal is effectively random under this fixed-timestep
+denoising-loss scorer.
+
+## Decision
+
+`bounded scout completed / score packet produced / weak denoising-loss signal /
+no GPU expansion`.
+
+This scout answers the only question released by the preflight: the ReDiffuse
+STL-10 path is executable and scoreable locally, but the short target plus
+fixed-timestep denoising-loss does not produce useful membership evidence. This
+is a negative-but-useful result, not a reason to launch full-paper training.
+
+Do not expand this into step-count, seed, timestep, batch-size, subset-size,
+EMA, scheduler, or denoising-loss matrices by default. Reopen ReDiffuse
+STL-10 only if one of these appears:
+
+- a public third-party STL-10 checkpoint or score packet for the official split;
+- a clearly different membership observable with a one-run falsifiable
+  hypothesis; or
+- an explicit decision to spend a reviewed long-training budget for scientific
+  portability, with checkpoint publication and score-packet contract defined in
+  advance.
+
+## Platform and Runtime Impact
+
+None. The admitted Platform/Runtime bundle remains the existing five rows:
+`recon`, `PIA baseline`, `PIA defended`, `GSA`, and `DPDM W-1`.