Skip to content

Commit c4994c6

Browse files
docs: record ReDiffuse STL-10 preflight
Record the ReDiffuse DDPM/STL-10 split/statistics/resource preflight and update Research current-state docs to release exactly one bounded scout while keeping Platform/Runtime rows unchanged.
1 parent 35d52ce commit c4994c6

6 files changed

Lines changed: 205 additions & 22 deletions

AGENTS.md

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@ Do not start from memory or old chat context. Re-anchor on repository files.
2828

2929
## Current Operating State
3030

31-
- Active work: `2026-05-25 feature-packet consumer-boundary correction is the latest roadmap operating-system update. The current Platform/Runtime admitted bundle remains admitted-only with exactly five rows: recon, PIA baseline, PIA defended, GSA, and DPDM W-1. Tracing the Roots remains positive Research-side gray-box feature-packet evidence (AUC = 0.815826, TPR@1%FPR = 0.134000) and a candidate consumer-boundary design item, but it is not admitted because the current bundle/export/tests/schema still exclude it and the public packet lacks raw target checkpoint identity, raw sample IDs, and image query/response artifacts. ReDiffuse OpenReview split manifests remain the clearest second-asset path, but no checkpoint/score packet is public and no GPU job is selected from this correction. active_gpu_question = none; next_gpu_candidate = none; CPU sidecar = none selected after feature-packet boundary correction.`
32-
- Next GPU candidate: none selected
31+
- Active work: `2026-05-25 ReDiffuse DDPM/STL-10 split/statistics/resource preflight is the latest roadmap operating-system update. The official STL-10 split is exact and public (50k / 50k, SHA256 14a06133f36c74e7d3cb97dbe74385fb42c22335a7cb955fd9944ca503baca52), binds to the local STL-10 unlabeled payload, and does not show obvious low-level image-statistics leakage (linear-probe holdout AUC = 0.4994776215625). The CUDA-capable surface is conda env diffaudit-research, not the default PATH Python. Official ReDiffuse DDPM UNet + GaussianDiffusionTrainer calibration succeeded at batch 4 / 20 steps and batch 64 / 10 steps, with batch 64 peak allocated VRAM 4.419 GB. This is not a membership metric, checkpoint, score packet, or admitted row. active_gpu_question = ReDiffuse DDPM/STL-10 bounded scout; next_gpu_candidate = one bounded STL-10 DDPM pipeline scout only; CPU sidecar = none; split/statistics/resource preflight complete.`
32+
- Next GPU candidate: one bounded ReDiffuse DDPM/STL-10 pipeline scout only
3333
- Long-horizon control: follow `ROADMAP.md` section
3434
`Long-Horizon Research Task Board(2026-05-13 起)` before reopening any
3535
Research lane. The selected forward path is Lane A external asset acquisition
@@ -553,12 +553,15 @@ Do not start from memory or old chat context. Re-anchor on repository files.
553553
package it as a black-box/conditional response-contract candidate unless a
554554
separate reproducibility-maintenance task explicitly reopens admitted GSA
555555
provenance.
556-
- ReDiffuse is closed as hold / split-manifest-only. The official OpenReview
557-
supplement now gives DDPM CIFAR10/CIFAR100/STL10/Tiny-IN train/eval index
558-
manifests, but no target checkpoint, generated response/feature cache, score
559-
packet, ROC CSV, or metric artifact. Do not train DDPM/DiT/Stable Diffusion
560-
targets or rerun same-family attack scripts unless exact checkpoints or score
561-
packets appear for those manifests.
556+
- ReDiffuse is open only for one bounded DDPM/STL-10 scout after the 2026-05-25
557+
split/statistics/resource preflight. The official OpenReview supplement gives
558+
exact DDPM split manifests, the STL-10 `50k / 50k` split binds to local data,
559+
low-level statistics do not separate labels, and the official UNet/trainer path
560+
fits local CUDA at batch 64 calibration scale. This still has no third-party
561+
trained checkpoint, generated response/feature cache, score packet, ROC CSV,
562+
or metric artifact. Do not run full DDPM/DiT/Stable Diffusion training,
563+
`800k`-step jobs, Tiny-ImageNet downloads, Stable Diffusion downloads, or
564+
same-family attack-script sweeps by default.
562565
- `YuxinWenRick/diffusion_memorization` is closed as memorization semantic-shift
563566
watch. It has a real `500`-row `sdv1_500_memorized.jsonl` prompt manifest, but
564567
the ground-truth image package is `2.60G`, `CompVis/stable-diffusion-v1-4` is

ROADMAP.md

Lines changed: 37 additions & 4 deletions
Large diffs are not rendered by default.
Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# ReDiffuse STL-10 Split and Microtrain Preflight
2+
3+
> Date: 2026-05-25
4+
> Status: split preflight passed / resource-feasible scout candidate / no membership metric yet / no admitted row
5+
6+
## Question
7+
8+
Does the official ReDiffuse OpenReview STL-10 split justify moving beyond
9+
metadata-only asset review into a bounded DDPM/STL-10 model-pipeline scout?
10+
11+
This is a gate-setting preflight, not a benchmark result. It checks the split
12+
semantics, low-level label leakage, and local CUDA resource envelope for the
13+
official ReDiffuse DDPM code path. It does not train a checkpoint to convergence,
14+
save a checkpoint, sample images, run a membership attack, or promote
15+
Platform/Runtime evidence.
16+
17+
## Assets
18+
19+
| Field | Value |
20+
| --- | --- |
21+
| Paper line | `Towards Black-Box Membership Inference Attack for Diffusion Models` / ReDiffuse |
22+
| Split file | `Rediffuse/DDPM/STL10_train_ratio0.5.npz` |
23+
| Split SHA256 | `14a06133f36c74e7d3cb97dbe74385fb42c22335a7cb955fd9944ca503baca52` |
24+
| Local STL-10 payload | `<DOWNLOAD_ROOT>/shared/datasets/stl10_binary/unlabeled_X.bin` |
25+
| Official code surface | `Rediffuse/DDPM/` |
26+
| CUDA environment | `conda run -n diffaudit-research python` |
27+
| GPU observed | NVIDIA GeForce RTX 4070 Laptop GPU, `7.996 GB` total VRAM |
28+
29+
The default PATH Python is CPU-only for this workspace. The CUDA-capable
30+
surface for this preflight is the `diffaudit-research` conda environment.
31+
32+
## Split Semantics
33+
34+
| Check | Result |
35+
| --- | ---: |
36+
| Member count | `50000` |
37+
| Nonmember count | `50000` |
38+
| Member/nonmember overlap | `0` |
39+
| Union coverage | `0..99999` |
40+
| STL-10 payload rows | `100000` unlabeled images |
41+
42+
The STL-10 split is mechanically valid for a `50k / 50k` member/nonmember
43+
experiment over the STL-10 unlabeled payload. This is materially stronger than
44+
paper-only watch evidence because the exact index arrays are public and
45+
hashable.
46+
47+
One provenance caveat remains: `STL10_train_ratio0.5.npz` and
48+
`TINY-IN_train_ratio0.5.npz` are byte-identical and contain identical indices.
49+
That weakens independent split-provenance interpretation for the supplement,
50+
but it does not by itself invalidate the STL-10 split because the indices still
51+
bind cleanly to the `100000` STL-10 unlabeled rows.
52+
53+
## Low-Level Leakage Preflight
54+
55+
A CPU-only image-statistics probe used all `100000` STL-10 unlabeled images and
56+
`208` low-level features. The goal was to catch trivial source/statistics
57+
confounding before releasing any model-pipeline work.
58+
59+
| Check | Result |
60+
| --- | ---: |
61+
| Feature matrix | `100000 x 208` |
62+
| Top univariate absolute AUC | about `0.502` |
63+
| Linear probe train AUC | `0.556014935` |
64+
| Linear probe test AUC | `0.4994776215625` on an `80000` holdout |
65+
66+
Decision implication: the split does not show obvious low-level source or
67+
image-statistics leakage. Unlike the collaborator Stable Diffusion ReDiffuse
68+
packet, the member label is not trivially explained by a source column or
69+
low-level image-statistics split.
70+
71+
## CUDA Pipeline Calibration
72+
73+
The official DDPM dependencies were checked before running model code. The
74+
`diffaudit-research` environment has CUDA Torch, `torchvision`, `sklearn`, and
75+
`absl`, but does not currently have `tensorboardX` or `pynvml`. Therefore the
76+
official `main.py` was not run unmodified. Temporary calibration scripts
77+
imported the official `UNet` and `GaussianDiffusionTrainer` directly and
78+
bypassed logging/GPU-monitoring dependencies.
79+
80+
| Calibration | Batch | Steps | Status | Elapsed | Peak allocated VRAM | Notes |
81+
| --- | ---: | ---: | --- | ---: | ---: | --- |
82+
| Microtrain pipeline smoke | `4` | `20` | `ready` | `2.823s` | `0.833 GB` | official model/trainer path executes |
83+
| Batch envelope check | `64` | `10` | `ready` | `10.247s` | `4.419 GB` | no checkpoint, no sampling, no MIA metric |
84+
85+
Batch `64` is resource-feasible on the local RTX 4070 Laptop GPU for a bounded
86+
scout. The batch-envelope check observed `0.616 GB` free after completion, so a
87+
longer scout should still use an explicit memory guard, checkpoint cadence, and
88+
stop condition rather than assuming the full training recipe is safe.
89+
90+
## Gate Result
91+
92+
| Gate | Result |
93+
| --- | --- |
94+
| Target identity | Still missing for public replay. No trained third-party STL-10 checkpoint or score packet is public. |
95+
| Split contract | Pass for STL-10. Exact public `50k / 50k` split binds to the local STL-10 unlabeled payload. |
96+
| Low-level leakage | Pass for preflight. Simple image statistics do not separate member/nonmember labels. |
97+
| Official code path | Partial pass. Official `UNet` and `GaussianDiffusionTrainer` run under the CUDA conda environment; official `main.py` still has missing logging/monitoring deps. |
98+
| Resource envelope | Pass for bounded scout. Batch `64` fits within local VRAM for calibration. |
99+
| Metric contract | Not run. No MIA score, ROC, AUC, ASR, or low-FPR metric exists from this preflight. |
100+
101+
## Decision
102+
103+
`split preflight passed / resource-feasible scout candidate / no membership
104+
metric yet / no admitted row`.
105+
106+
The ReDiffuse DDPM/STL-10 route is now eligible for exactly one bounded
107+
model-pipeline scout because it is the clearest available second-dataset route:
108+
the split is exact and public, the local STL-10 payload is present, low-level
109+
leakage was not detected, and the official DDPM model/trainer path is
110+
resource-feasible on the local GPU.
111+
112+
This does not release long training by default. The next run must be a bounded
113+
scout with a frozen hypothesis, command, maximum wall-clock/step budget, memory
114+
guard, checkpoint/output target, and stop condition. It must report whether a
115+
short STL-10 DDPM target can produce a scoreable attack packet; it must not be
116+
an `800k`-step or full-paper reproduction attempt.
117+
118+
## Stop Condition
119+
120+
- Do not claim any membership-inference result from the `20`-step or `10`-step
121+
calibration runs.
122+
- Do not run full DDPM training or broad hyperparameter sweeps from this note.
123+
- Do not download Tiny-ImageNet or Stable Diffusion assets because the current
124+
gate is specifically STL-10.
125+
- Do not add new CLI, validators, or long scaffolding before a bounded scout
126+
actually produces a decision-changing artifact.
127+
128+
## Platform and Runtime Impact
129+
130+
None. This is Research-only preflight evidence. The admitted Platform/Runtime
131+
bundle remains the existing five rows.

0 commit comments

Comments
 (0)