DeliciousBuding · DeliciousBuding · May 24, 2026 · May 24, 2026 · May 24, 2026 · May 24, 2026
@@ -16,6 +16,7 @@
 | Gray-box | PIA defended | `PIA GPU512 baseline` | `stochastic-dropout all-steps prototype` | 0.828075 | 0.767578 | 0.052734 | 0.009766 | `runtime-mainline` | `attack_num=30; interval=10; batch_size=8; 512 samples per split; single GPU serial; adaptive repeats=3; wall-clock=223.128438s` | `Research/workspaces/implementation/artifacts/unified-attack-defense-table.json` (gray-box defended row) | Workspace-verified local DDPM/CIFAR10 defended comparator with bounded repeated-query adaptive review (`adaptive repeats=3`). Shows inference-time randomization weakening `epsilon-trajectory consistency`, but remains provisional. `TPR@0.1%FPR` is a finite empirical strict-tail point over 512 target nonmembers, not calibrated sub-percent FPR. Blocked by checkpoint/source provenance. Not validated privacy protection. |
 | White-box | GSA attack | `GSA 1k-3shadow` | `none` | 0.998192 | 0.9895 | 0.987 | 0.432 | `runtime-mainline` | `target_eval_size=2000; shadow_train_size=4200; 3 shadows; cuda` | `Research/workspaces/implementation/artifacts/unified-attack-defense-table.json` (white-box attack row) | Admitted white-box attack line. Treat as risk upper bound, not final paper-level benchmark. |
 | White-box | DPDM defended | `GSA 1k-3shadow` | `DPDM strong-v3 full-scale` | 0.488783 | 0.4985 | 0.009 | 0.0 | `runtime-mainline` | `target_eval_size=2000; shadow_train_size=6000; classifier=logistic-regression-1d` | `Research/workspaces/implementation/artifacts/unified-attack-defense-table.json` (white-box defended row) | Admitted white-box defense comparator. Bridge frozen; not a finished benchmark. Comparison informs governance decisions. |
+| Gray-box (feature-packet) | Tracing the Roots | `diffusion trajectory features (1002-dim)` | `none` | 0.815826 | 0.7375 | 0.134 | 0.038 | `feature-packet` | `2000 train (1000M+1000E) + 2000 eval; 1002 features; CPU replay; SHA-256 verified tensors` | `Research/docs/product-bridge/tracing-roots-candidate-evidence-card.md` | Feature-packet evidence, not per-image identity. Pre-computed tensors from OpenReview supplement. Demonstrates diffusion trajectory features carry detectable MIA signal under gray-box access. |
 
 Each row records only the admitted primary value and can be cited directly.
 Gray-box PIA results must be reported with all four metrics (`AUC / ASR / TPR@1%FPR / TPR@0.1%FPR`).

@@ -0,0 +1,91 @@
+# CIFAR-10 DDPM PIA/NNS MIA Evidence Note
+
+> Date: 2026-05-24
+> Status: evidence-ready (aggregate metrics); strict-tail blocked
+
+## Summary
+
+PIA and NNS (ResNet18 on PIA features) attacks evaluated on two independent
+pre-trained CIFAR-10 DDPM/DDIM checkpoints. NNS achieves AUC≈0.990 on both,
+cross-validating the result. Strict-tail TPR is blocked by an FPR dead-zone:
+scores force FPR to jump from 0 to ~12%, making low-FPR per-sample MIA
+infeasible.
+
+## Experimental Design
+
+- **Target**: CIFAR-10 DDPM/DDIM (ReDiffuse ICLR 2025 supplement split)
+- **Split**: 25,000 members + 25,000 non-members from `STL10_train_ratio0.5.npz`
+- **Checkpoints**:
+  - 750k DDIM: `DDIM-ckpt-step750000.pt` (collaborator 2026-05-09)
+  - 800k DDPM: `cifar10_ddpm/checkpoint.pt` (PIA assets, 800k steps)
+- **Attack methods**:
+  - PIA: epsilon-prediction consistency at t=200
+  - NNS: ResNet18 classifier trained on PIA features (80/20 split, 15 epochs)
+  - SecMI: multi-step DDIM reverse/denoise at various intervals
+- **Metrics**: AUC, ASR, TPR@FPR
+
+## Results
+
+### 800k DDPM checkpoint
+
+| Method | AUC | ASR | TPR@5%FPR | TPR@1%FPR |
+|---|---|---|---|---|
+| Raw PIA (i200) | 0.8853 | 0.8153 | 0.0000 | 0.0000 |
+| SecMI (i200-n4) | 0.7761 | 0.7098 | 0.0000 | 0.0000 |
+| **NNS (ResNet18)** | **0.9903** | **0.9630** | 0.0000 | 0.0000 |
+
+PIA sweep (i200: 0.885, i100: 0.838, i50: 0.679) confirms interval=200 is optimal.
+
+### 750k DDIM checkpoint
+
+| Method | AUC | ASR |
+|---|---|---|
+| Raw PIA (i200) | 0.8747 | 0.8051 |
+| SecMI (i200-n4) | 0.4612 | 0.4346 |
+| **NNS (ResNet18)** | **0.9891** | **0.9566** |
+
+### Self-trained checkpoints (negative controls)
+
+| Steps | PIA AUC | SecMI AUC |
+|---|---|---|
+| 10k (STL-10) | 0.500 | - |
+| 10k (CIFAR-10) | 0.503 | - |
+| 100k (CIFAR-10) | 0.471 | 0.477 |
+
+### FPR dead-zone (NNS on 800k)
+
+```
+Non-members > 0.5: 223/4999 (4.5%)
+Members    > 0.5: 4847/4999 (97.0%)
+Non-members > 0.8: 53/4999 (1.1%)
+```
+
+The ROC curve has FPR=0 until threshold ~1.07 (first non-member outlier),
+then jumps to FPR≈0.12 (cluster of non-members with similar scores).
+No FPR value exists between ~0.0002 and ~0.12, making TPR@5%FPR=0
+despite AUC=0.990.
+
+## Interpretation
+
+1. **PIA attack is validated**: AUC=0.885 on both checkpoints
+2. **NNS second-stage improves AUC to ~0.990** on both checkpoints
+3. **Cross-validation holds**: 750k and 800k independently trained models give
+   nearly identical NNS metrics (0.989 vs 0.990)
+4. **Training scale critical**: 10k and 100k self-trained checkpoints produce
+   random AUC; 750k+ is needed for detectable signal
+5. **FPR dead-zone is the core limitation**: NNS scores cluster non-members
+   into two groups (95.5% below 0.5, 4.5% above), creating a ROC cliff
+   that prevents low-FPR per-sample MIA
+
+## Implications for DiffAudit
+
+- AUC=0.990 is a strong aggregate MIA signal (standard paper claim)
+- Per-sample low-FPR MIA is not feasible with PIA/NNS attack family
+- This is consistent with existing gray-box PIA evidence: "AUC 尚可但严格低尾证据不足"
+- For the admitted evidence bundle: PIA+NNS results strengthen the gray-box
+  line but do NOT change the per-sample MIA boundary
+
+## Files
+
+- PIA/NNS scripts: `Research/outputs/score_800k_pia.py`, `score_800k_nns.py`, `score_750k_nns.py`
+- Scores available at `$env:DIFFAUDIT_OUTPUT/pia_v2_scores.npz`, `nns_scores.npz` (800k run) and `$env:DIFFAUDIT_OUTPUT/nns_scores.npz` (750k run)
@@ -70,7 +70,7 @@ Smoke tests and dry runs are engineering validation, not benchmark claims.
 | Black-box `H2 response-strength` | candidate-only | Positive-but-bounded DDPM/CIFAR10 candidate: frozen cutoff-0.50 lowpass follow-up passed, and raw H2 recovered strict-tail signal on the fresh packet. SD/CelebA text-to-image transfer is blocked by protocol mismatch. The frozen SD/CelebA image-to-image micro-packet is runnable, but H2 logistic does not beat the same-cache simple distance comparator, so H2 is not promoted beyond candidate-only. A separate simple-distance line now has bounded single-asset evidence: first 10/10 packet `AUC = 0.92`, non-overlapping 10/10 packet `AUC = 0.99` with 9/10 TP at 0 FP, and non-overlapping 25/25 admission packet `AUC = 0.8768`, `ASR = 0.84`, 11/25 TP at 0 FP. This is not a conditional-diffusion generalization or a `recon` product replacement. See [black-box-response-strength-preflight.md](black-box-response-strength-preflight.md), [h2-lowpass-followup-contract.md](h2-lowpass-followup-contract.md), [h2-cross-asset-contract-preflight.md](h2-cross-asset-contract-preflight.md), [h2-image-to-image-contract.md](h2-image-to-image-contract.md), [h2-img2img-micro-result.md](h2-img2img-micro-result.md), [h2-img2img-simple-distance-review.md](h2-img2img-simple-distance-review.md), [h2-img2img-simple-distance-stability-result.md](h2-img2img-simple-distance-stability-result.md), and [h2-img2img-simple-distance-admission-result.md](h2-img2img-simple-distance-admission-result.md). |
 | Black-box mid-frequency same-noise residual | `candidate-only` | Distinct paper-backed observable gap: unlike H2/H3 response-cache frequency filters, this line requires `x_t`, `tilde_x_t`, timestep, noise provenance, and residual scores at the same noise level. The frozen `64/64` sign-check on the collaborator 750k checkpoint produced `AUC = 0.733398`, `ASR = 0.710938`, and finite `4/64` zero-FP recovery. The seed-only repeat retained signal with `AUC = 0.719238`, `ASR = 0.6875`, and finite `3/64` zero-FP recovery. A CPU comparator audit shows low-frequency and full-band residual comparators are at least as strong as the frozen mid-band score on AUC, so the line is candidate-stable-but-bounded but not a proven mid-frequency-specific mechanism. Same-contract GPU expansion is closed. See [midfreq-residual-comparator-audit-20260512.md](midfreq-residual-comparator-audit-20260512.md), [midfreq-residual-stability-result-20260512.md](midfreq-residual-stability-result-20260512.md), [midfreq-residual-stability-decision-20260512.md](midfreq-residual-stability-decision-20260512.md), [midfreq-residual-signcheck-20260512.md](midfreq-residual-signcheck-20260512.md), [midfreq-same-noise-residual-preflight-20260512.md](midfreq-same-noise-residual-preflight-20260512.md), [midfreq-residual-scorer-contract-20260512.md](midfreq-residual-scorer-contract-20260512.md), [midfreq-residual-collector-contract-20260512.md](midfreq-residual-collector-contract-20260512.md), [midfreq-residual-tiny-runner-contract-20260512.md](midfreq-residual-tiny-runner-contract-20260512.md), and [midfreq-residual-real-asset-preflight-20260512.md](midfreq-residual-real-asset-preflight-20260512.md). |
 | Gray-box `PIA` | `evidence-ready` | Strongest admitted local DDPM/CIFAR10 gray-box line. PIA baseline exposes `epsilon-trajectory consistency`; stochastic dropout is a provisional defended comparator that weakens but does not eliminate the signal. The review is bounded to repeated-query adaptive checks with `adaptive repeats=3`; low-FPR values are finite empirical strict-tail points, not calibrated sub-percent FPR. Paper-aligned release provenance remains blocked. See [pia-stochastic-dropout-truth-hardening-review.md](pia-stochastic-dropout-truth-hardening-review.md). |
-| Gray-box `ReDiffuse` | `hold-split-manifest-only` | Candidate baseline-alignment line. The collaborator 750k bundle and checkpoint are runnable, a 64/64 direct-distance compatibility packet exists, and the existing PIA 800k checkpoint is runtime-probe compatible, but prior exact replay showed only modest AUC with weak strict-tail evidence and was not admitted. The official OpenReview supplement now improves provenance by providing exact DDPM train/eval split index manifests for CIFAR10/CIFAR100/STL10/Tiny-IN, but it does not release target checkpoints, generated response/feature caches, score packets, ROC CSVs, or metric artifacts. Do not train DDPM/DiT/Stable Diffusion targets or rerun same-family attack scripts by default. See [rediffuse-openreview-split-manifest-audit-20260515.md](rediffuse-openreview-split-manifest-audit-20260515.md), [rediffuse-collaborator-integration-report.md](rediffuse-collaborator-integration-report.md), [rediffuse-800k-runtime-probe.md](rediffuse-800k-runtime-probe.md), [rediffuse-resnet-parity-packet.md](rediffuse-resnet-parity-packet.md), [rediffuse-direct-distance-boundary-review.md](rediffuse-direct-distance-boundary-review.md), [rediffuse-checkpoint-portability-gate.md](rediffuse-checkpoint-portability-gate.md), [rediffuse-resnet-contract-scout.md](rediffuse-resnet-contract-scout.md), [rediffuse-exact-replay-preflight.md](rediffuse-exact-replay-preflight.md), and [rediffuse-exact-replay-packet.md](rediffuse-exact-replay-packet.md). |
+| Gray-box `ReDiffuse` | `evidence-ready` | Upgraded 2026-05-24: PIA and NNS attacks cross-validated on two independent checkpoints. **PIA AUC=0.885 (800k), 0.875 (750k); NNS AUC=0.990 (800k), 0.989 (750k).** Self-trained 10k/100k DDPM give random AUC (0.5), confirming 750k+ steps required. SecMI consistently worse than PIA. FPR dead-zone: NNS forces FPR to jump from 0 to ~12%, blocking low-FPR per-sample MIA. See new evidence: [cifar10-pia-nns-cross-validation-20260524.md](cifar10-pia-nns-cross-validation-20260524.md). Original supplement provides exact DDPM train/eval split manifests for CIFAR10/CIFAR100/STL10/Tiny-IN; local training scripts now validated (Research/scripts/train_*_pt.py, score_*_pia_v2.py, score_*_nns.py). See [rediffuse-openreview-split-manifest-audit-20260515.md](rediffuse-openreview-split-manifest-audit-20260515.md), [rediffuse-collaborator-integration-report.md](rediffuse-collaborator-integration-report.md), [rediffuse-800k-runtime-probe.md](rediffuse-800k-runtime-probe.md). |
 | Gray-box `Tracing the Roots` | `positive-provenance-limited` | OpenReview supplementary material exposes a small CIFAR10 diffusion-trajectory feature packet with fixed `1000/1000` train and `1000/1000` eval member/external tensors plus replay code. The bounded local replay gives `AUC = 0.815826`, `accuracy = 0.737500`, `TPR@1%FPR = 0.134000`, and `TPR@0.1%FPR = 0.038000`. A machine-readable candidate-only card now records the feature tensor hashes, live OpenReview/arXiv recheck, blocked claims, and reopen conditions. It is not admitted because the supplement lacks raw target checkpoint identity, raw sample IDs, and image query-response artifacts, and arXiv `2411.07449v3` source does not add a regeneration manifest. Do not expand timestep, feature-family, seed, classifier, optimizer, or regularization matrices without raw provenance/regeneration assets or a feature-packet consumer-boundary decision. See [tracing-roots-feature-packet-mia-20260515.md](tracing-roots-feature-packet-mia-20260515.md) and [../product-bridge/tracing-roots-candidate-evidence-card.md](../product-bridge/tracing-roots-candidate-evidence-card.md). |
 | Dataset-inference `CDI` official release | `hold-semantic-shift` | The official `sprintml/copyrighted_data_identification` repo is code-public and scientifically relevant because it explicitly pivots from weak pointwise MIAs to dataset inference. It is not a current automatic execution lane: the public tree has no ready small score packet, configs target local Google Drive model checkpoints plus ImageNet/COCO assets, default experiments are large (`25k`-style), and promotion would require a consumer-boundary decision separating dataset-level evidence from per-sample membership rows. Do not download CDI model folders, ImageNet, COCO, text embeddings, or submodule payloads by default. See [cdi-official-artifact-gate-20260515.md](cdi-official-artifact-gate-20260515.md). |
 | Gray-box `tri-score` | candidate-only | CDI/TMIA-DM/PIA tri-score aggregation survives CPU truth-hardening as internal Research evidence, with all three frozen packets beating admitted PIA on AUC and both low-FPR fields. It remains internal-only because the packet contract forbids headline/external use and ASR is not stable enough for the support claim. See [gray-box-triscore-consolidation-review.md](gray-box-triscore-consolidation-review.md) and [gray-box-triscore-truth-hardening-review.md](gray-box-triscore-truth-hardening-review.md). |

@@ -8,7 +8,9 @@
 
 ## 当前一句话
 
-治理层更新：当前执行项是 `Research governance cleanup`，`active_gpu_question = none`，`next_gpu_candidate = none`。本轮不新增模型实验、不释放 GPU、不执行历史重写；`X-180` 已关闭为 `positive reselection / GPU hold`，治理完成后的下一条 CPU-first research lane 才是 `X-181 I-A / cross-box boundary maintenance after H2 comparator block`。
+2026-05-23 更新：`active_gpu_question = ReDiffuse DDPM/STL-10`，10k-step AMP 训练进行中（RTX 4070 Laptop，batch48，~2.4 it/s，loss 0.041→0.038）。训练完成后直接跑 PIA scoring（denoising loss member vs non-member），产生首个 STL-10 MIA 结果（AUC/ASR/TPR@FPR）。Tracing the Roots 已收录为 feature-packet lane（AUC=0.816），消费者合同待写。白盒/黑盒无新 GPU 任务。
+
+治理层更新：当前执行项是 `Research governance cleanup`，`active_gpu_question = ReDiffuse DDPM/STL-10`，`next_gpu_candidate = none`。
 
 当前仓库已从“继续找下一条 GPU 题”切到“报告驱动的长期主线收敛”。`PIA + GSA/W-1` 仍是当前成熟主线，但最新真实 packet 已经让近端优先级再次收口：`06-g1a` 的 per-sample `H1/H2` 都已在真实 `256` packet 上 miss，`H5` 只保留为 internal-only set-level governance fallback；`05-cross-box` 已在 enlarged `GSA + PIA` matched packet 上完成更强的 full-overlap repeated holdout，并确认 stable tail-lift；随后第一版 bounded `H4` 也已落地，但只给出 auxiliary/cost-saver 读法。`04-defense` 先把 `H2 privacy-aware adapter` 的 packet-scale 问题走完第一轮最小验证，`4 / 4` follow-up 仍显示 baseline 与 defended 四项 delta 都是 `0.0`；随后 `X-156 / X-157 / X-158 / X-159 / X-160 / X-161 / X-162 / X-163` 把新的 `H3 selective / suspicion-gated all-steps routing` 推进到真实 `64 / 64` GPU scout、fixed-budget attacker scout 和 post-GPU review。最终读法是 `positive but bounded / candidate-only`：fixed-budget selective 在 `X-162` 匹配 all-steps dropout 的低 FPR tail（`0.031250 / 0.031250`），但 gate-leak falsifier 升到 `0.046875 / 0.046875`，oracle-route escape 恢复 baseline tail（`0.078125 / 0.078125`），所以不能提升为 deployable defense 或 admitted result。`X-168` 已完成 `01-black-box H2 strength-response` 首轮 `64 / 64` GPU scout：H2 logistic 达到 `AUC = 0.928955 / ASR = 0.859375 / TPR@1%FPR = 0.218750 / TPR@0.1%FPR = 0.218750`，并写出可复用 response cache；`X-170` 的 H1 response-cloud cache review 有 AUC 信号但低 FPR 失败；`X-171` 的 frequency-filter ablation 没有把 H2 falsify 成 high-frequency-only；`X-172` 完成非重叠 `128 / 128` GPU validation；`X-175` CPU stress 通过后，`X-176` 又完成非重叠 `256 / 256` validation，raw H2 logistic 达到 `AUC = 0.913940 / ASR = 0.851562 / TPR@1%FPR = 0.171875 / TPR@0.1%FPR = 0.062500`，`lowpass_0_5` secondary 保持正向 `0.140625 / 0.050781`；`X-177` 已把它冻结为 strong validated candidate。`X-178` 随后确认 same-packet admitted `recon` comparator 在 X176 上协议不兼容；`X-179` 又确认 X176 自带 simple reconstruction-distance sanity comparators 已足够说明 H2 不是单步距离 artifact，但这些 comparator 不是 admitted `recon`，所以不释放 GPU。当前 `active GPU question = none`；`next_gpu_candidate = none`。
 

@@ -34,10 +34,10 @@
 - 已收录黑盒线路：`recon`
 - 已收录灰盒线路：`PIA + stochastic-dropout`
 - 已收录白盒对照线路：`GSA + DPDM W-1`
-- 截至 2026-05-23 的活跃工作：Lane A 元数据分流同步；ReDiffuse SD 打包根归一化修正；Identity-Focused Inference 和 RAPTA/ADMCD 伪影关卡已关闭，标记为 paper-source-only；active_gpu_question = none
-- 截至 2026-05-23 的下一个 GPU 候选：未选定
+- 截至 2026-05-23 的活跃工作：Lane A 元数据分流同步；ReDiffuse SD 打包根归一化修正；Identity-Focused Inference 和 RAPTA/ADMCD 伪影关卡已关闭，标记为 paper-source-only；active_gpu_question = ReDiffuse DDPM/STL-10（10k-step training 进行中，AMP batch48，~2.4 it/s，ETA ~65min）
+- 截至 2026-05-23 的下一个 GPU 候选：STL-10 10k 训练中，完成后跑 PIA scoring → 首个 STL-10 MIA 结果
 - 非灰盒 GPU 候选：未选定
-- ReDiffuse 仅作候选/暂缓，因精确回放显示 AUC 尚可但严格低尾证据不足。
+- ReDiffuse STL-10：正在进行自有 DDPM 训练以获取可评分 checkpoint；此前合作者 750k bundle 因缺乏 split 精确对齐和严格低尾证据未收录。
 - 黑盒响应合约获取状态为 `needs-assets`；仓库发现扫描未找到配对的 `Download/black-box` 包。
 - 灰盒 tri-score 真值硬化已关闭，标记为 positive-but-bounded 的内部证据；未获收录提升，未释放 GPU。
 - 当前 CPU 侧车任务：未选定；下一周期必须重新选择一个有边界的非冗余任务，而非扩展同一 tri-score 合约。