Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3,335 changes: 86 additions & 3,249 deletions ROADMAP.md

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions docs/evidence/admitted-results-summary.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
| Gray-box | PIA defended | `PIA GPU512 baseline` | `stochastic-dropout all-steps prototype` | 0.828075 | 0.767578 | 0.052734 | 0.009766 | `runtime-mainline` | `attack_num=30; interval=10; batch_size=8; 512 samples per split; single GPU serial; adaptive repeats=3; wall-clock=223.128438s` | `Research/workspaces/implementation/artifacts/unified-attack-defense-table.json` (gray-box defended row) | Workspace-verified local DDPM/CIFAR10 defended comparator with bounded repeated-query adaptive review (`adaptive repeats=3`). Shows inference-time randomization weakening `epsilon-trajectory consistency`, but remains provisional. `TPR@0.1%FPR` is a finite empirical strict-tail point over 512 target nonmembers, not calibrated sub-percent FPR. Blocked by checkpoint/source provenance. Not validated privacy protection. |
| White-box | GSA attack | `GSA 1k-3shadow` | `none` | 0.998192 | 0.9895 | 0.987 | 0.432 | `runtime-mainline` | `target_eval_size=2000; shadow_train_size=4200; 3 shadows; cuda` | `Research/workspaces/implementation/artifacts/unified-attack-defense-table.json` (white-box attack row) | Admitted white-box attack line. Treat as risk upper bound, not final paper-level benchmark. |
| White-box | DPDM defended | `GSA 1k-3shadow` | `DPDM strong-v3 full-scale` | 0.488783 | 0.4985 | 0.009 | 0.0 | `runtime-mainline` | `target_eval_size=2000; shadow_train_size=6000; classifier=logistic-regression-1d` | `Research/workspaces/implementation/artifacts/unified-attack-defense-table.json` (white-box defended row) | Admitted white-box defense comparator. Bridge frozen; not a finished benchmark. Comparison informs governance decisions. |
| Gray-box (feature-packet) | Tracing the Roots | `diffusion trajectory features (1002-dim)` | `none` | 0.815826 | 0.7375 | 0.134 | 0.038 | `feature-packet` | `2000 train (1000M+1000E) + 2000 eval; 1002 features; CPU replay; SHA-256 verified tensors` | `Research/docs/product-bridge/tracing-roots-candidate-evidence-card.md` | Feature-packet evidence, not per-image identity. Pre-computed tensors from OpenReview supplement. Demonstrates diffusion trajectory features carry detectable MIA signal under gray-box access. |

Each row records only the admitted primary value and can be cited directly.
Gray-box PIA results must be reported with all four metrics (`AUC / ASR / TPR@1%FPR / TPR@0.1%FPR`).
Expand Down
91 changes: 91 additions & 0 deletions docs/evidence/cifar10-pia-nns-cross-validation-20260524.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# CIFAR-10 DDPM PIA/NNS MIA Evidence Note

> Date: 2026-05-24
> Status: evidence-ready (aggregate metrics); strict-tail blocked

## Summary

PIA and NNS (ResNet18 on PIA features) attacks evaluated on two independent
pre-trained CIFAR-10 DDPM/DDIM checkpoints. NNS achieves AUC≈0.990 on both,
cross-validating the result. Strict-tail TPR is blocked by an FPR dead-zone:
scores force FPR to jump from 0 to ~12%, making low-FPR per-sample MIA
infeasible.

## Experimental Design

- **Target**: CIFAR-10 DDPM/DDIM (ReDiffuse ICLR 2025 supplement split)
- **Split**: 25,000 members + 25,000 non-members from `STL10_train_ratio0.5.npz`
- **Checkpoints**:
- 750k DDIM: `DDIM-ckpt-step750000.pt` (collaborator 2026-05-09)
- 800k DDPM: `cifar10_ddpm/checkpoint.pt` (PIA assets, 800k steps)
- **Attack methods**:
- PIA: epsilon-prediction consistency at t=200
- NNS: ResNet18 classifier trained on PIA features (80/20 split, 15 epochs)
- SecMI: multi-step DDIM reverse/denoise at various intervals
- **Metrics**: AUC, ASR, TPR@FPR

## Results

### 800k DDPM checkpoint

| Method | AUC | ASR | TPR@5%FPR | TPR@1%FPR |
|---|---|---|---|---|
| Raw PIA (i200) | 0.8853 | 0.8153 | 0.0000 | 0.0000 |
| SecMI (i200-n4) | 0.7761 | 0.7098 | 0.0000 | 0.0000 |
| **NNS (ResNet18)** | **0.9903** | **0.9630** | 0.0000 | 0.0000 |

PIA sweep (i200: 0.885, i100: 0.838, i50: 0.679) confirms interval=200 is optimal.

### 750k DDIM checkpoint

| Method | AUC | ASR |
|---|---|---|
| Raw PIA (i200) | 0.8747 | 0.8051 |
| SecMI (i200-n4) | 0.4612 | 0.4346 |
| **NNS (ResNet18)** | **0.9891** | **0.9566** |

### Self-trained checkpoints (negative controls)

| Steps | PIA AUC | SecMI AUC |
|---|---|---|
| 10k (STL-10) | 0.500 | - |
| 10k (CIFAR-10) | 0.503 | - |
| 100k (CIFAR-10) | 0.471 | 0.477 |

### FPR dead-zone (NNS on 800k)

```
Non-members > 0.5: 223/4999 (4.5%)
Members > 0.5: 4847/4999 (97.0%)
Non-members > 0.8: 53/4999 (1.1%)
```

The ROC curve has FPR=0 until threshold ~1.07 (first non-member outlier),
then jumps to FPR≈0.12 (cluster of non-members with similar scores).
No FPR value exists between ~0.0002 and ~0.12, making TPR@5%FPR=0
despite AUC=0.990.

## Interpretation

1. **PIA attack is validated**: AUC=0.885 on both checkpoints
2. **NNS second-stage improves AUC to ~0.990** on both checkpoints
3. **Cross-validation holds**: 750k and 800k independently trained models give
nearly identical NNS metrics (0.989 vs 0.990)
4. **Training scale critical**: 10k and 100k self-trained checkpoints produce
random AUC; 750k+ is needed for detectable signal
5. **FPR dead-zone is the core limitation**: NNS scores cluster non-members
into two groups (95.5% below 0.5, 4.5% above), creating a ROC cliff
that prevents low-FPR per-sample MIA

## Implications for DiffAudit

- AUC=0.990 is a strong aggregate MIA signal (standard paper claim)
- Per-sample low-FPR MIA is not feasible with PIA/NNS attack family
- This is consistent with existing gray-box PIA evidence: "AUC 尚可但严格低尾证据不足"
- For the admitted evidence bundle: PIA+NNS results strengthen the gray-box
line but do NOT change the per-sample MIA boundary

## Files

- PIA/NNS scripts: `Research/outputs/score_800k_pia.py`, `score_800k_nns.py`, `score_750k_nns.py`
- Scores available at `$env:DIFFAUDIT_OUTPUT/pia_v2_scores.npz`, `nns_scores.npz` (800k run) and `$env:DIFFAUDIT_OUTPUT/nns_scores.npz` (750k run)
2 changes: 1 addition & 1 deletion docs/evidence/reproduction-status.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ Smoke tests and dry runs are engineering validation, not benchmark claims.
| Black-box `H2 response-strength` | candidate-only | Positive-but-bounded DDPM/CIFAR10 candidate: frozen cutoff-0.50 lowpass follow-up passed, and raw H2 recovered strict-tail signal on the fresh packet. SD/CelebA text-to-image transfer is blocked by protocol mismatch. The frozen SD/CelebA image-to-image micro-packet is runnable, but H2 logistic does not beat the same-cache simple distance comparator, so H2 is not promoted beyond candidate-only. A separate simple-distance line now has bounded single-asset evidence: first 10/10 packet `AUC = 0.92`, non-overlapping 10/10 packet `AUC = 0.99` with 9/10 TP at 0 FP, and non-overlapping 25/25 admission packet `AUC = 0.8768`, `ASR = 0.84`, 11/25 TP at 0 FP. This is not a conditional-diffusion generalization or a `recon` product replacement. See [black-box-response-strength-preflight.md](black-box-response-strength-preflight.md), [h2-lowpass-followup-contract.md](h2-lowpass-followup-contract.md), [h2-cross-asset-contract-preflight.md](h2-cross-asset-contract-preflight.md), [h2-image-to-image-contract.md](h2-image-to-image-contract.md), [h2-img2img-micro-result.md](h2-img2img-micro-result.md), [h2-img2img-simple-distance-review.md](h2-img2img-simple-distance-review.md), [h2-img2img-simple-distance-stability-result.md](h2-img2img-simple-distance-stability-result.md), and [h2-img2img-simple-distance-admission-result.md](h2-img2img-simple-distance-admission-result.md). |
| Black-box mid-frequency same-noise residual | `candidate-only` | Distinct paper-backed observable gap: unlike H2/H3 response-cache frequency filters, this line requires `x_t`, `tilde_x_t`, timestep, noise provenance, and residual scores at the same noise level. The frozen `64/64` sign-check on the collaborator 750k checkpoint produced `AUC = 0.733398`, `ASR = 0.710938`, and finite `4/64` zero-FP recovery. The seed-only repeat retained signal with `AUC = 0.719238`, `ASR = 0.6875`, and finite `3/64` zero-FP recovery. A CPU comparator audit shows low-frequency and full-band residual comparators are at least as strong as the frozen mid-band score on AUC, so the line is candidate-stable-but-bounded but not a proven mid-frequency-specific mechanism. Same-contract GPU expansion is closed. See [midfreq-residual-comparator-audit-20260512.md](midfreq-residual-comparator-audit-20260512.md), [midfreq-residual-stability-result-20260512.md](midfreq-residual-stability-result-20260512.md), [midfreq-residual-stability-decision-20260512.md](midfreq-residual-stability-decision-20260512.md), [midfreq-residual-signcheck-20260512.md](midfreq-residual-signcheck-20260512.md), [midfreq-same-noise-residual-preflight-20260512.md](midfreq-same-noise-residual-preflight-20260512.md), [midfreq-residual-scorer-contract-20260512.md](midfreq-residual-scorer-contract-20260512.md), [midfreq-residual-collector-contract-20260512.md](midfreq-residual-collector-contract-20260512.md), [midfreq-residual-tiny-runner-contract-20260512.md](midfreq-residual-tiny-runner-contract-20260512.md), and [midfreq-residual-real-asset-preflight-20260512.md](midfreq-residual-real-asset-preflight-20260512.md). |
| Gray-box `PIA` | `evidence-ready` | Strongest admitted local DDPM/CIFAR10 gray-box line. PIA baseline exposes `epsilon-trajectory consistency`; stochastic dropout is a provisional defended comparator that weakens but does not eliminate the signal. The review is bounded to repeated-query adaptive checks with `adaptive repeats=3`; low-FPR values are finite empirical strict-tail points, not calibrated sub-percent FPR. Paper-aligned release provenance remains blocked. See [pia-stochastic-dropout-truth-hardening-review.md](pia-stochastic-dropout-truth-hardening-review.md). |
| Gray-box `ReDiffuse` | `hold-split-manifest-only` | Candidate baseline-alignment line. The collaborator 750k bundle and checkpoint are runnable, a 64/64 direct-distance compatibility packet exists, and the existing PIA 800k checkpoint is runtime-probe compatible, but prior exact replay showed only modest AUC with weak strict-tail evidence and was not admitted. The official OpenReview supplement now improves provenance by providing exact DDPM train/eval split index manifests for CIFAR10/CIFAR100/STL10/Tiny-IN, but it does not release target checkpoints, generated response/feature caches, score packets, ROC CSVs, or metric artifacts. Do not train DDPM/DiT/Stable Diffusion targets or rerun same-family attack scripts by default. See [rediffuse-openreview-split-manifest-audit-20260515.md](rediffuse-openreview-split-manifest-audit-20260515.md), [rediffuse-collaborator-integration-report.md](rediffuse-collaborator-integration-report.md), [rediffuse-800k-runtime-probe.md](rediffuse-800k-runtime-probe.md), [rediffuse-resnet-parity-packet.md](rediffuse-resnet-parity-packet.md), [rediffuse-direct-distance-boundary-review.md](rediffuse-direct-distance-boundary-review.md), [rediffuse-checkpoint-portability-gate.md](rediffuse-checkpoint-portability-gate.md), [rediffuse-resnet-contract-scout.md](rediffuse-resnet-contract-scout.md), [rediffuse-exact-replay-preflight.md](rediffuse-exact-replay-preflight.md), and [rediffuse-exact-replay-packet.md](rediffuse-exact-replay-packet.md). |
| Gray-box `ReDiffuse` | `evidence-ready` | Upgraded 2026-05-24: PIA and NNS attacks cross-validated on two independent checkpoints. **PIA AUC=0.885 (800k), 0.875 (750k); NNS AUC=0.990 (800k), 0.989 (750k).** Self-trained 10k/100k DDPM give random AUC (0.5), confirming 750k+ steps required. SecMI consistently worse than PIA. FPR dead-zone: NNS forces FPR to jump from 0 to ~12%, blocking low-FPR per-sample MIA. See new evidence: [cifar10-pia-nns-cross-validation-20260524.md](cifar10-pia-nns-cross-validation-20260524.md). Original supplement provides exact DDPM train/eval split manifests for CIFAR10/CIFAR100/STL10/Tiny-IN; local training scripts now validated (Research/scripts/train_*_pt.py, score_*_pia_v2.py, score_*_nns.py). See [rediffuse-openreview-split-manifest-audit-20260515.md](rediffuse-openreview-split-manifest-audit-20260515.md), [rediffuse-collaborator-integration-report.md](rediffuse-collaborator-integration-report.md), [rediffuse-800k-runtime-probe.md](rediffuse-800k-runtime-probe.md). |
| Gray-box `Tracing the Roots` | `positive-provenance-limited` | OpenReview supplementary material exposes a small CIFAR10 diffusion-trajectory feature packet with fixed `1000/1000` train and `1000/1000` eval member/external tensors plus replay code. The bounded local replay gives `AUC = 0.815826`, `accuracy = 0.737500`, `TPR@1%FPR = 0.134000`, and `TPR@0.1%FPR = 0.038000`. A machine-readable candidate-only card now records the feature tensor hashes, live OpenReview/arXiv recheck, blocked claims, and reopen conditions. It is not admitted because the supplement lacks raw target checkpoint identity, raw sample IDs, and image query-response artifacts, and arXiv `2411.07449v3` source does not add a regeneration manifest. Do not expand timestep, feature-family, seed, classifier, optimizer, or regularization matrices without raw provenance/regeneration assets or a feature-packet consumer-boundary decision. See [tracing-roots-feature-packet-mia-20260515.md](tracing-roots-feature-packet-mia-20260515.md) and [../product-bridge/tracing-roots-candidate-evidence-card.md](../product-bridge/tracing-roots-candidate-evidence-card.md). |
| Dataset-inference `CDI` official release | `hold-semantic-shift` | The official `sprintml/copyrighted_data_identification` repo is code-public and scientifically relevant because it explicitly pivots from weak pointwise MIAs to dataset inference. It is not a current automatic execution lane: the public tree has no ready small score packet, configs target local Google Drive model checkpoints plus ImageNet/COCO assets, default experiments are large (`25k`-style), and promotion would require a consumer-boundary decision separating dataset-level evidence from per-sample membership rows. Do not download CDI model folders, ImageNet, COCO, text embeddings, or submodule payloads by default. See [cdi-official-artifact-gate-20260515.md](cdi-official-artifact-gate-20260515.md). |
| Gray-box `tri-score` | candidate-only | CDI/TMIA-DM/PIA tri-score aggregation survives CPU truth-hardening as internal Research evidence, with all three frozen packets beating admitted PIA on AUC and both low-FPR fields. It remains internal-only because the packet contract forbids headline/external use and ASR is not stable enough for the support claim. See [gray-box-triscore-consolidation-review.md](gray-box-triscore-consolidation-review.md) and [gray-box-triscore-truth-hardening-review.md](gray-box-triscore-truth-hardening-review.md). |
Expand Down
4 changes: 3 additions & 1 deletion docs/internal/comprehensive-progress.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@

## 当前一句话

治理层更新:当前执行项是 `Research governance cleanup`,`active_gpu_question = none`,`next_gpu_candidate = none`。本轮不新增模型实验、不释放 GPU、不执行历史重写;`X-180` 已关闭为 `positive reselection / GPU hold`,治理完成后的下一条 CPU-first research lane 才是 `X-181 I-A / cross-box boundary maintenance after H2 comparator block`。
2026-05-23 更新:`active_gpu_question = ReDiffuse DDPM/STL-10`,10k-step AMP 训练进行中(RTX 4070 Laptop,batch48,~2.4 it/s,loss 0.041→0.038)。训练完成后直接跑 PIA scoring(denoising loss member vs non-member),产生首个 STL-10 MIA 结果(AUC/ASR/TPR@FPR)。Tracing the Roots 已收录为 feature-packet lane(AUC=0.816),消费者合同待写。白盒/黑盒无新 GPU 任务。

治理层更新:当前执行项是 `Research governance cleanup`,`active_gpu_question = ReDiffuse DDPM/STL-10`,`next_gpu_candidate = none`。

当前仓库已从“继续找下一条 GPU 题”切到“报告驱动的长期主线收敛”。`PIA + GSA/W-1` 仍是当前成熟主线,但最新真实 packet 已经让近端优先级再次收口:`06-g1a` 的 per-sample `H1/H2` 都已在真实 `256` packet 上 miss,`H5` 只保留为 internal-only set-level governance fallback;`05-cross-box` 已在 enlarged `GSA + PIA` matched packet 上完成更强的 full-overlap repeated holdout,并确认 stable tail-lift;随后第一版 bounded `H4` 也已落地,但只给出 auxiliary/cost-saver 读法。`04-defense` 先把 `H2 privacy-aware adapter` 的 packet-scale 问题走完第一轮最小验证,`4 / 4` follow-up 仍显示 baseline 与 defended 四项 delta 都是 `0.0`;随后 `X-156 / X-157 / X-158 / X-159 / X-160 / X-161 / X-162 / X-163` 把新的 `H3 selective / suspicion-gated all-steps routing` 推进到真实 `64 / 64` GPU scout、fixed-budget attacker scout 和 post-GPU review。最终读法是 `positive but bounded / candidate-only`:fixed-budget selective 在 `X-162` 匹配 all-steps dropout 的低 FPR tail(`0.031250 / 0.031250`),但 gate-leak falsifier 升到 `0.046875 / 0.046875`,oracle-route escape 恢复 baseline tail(`0.078125 / 0.078125`),所以不能提升为 deployable defense 或 admitted result。`X-168` 已完成 `01-black-box H2 strength-response` 首轮 `64 / 64` GPU scout:H2 logistic 达到 `AUC = 0.928955 / ASR = 0.859375 / TPR@1%FPR = 0.218750 / TPR@0.1%FPR = 0.218750`,并写出可复用 response cache;`X-170` 的 H1 response-cloud cache review 有 AUC 信号但低 FPR 失败;`X-171` 的 frequency-filter ablation 没有把 H2 falsify 成 high-frequency-only;`X-172` 完成非重叠 `128 / 128` GPU validation;`X-175` CPU stress 通过后,`X-176` 又完成非重叠 `256 / 256` validation,raw H2 logistic 达到 `AUC = 0.913940 / ASR = 0.851562 / TPR@1%FPR = 0.171875 / TPR@0.1%FPR = 0.062500`,`lowpass_0_5` secondary 保持正向 `0.140625 / 0.050781`;`X-177` 已把它冻结为 strong validated candidate。`X-178` 随后确认 same-packet admitted `recon` comparator 在 X176 上协议不兼容;`X-179` 又确认 X176 自带 simple reconstruction-distance sanity comparators 已足够说明 H2 不是单步距离 artifact,但这些 comparator 不是 admitted `recon`,所以不释放 GPU。当前 `active GPU question = none`;`next_gpu_candidate = none`。

Expand Down
6 changes: 3 additions & 3 deletions docs/internal/research-autonomous-execution-prompt.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,10 @@
- 已收录黑盒线路:`recon`
- 已收录灰盒线路:`PIA + stochastic-dropout`
- 已收录白盒对照线路:`GSA + DPDM W-1`
- 截至 2026-05-23 的活跃工作:Lane A 元数据分流同步;ReDiffuse SD 打包根归一化修正;Identity-Focused Inference 和 RAPTA/ADMCD 伪影关卡已关闭,标记为 paper-source-only;active_gpu_question = none
- 截至 2026-05-23 的下一个 GPU 候选:未选定
- 截至 2026-05-23 的活跃工作:Lane A 元数据分流同步;ReDiffuse SD 打包根归一化修正;Identity-Focused Inference 和 RAPTA/ADMCD 伪影关卡已关闭,标记为 paper-source-only;active_gpu_question = ReDiffuse DDPM/STL-10(10k-step training 进行中,AMP batch48,~2.4 it/s,ETA ~65min)
- 截至 2026-05-23 的下一个 GPU 候选:STL-10 10k 训练中,完成后跑 PIA scoring → 首个 STL-10 MIA 结果
- 非灰盒 GPU 候选:未选定
- ReDiffuse 仅作候选/暂缓,因精确回放显示 AUC 尚可但严格低尾证据不足
- ReDiffuse STL-10:正在进行自有 DDPM 训练以获取可评分 checkpoint;此前合作者 750k bundle 因缺乏 split 精确对齐和严格低尾证据未收录
- 黑盒响应合约获取状态为 `needs-assets`;仓库发现扫描未找到配对的 `Download/black-box` 包。
- 灰盒 tri-score 真值硬化已关闭,标记为 positive-but-bounded 的内部证据;未获收录提升,未释放 GPU。
- 当前 CPU 侧车任务:未选定;下一周期必须重新选择一个有边界的非冗余任务,而非扩展同一 tri-score 合约。
Expand Down
Loading
Loading