You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/evidence/reproduction-status.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -70,7 +70,7 @@ Smoke tests and dry runs are engineering validation, not benchmark claims.
70
70
| Black-box `H2 response-strength` | candidate-only | Positive-but-bounded DDPM/CIFAR10 candidate: frozen cutoff-0.50 lowpass follow-up passed, and raw H2 recovered strict-tail signal on the fresh packet. SD/CelebA text-to-image transfer is blocked by protocol mismatch. The frozen SD/CelebA image-to-image micro-packet is runnable, but H2 logistic does not beat the same-cache simple distance comparator, so H2 is not promoted beyond candidate-only. A separate simple-distance line now has bounded single-asset evidence: first 10/10 packet `AUC = 0.92`, non-overlapping 10/10 packet `AUC = 0.99` with 9/10 TP at 0 FP, and non-overlapping 25/25 admission packet `AUC = 0.8768`, `ASR = 0.84`, 11/25 TP at 0 FP. This is not a conditional-diffusion generalization or a `recon` product replacement. See [black-box-response-strength-preflight.md](black-box-response-strength-preflight.md), [h2-lowpass-followup-contract.md](h2-lowpass-followup-contract.md), [h2-cross-asset-contract-preflight.md](h2-cross-asset-contract-preflight.md), [h2-image-to-image-contract.md](h2-image-to-image-contract.md), [h2-img2img-micro-result.md](h2-img2img-micro-result.md), [h2-img2img-simple-distance-review.md](h2-img2img-simple-distance-review.md), [h2-img2img-simple-distance-stability-result.md](h2-img2img-simple-distance-stability-result.md), and [h2-img2img-simple-distance-admission-result.md](h2-img2img-simple-distance-admission-result.md). |
71
71
| Black-box mid-frequency same-noise residual | `candidate-only` | Distinct paper-backed observable gap: unlike H2/H3 response-cache frequency filters, this line requires `x_t`, `tilde_x_t`, timestep, noise provenance, and residual scores at the same noise level. The frozen `64/64` sign-check on the collaborator 750k checkpoint produced `AUC = 0.733398`, `ASR = 0.710938`, and finite `4/64` zero-FP recovery. The seed-only repeat retained signal with `AUC = 0.719238`, `ASR = 0.6875`, and finite `3/64` zero-FP recovery. A CPU comparator audit shows low-frequency and full-band residual comparators are at least as strong as the frozen mid-band score on AUC, so the line is candidate-stable-but-bounded but not a proven mid-frequency-specific mechanism. Same-contract GPU expansion is closed. See [midfreq-residual-comparator-audit-20260512.md](midfreq-residual-comparator-audit-20260512.md), [midfreq-residual-stability-result-20260512.md](midfreq-residual-stability-result-20260512.md), [midfreq-residual-stability-decision-20260512.md](midfreq-residual-stability-decision-20260512.md), [midfreq-residual-signcheck-20260512.md](midfreq-residual-signcheck-20260512.md), [midfreq-same-noise-residual-preflight-20260512.md](midfreq-same-noise-residual-preflight-20260512.md), [midfreq-residual-scorer-contract-20260512.md](midfreq-residual-scorer-contract-20260512.md), [midfreq-residual-collector-contract-20260512.md](midfreq-residual-collector-contract-20260512.md), [midfreq-residual-tiny-runner-contract-20260512.md](midfreq-residual-tiny-runner-contract-20260512.md), and [midfreq-residual-real-asset-preflight-20260512.md](midfreq-residual-real-asset-preflight-20260512.md). |
72
72
| Gray-box `PIA`|`evidence-ready`| Strongest admitted local DDPM/CIFAR10 gray-box line. PIA baseline exposes `epsilon-trajectory consistency`; stochastic dropout is a provisional defended comparator that weakens but does not eliminate the signal. The review is bounded to repeated-query adaptive checks with `adaptive repeats=3`; low-FPR values are finite empirical strict-tail points, not calibrated sub-percent FPR. Paper-aligned release provenance remains blocked. See [pia-stochastic-dropout-truth-hardening-review.md](pia-stochastic-dropout-truth-hardening-review.md). |
73
-
| Gray-box `ReDiffuse` | `hold-split-manifest-only` | Candidate baseline-alignment line. The collaborator 750k bundle and checkpoint are runnable, a 64/64 direct-distance compatibility packet exists, and the existing PIA 800k checkpoint is runtime-probe compatible, but prior exact replay showed only modest AUC with weak strict-tail evidence and was not admitted. The official OpenReview supplement now improves provenance by providing exact DDPM train/eval split index manifests for CIFAR10/CIFAR100/STL10/Tiny-IN, but it does not release target checkpoints, generated response/feature caches, score packets, ROC CSVs, or metric artifacts. Do not train DDPM/DiT/Stable Diffusion targets or rerun same-family attack scripts by default. See [rediffuse-openreview-split-manifest-audit-20260515.md](rediffuse-openreview-split-manifest-audit-20260515.md), [rediffuse-collaborator-integration-report.md](rediffuse-collaborator-integration-report.md), [rediffuse-800k-runtime-probe.md](rediffuse-800k-runtime-probe.md), [rediffuse-resnet-parity-packet.md](rediffuse-resnet-parity-packet.md), [rediffuse-direct-distance-boundary-review.md](rediffuse-direct-distance-boundary-review.md), [rediffuse-checkpoint-portability-gate.md](rediffuse-checkpoint-portability-gate.md), [rediffuse-resnet-contract-scout.md](rediffuse-resnet-contract-scout.md), [rediffuse-exact-replay-preflight.md](rediffuse-exact-replay-preflight.md), and [rediffuse-exact-replay-packet.md](rediffuse-exact-replay-packet.md). |
73
+
| Gray-box `ReDiffuse` | `hold-split-manifest-only` | Candidate baseline-alignment line. The collaborator 750k bundle and checkpoint are runnable, a 64/64 direct-distance compatibility packet exists, and the existing PIA 800k checkpoint is runtime-probe compatible, but prior exact replay showed only modest AUC with weak strict-tail evidence and was not admitted. The official OpenReview supplement now improves provenance by providing exact DDPM train/eval split index manifests for CIFAR10/CIFAR100/STL10/Tiny-IN, but it does not release target checkpoints, generated response/feature caches, score packets, ROC CSVs, or metric artifacts. The collaborator Stable Diffusion ReDiffuse `5000`-row packet remains replayable (`AUC = 0.71031888`), but its member/nonmember labels are perfectly aligned with `LAION-5B member subset` versus `COCO2017-val non-member subset`, so it is a cross-source stress-test candidate rather than a same-distribution second asset. Do not train DDPM/DiT/Stable Diffusion targets, request `coco_data`, download Stable Diffusion weights, or rerun same-family attack scripts by default. See [stable-diffusion-rediffuse-collaborator-artifact-20260517.md](stable-diffusion-rediffuse-collaborator-artifact-20260517.md), [rediffuse-openreview-split-manifest-audit-20260515.md](rediffuse-openreview-split-manifest-audit-20260515.md), [rediffuse-collaborator-integration-report.md](rediffuse-collaborator-integration-report.md), [rediffuse-800k-runtime-probe.md](rediffuse-800k-runtime-probe.md), [rediffuse-resnet-parity-packet.md](rediffuse-resnet-parity-packet.md), [rediffuse-direct-distance-boundary-review.md](rediffuse-direct-distance-boundary-review.md), [rediffuse-checkpoint-portability-gate.md](rediffuse-checkpoint-portability-gate.md), [rediffuse-resnet-contract-scout.md](rediffuse-resnet-contract-scout.md), [rediffuse-exact-replay-preflight.md](rediffuse-exact-replay-preflight.md), and [rediffuse-exact-replay-packet.md](rediffuse-exact-replay-packet.md). |
74
74
| Gray-box `Tracing the Roots` | `positive-provenance-limited` | OpenReview supplementary material exposes a small CIFAR10 diffusion-trajectory feature packet with fixed `1000/1000` train and `1000/1000` eval member/external tensors plus replay code. The bounded local replay gives `AUC = 0.815826`, `accuracy = 0.737500`, `TPR@1%FPR = 0.134000`, and `TPR@0.1%FPR = 0.038000`. A machine-readable candidate-only card now records the feature tensor hashes, live OpenReview/arXiv recheck, blocked claims, and reopen conditions. It is not admitted because the supplement lacks raw target checkpoint identity, raw sample IDs, and image query-response artifacts, and arXiv `2411.07449v3` source does not add a regeneration manifest. Do not expand timestep, feature-family, seed, classifier, optimizer, or regularization matrices without raw provenance/regeneration assets or a feature-packet consumer-boundary decision. See [tracing-roots-feature-packet-mia-20260515.md](tracing-roots-feature-packet-mia-20260515.md) and [../product-bridge/tracing-roots-candidate-evidence-card.md](../product-bridge/tracing-roots-candidate-evidence-card.md). |
75
75
| Dataset-inference `CDI` official release |`hold-semantic-shift`| The official `sprintml/copyrighted_data_identification` repo is code-public and scientifically relevant because it explicitly pivots from weak pointwise MIAs to dataset inference. It is not a current automatic execution lane: the public tree has no ready small score packet, configs target local Google Drive model checkpoints plus ImageNet/COCO assets, default experiments are large (`25k`-style), and promotion would require a consumer-boundary decision separating dataset-level evidence from per-sample membership rows. Do not download CDI model folders, ImageNet, COCO, text embeddings, or submodule payloads by default. See [cdi-official-artifact-gate-20260515.md](cdi-official-artifact-gate-20260515.md). |
76
76
| Gray-box `tri-score`| candidate-only | CDI/TMIA-DM/PIA tri-score aggregation survives CPU truth-hardening as internal Research evidence, with all three frozen packets beating admitted PIA on AUC and both low-FPR fields. It remains internal-only because the packet contract forbids headline/external use and ASR is not stable enough for the support claim. See [gray-box-triscore-consolidation-review.md](gray-box-triscore-consolidation-review.md) and [gray-box-triscore-truth-hardening-review.md](gray-box-triscore-truth-hardening-review.md). |
0 commit comments