diff --git a/.gitignore b/.gitignore index e43efbb7..9215e577 100644 --- a/.gitignore +++ b/.gitignore @@ -142,6 +142,9 @@ workspaces/**/artifacts/** !workspaces/black-box/artifacts/copymark-commoncanvas-response-contract-probe-20260512.json !workspaces/black-box/artifacts/copymark-commoncanvas-multiseed-stability-20260513.json !workspaces/black-box/artifacts/commoncanvas-denoising-loss-20260513.json +!workspaces/black-box/artifacts/h2-output-cloud-geometry-20260525.json +!workspaces/black-box/artifacts/h2-output-cloud-geometry-seed177-20260525.json +!workspaces/black-box/artifacts/h2-output-cloud-geometry-label-shuffle-20260525.json !workspaces/black-box/artifacts/beans-lora-member-denoising-loss-scout-20260513.json !workspaces/black-box/artifacts/clid-image-identity-boundary-20260511.json !workspaces/black-box/artifacts/midfreq-same-noise-residual-cache-audit-20260512.json diff --git a/AGENTS.md b/AGENTS.md index f2cbab0c..76ddda65 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -28,7 +28,7 @@ Do not start from memory or old chat context. Re-anchor on repository files. ## Current Operating State -- Active work: `2026-05-25 feature-packet channel consumer verdict is the latest consumer-boundary update. Tracing the Roots remains positive Research-side feature-packet evidence (AUC = 0.815826, TPR@1%FPR = 0.134000), but the Platform/Runtime feature-packet channel is deferred because the public surface still has only one singleton feature tensor packet, no second non-source-equivalent public feature-packet, and no raw target checkpoint / raw sample manifest / feature-regeneration assets. Do not create feature-packet schema, bundle export, validators, tests, Platform UI types, or Runtime runners from this singleton. active_gpu_question = none; next_gpu_candidate = none; CPU sidecar = none selected after feature-packet channel consumer verdict. LeakyCLIP remains CLIP / multimodal privacy watch-plus, not a second diffusion asset. ReDiffuse DDPM/STL-10 remains closed by default after the weak bounded scout (AUC = 0.4996337890625) and weak SimA-style score-norm scorer (AUC = 0.5052947998046875).` +- Active work: `2026-05-25 H2 output-cloud geometry cache review is the latest metric verdict. It is a strong Research-side candidate on the existing H2 response cache (seed 176 logistic AUC = 0.961529, TPR@1%FPR = 0.333984, TPR@0.1%FPR = 0.117188; seed 177 AUC = 0.961048; label-shuffle AUC = 0.507595), but it is not admitted because the source cache used class-ordered sample offsets and needs a reseeded or interleaved order-control cache before any promotion. Do not create Platform/Runtime schema, bundle export, UI type, runner, KDE/shadow-density/repeat-count sweeps, or same-cache feature sweeps from this result. active_gpu_question = none; next_gpu_candidate = none; CPU sidecar = none selected after H2 output-cloud cache review. Feature-packet consumer lane remains deferred. LeakyCLIP remains CLIP / multimodal privacy watch-plus. ReDiffuse DDPM/STL-10 remains closed by default after the weak bounded scout (AUC = 0.4996337890625) and weak SimA-style score-norm scorer (AUC = 0.5052947998046875).` - Next GPU candidate: none selected - Long-horizon control: follow `ROADMAP.md` section `Long-Horizon Research Task Board(2026-05-13 起)` before reopening any diff --git a/ROADMAP.md b/ROADMAP.md index 4fd7b8fa..23b6b9ef 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -2,6 +2,33 @@ > Last updated: 2026-05-25 +## 2026-05-25 H2 output-cloud geometry 候选信号 + +最新决策:H2 response-strength 的既有 `512 / 512` response cache 暴露出一个强的 +output-output geometry 候选信号,但在 order-control 通过前不晋升、不释放产品消费、 +不扩展同 cache 特征工程。该复查只读取现有 +`workspaces/black-box/runs/h2-response-strength-512-20260501-r1/response-cache.npz`, +没有生成新响应、没有下载资产、没有运行 GPU。 + +该 scorer 刻意排除 seed-to-output distance,只使用同 timestep repeat 间 RMSE、 +不同 timestep centroid RMSE 和 response-cloud Gram/PCA 特征。主结果为 +`AUC = 0.961529`,`ASR = 0.900391`,`TPR@1%FPR = 0.333984`, +`TPR@0.1%FPR = 0.117188`,相对 raw H2 logistic 提升 +`AUC +0.055836`、`TPR@1%FPR +0.199218`、`TPR@0.1%FPR +0.117188`。 +seed `177` 稳定性仍为 `AUC = 0.961048`,`TPR@1%FPR = 0.353516`, +`TPR@0.1%FPR = 0.130859`;label-shuffle sanity 回到随机级 +`AUC = 0.507595`。 + +关键 caveat:源 cache 生成时 member 侧 `sample_offset = 0`,nonmember 侧 +`sample_offset = len(member_indices)`,output-output geometry 对采样种子和响应云形态敏感, +所以当前强信号可能混入 class-ordered sampling effect。该结果只能作为 +Research-side 强候选;下一次重新评估只能是一个有界 reseeded / interleaved +order-control response-cache scout。当前 slots 仍为: +`active_gpu_question = none`,`next_gpu_candidate = none`, +`CPU sidecar = none selected after H2 output-cloud cache review`。 +See +[docs/evidence/h2-output-cloud-geometry-20260525.md](docs/evidence/h2-output-cloud-geometry-20260525.md)。 + ## 2026-05-25 Feature-Packet 通道消费者裁决 最新决策:不在 2026-05-25 为 Tracing the Roots 单例开通 Platform/Runtime diff --git a/docs/evidence/h2-output-cloud-geometry-20260525.md b/docs/evidence/h2-output-cloud-geometry-20260525.md new file mode 100644 index 00000000..3a2b7746 --- /dev/null +++ b/docs/evidence/h2-output-cloud-geometry-20260525.md @@ -0,0 +1,127 @@ +# H2 Output-Cloud Geometry Cache Review + +> Date: 2026-05-25 +> Status: candidate complementary signal / CPU-only cache review / order-control required before promotion / no GPU release / no admitted row + +## Question + +在已有 H2 response-strength cache 上,输出之间的几何结构是否携带不同于 +seed-to-output distance 的 membership 信号? + +本轮只复用现有 +`workspaces/black-box/runs/h2-response-strength-512-20260501-r1/response-cache.npz`。 +没有生成新响应、没有下载资产、没有运行 GPU,也没有扩展同一路线的 KDE、shadow +density、repeat-count 或特征 sweep。 + +## Contract + +脚本: +`scripts/review_h2_output_cloud_geometry.py` + +输入 cache: + +| Field | Value | +| --- | ---: | +| Samples | `1024` | +| Members | `512` | +| Nonmembers | `512` | +| Timesteps | `40 / 80 / 120 / 160` | +| Repeats per timestep | `2` | +| Response shape | `[1024, 4, 2, 3, 32, 32]` | + +特征只使用 output-output geometry: + +| Feature family | Meaning | +| --- | --- | +| within-timestep pair RMSE | 同一 timestep 内不同 repeat 的响应距离 | +| timestep centroid RMSE | 不同 timestep 的响应云 centroid 距离 | +| response-cloud PCA trace/top share | 小响应云 Gram spectrum 的尺度和集中度 | + +该脚本刻意不读取 seed-to-output distance 特征,因此不会退化成原 H2 simple +distance 评分器。 + +## Result + +主结果: +`workspaces/black-box/artifacts/h2-output-cloud-geometry-20260525.json` + +| Metric | Output-cloud logistic | Raw H2 logistic | Lowpass H2 logistic | +| --- | ---: | ---: | ---: | +| AUC | `0.961529` | `0.905693` | `0.895679` | +| ASR | `0.900391` | `0.841797` | `0.831055` | +| TPR@1%FPR | `0.333984` | `0.134766` | `0.148438` | +| TPR@0.1%FPR | `0.117188` | `0.0` | `0.025391` | + +相对 raw H2:`AUC +0.055836`,`TPR@1%FPR +0.199218`, +`TPR@0.1%FPR +0.117188`。 + +相对 lowpass H2:`AUC +0.065850`,`TPR@1%FPR +0.185546`, +`TPR@0.1%FPR +0.091797`。 + +简单单特征不能解释该结果: + +| Best simple view | Feature | Orientation | AUC | TPR@1%FPR | TPR@0.1%FPR | +| --- | --- | --- | ---: | ---: | ---: | +| Best AUC | `centroid_rmse_40_160` | negative | `0.801182` | `0.03125` | `0.005859` | +| Best low-FPR | `cloud_pca_top_share` | negative | `0.650913` | `0.078125` | `0.017578` | + +Seed stability check: +`workspaces/black-box/artifacts/h2-output-cloud-geometry-seed177-20260525.json` + +| Metric | Seed 177 | +| --- | ---: | +| AUC | `0.961048` | +| ASR | `0.900391` | +| TPR@1%FPR | `0.353516` | +| TPR@0.1%FPR | `0.130859` | + +Label-shuffle sanity: +`workspaces/black-box/artifacts/h2-output-cloud-geometry-label-shuffle-20260525.json` + +| Metric | Label shuffle | +| --- | ---: | +| AUC | `0.507595` | +| ASR | `0.521484` | +| TPR@1%FPR | `0.011719` | +| TPR@0.1%FPR | `0.003906` | + +这说明 scorer/evaluation 管线没有明显的标签直通泄漏。 + +## Critical Caveat + +该结果仍然不能晋升。源 cache 的响应生成存在 class-ordered seed offset: +`scripts/run_h2_response_strength_validation.py` 中 member 侧使用 +`sample_offset = 0`,nonmember 侧使用 `sample_offset = len(member_indices)`。 +Output-output geometry 对采样种子和响应云形态敏感,因此当前强信号可能混入 +class-ordered sampling effect。 + +这不是要继续在同一个 cache 上补表格;它只定义一个非常窄的下一步: +如果需要推进,最多释放一个有界 order-control / reseeded / interleaved +response-cache scout,用来判断该强信号是否跨 class-order 控制保留。 + +## Decision + +`candidate complementary signal / order-control required / no admitted row`。 + +保留为 Research-side 强候选,因为它满足三个有价值条件: + +- 它是不同 observable:output-output cloud geometry,而不是 seed-to-output distance。 +- 它在同一 H2 cache 上明显强于 raw/lowpass H2 logistic。 +- 它通过了 seed-177 稳定性和 label-shuffle sanity。 + +但当前不做以下事情: + +- 不升级到 Platform/Runtime admitted bundle。 +- 不新增产品 schema、Runtime runner、UI 类型或 bundle row。 +- 不在同一 cache 上展开 KDE、shadow density、repeat-count、特征族或融合 sweep。 +- 不释放 GPU 或大下载。 + +下一次重新评估只允许基于一个 order-control cache 的结果。如果 reseeded / +interleaved cache 仍保持强 AUC 和严格尾部恢复,再讨论是否进入更正式的 H2 +output-cloud 机制线;如果不保持,该候选直接关闭为 class-ordered response-cache +artifact。 + +## Platform and Runtime Impact + +None. The admitted Platform/Runtime bundle remains the existing five rows: +`recon`, `PIA baseline`, `PIA defended`, `GSA`, and `DPDM W-1`. diff --git a/docs/evidence/reproduction-status.md b/docs/evidence/reproduction-status.md index 7c00b1d7..1a447b74 100644 --- a/docs/evidence/reproduction-status.md +++ b/docs/evidence/reproduction-status.md @@ -73,6 +73,7 @@ Smoke tests and dry runs are engineering validation, not benchmark claims. | FERMI multi-relational tabular MIA | `hold-paper-source-only` | arXiv `2605.11527` reports strong multi-relational TabDDPM/TabDiff/TabSyn membership metrics, but the public surface has no code tree, target/split manifests, generated synthetic tables, feature/score rows, ROC arrays, metric JSON, or replay command. It does not reopen MIDST/tabular execution and releases no tabular dataset download, model training, or GPU work. See [fermi-tabular-artifact-gate-20260515.md](fermi-tabular-artifact-gate-20260515.md). | | True known-split mechanisms | `hold-weak` | MNIST/DDPM raw-loss and x0 scouts are weak. Tiny overfit final-layer gradient norm was positive only on the extreme `8 / 64` target, weakened at `16 / 64`, and a more optimistic `64 / 64` oracle gradient-prototype alignment follow-up is effectively random (`AUC = 0.500977`, `ASR = 0.562500`, zero low-FPR recovery). Fashion-MNIST DDPM now has three weak clean-split scouts: fixed-timestep PIA-style loss (`AUC = 0.535889`, `TPR@1%FPR = 0.03125`), SimA single-query score-norm (`AUC = 0.515137`, zero low-FPR recovery), and score-Jacobian sensitivity (`AUC = 0.511719`, zero low-FPR recovery). The Beans member-LoRA denoising-loss scout repaired pseudo-membership semantics by creating an exact `SD1.5 + Beans-member LoRA` target, but the internal conditional denoising-loss score is weak (`AUC = 0.414400`, reverse `0.585600`, `TPR@1%FPR = 0.080000`) and parameter-delta sensitivity is also weak (`AUC = 0.512000`). Do not run more final-layer gradient norm/cosine variants, Fashion-MNIST timestep/seed/`p`-norm/perturbation/norm/packet-size sweeps, or Beans LoRA train-step/rank/resolution/prompt/timestep/layer matrices by default. See [fashion-mnist-ddpm-score-jacobian-sensitivity-20260514.md](fashion-mnist-ddpm-score-jacobian-sensitivity-20260514.md), [fashion-mnist-ddpm-sima-score-norm-20260514.md](fashion-mnist-ddpm-sima-score-norm-20260514.md), [beans-lora-delta-sensitivity-20260513.md](beans-lora-delta-sensitivity-20260513.md), [beans-lora-member-denoising-loss-scout-20260513.md](beans-lora-member-denoising-loss-scout-20260513.md), [fashion-mnist-ddpm-pia-loss-scout-20260513.md](fashion-mnist-ddpm-pia-loss-scout-20260513.md), [tiny-known-split-gradient-prototype-alignment-20260513.md](tiny-known-split-gradient-prototype-alignment-20260513.md), [gradient-norm-stability-gate-20260512.md](gradient-norm-stability-gate-20260512.md), and [tiny-overfit-gradient-norm-scout-20260512.md](tiny-overfit-gradient-norm-scout-20260512.md). | | Black-box `H2 response-strength` | candidate-only | Positive-but-bounded DDPM/CIFAR10 candidate: frozen cutoff-0.50 lowpass follow-up passed, and raw H2 recovered strict-tail signal on the fresh packet. SD/CelebA text-to-image transfer is blocked by protocol mismatch. The frozen SD/CelebA image-to-image micro-packet is runnable, but H2 logistic does not beat the same-cache simple distance comparator, so H2 is not promoted beyond candidate-only. A separate simple-distance line now has bounded single-asset evidence: first 10/10 packet `AUC = 0.92`, non-overlapping 10/10 packet `AUC = 0.99` with 9/10 TP at 0 FP, and non-overlapping 25/25 admission packet `AUC = 0.8768`, `ASR = 0.84`, 11/25 TP at 0 FP. This is not a conditional-diffusion generalization or a `recon` product replacement. See [black-box-response-strength-preflight.md](black-box-response-strength-preflight.md), [h2-lowpass-followup-contract.md](h2-lowpass-followup-contract.md), [h2-cross-asset-contract-preflight.md](h2-cross-asset-contract-preflight.md), [h2-image-to-image-contract.md](h2-image-to-image-contract.md), [h2-img2img-micro-result.md](h2-img2img-micro-result.md), [h2-img2img-simple-distance-review.md](h2-img2img-simple-distance-review.md), [h2-img2img-simple-distance-stability-result.md](h2-img2img-simple-distance-stability-result.md), and [h2-img2img-simple-distance-admission-result.md](h2-img2img-simple-distance-admission-result.md). | +| Black-box `H2 output-cloud geometry` | candidate-only | CPU-only review on the existing H2 response cache found a strong output-output geometry signal that excludes seed-to-output distance (`AUC = 0.961529`, `ASR = 0.900391`, `TPR@1%FPR = 0.333984`, `TPR@0.1%FPR = 0.117188`). Seed `177` remains stable (`AUC = 0.961048`), and label-shuffle sanity returns random-level (`AUC = 0.507595`). This is not admitted because the source cache used class-ordered sample offsets, so a reseeded or interleaved order-control cache is required before promotion. Do not expand into KDE, shadow-density, repeat-count, same-cache feature sweeps, Platform schema, Runtime runner, or bundle rows. See [h2-output-cloud-geometry-20260525.md](h2-output-cloud-geometry-20260525.md). | | Black-box mid-frequency same-noise residual | `candidate-only` | Distinct paper-backed observable gap: unlike H2/H3 response-cache frequency filters, this line requires `x_t`, `tilde_x_t`, timestep, noise provenance, and residual scores at the same noise level. The frozen `64/64` sign-check on the collaborator 750k checkpoint produced `AUC = 0.733398`, `ASR = 0.710938`, and finite `4/64` zero-FP recovery. The seed-only repeat retained signal with `AUC = 0.719238`, `ASR = 0.6875`, and finite `3/64` zero-FP recovery. A CPU comparator audit shows low-frequency and full-band residual comparators are at least as strong as the frozen mid-band score on AUC, so the line is candidate-stable-but-bounded but not a proven mid-frequency-specific mechanism. Same-contract GPU expansion is closed. See [midfreq-residual-comparator-audit-20260512.md](midfreq-residual-comparator-audit-20260512.md), [midfreq-residual-stability-result-20260512.md](midfreq-residual-stability-result-20260512.md), [midfreq-residual-stability-decision-20260512.md](midfreq-residual-stability-decision-20260512.md), [midfreq-residual-signcheck-20260512.md](midfreq-residual-signcheck-20260512.md), [midfreq-same-noise-residual-preflight-20260512.md](midfreq-same-noise-residual-preflight-20260512.md), [midfreq-residual-scorer-contract-20260512.md](midfreq-residual-scorer-contract-20260512.md), [midfreq-residual-collector-contract-20260512.md](midfreq-residual-collector-contract-20260512.md), [midfreq-residual-tiny-runner-contract-20260512.md](midfreq-residual-tiny-runner-contract-20260512.md), and [midfreq-residual-real-asset-preflight-20260512.md](midfreq-residual-real-asset-preflight-20260512.md). | | Gray-box `PIA` | `evidence-ready` | Strongest admitted local DDPM/CIFAR10 gray-box line. PIA baseline exposes `epsilon-trajectory consistency`; stochastic dropout is a provisional defended comparator that weakens but does not eliminate the signal. The review is bounded to repeated-query adaptive checks with `adaptive repeats=3`; low-FPR values are finite empirical strict-tail points, not calibrated sub-percent FPR. Paper-aligned release provenance remains blocked. See [pia-stochastic-dropout-truth-hardening-review.md](pia-stochastic-dropout-truth-hardening-review.md). | | Gray-box `ReDiffuse` | `hold-weak` | Candidate baseline-alignment line. The collaborator 750k bundle and checkpoint are runnable, a 64/64 direct-distance compatibility packet exists, and the existing PIA 800k checkpoint is runtime-probe compatible, but prior exact replay showed only modest AUC with weak strict-tail evidence and was not admitted. The collaborator Stable Diffusion ReDiffuse `5000`-row packet remains replayable (`AUC = 0.71031888`), but its member/nonmember labels are perfectly aligned with `LAION-5B member subset` versus `COCO2017-val non-member subset`, so it is a cross-source stress-test candidate rather than a same-distribution second asset. The official OpenReview supplement still does not release third-party target checkpoints, generated response/feature caches, score packets, ROC CSVs, or metric artifacts. A local ReDiffuse DDPM/STL-10 bounded scout now proves the split and official model path are executable and scoreable, but the short target fixed-timestep denoising-loss packet is random-level (`AUC = 0.4996337890625`, `ASR = 0.509765625`, `TPR@1%FPR = 0.01171875`, `TPR@0.1%FPR = 0.0`). Reusing the same checkpoint and `256 / 256` split for a genuinely different SimA-style denoiser-output score norm also remained random-level (`AUC = 0.5052947998046875`, `ASR = 0.525390625`, `TPR@1%FPR = 0.03125`, `TPR@0.1%FPR = 0.01953125`). Do not expand into step-count, seed, timestep, batch-size, subset-size, EMA, scheduler, denoising-loss matrices, score-norm matrices, checkpoint-step/fusion sweeps, full DDPM/DiT/Stable Diffusion targets, `800k`-step training, Tiny-ImageNet downloads, request `coco_data`, download Stable Diffusion weights, or rerun same-family attack scripts by default. See [rediffuse-stl10-sima-score-norm-20260525.md](rediffuse-stl10-sima-score-norm-20260525.md), [rediffuse-stl10-bounded-scout-20260525.md](rediffuse-stl10-bounded-scout-20260525.md), [rediffuse-stl10-split-and-microtrain-preflight-20260525.md](rediffuse-stl10-split-and-microtrain-preflight-20260525.md), [stable-diffusion-rediffuse-collaborator-artifact-20260517.md](stable-diffusion-rediffuse-collaborator-artifact-20260517.md), [rediffuse-openreview-split-manifest-audit-20260515.md](rediffuse-openreview-split-manifest-audit-20260515.md), [rediffuse-collaborator-integration-report.md](rediffuse-collaborator-integration-report.md), [rediffuse-800k-runtime-probe.md](rediffuse-800k-runtime-probe.md), [rediffuse-resnet-parity-packet.md](rediffuse-resnet-parity-packet.md), [rediffuse-direct-distance-boundary-review.md](rediffuse-direct-distance-boundary-review.md), [rediffuse-checkpoint-portability-gate.md](rediffuse-checkpoint-portability-gate.md), [rediffuse-resnet-contract-scout.md](rediffuse-resnet-contract-scout.md), [rediffuse-exact-replay-preflight.md](rediffuse-exact-replay-preflight.md), and [rediffuse-exact-replay-packet.md](rediffuse-exact-replay-packet.md). | diff --git a/docs/evidence/workspace-evidence-index.md b/docs/evidence/workspace-evidence-index.md index 04399a80..a2368f56 100644 --- a/docs/evidence/workspace-evidence-index.md +++ b/docs/evidence/workspace-evidence-index.md @@ -5,6 +5,18 @@ This index separates current track state from archived research history. ## Current Track State Latest Research update: +[h2-output-cloud-geometry-20260525.md](h2-output-cloud-geometry-20260525.md) +records a CPU-only metric verdict on the existing H2 response-strength cache. +The output-output geometry scorer is a strong Research-side candidate +(`AUC = 0.961529`, `TPR@1%FPR = 0.333984`, +`TPR@0.1%FPR = 0.117188`) and is stable under seed `177` +(`AUC = 0.961048`), while label-shuffle sanity returns random-level +(`AUC = 0.507595`). It is not admitted because the source cache used +class-ordered sample offsets and needs a reseeded or interleaved order-control +cache before promotion. Decision: `candidate complementary signal / +order-control required / no admitted row / no download / no GPU release`. + +Previous Research update: [feature-packet-channel-consumer-verdict-20260525.md](feature-packet-channel-consumer-verdict-20260525.md) records a consumer-boundary verdict for the gray-box feature-packet lane. Tracing the Roots remains positive Research-side feature-packet evidence diff --git a/scripts/review_h2_output_cloud_geometry.py b/scripts/review_h2_output_cloud_geometry.py new file mode 100644 index 00000000..608f04df --- /dev/null +++ b/scripts/review_h2_output_cloud_geometry.py @@ -0,0 +1,272 @@ +from __future__ import annotations + +import argparse +import json +from pathlib import Path +from typing import Any + +import numpy as np + +from diffaudit.attacks.h2_response_strength import evaluate_logistic_holdout, metric_delta, score_metrics + + +def _sanitize(value: Any) -> Any: + if isinstance(value, dict): + return {str(key): _sanitize(item) for key, item in value.items()} + if isinstance(value, list): + return [_sanitize(item) for item in value] + if isinstance(value, tuple): + return [_sanitize(item) for item in value] + if isinstance(value, np.ndarray): + return _sanitize(value.tolist()) + if isinstance(value, np.generic): + return _sanitize(value.item()) + if isinstance(value, Path): + return str(value) + return value + + +def _slope(values: np.ndarray, axis_values: np.ndarray) -> np.ndarray: + if values.shape[1] <= 1: + return np.zeros(values.shape[0], dtype=np.float32) + return np.polyfit(axis_values.astype(np.float64), values.T.astype(np.float64), deg=1)[0].astype(np.float32) + + +def _cloud_eigen_features(flat_responses: np.ndarray) -> tuple[np.ndarray, np.ndarray]: + centered = flat_responses - flat_responses.mean(axis=1, keepdims=True) + # Eigenvalues of the response Gram matrix give the non-zero PCA spectrum of + # each small response cloud without building a huge pixel covariance matrix. + gram = np.einsum("nrd,nsd->nrs", centered, centered, optimize=True) + gram /= max(flat_responses.shape[1] - 1, 1) + eigvals = np.linalg.eigvalsh(gram).astype(np.float64) + eigvals = np.clip(eigvals, 0.0, None) + trace = eigvals.sum(axis=1) + top_share = np.divide( + eigvals[:, -1], + trace, + out=np.zeros_like(trace), + where=trace > 0, + ) + return trace.astype(np.float32), top_share.astype(np.float32) + + +def compute_output_cloud_features(responses: np.ndarray, axis_values: np.ndarray) -> tuple[np.ndarray, list[str]]: + responses_f32 = np.asarray(responses, dtype=np.float32) + if responses_f32.ndim != 6: + raise ValueError("responses must have shape [sample, timestep, repeat, channel, height, width]") + if responses_f32.shape[2] < 2: + raise ValueError("output-cloud geometry needs at least two repeats per timestep") + + sample_count, timestep_count, repeat_count = responses_f32.shape[:3] + flat = responses_f32.reshape(sample_count, timestep_count, repeat_count, -1) + names: list[str] = [] + columns: list[np.ndarray] = [] + + pair_distances = [] + for left in range(repeat_count): + for right in range(left + 1, repeat_count): + pair_distances.append(np.sqrt(np.mean((flat[:, :, left] - flat[:, :, right]) ** 2, axis=2))) + pair_rmse = np.stack(pair_distances, axis=2).mean(axis=2).astype(np.float32) + for idx, axis_value in enumerate(axis_values.tolist()): + names.append(f"within_timestep_pair_rmse_{int(axis_value)}") + columns.append(pair_rmse[:, idx]) + names.extend( + [ + "within_timestep_pair_rmse_mean", + "within_timestep_pair_rmse_std", + "within_timestep_pair_rmse_slope", + ] + ) + columns.extend([pair_rmse.mean(axis=1), pair_rmse.std(axis=1), _slope(pair_rmse, axis_values)]) + + centroids = flat.mean(axis=2) + centroid_pairs: list[np.ndarray] = [] + centroid_pair_names: list[str] = [] + for left in range(timestep_count): + for right in range(left + 1, timestep_count): + distance = np.sqrt(np.mean((centroids[:, left] - centroids[:, right]) ** 2, axis=1)).astype(np.float32) + centroid_pairs.append(distance) + centroid_pair_names.append( + f"centroid_rmse_{int(axis_values[left])}_{int(axis_values[right])}" + ) + centroid_pair_matrix = np.stack(centroid_pairs, axis=1) + names.extend(centroid_pair_names) + columns.extend(centroid_pairs) + names.extend(["centroid_rmse_mean", "centroid_rmse_std"]) + columns.extend([centroid_pair_matrix.mean(axis=1), centroid_pair_matrix.std(axis=1)]) + + cloud_flat = flat.reshape(sample_count, timestep_count * repeat_count, -1) + cloud_trace, cloud_top_share = _cloud_eigen_features(cloud_flat) + names.extend(["cloud_pca_trace", "cloud_pca_top_share"]) + columns.extend([cloud_trace, cloud_top_share]) + + features = np.stack(columns, axis=1).astype(np.float32) + return features, names + + +def _simple_candidates(labels: np.ndarray, features: np.ndarray, names: list[str], *, seed: int) -> list[dict[str, Any]]: + candidates: list[dict[str, Any]] = [] + for idx, name in enumerate(names): + raw_scores = features[:, idx] + for orientation, scores in ( + ("negative_higher_is_member", -raw_scores), + ("positive_higher_is_member", raw_scores), + ): + candidates.append( + { + "name": name, + "orientation": orientation, + "metrics": score_metrics(labels, scores), + } + ) + return candidates + + +def _best_by_low_fpr(candidates: list[dict[str, Any]]) -> dict[str, Any]: + return max( + candidates, + key=lambda item: ( + float(item["metrics"]["tpr_at_1pct_fpr"]), + float(item["metrics"]["tpr_at_0_1pct_fpr"]), + float(item["metrics"]["auc"]), + ), + ) + + +def parse_args() -> argparse.Namespace: + parser = argparse.ArgumentParser( + description="Review output-output cloud geometry on an existing H2 response-cache.npz." + ) + parser.add_argument("--response-cache", type=Path, required=True) + parser.add_argument("--output", type=Path, required=True) + parser.add_argument("--seed", type=int, default=176) + parser.add_argument("--holdout-repeats", type=int, default=7) + parser.add_argument("--bootstrap-iters", type=int, default=200) + parser.add_argument( + "--shuffle-labels", + action="store_true", + help="Run the scorer after a seeded label permutation as a leakage sanity check.", + ) + return parser.parse_args() + + +def main() -> int: + args = parse_args() + cache = np.load(args.response_cache) + labels = cache["labels"].astype(np.int64) + label_mode = "original" + if args.shuffle_labels: + labels = np.random.default_rng(args.seed).permutation(labels) + label_mode = f"shuffled_seed_{args.seed}" + if "timesteps" not in cache.files: + raise KeyError("output-cloud geometry review expects a timestep-based H2 cache") + timesteps = cache["timesteps"].astype(np.int64) + responses = cache["responses"].astype(np.float32) + features, feature_names = compute_output_cloud_features(responses, timesteps) + + simple = _simple_candidates(labels, features, feature_names, seed=args.seed) + best_auc = max(simple, key=lambda item: float(item["metrics"]["auc"])) + best_low = _best_by_low_fpr(simple) + logistic = evaluate_logistic_holdout( + labels, + features, + seed=args.seed, + repeats=args.holdout_repeats, + bootstrap_iters=args.bootstrap_iters, + ) + + raw_h2_metrics = None + lowpass_h2_metrics = None + summary_path = args.response_cache.with_name("summary.json") + if summary_path.exists(): + summary = json.loads(summary_path.read_text(encoding="utf-8")) + raw_h2_metrics = summary.get("raw_h2", {}).get("logistic", {}).get("aggregate_metrics") + lowpass_h2_metrics = summary.get("lowpass_h2", {}).get("logistic", {}).get("aggregate_metrics") + + logistic_metrics = logistic["aggregate_metrics"] + verdict = "weak_non_complementary_output_cloud_geometry" + if args.shuffle_labels: + verdict = "label_shuffle_sanity_random_level" + elif ( + float(logistic_metrics["tpr_at_0_1pct_fpr"]) > 0 + and raw_h2_metrics is not None + and float(logistic_metrics["auc"]) >= float(raw_h2_metrics["auc"]) - 0.03 + ): + verdict = "candidate_complementary_output_cloud_geometry" + + result: dict[str, Any] = { + "status": "ready", + "track": "black-box", + "method": "H2 output-cloud geometry scorer", + "mode": "cpu-cache-review", + "response_cache": str(args.response_cache), + "inputs": { + "sample_count": int(labels.shape[0]), + "member_count": int((labels == 1).sum()), + "nonmember_count": int((labels == 0).sum()), + "timesteps": [int(value) for value in timesteps.tolist()], + "repeat_count": int(responses.shape[2]), + "feature_count": int(features.shape[1]), + "feature_names": feature_names, + "seed": int(args.seed), + "label_mode": label_mode, + "holdout_repeats": int(args.holdout_repeats), + "bootstrap_iters": int(args.bootstrap_iters), + }, + "simple": { + "best_by_auc": { + "name": best_auc["name"], + "orientation": best_auc["orientation"], + "metrics": best_auc["metrics"], + }, + "best_by_low_fpr": { + "name": best_low["name"], + "orientation": best_low["orientation"], + "metrics": best_low["metrics"], + }, + }, + "logistic": { + "aggregate_metrics": logistic_metrics, + "aggregate_ci95": logistic["aggregate_ci95"], + "mean_coefficients": logistic["mean_coefficients"], + "prediction_count": logistic["prediction_count"], + }, + "comparison": { + "raw_h2_logistic": raw_h2_metrics if not args.shuffle_labels else None, + "lowpass_h2_logistic": lowpass_h2_metrics if not args.shuffle_labels else None, + "output_cloud_minus_raw_h2": metric_delta(logistic_metrics, raw_h2_metrics) + if raw_h2_metrics is not None and not args.shuffle_labels + else None, + "output_cloud_minus_lowpass_h2": metric_delta(logistic_metrics, lowpass_h2_metrics) + if lowpass_h2_metrics is not None and not args.shuffle_labels + else None, + }, + "decision_gate": { + "uses_only_output_output_geometry": True, + "does_not_generate_new_responses": True, + "nonzero_strict_tail": bool(float(logistic_metrics["tpr_at_0_1pct_fpr"]) > 0), + "beats_best_simple_low_fpr": bool( + float(logistic_metrics["tpr_at_1pct_fpr"]) + > float(best_low["metrics"]["tpr_at_1pct_fpr"]) + and float(logistic_metrics["tpr_at_0_1pct_fpr"]) + >= float(best_low["metrics"]["tpr_at_0_1pct_fpr"]) + ), + "reopen_allowed": False, + "requires_reseeded_or_interleaved_cache_before_promotion": True, + }, + "verdict": verdict, + "notes": [ + "This is a CPU-only scorer review on an existing H2 response cache.", + "It intentionally excludes seed-to-output distance features so it cannot collapse back into H2 simple distance.", + "A positive result is candidate-only until reseeded or interleaved response-cache controls rule out class-ordered sampling effects.", + "Do not expand this cache into KDE, shadow density, repeat-count, or same-cache feature sweeps.", + ], + } + args.output.parent.mkdir(parents=True, exist_ok=True) + args.output.write_text(json.dumps(_sanitize(result), indent=2, ensure_ascii=True), encoding="utf-8") + print(json.dumps(_sanitize(result), indent=2, ensure_ascii=False)) + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/tests/test_review_h2_output_cloud_geometry_script.py b/tests/test_review_h2_output_cloud_geometry_script.py new file mode 100644 index 00000000..e5c387f1 --- /dev/null +++ b/tests/test_review_h2_output_cloud_geometry_script.py @@ -0,0 +1,111 @@ +from __future__ import annotations + +import importlib.util +import json +import subprocess +import sys +import tempfile +import unittest +from pathlib import Path + +import numpy as np + + +def _load_script_module(): + repo_root = Path(__file__).resolve().parents[1] + script_path = repo_root / "scripts" / "review_h2_output_cloud_geometry.py" + spec = importlib.util.spec_from_file_location("review_h2_output_cloud_geometry", script_path) + if spec is None or spec.loader is None: + raise RuntimeError("Could not load review_h2_output_cloud_geometry.py") + module = importlib.util.module_from_spec(spec) + spec.loader.exec_module(module) + return module + + +class ReviewH2OutputCloudGeometryScriptTests(unittest.TestCase): + def test_compute_features_uses_all_repeat_pairs(self) -> None: + module = _load_script_module() + responses = np.asarray( + [ + [ + [[[0.0]], [[2.0]], [[4.0]]], + [[[1.0]], [[1.0]], [[1.0]]], + ], + [ + [[[1.0]], [[3.0]], [[9.0]]], + [[[4.0]], [[8.0]], [[12.0]]], + ], + ], + dtype=np.float32, + ) + responses = responses.reshape(2, 2, 3, 1, 1, 1) + + features, names = module.compute_output_cloud_features(responses, np.asarray([10, 20])) + + self.assertEqual(features.shape, (2, 10)) + self.assertEqual(names[0], "within_timestep_pair_rmse_10") + self.assertEqual(names[1], "within_timestep_pair_rmse_20") + self.assertEqual(names[-2:], ["cloud_pca_trace", "cloud_pca_top_share"]) + np.testing.assert_allclose(features[:, 0], np.asarray([8.0 / 3.0, 16.0 / 3.0]), rtol=1e-6) + np.testing.assert_allclose(features[:, 1], np.asarray([0.0, 16.0 / 3.0]), rtol=1e-6) + + def test_compute_features_rejects_single_repeat_cache(self) -> None: + module = _load_script_module() + responses = np.zeros((2, 2, 1, 1, 1, 1), dtype=np.float32) + + with self.assertRaisesRegex(ValueError, "at least two repeats"): + module.compute_output_cloud_features(responses, np.asarray([10, 20])) + + def test_shuffle_label_mode_writes_sanity_payload(self) -> None: + repo_root = Path(__file__).resolve().parents[1] + with tempfile.TemporaryDirectory() as tmpdir: + tmp = Path(tmpdir) + cache = tmp / "response-cache.npz" + summary = tmp / "summary.json" + output = tmp / "shuffle-review.json" + labels = np.asarray([1] * 8 + [0] * 8, dtype=np.int64) + responses = np.zeros((16, 2, 2, 1, 1, 1), dtype=np.float32) + responses[:, 0, 0, 0, 0, 0] = np.linspace(0.0, 1.5, 16) + responses[:, 0, 1, 0, 0, 0] = np.linspace(0.2, 1.7, 16) + responses[:, 1, 0, 0, 0, 0] = np.linspace(0.4, 1.9, 16) + responses[:, 1, 1, 0, 0, 0] = np.linspace(0.7, 2.2, 16) + np.savez_compressed(cache, labels=labels, timesteps=np.asarray([40, 80]), responses=responses) + summary.write_text( + json.dumps({"raw_h2": {"logistic": {"aggregate_metrics": {"auc": 0.9}}}}), + encoding="utf-8", + ) + + completed = subprocess.run( + [ + sys.executable, + "-X", + "utf8", + "scripts/review_h2_output_cloud_geometry.py", + "--response-cache", + str(cache), + "--output", + str(output), + "--seed", + "176", + "--holdout-repeats", + "2", + "--bootstrap-iters", + "0", + "--shuffle-labels", + ], + check=False, + capture_output=True, + text=True, + cwd=repo_root, + ) + + self.assertEqual(completed.returncode, 0, completed.stderr) + payload = json.loads(output.read_text(encoding="utf-8")) + self.assertEqual(payload["inputs"]["label_mode"], "shuffled_seed_176") + self.assertEqual(payload["verdict"], "label_shuffle_sanity_random_level") + self.assertIsNone(payload["comparison"]["raw_h2_logistic"]) + self.assertIsNone(payload["comparison"]["output_cloud_minus_raw_h2"]) + + +if __name__ == "__main__": + unittest.main() diff --git a/workspaces/black-box/README.md b/workspaces/black-box/README.md index 6297a817..aafee7dd 100644 --- a/workspaces/black-box/README.md +++ b/workspaces/black-box/README.md @@ -5,6 +5,13 @@ - 方向:黑盒成员推断攻击。 - 主要方法:`recon` 是已准入的黑盒产品行,也是选定用于有限尾部置信度加固的审计线路。 - 支撑方法:`CLiD`、`variation`、`H2 response-strength` 以及语义辅助分类器。 +- H2 output-cloud geometry 状态:复用既有 H2 `512 / 512` response cache 的 + CPU-only review 发现强候选信号,logistic `AUC = 0.961529`、 + `TPR@1%FPR = 0.333984`、`TPR@0.1%FPR = 0.117188`;seed `177` 仍稳定, + label-shuffle sanity 回到随机级。但源 cache 使用 class-ordered sample offset, + 所以该结果必须先过 reseeded / interleaved order-control cache,才能讨论晋升。 + 不要把它扩成 KDE、shadow density、repeat-count 或同 cache feature sweep; + 不要新增 Platform/Runtime schema、runner 或 admitted bundle row。 - 已导入候选工件:协作者移交的 Stable Diffusion ReDiffuse 结果包现通过 `diffaudit probe-rediffuse-sd-artifacts` 进行审计。导入的 `5000` 行 `2500 / 2500` 包重放结果为 `AUC = 0.710319` 和 `ASR = 0.6846`,因此值得保留作为候选证据。同一导入子集现在也支持 @@ -41,6 +48,9 @@ 当前 H2 候选边界: [../../docs/evidence/black-box-response-strength-preflight.md](../../docs/evidence/black-box-response-strength-preflight.md)。 +当前 H2 output-cloud geometry 候选: +[../../docs/evidence/h2-output-cloud-geometry-20260525.md](../../docs/evidence/h2-output-cloud-geometry-20260525.md)。 + 当前中频同噪声残差预检: [../../docs/evidence/midfreq-same-noise-residual-preflight-20260512.md](../../docs/evidence/midfreq-same-noise-residual-preflight-20260512.md)。 diff --git a/workspaces/black-box/artifacts/h2-output-cloud-geometry-20260525.json b/workspaces/black-box/artifacts/h2-output-cloud-geometry-20260525.json new file mode 100644 index 00000000..30087312 --- /dev/null +++ b/workspaces/black-box/artifacts/h2-output-cloud-geometry-20260525.json @@ -0,0 +1,166 @@ +{ + "status": "ready", + "track": "black-box", + "method": "H2 output-cloud geometry scorer", + "mode": "cpu-cache-review", + "response_cache": "workspaces\\black-box\\runs\\h2-response-strength-512-20260501-r1\\response-cache.npz", + "inputs": { + "sample_count": 1024, + "member_count": 512, + "nonmember_count": 512, + "timesteps": [ + 40, + 80, + 120, + 160 + ], + "repeat_count": 2, + "feature_count": 17, + "feature_names": [ + "within_timestep_pair_rmse_40", + "within_timestep_pair_rmse_80", + "within_timestep_pair_rmse_120", + "within_timestep_pair_rmse_160", + "within_timestep_pair_rmse_mean", + "within_timestep_pair_rmse_std", + "within_timestep_pair_rmse_slope", + "centroid_rmse_40_80", + "centroid_rmse_40_120", + "centroid_rmse_40_160", + "centroid_rmse_80_120", + "centroid_rmse_80_160", + "centroid_rmse_120_160", + "centroid_rmse_mean", + "centroid_rmse_std", + "cloud_pca_trace", + "cloud_pca_top_share" + ], + "seed": 176, + "label_mode": "original", + "holdout_repeats": 7, + "bootstrap_iters": 200 + }, + "simple": { + "best_by_auc": { + "name": "centroid_rmse_40_160", + "orientation": "negative_higher_is_member", + "metrics": { + "auc": 0.801182, + "asr": 0.739258, + "tpr_at_1pct_fpr": 0.03125, + "tpr_at_0_1pct_fpr": 0.005859, + "member_score_mean": -0.035516, + "nonmember_score_mean": -0.0456 + } + }, + "best_by_low_fpr": { + "name": "cloud_pca_top_share", + "orientation": "negative_higher_is_member", + "metrics": { + "auc": 0.650913, + "asr": 0.618164, + "tpr_at_1pct_fpr": 0.078125, + "tpr_at_0_1pct_fpr": 0.017578, + "member_score_mean": -0.26242, + "nonmember_score_mean": -0.275406 + } + } + }, + "logistic": { + "aggregate_metrics": { + "auc": 0.961529, + "asr": 0.900391, + "tpr_at_1pct_fpr": 0.333984, + "tpr_at_0_1pct_fpr": 0.117188, + "member_score_mean": 0.829101, + "nonmember_score_mean": 0.171006 + }, + "aggregate_ci95": { + "auc": { + "p025": 0.950939, + "p975": 0.972625 + }, + "asr": { + "p025": 0.887671, + "p975": 0.920922 + }, + "tpr_at_1pct_fpr": { + "p025": 0.202197, + "p975": 0.617334 + }, + "tpr_at_0_1pct_fpr": { + "p025": 0.093701, + "p975": 0.294971 + } + }, + "mean_coefficients": [ + 2.430795, + 0.320931, + 0.262193, + 0.973249, + 0.860887, + -0.599108, + -0.058332, + 0.20929, + -3.265317, + -4.968652, + 0.839256, + 0.448905, + 2.486766, + -0.861475, + -0.441132, + -0.35279, + -0.293748 + ], + "prediction_count": { + "min": 7, + "max": 7, + "mean": 7.0 + } + }, + "comparison": { + "raw_h2_logistic": { + "auc": 0.905693, + "asr": 0.841797, + "tpr_at_1pct_fpr": 0.134766, + "tpr_at_0_1pct_fpr": 0.0, + "member_score_mean": 0.743293, + "nonmember_score_mean": 0.256827 + }, + "lowpass_h2_logistic": { + "auc": 0.895679, + "asr": 0.831055, + "tpr_at_1pct_fpr": 0.148438, + "tpr_at_0_1pct_fpr": 0.025391, + "member_score_mean": 0.735716, + "nonmember_score_mean": 0.264013 + }, + "output_cloud_minus_raw_h2": { + "auc": 0.055836, + "asr": 0.058594, + "tpr_at_1pct_fpr": 0.199218, + "tpr_at_0_1pct_fpr": 0.117188 + }, + "output_cloud_minus_lowpass_h2": { + "auc": 0.06585, + "asr": 0.069336, + "tpr_at_1pct_fpr": 0.185546, + "tpr_at_0_1pct_fpr": 0.091797 + } + }, + "decision_gate": { + "uses_only_output_output_geometry": true, + "does_not_generate_new_responses": true, + "nonzero_strict_tail": true, + "beats_best_simple_low_fpr": true, + "reopen_allowed": false, + "requires_reseeded_or_interleaved_cache_before_promotion": true + }, + "verdict": "candidate_complementary_output_cloud_geometry", + "notes": [ + "This is a CPU-only scorer review on an existing H2 response cache.", + "It intentionally excludes seed-to-output distance features so it cannot collapse back into H2 simple distance.", + "A positive result is candidate-only until reseeded or interleaved response-cache controls rule out class-ordered sampling effects.", + "Do not expand this cache into KDE, shadow density, repeat-count, or same-cache feature sweeps." + ] +} \ No newline at end of file diff --git a/workspaces/black-box/artifacts/h2-output-cloud-geometry-label-shuffle-20260525.json b/workspaces/black-box/artifacts/h2-output-cloud-geometry-label-shuffle-20260525.json new file mode 100644 index 00000000..d1bfc4b4 --- /dev/null +++ b/workspaces/black-box/artifacts/h2-output-cloud-geometry-label-shuffle-20260525.json @@ -0,0 +1,142 @@ +{ + "status": "ready", + "track": "black-box", + "method": "H2 output-cloud geometry scorer", + "mode": "cpu-cache-review", + "response_cache": "workspaces\\black-box\\runs\\h2-response-strength-512-20260501-r1\\response-cache.npz", + "inputs": { + "sample_count": 1024, + "member_count": 512, + "nonmember_count": 512, + "timesteps": [ + 40, + 80, + 120, + 160 + ], + "repeat_count": 2, + "feature_count": 17, + "feature_names": [ + "within_timestep_pair_rmse_40", + "within_timestep_pair_rmse_80", + "within_timestep_pair_rmse_120", + "within_timestep_pair_rmse_160", + "within_timestep_pair_rmse_mean", + "within_timestep_pair_rmse_std", + "within_timestep_pair_rmse_slope", + "centroid_rmse_40_80", + "centroid_rmse_40_120", + "centroid_rmse_40_160", + "centroid_rmse_80_120", + "centroid_rmse_80_160", + "centroid_rmse_120_160", + "centroid_rmse_mean", + "centroid_rmse_std", + "cloud_pca_trace", + "cloud_pca_top_share" + ], + "seed": 176, + "label_mode": "shuffled_seed_176", + "holdout_repeats": 7, + "bootstrap_iters": 100 + }, + "simple": { + "best_by_auc": { + "name": "within_timestep_pair_rmse_std", + "orientation": "negative_higher_is_member", + "metrics": { + "auc": 0.522099, + "asr": 0.53418, + "tpr_at_1pct_fpr": 0.015625, + "tpr_at_0_1pct_fpr": 0.001953, + "member_score_mean": -0.01289, + "nonmember_score_mean": -0.013093 + } + }, + "best_by_low_fpr": { + "name": "within_timestep_pair_rmse_40", + "orientation": "positive_higher_is_member", + "metrics": { + "auc": 0.483044, + "asr": 0.518555, + "tpr_at_1pct_fpr": 0.033203, + "tpr_at_0_1pct_fpr": 0.021484, + "member_score_mean": 0.027483, + "nonmember_score_mean": 0.027669 + } + } + }, + "logistic": { + "aggregate_metrics": { + "auc": 0.507595, + "asr": 0.521484, + "tpr_at_1pct_fpr": 0.011719, + "tpr_at_0_1pct_fpr": 0.003906, + "member_score_mean": 0.500857, + "nonmember_score_mean": 0.49935 + }, + "aggregate_ci95": { + "auc": { + "p025": 0.479549, + "p975": 0.540294 + }, + "asr": { + "p025": 0.514136, + "p975": 0.551343 + }, + "tpr_at_1pct_fpr": { + "p025": 0.004834, + "p975": 0.041406 + }, + "tpr_at_0_1pct_fpr": { + "p025": 0.0, + "p975": 0.011719 + } + }, + "mean_coefficients": [ + -0.014965, + 0.099394, + 0.015372, + 0.004088, + 0.026973, + -0.264673, + -0.009396, + -0.70348, + -0.126196, + 0.492217, + 0.100179, + -0.049061, + -0.155517, + -0.024371, + -0.286773, + 0.775695, + 0.077593 + ], + "prediction_count": { + "min": 7, + "max": 7, + "mean": 7.0 + } + }, + "comparison": { + "raw_h2_logistic": null, + "lowpass_h2_logistic": null, + "output_cloud_minus_raw_h2": null, + "output_cloud_minus_lowpass_h2": null + }, + "decision_gate": { + "uses_only_output_output_geometry": true, + "does_not_generate_new_responses": true, + "nonzero_strict_tail": true, + "beats_best_simple_low_fpr": false, + "reopen_allowed": false, + "requires_reseeded_or_interleaved_cache_before_promotion": true + }, + "verdict": "label_shuffle_sanity_random_level", + "notes": [ + "This is a CPU-only scorer review on an existing H2 response cache.", + "It intentionally excludes seed-to-output distance features so it cannot collapse back into H2 simple distance.", + "A positive result is candidate-only until reseeded or interleaved response-cache controls rule out class-ordered sampling effects.", + "Do not expand this cache into KDE, shadow density, repeat-count, or same-cache feature sweeps." + ] +} \ No newline at end of file diff --git a/workspaces/black-box/artifacts/h2-output-cloud-geometry-seed177-20260525.json b/workspaces/black-box/artifacts/h2-output-cloud-geometry-seed177-20260525.json new file mode 100644 index 00000000..d46ffff1 --- /dev/null +++ b/workspaces/black-box/artifacts/h2-output-cloud-geometry-seed177-20260525.json @@ -0,0 +1,166 @@ +{ + "status": "ready", + "track": "black-box", + "method": "H2 output-cloud geometry scorer", + "mode": "cpu-cache-review", + "response_cache": "workspaces\\black-box\\runs\\h2-response-strength-512-20260501-r1\\response-cache.npz", + "inputs": { + "sample_count": 1024, + "member_count": 512, + "nonmember_count": 512, + "timesteps": [ + 40, + 80, + 120, + 160 + ], + "repeat_count": 2, + "feature_count": 17, + "feature_names": [ + "within_timestep_pair_rmse_40", + "within_timestep_pair_rmse_80", + "within_timestep_pair_rmse_120", + "within_timestep_pair_rmse_160", + "within_timestep_pair_rmse_mean", + "within_timestep_pair_rmse_std", + "within_timestep_pair_rmse_slope", + "centroid_rmse_40_80", + "centroid_rmse_40_120", + "centroid_rmse_40_160", + "centroid_rmse_80_120", + "centroid_rmse_80_160", + "centroid_rmse_120_160", + "centroid_rmse_mean", + "centroid_rmse_std", + "cloud_pca_trace", + "cloud_pca_top_share" + ], + "seed": 177, + "label_mode": "original", + "holdout_repeats": 7, + "bootstrap_iters": 100 + }, + "simple": { + "best_by_auc": { + "name": "centroid_rmse_40_160", + "orientation": "negative_higher_is_member", + "metrics": { + "auc": 0.801182, + "asr": 0.739258, + "tpr_at_1pct_fpr": 0.03125, + "tpr_at_0_1pct_fpr": 0.005859, + "member_score_mean": -0.035516, + "nonmember_score_mean": -0.0456 + } + }, + "best_by_low_fpr": { + "name": "cloud_pca_top_share", + "orientation": "negative_higher_is_member", + "metrics": { + "auc": 0.650913, + "asr": 0.618164, + "tpr_at_1pct_fpr": 0.078125, + "tpr_at_0_1pct_fpr": 0.017578, + "member_score_mean": -0.26242, + "nonmember_score_mean": -0.275406 + } + } + }, + "logistic": { + "aggregate_metrics": { + "auc": 0.961048, + "asr": 0.900391, + "tpr_at_1pct_fpr": 0.353516, + "tpr_at_0_1pct_fpr": 0.130859, + "member_score_mean": 0.829446, + "nonmember_score_mean": 0.170744 + }, + "aggregate_ci95": { + "auc": { + "p025": 0.948315, + "p975": 0.969186 + }, + "asr": { + "p025": 0.887183, + "p975": 0.919458 + }, + "tpr_at_1pct_fpr": { + "p025": 0.208887, + "p975": 0.542041 + }, + "tpr_at_0_1pct_fpr": { + "p025": 0.106397, + "p975": 0.296338 + } + }, + "mean_coefficients": [ + 2.431525, + 0.319316, + 0.265638, + 0.973178, + 0.861645, + -0.602296, + -0.057459, + 0.220761, + -3.269253, + -4.971886, + 0.839791, + 0.442868, + 2.490737, + -0.861653, + -0.443531, + -0.354449, + -0.290938 + ], + "prediction_count": { + "min": 7, + "max": 7, + "mean": 7.0 + } + }, + "comparison": { + "raw_h2_logistic": { + "auc": 0.905693, + "asr": 0.841797, + "tpr_at_1pct_fpr": 0.134766, + "tpr_at_0_1pct_fpr": 0.0, + "member_score_mean": 0.743293, + "nonmember_score_mean": 0.256827 + }, + "lowpass_h2_logistic": { + "auc": 0.895679, + "asr": 0.831055, + "tpr_at_1pct_fpr": 0.148438, + "tpr_at_0_1pct_fpr": 0.025391, + "member_score_mean": 0.735716, + "nonmember_score_mean": 0.264013 + }, + "output_cloud_minus_raw_h2": { + "auc": 0.055355, + "asr": 0.058594, + "tpr_at_1pct_fpr": 0.21875, + "tpr_at_0_1pct_fpr": 0.130859 + }, + "output_cloud_minus_lowpass_h2": { + "auc": 0.065369, + "asr": 0.069336, + "tpr_at_1pct_fpr": 0.205078, + "tpr_at_0_1pct_fpr": 0.105468 + } + }, + "decision_gate": { + "uses_only_output_output_geometry": true, + "does_not_generate_new_responses": true, + "nonzero_strict_tail": true, + "beats_best_simple_low_fpr": true, + "reopen_allowed": false, + "requires_reseeded_or_interleaved_cache_before_promotion": true + }, + "verdict": "candidate_complementary_output_cloud_geometry", + "notes": [ + "This is a CPU-only scorer review on an existing H2 response cache.", + "It intentionally excludes seed-to-output distance features so it cannot collapse back into H2 simple distance.", + "A positive result is candidate-only until reseeded or interleaved response-cache controls rule out class-ordered sampling effects.", + "Do not expand this cache into KDE, shadow density, repeat-count, or same-cache feature sweeps." + ] +} \ No newline at end of file diff --git a/workspaces/black-box/plan.md b/workspaces/black-box/plan.md index 687225ad..4322b370 100644 --- a/workspaces/black-box/plan.md +++ b/workspaces/black-box/plan.md @@ -36,7 +36,12 @@ not selected for GPU. - `H2 response-strength`: candidate-only with positive non-overlap signal; frozen lowpass follow-up is positive-but-bounded on `DDPM/CIFAR10`; SD/CelebA - text-to-image transfer is protocol-blocked. + text-to-image transfer is protocol-blocked. The 2026-05-25 output-cloud + geometry cache review found a stronger output-output candidate signal + (`AUC = 0.961529`, `TPR@0.1%FPR = 0.117188`) and a random-level label-shuffle + sanity check, but it remains candidate-only until a reseeded or interleaved + order-control cache preserves the signal. Do not promote it into Platform or + Runtime runners from the existing cache. - `simple image-to-image distance`: bounded single-asset evidence on SD1.5/CelebA; not a product row and not portability evidence. - `mid-frequency same-noise residual`: distinct paper-backed observable gap; @@ -69,15 +74,18 @@ ## Next Action -No black-box GPU or CPU sidecar is selected. The next action belongs to the -root long-horizon queue: continue Lane A only with a non-duplicate asset that -has exact target identity, member/nonmember split artifacts, and response or -score coverage. The imported Stable Diffusion ReDiffuse collaborator artifact, -CLiD gated ZIP, CopyMark `laion_mi`, and CopyMark `laion_ridar` do not satisfy -that gate by themselves, so preserve them as support/candidate evidence instead -of turning them into rerun tasks. Do not reopen CommonCanvas, Beans, -Fashion-MNIST, MIDST, or same-contract mid-frequency residual variants unless -a genuinely new artifact or observable changes the decision gate. +No black-box GPU or CPU sidecar is selected. The only H2 output-cloud reopen +path is one bounded reseeded or interleaved order-control response-cache scout; +until that exists, preserve the signal as candidate evidence instead of +turning it into same-cache feature work. The broader root long-horizon queue +still continues Lane A only with a non-duplicate asset that has exact target +identity, member/nonmember split artifacts, and response or score coverage. +The imported Stable Diffusion ReDiffuse collaborator artifact, CLiD gated ZIP, +CopyMark `laion_mi`, and CopyMark `laion_ridar` do not satisfy that gate by +themselves, so preserve them as support/candidate evidence instead of turning +them into rerun tasks. Do not reopen CommonCanvas, Beans, Fashion-MNIST, +MIDST, or same-contract mid-frequency residual variants unless a genuinely new +artifact or observable changes the decision gate. ## Current Status