From 7886254380bd340a2d3dc78021756621feb5dc24 Mon Sep 17 00:00:00 2001
From: Delicious233 <delicious233@hnu.edu.cn>
Date: Mon, 25 May 2026 06:22:25 +0800
Subject: [PATCH] docs: record h2 order-control seed stability

---
 .gitignore                                    |   2 +
 AGENTS.md                                     |   2 +-
 ROADMAP.md                                    |  11 +-
 .../h2-output-cloud-geometry-20260525.md      |  52 +++++-
 docs/evidence/reproduction-status.md          |   2 +-
 docs/evidence/workspace-evidence-index.md     |   9 +-
 workspaces/black-box/README.md                |   7 +-
 ...-shared-position-seed177-256-20260525.json | 166 ++++++++++++++++++
 ...on-seed177-256-label-shuffle-20260525.json | 142 +++++++++++++++
 workspaces/black-box/plan.md                  |   9 +-
 10 files changed, 385 insertions(+), 17 deletions(-)
 create mode 100644 workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-seed177-256-20260525.json
 create mode 100644 workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-seed177-256-label-shuffle-20260525.json

diff --git a/.gitignore b/.gitignore
index 63fd119d..f0ad526a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -147,6 +147,8 @@ workspaces/**/artifacts/**
 !workspaces/black-box/artifacts/h2-output-cloud-geometry-label-shuffle-20260525.json
 !workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-256-20260525.json
 !workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-256-label-shuffle-20260525.json
+!workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-seed177-256-20260525.json
+!workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-seed177-256-label-shuffle-20260525.json
 !workspaces/black-box/artifacts/h2-output-cloud-geometry-class-ordered-subset-256-20260525.json
 !workspaces/black-box/artifacts/h2-output-cloud-geometry-class-ordered-subset-256-label-shuffle-20260525.json
 !workspaces/black-box/artifacts/beans-lora-member-denoising-loss-scout-20260513.json
diff --git a/AGENTS.md b/AGENTS.md
index 1d833ab9..e25fa8f1 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -28,7 +28,7 @@ Do not start from memory or old chat context. Re-anchor on repository files.
 
 ## Current Operating State
 
-- Active work: `2026-05-25 H2 output-cloud geometry is the latest metric verdict. It is a strong Research-side candidate on the existing H2 response cache (seed 176 logistic AUC = 0.961529, TPR@1%FPR = 0.333984, TPR@0.1%FPR = 0.117188; seed 177 AUC = 0.961048; label-shuffle AUC = 0.507595). The bounded 256/256 shared-position order-control scout preserved the signal (AUC = 0.967819, TPR@1%FPR = 0.410156, TPR@0.1%FPR = 0.132812; label-shuffle AUC = 0.464066), so class-ordered seed offset is not a sufficient explanation. It is still not admitted because this remains Research-side H2 response-cache geometry, not a second public asset or Platform/Runtime contract. Do not create Platform/Runtime schema, bundle export, UI type, runner, KDE/shadow-density/repeat-count sweeps, same-cache feature sweeps, or a full 512/512 rerun just to complete a table. active_gpu_question = none; next_gpu_candidate = none; CPU sidecar = none selected after H2 output-cloud order-control scout. Feature-packet consumer lane remains deferred. LeakyCLIP remains CLIP / multimodal privacy watch-plus. ReDiffuse DDPM/STL-10 remains closed by default after the weak bounded scout (AUC = 0.4996337890625) and weak SimA-style score-norm scorer (AUC = 0.5052947998046875).`
+- Active work: `2026-05-25 H2 output-cloud geometry is the latest metric verdict. It is a strong Research-side candidate on the existing H2 response cache (seed 176 logistic AUC = 0.961529, TPR@1%FPR = 0.333984, TPR@0.1%FPR = 0.117188; seed 177 AUC = 0.961048; label-shuffle AUC = 0.507595). The bounded 256/256 shared-position order-control scout preserved the signal (AUC = 0.967819, TPR@1%FPR = 0.410156, TPR@0.1%FPR = 0.132812; label-shuffle AUC = 0.464066), so class-ordered seed offset is not a sufficient explanation. The same controlled boundary at seed 177 remains strong (AUC = 0.956192, TPR@1%FPR = 0.285156, TPR@0.1%FPR = 0.109375; label-shuffle AUC = 0.484070), so the controlled signal is not single-seed. It is still not admitted because this remains Research-side H2 response-cache geometry, not a second public asset or Platform/Runtime contract. Do not create Platform/Runtime schema, bundle export, UI type, runner, KDE/shadow-density/repeat-count sweeps, same-cache feature sweeps, or a full 512/512 rerun just to complete a table. active_gpu_question = none; next_gpu_candidate = none; CPU sidecar = none selected after H2 output-cloud order-control seed-stability scout. Feature-packet consumer lane remains deferred. LeakyCLIP remains CLIP / multimodal privacy watch-plus. ReDiffuse DDPM/STL-10 remains closed by default after the weak bounded scout (AUC = 0.4996337890625) and weak SimA-style score-norm scorer (AUC = 0.5052947998046875).`
 - Next GPU candidate: none selected
 - Long-horizon control: follow `ROADMAP.md` section
   `Long-Horizon Research Task Board（2026-05-13 起）` before reopening any
diff --git a/ROADMAP.md b/ROADMAP.md
index 45421444..22359198 100644
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -5,11 +5,13 @@
 ## 2026-05-25 H2 output-cloud geometry 候选信号
 
 最新决策：H2 response-strength 的 output-cloud geometry 是 Research-side 强候选，
-并且已通过一个有界 `256 / 256` shared-position order-control scout；但它仍不晋升、
+并且已通过有界 `256 / 256` shared-position order-control 和 seed-stability scout；
+但它仍不晋升、
 不释放产品消费、不扩展同 cache 特征工程，也不默认补跑完整 `512 / 512`
 shared-position。第一轮复查读取既有
 `workspaces/black-box/runs/h2-response-strength-512-20260501-r1/response-cache.npz`；
-控制轮生成了本地 `256 / 256` shared-position cache，没有下载资产。
+控制轮生成了本地 `256 / 256` shared-position cache，稳定性轮只把 seed 从 `176`
+改成 `177`，没有下载资产。
 
 该 scorer 刻意排除 seed-to-output distance，只使用同 timestep repeat 间 RMSE、
 不同 timestep centroid RMSE 和 response-cloud Gram/PCA 特征。主结果为
@@ -27,12 +29,15 @@ logistic 仍为 `AUC = 0.967819`，`ASR = 0.923828`，
 回到随机级 `AUC = 0.464066`。同尺寸旧 class-ordered subset 为
 `AUC = 0.967438`，`TPR@1%FPR = 0.179688`，
 `TPR@0.1%FPR = 0.105469`。因此 class-ordered seed offset 不再是该强信号的充分解释。
+同边界 seed `177` shared-position scout 继续保持强信号：output-cloud logistic
+`AUC = 0.956192`，`ASR = 0.896484`，`TPR@1%FPR = 0.285156`，
+`TPR@0.1%FPR = 0.109375`；label-shuffle 回到随机级 `AUC = 0.484070`。
 
 该结果只能作为 Research-side 强候选；下一步不是同 cache sweep，也不是为了补表格跑
 完整 `512 / 512` shared-position。重新打开只应基于正式机制晋升、第二公开资产或独立消费合约。
 当前 slots 仍为：
 `active_gpu_question = none`，`next_gpu_candidate = none`，
-`CPU sidecar = none selected after H2 output-cloud order-control scout`。
+`CPU sidecar = none selected after H2 output-cloud order-control seed-stability scout`。
 See
 [docs/evidence/h2-output-cloud-geometry-20260525.md](docs/evidence/h2-output-cloud-geometry-20260525.md)。
 
diff --git a/docs/evidence/h2-output-cloud-geometry-20260525.md b/docs/evidence/h2-output-cloud-geometry-20260525.md
index 9487f8a6..2c8e20c0 100644
--- a/docs/evidence/h2-output-cloud-geometry-20260525.md
+++ b/docs/evidence/h2-output-cloud-geometry-20260525.md
@@ -1,7 +1,7 @@
 # H2 Output-Cloud Geometry Cache Review
 
 > Date: 2026-05-25
-> Status: candidate complementary signal / order-control scout passed / no admitted row / no 512/512 rerun selected
+> Status: candidate complementary signal / order-control scout passed / shared-position seed-stable / no admitted row / no 512/512 rerun selected
 
 ## Question
 
@@ -11,8 +11,9 @@ seed-to-output distance 的 membership 信号？
 第一轮只复用现有
 `workspaces/black-box/runs/h2-response-strength-512-20260501-r1/response-cache.npz`。
 随后只释放一个有界 `256 / 256` shared-position order-control scout，用来回答
-class-ordered seed-offset caveat。没有下载资产，也没有扩展同一路线的 KDE、shadow
-density、repeat-count 或特征 sweep。
+class-ordered seed-offset caveat；再释放一个同边界的 seed `177` 稳定性 scout，
+用来判断 order-control 后的强信号是否只是单 seed 现象。没有下载资产，也没有扩展
+同一路线的 KDE、shadow density、repeat-count 或特征 sweep。
 
 ## Contract
 
@@ -159,9 +160,51 @@ the signal. The result still does not imply product admission: it is one
 controlled scout on H2 DDPM/CIFAR10 response-cache geometry, not a second
 public asset or Platform/Runtime contract.
 
+## Shared-Position Seed-177 Stability Scout
+
+为避免把 order-control 结论建立在单个 random seed 上，下一步只跑同边界
+`256 / 256` shared-position seed `177`。运行边界不扩大：timesteps
+`40 / 80 / 120 / 160`，repeats `2`，holdout repeats `7`，bootstrap iters
+`100`。GPU scout 用时 `185.470864s`。
+
+Runner summary 的 H2 distance scorer：
+
+| Metric | Raw H2 logistic | Lowpass H2 logistic |
+| --- | ---: | ---: |
+| AUC | `0.911255` | `0.896698` |
+| ASR | `0.851562` | `0.828125` |
+| TPR@1%FPR | `0.113281` | `0.093750` |
+| TPR@0.1%FPR | `0.0` | `0.062500` |
+
+Output-cloud geometry review:
+`workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-seed177-256-20260525.json`
+
+| Metric | Shared-position seed `177` |
+| --- | ---: |
+| AUC | `0.956192` |
+| ASR | `0.896484` |
+| TPR@1%FPR | `0.285156` |
+| TPR@0.1%FPR | `0.109375` |
+
+Label-shuffle sanity:
+`workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-seed177-256-label-shuffle-20260525.json`
+
+| Metric | Seed `177` label shuffle |
+| --- | ---: |
+| AUC | `0.484070` |
+| ASR | `0.513672` |
+| TPR@1%FPR | `0.023438` |
+| TPR@0.1%FPR | `0.011719` |
+
+Interpretation: the shared-position output-cloud signal remains strong under
+seed `177`, and label shuffle stays random-level. Together with seed `176`, this
+supports the narrower conclusion that output-cloud geometry is a stable H2
+mechanism candidate after seed-offset control. It still does not create a
+second public asset, a product contract, or an admitted row.
+
 ## Decision
 
-`candidate complementary signal / order-control scout passed / no admitted row`。
+`candidate complementary signal / order-control scout passed / seed-stable / no admitted row`。
 
 保留为 Research-side 强候选，因为它满足三个有价值条件：
 
@@ -169,6 +212,7 @@ public asset or Platform/Runtime contract.
 - 它在同一 H2 cache 上明显强于 raw/lowpass H2 logistic。
 - 它通过了 seed-177 稳定性和 label-shuffle sanity。
 - 它在 `256 / 256` shared-position order-control scout 中没有因 seed-offset 控制而坍塌。
+- 它在 shared-position seed `177` scout 中仍保持强 AUC 和非零严格尾部恢复。
 
 但当前不做以下事情：
 
diff --git a/docs/evidence/reproduction-status.md b/docs/evidence/reproduction-status.md
index cb9e5edb..8773324a 100644
--- a/docs/evidence/reproduction-status.md
+++ b/docs/evidence/reproduction-status.md
@@ -32,7 +32,7 @@ Smoke tests and dry runs are engineering validation, not benchmark claims.
 | Track | Status | Notes |
 | --- | --- | --- |
 | Black-box `recon` | `evidence-ready` | Strongest black-box method and admitted non-CLiD product row. Public data limits strict paper-aligned claims. The bounded public-100 step30 rerun plus unified artifact summary yields the promoted coherent packet: `AUC = 0.837`, `ASR = 0.74`, `TPR@1%FPR = 0.22`, `TPR@0.1%FPR = 0.11`. See [non-clid-black-box-reselection.md](non-clid-black-box-reselection.md), [recon-product-validation-contract.md](recon-product-validation-contract.md), [recon-product-validation-result.md](recon-product-validation-result.md), and [../product-bridge/recon-product-validation-handoff.md](../product-bridge/recon-product-validation-handoff.md). |
-| Black-box `H2 output-cloud geometry` | `hold-candidate` | Strong Research-side output-output geometry signal on H2 response caches, but not an admitted Platform/Runtime row. Existing `512 / 512` cache review gives `AUC = 0.961529`, `TPR@1%FPR = 0.333984`, `TPR@0.1%FPR = 0.117188`; seed `177` is stable and label shuffle is random-level. The `256 / 256` shared-position order-control scout preserves the signal (`AUC = 0.967819`, `TPR@1%FPR = 0.410156`, `TPR@0.1%FPR = 0.132812`) with random-level label shuffle (`AUC = 0.464066`), so class-ordered seed offset is not a sufficient explanation. Do not promote, add schema/runner/UI/bundle rows, run same-cache feature sweeps, or schedule a full `512 / 512` rerun by default. See [h2-output-cloud-geometry-20260525.md](h2-output-cloud-geometry-20260525.md). |
+| Black-box `H2 output-cloud geometry` | `hold-candidate` | Strong Research-side output-output geometry signal on H2 response caches, but not an admitted Platform/Runtime row. Existing `512 / 512` cache review gives `AUC = 0.961529`, `TPR@1%FPR = 0.333984`, `TPR@0.1%FPR = 0.117188`; seed `177` is stable and label shuffle is random-level. The `256 / 256` shared-position order-control scout preserves the signal (`AUC = 0.967819`, `TPR@1%FPR = 0.410156`, `TPR@0.1%FPR = 0.132812`) with random-level label shuffle (`AUC = 0.464066`), so class-ordered seed offset is not a sufficient explanation. The same controlled boundary at seed `177` remains strong (`AUC = 0.956192`, `TPR@1%FPR = 0.285156`, `TPR@0.1%FPR = 0.109375`) with random-level label shuffle (`AUC = 0.484070`), so the controlled signal is not single-seed. Do not promote, add schema/runner/UI/bundle rows, run same-cache feature sweeps, or schedule a full `512 / 512` rerun by default. See [h2-output-cloud-geometry-20260525.md](h2-output-cloud-geometry-20260525.md). |
 | Black-box `CLiD` | `hold-candidate` | Selected as a bounded black-box lane after H2 SD/CelebA text-to-image transfer was protocol-blocked. The official CPU `inter_output/*` replay is strong (`AUC = 0.961277`, `TPR@1%FPR = 0.675470`, `ASR = 0.891957`) and now has a machine-readable candidate-only card, but row identity remains blocked because the public score rows are numeric-only and the 2026-05-15 authenticated HF `mia_COCO.zip` `HEAD`/`Range` recheck still returned `403`. Earlier local prompt-conditioned packets were strong and repeat-stable, but prompt-neutral perturbation collapses the signal, swapped-prompt control is degraded, within-split prompt shuffle is weak and seed-sensitive, prompt-text-only review is moderate AUC but weak strict-tail, and control attribution shows auxiliary-feature instability under prompt controls. Current evidence supports a prompt-conditioned diagnostic claim only, not admitted general black-box evidence. No next CLiD GPU task is selected. See [../product-bridge/clid-candidate-evidence-card.md](../product-bridge/clid-candidate-evidence-card.md), [clid-official-inter-output-replay-20260515.md](clid-official-inter-output-replay-20260515.md), [clid-identity-manifest-gate-20260515.md](clid-identity-manifest-gate-20260515.md), [black-box-next-lane-selection.md](black-box-next-lane-selection.md), [clid-bridge-contract.md](clid-bridge-contract.md), [clid-score-schema-gate.md](clid-score-schema-gate.md), [clid-tiny-score-bridge.md](clid-tiny-score-bridge.md), [clid-100-score-packet.md](clid-100-score-packet.md), [clid-candidate-integrity-review.md](clid-candidate-integrity-review.md), [clid-repeat-stability.md](clid-repeat-stability.md), [clid-prompt-perturbation.md](clid-prompt-perturbation.md), [clid-prompt-conditioning-boundary.md](clid-prompt-conditioning-boundary.md), [clid-swapped-prompt-control.md](clid-swapped-prompt-control.md), [clid-within-split-shuffle-control.md](clid-within-split-shuffle-control.md), [clid-prompt-text-only-review.md](clid-prompt-text-only-review.md), and [clid-control-attribution.md](clid-control-attribution.md). |
 | Black-box `variation` | `code-ready` | API-only support method; needs real query data for stronger claims. |
 | Feature-packet consumer lane | `deferred-candidate` | 2026-05-25 consumer verdict keeps the gray-box feature-packet lane out of Platform/Runtime. Tracing the Roots remains positive Research evidence (`AUC = 0.815826`, `TPR@1%FPR = 0.134000`), but live narrow public-surface recheck found no second non-source-equivalent public feature-packet and no raw checkpoint/sample/regeneration assets. Do not add feature-packet schema, bundle export, validators, tests, Platform UI type, Runtime runner, GPU task, or download from this singleton. See [feature-packet-channel-consumer-verdict-20260525.md](feature-packet-channel-consumer-verdict-20260525.md) and [../product-bridge/feature-packet-lane.md](../product-bridge/feature-packet-lane.md). |
diff --git a/docs/evidence/workspace-evidence-index.md b/docs/evidence/workspace-evidence-index.md
index 685fcb71..43d7d160 100644
--- a/docs/evidence/workspace-evidence-index.md
+++ b/docs/evidence/workspace-evidence-index.md
@@ -6,8 +6,8 @@ This index separates current track state from archived research history.
 
 Latest Research update:
 [h2-output-cloud-geometry-20260525.md](h2-output-cloud-geometry-20260525.md)
-records a metric verdict on the H2 response-strength cache plus a bounded
-`256 / 256` shared-position order-control scout.
+records a metric verdict on the H2 response-strength cache plus bounded
+`256 / 256` shared-position order-control and seed-stability scouts.
 The output-output geometry scorer is a strong Research-side candidate
 (`AUC = 0.961529`, `TPR@1%FPR = 0.333984`,
 `TPR@0.1%FPR = 0.117188`) and is stable under seed `177`
@@ -15,9 +15,12 @@ The output-output geometry scorer is a strong Research-side candidate
 (`AUC = 0.507595`). The shared-position order-control scout also stays strong
 (`AUC = 0.967819`, `TPR@1%FPR = 0.410156`,
 `TPR@0.1%FPR = 0.132812`) with random-level label shuffle (`AUC = 0.464066`).
+The same controlled boundary at seed `177` remains strong (`AUC = 0.956192`,
+`TPR@1%FPR = 0.285156`, `TPR@0.1%FPR = 0.109375`) with random-level label
+shuffle (`AUC = 0.484070`).
 It is not admitted because this remains a Research-side H2 response-cache
 geometry candidate, not a second public asset or Platform/Runtime contract.
-Decision: `candidate complementary signal / order-control scout passed /
+Decision: `candidate complementary signal / order-control scout passed / seed-stable /
 no admitted row / no download / no 512/512 rerun selected`.
 
 Previous Research update:
diff --git a/workspaces/black-box/README.md b/workspaces/black-box/README.md
index 6f409072..686c50d7 100644
--- a/workspaces/black-box/README.md
+++ b/workspaces/black-box/README.md
@@ -11,8 +11,11 @@
   label-shuffle sanity 回到随机级。后续 `256 / 256` shared-position
   order-control scout 仍为 `AUC = 0.967819`、`TPR@1%FPR = 0.410156`、
   `TPR@0.1%FPR = 0.132812`，label-shuffle `AUC = 0.464066`，因此
-  class-ordered seed offset 不是充分解释。但它仍只是 Research-side H2
-  response-cache geometry 候选，不是第二公开资产或产品合约。
+  class-ordered seed offset 不是充分解释。同边界 seed `177` shared-position
+  scout 仍为 `AUC = 0.956192`、`TPR@1%FPR = 0.285156`、
+  `TPR@0.1%FPR = 0.109375`，label-shuffle `AUC = 0.484070`，说明该候选
+  在 order-control 后不是单 seed 现象。但它仍只是 Research-side H2 response-cache
+  geometry 候选，不是第二公开资产或产品合约。
   不要把它扩成 KDE、shadow density、repeat-count 或同 cache feature sweep；
   不要补跑完整 `512 / 512` 只为表格好看；不要新增 Platform/Runtime schema、
   runner 或 admitted bundle row。
diff --git a/workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-seed177-256-20260525.json b/workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-seed177-256-20260525.json
new file mode 100644
index 00000000..9b51ebfc
--- /dev/null
+++ b/workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-seed177-256-20260525.json
@@ -0,0 +1,166 @@
+{
+  "status": "ready",
+  "track": "black-box",
+  "method": "H2 output-cloud geometry scorer",
+  "mode": "cpu-cache-review",
+  "response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-shared-position-seed177-20260525-r1\\response-cache.npz",
+  "inputs": {
+    "sample_count": 512,
+    "member_count": 256,
+    "nonmember_count": 256,
+    "timesteps": [
+      40,
+      80,
+      120,
+      160
+    ],
+    "repeat_count": 2,
+    "feature_count": 17,
+    "feature_names": [
+      "within_timestep_pair_rmse_40",
+      "within_timestep_pair_rmse_80",
+      "within_timestep_pair_rmse_120",
+      "within_timestep_pair_rmse_160",
+      "within_timestep_pair_rmse_mean",
+      "within_timestep_pair_rmse_std",
+      "within_timestep_pair_rmse_slope",
+      "centroid_rmse_40_80",
+      "centroid_rmse_40_120",
+      "centroid_rmse_40_160",
+      "centroid_rmse_80_120",
+      "centroid_rmse_80_160",
+      "centroid_rmse_120_160",
+      "centroid_rmse_mean",
+      "centroid_rmse_std",
+      "cloud_pca_trace",
+      "cloud_pca_top_share"
+    ],
+    "seed": 177,
+    "label_mode": "original",
+    "holdout_repeats": 7,
+    "bootstrap_iters": 100
+  },
+  "simple": {
+    "best_by_auc": {
+      "name": "centroid_rmse_40_160",
+      "orientation": "negative_higher_is_member",
+      "metrics": {
+        "auc": 0.794266,
+        "asr": 0.755859,
+        "tpr_at_1pct_fpr": 0.015625,
+        "tpr_at_0_1pct_fpr": 0.007812,
+        "member_score_mean": -0.035444,
+        "nonmember_score_mean": -0.045246
+      }
+    },
+    "best_by_low_fpr": {
+      "name": "cloud_pca_top_share",
+      "orientation": "negative_higher_is_member",
+      "metrics": {
+        "auc": 0.626831,
+        "asr": 0.619141,
+        "tpr_at_1pct_fpr": 0.058594,
+        "tpr_at_0_1pct_fpr": 0.042969,
+        "member_score_mean": -0.264214,
+        "nonmember_score_mean": -0.273958
+      }
+    }
+  },
+  "logistic": {
+    "aggregate_metrics": {
+      "auc": 0.956192,
+      "asr": 0.896484,
+      "tpr_at_1pct_fpr": 0.285156,
+      "tpr_at_0_1pct_fpr": 0.109375,
+      "member_score_mean": 0.797283,
+      "nonmember_score_mean": 0.200697
+    },
+    "aggregate_ci95": {
+      "auc": {
+        "p025": 0.941745,
+        "p975": 0.970779
+      },
+      "asr": {
+        "p025": 0.877881,
+        "p975": 0.917041
+      },
+      "tpr_at_1pct_fpr": {
+        "p025": 0.116406,
+        "p975": 0.678613
+      },
+      "tpr_at_0_1pct_fpr": {
+        "p025": 0.07793,
+        "p975": 0.364649
+      }
+    },
+    "mean_coefficients": [
+      2.492539,
+      0.229399,
+      0.14844,
+      0.910024,
+      0.794222,
+      -0.250528,
+      -0.167823,
+      -0.138342,
+      -3.268879,
+      -3.724328,
+      0.825758,
+      0.516866,
+      2.113911,
+      -0.684683,
+      -0.41095,
+      -0.702947,
+      -0.039379
+    ],
+    "prediction_count": {
+      "min": 7,
+      "max": 7,
+      "mean": 7.0
+    }
+  },
+  "comparison": {
+    "raw_h2_logistic": {
+      "auc": 0.911255,
+      "asr": 0.851562,
+      "tpr_at_1pct_fpr": 0.113281,
+      "tpr_at_0_1pct_fpr": 0.0,
+      "member_score_mean": 0.737363,
+      "nonmember_score_mean": 0.263165
+    },
+    "lowpass_h2_logistic": {
+      "auc": 0.896698,
+      "asr": 0.828125,
+      "tpr_at_1pct_fpr": 0.09375,
+      "tpr_at_0_1pct_fpr": 0.0625,
+      "member_score_mean": 0.727362,
+      "nonmember_score_mean": 0.273117
+    },
+    "output_cloud_minus_raw_h2": {
+      "auc": 0.044937,
+      "asr": 0.044922,
+      "tpr_at_1pct_fpr": 0.171875,
+      "tpr_at_0_1pct_fpr": 0.109375
+    },
+    "output_cloud_minus_lowpass_h2": {
+      "auc": 0.059494,
+      "asr": 0.068359,
+      "tpr_at_1pct_fpr": 0.191406,
+      "tpr_at_0_1pct_fpr": 0.046875
+    }
+  },
+  "decision_gate": {
+    "uses_only_output_output_geometry": true,
+    "does_not_generate_new_responses": true,
+    "nonzero_strict_tail": true,
+    "beats_best_simple_low_fpr": true,
+    "reopen_allowed": false,
+    "requires_reseeded_or_interleaved_cache_before_promotion": true
+  },
+  "verdict": "candidate_complementary_output_cloud_geometry",
+  "notes": [
+    "This is a CPU-only scorer review on an existing H2 response cache.",
+    "It intentionally excludes seed-to-output distance features so it cannot collapse back into H2 simple distance.",
+    "A positive result is candidate-only until reseeded or interleaved response-cache controls rule out class-ordered sampling effects.",
+    "Do not expand this cache into KDE, shadow density, repeat-count, or same-cache feature sweeps."
+  ]
+}
\ No newline at end of file
diff --git a/workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-seed177-256-label-shuffle-20260525.json b/workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-seed177-256-label-shuffle-20260525.json
new file mode 100644
index 00000000..d391cb21
--- /dev/null
+++ b/workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-seed177-256-label-shuffle-20260525.json
@@ -0,0 +1,142 @@
+{
+  "status": "ready",
+  "track": "black-box",
+  "method": "H2 output-cloud geometry scorer",
+  "mode": "cpu-cache-review",
+  "response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-shared-position-seed177-20260525-r1\\response-cache.npz",
+  "inputs": {
+    "sample_count": 512,
+    "member_count": 256,
+    "nonmember_count": 256,
+    "timesteps": [
+      40,
+      80,
+      120,
+      160
+    ],
+    "repeat_count": 2,
+    "feature_count": 17,
+    "feature_names": [
+      "within_timestep_pair_rmse_40",
+      "within_timestep_pair_rmse_80",
+      "within_timestep_pair_rmse_120",
+      "within_timestep_pair_rmse_160",
+      "within_timestep_pair_rmse_mean",
+      "within_timestep_pair_rmse_std",
+      "within_timestep_pair_rmse_slope",
+      "centroid_rmse_40_80",
+      "centroid_rmse_40_120",
+      "centroid_rmse_40_160",
+      "centroid_rmse_80_120",
+      "centroid_rmse_80_160",
+      "centroid_rmse_120_160",
+      "centroid_rmse_mean",
+      "centroid_rmse_std",
+      "cloud_pca_trace",
+      "cloud_pca_top_share"
+    ],
+    "seed": 177,
+    "label_mode": "shuffled_seed_177",
+    "holdout_repeats": 7,
+    "bootstrap_iters": 100
+  },
+  "simple": {
+    "best_by_auc": {
+      "name": "cloud_pca_top_share",
+      "orientation": "negative_higher_is_member",
+      "metrics": {
+        "auc": 0.518509,
+        "asr": 0.527344,
+        "tpr_at_1pct_fpr": 0.0,
+        "tpr_at_0_1pct_fpr": 0.0,
+        "member_score_mean": -0.268277,
+        "nonmember_score_mean": -0.269895
+      }
+    },
+    "best_by_low_fpr": {
+      "name": "centroid_rmse_40_160",
+      "orientation": "negative_higher_is_member",
+      "metrics": {
+        "auc": 0.515167,
+        "asr": 0.544922,
+        "tpr_at_1pct_fpr": 0.023438,
+        "tpr_at_0_1pct_fpr": 0.003906,
+        "member_score_mean": -0.040165,
+        "nonmember_score_mean": -0.040525
+      }
+    }
+  },
+  "logistic": {
+    "aggregate_metrics": {
+      "auc": 0.48407,
+      "asr": 0.513672,
+      "tpr_at_1pct_fpr": 0.023438,
+      "tpr_at_0_1pct_fpr": 0.011719,
+      "member_score_mean": 0.500029,
+      "nonmember_score_mean": 0.50166
+    },
+    "aggregate_ci95": {
+      "auc": {
+        "p025": 0.442889,
+        "p975": 0.53199
+      },
+      "asr": {
+        "p025": 0.511719,
+        "p975": 0.554785
+      },
+      "tpr_at_1pct_fpr": {
+        "p025": 0.003906,
+        "p975": 0.046875
+      },
+      "tpr_at_0_1pct_fpr": {
+        "p025": 0.003906,
+        "p975": 0.035156
+      }
+    },
+    "mean_coefficients": [
+      -0.17842,
+      -0.255633,
+      0.287435,
+      -0.2456,
+      -0.087512,
+      0.110718,
+      -0.05195,
+      -0.018815,
+      -0.308328,
+      0.260794,
+      -0.079668,
+      -0.443812,
+      0.404135,
+      -0.018786,
+      -0.248044,
+      0.792548,
+      -0.037794
+    ],
+    "prediction_count": {
+      "min": 7,
+      "max": 7,
+      "mean": 7.0
+    }
+  },
+  "comparison": {
+    "raw_h2_logistic": null,
+    "lowpass_h2_logistic": null,
+    "output_cloud_minus_raw_h2": null,
+    "output_cloud_minus_lowpass_h2": null
+  },
+  "decision_gate": {
+    "uses_only_output_output_geometry": true,
+    "does_not_generate_new_responses": true,
+    "nonzero_strict_tail": true,
+    "beats_best_simple_low_fpr": false,
+    "reopen_allowed": false,
+    "requires_reseeded_or_interleaved_cache_before_promotion": true
+  },
+  "verdict": "label_shuffle_sanity_random_level",
+  "notes": [
+    "This is a CPU-only scorer review on an existing H2 response cache.",
+    "It intentionally excludes seed-to-output distance features so it cannot collapse back into H2 simple distance.",
+    "A positive result is candidate-only until reseeded or interleaved response-cache controls rule out class-ordered sampling effects.",
+    "Do not expand this cache into KDE, shadow density, repeat-count, or same-cache feature sweeps."
+  ]
+}
\ No newline at end of file
diff --git a/workspaces/black-box/plan.md b/workspaces/black-box/plan.md
index 05d420f4..e42a8243 100644
--- a/workspaces/black-box/plan.md
+++ b/workspaces/black-box/plan.md
@@ -42,7 +42,10 @@
   sanity check. A bounded `256 / 256` shared-position order-control scout
   preserved the signal (`AUC = 0.967819`, `TPR@1%FPR = 0.410156`,
   `TPR@0.1%FPR = 0.132812`) with random-level label shuffle (`AUC = 0.464066`),
-  so class-ordered seed offset is not a sufficient explanation. It remains
+  so class-ordered seed offset is not a sufficient explanation. The same
+  boundary at seed `177` remains strong (`AUC = 0.956192`,
+  `TPR@1%FPR = 0.285156`, `TPR@0.1%FPR = 0.109375`) with random-level label
+  shuffle (`AUC = 0.484070`), so it is not a single-seed artifact. It remains
   candidate-only: do not promote it into Platform or Runtime runners from this
   cache family.
 - `simple image-to-image distance`: bounded single-asset evidence on
@@ -78,8 +81,8 @@
 ## Next Action
 
 No black-box GPU or CPU sidecar is selected. The H2 output-cloud order-control
-scout has answered the seed-offset caveat at decision value; do not turn the
-candidate into same-cache feature work or a full `512 / 512` rerun just to
+and seed-stability scouts have answered the current caveats at decision value;
+do not turn the candidate into same-cache feature work or a full `512 / 512` rerun just to
 complete a table. The broader root long-horizon queue
 still continues Lane A only with a non-duplicate asset that has exact target
 identity, member/nonmember split artifacts, and response or score coverage.