|
| 1 | +# H2 Output-Cloud Geometry Cache Review |
| 2 | + |
| 3 | +> Date: 2026-05-25 |
| 4 | +> Status: candidate complementary signal / CPU-only cache review / order-control required before promotion / no GPU release / no admitted row |
| 5 | +
|
| 6 | +## Question |
| 7 | + |
| 8 | +在已有 H2 response-strength cache 上,输出之间的几何结构是否携带不同于 |
| 9 | +seed-to-output distance 的 membership 信号? |
| 10 | + |
| 11 | +本轮只复用现有 |
| 12 | +`workspaces/black-box/runs/h2-response-strength-512-20260501-r1/response-cache.npz`。 |
| 13 | +没有生成新响应、没有下载资产、没有运行 GPU,也没有扩展同一路线的 KDE、shadow |
| 14 | +density、repeat-count 或特征 sweep。 |
| 15 | + |
| 16 | +## Contract |
| 17 | + |
| 18 | +脚本: |
| 19 | +`scripts/review_h2_output_cloud_geometry.py` |
| 20 | + |
| 21 | +输入 cache: |
| 22 | + |
| 23 | +| Field | Value | |
| 24 | +| --- | ---: | |
| 25 | +| Samples | `1024` | |
| 26 | +| Members | `512` | |
| 27 | +| Nonmembers | `512` | |
| 28 | +| Timesteps | `40 / 80 / 120 / 160` | |
| 29 | +| Repeats per timestep | `2` | |
| 30 | +| Response shape | `[1024, 4, 2, 3, 32, 32]` | |
| 31 | + |
| 32 | +特征只使用 output-output geometry: |
| 33 | + |
| 34 | +| Feature family | Meaning | |
| 35 | +| --- | --- | |
| 36 | +| within-timestep pair RMSE | 同一 timestep 内不同 repeat 的响应距离 | |
| 37 | +| timestep centroid RMSE | 不同 timestep 的响应云 centroid 距离 | |
| 38 | +| response-cloud PCA trace/top share | 小响应云 Gram spectrum 的尺度和集中度 | |
| 39 | + |
| 40 | +该脚本刻意不读取 seed-to-output distance 特征,因此不会退化成原 H2 simple |
| 41 | +distance 评分器。 |
| 42 | + |
| 43 | +## Result |
| 44 | + |
| 45 | +主结果: |
| 46 | +`workspaces/black-box/artifacts/h2-output-cloud-geometry-20260525.json` |
| 47 | + |
| 48 | +| Metric | Output-cloud logistic | Raw H2 logistic | Lowpass H2 logistic | |
| 49 | +| --- | ---: | ---: | ---: | |
| 50 | +| AUC | `0.961529` | `0.905693` | `0.895679` | |
| 51 | +| ASR | `0.900391` | `0.841797` | `0.831055` | |
| 52 | +| TPR@1%FPR | `0.333984` | `0.134766` | `0.148438` | |
| 53 | +| TPR@0.1%FPR | `0.117188` | `0.0` | `0.025391` | |
| 54 | + |
| 55 | +相对 raw H2:`AUC +0.055836`,`TPR@1%FPR +0.199218`, |
| 56 | +`TPR@0.1%FPR +0.117188`。 |
| 57 | + |
| 58 | +相对 lowpass H2:`AUC +0.065850`,`TPR@1%FPR +0.185546`, |
| 59 | +`TPR@0.1%FPR +0.091797`。 |
| 60 | + |
| 61 | +简单单特征不能解释该结果: |
| 62 | + |
| 63 | +| Best simple view | Feature | Orientation | AUC | TPR@1%FPR | TPR@0.1%FPR | |
| 64 | +| --- | --- | --- | ---: | ---: | ---: | |
| 65 | +| Best AUC | `centroid_rmse_40_160` | negative | `0.801182` | `0.03125` | `0.005859` | |
| 66 | +| Best low-FPR | `cloud_pca_top_share` | negative | `0.650913` | `0.078125` | `0.017578` | |
| 67 | + |
| 68 | +Seed stability check: |
| 69 | +`workspaces/black-box/artifacts/h2-output-cloud-geometry-seed177-20260525.json` |
| 70 | + |
| 71 | +| Metric | Seed 177 | |
| 72 | +| --- | ---: | |
| 73 | +| AUC | `0.961048` | |
| 74 | +| ASR | `0.900391` | |
| 75 | +| TPR@1%FPR | `0.353516` | |
| 76 | +| TPR@0.1%FPR | `0.130859` | |
| 77 | + |
| 78 | +Label-shuffle sanity: |
| 79 | +`workspaces/black-box/artifacts/h2-output-cloud-geometry-label-shuffle-20260525.json` |
| 80 | + |
| 81 | +| Metric | Label shuffle | |
| 82 | +| --- | ---: | |
| 83 | +| AUC | `0.507595` | |
| 84 | +| ASR | `0.521484` | |
| 85 | +| TPR@1%FPR | `0.011719` | |
| 86 | +| TPR@0.1%FPR | `0.003906` | |
| 87 | + |
| 88 | +这说明 scorer/evaluation 管线没有明显的标签直通泄漏。 |
| 89 | + |
| 90 | +## Critical Caveat |
| 91 | + |
| 92 | +该结果仍然不能晋升。源 cache 的响应生成存在 class-ordered seed offset: |
| 93 | +`scripts/run_h2_response_strength_validation.py` 中 member 侧使用 |
| 94 | +`sample_offset = 0`,nonmember 侧使用 `sample_offset = len(member_indices)`。 |
| 95 | +Output-output geometry 对采样种子和响应云形态敏感,因此当前强信号可能混入 |
| 96 | +class-ordered sampling effect。 |
| 97 | + |
| 98 | +这不是要继续在同一个 cache 上补表格;它只定义一个非常窄的下一步: |
| 99 | +如果需要推进,最多释放一个有界 order-control / reseeded / interleaved |
| 100 | +response-cache scout,用来判断该强信号是否跨 class-order 控制保留。 |
| 101 | + |
| 102 | +## Decision |
| 103 | + |
| 104 | +`candidate complementary signal / order-control required / no admitted row`。 |
| 105 | + |
| 106 | +保留为 Research-side 强候选,因为它满足三个有价值条件: |
| 107 | + |
| 108 | +- 它是不同 observable:output-output cloud geometry,而不是 seed-to-output distance。 |
| 109 | +- 它在同一 H2 cache 上明显强于 raw/lowpass H2 logistic。 |
| 110 | +- 它通过了 seed-177 稳定性和 label-shuffle sanity。 |
| 111 | + |
| 112 | +但当前不做以下事情: |
| 113 | + |
| 114 | +- 不升级到 Platform/Runtime admitted bundle。 |
| 115 | +- 不新增产品 schema、Runtime runner、UI 类型或 bundle row。 |
| 116 | +- 不在同一 cache 上展开 KDE、shadow density、repeat-count、特征族或融合 sweep。 |
| 117 | +- 不释放 GPU 或大下载。 |
| 118 | + |
| 119 | +下一次重新评估只允许基于一个 order-control cache 的结果。如果 reseeded / |
| 120 | +interleaved cache 仍保持强 AUC 和严格尾部恢复,再讨论是否进入更正式的 H2 |
| 121 | +output-cloud 机制线;如果不保持,该候选直接关闭为 class-ordered response-cache |
| 122 | +artifact。 |
| 123 | + |
| 124 | +## Platform and Runtime Impact |
| 125 | + |
| 126 | +None. The admitted Platform/Runtime bundle remains the existing five rows: |
| 127 | +`recon`, `PIA baseline`, `PIA defended`, `GSA`, and `DPDM W-1`. |
0 commit comments