|
1 | 1 | # H2 Output-Cloud Geometry Cache Review |
2 | 2 |
|
3 | 3 | > Date: 2026-05-25 |
4 | | -> Status: candidate complementary signal / CPU-only cache review / order-control required before promotion / no GPU release / no admitted row |
| 4 | +> Status: candidate complementary signal / order-control scout passed / no admitted row / no 512/512 rerun selected |
5 | 5 |
|
6 | 6 | ## Question |
7 | 7 |
|
8 | 8 | 在已有 H2 response-strength cache 上,输出之间的几何结构是否携带不同于 |
9 | 9 | seed-to-output distance 的 membership 信号? |
10 | 10 |
|
11 | | -本轮只复用现有 |
| 11 | +第一轮只复用现有 |
12 | 12 | `workspaces/black-box/runs/h2-response-strength-512-20260501-r1/response-cache.npz`。 |
13 | | -没有生成新响应、没有下载资产、没有运行 GPU,也没有扩展同一路线的 KDE、shadow |
| 13 | +随后只释放一个有界 `256 / 256` shared-position order-control scout,用来回答 |
| 14 | +class-ordered seed-offset caveat。没有下载资产,也没有扩展同一路线的 KDE、shadow |
14 | 15 | density、repeat-count 或特征 sweep。 |
15 | 16 |
|
16 | 17 | ## Contract |
@@ -87,46 +88,99 @@ Label-shuffle sanity: |
87 | 88 |
|
88 | 89 | 这说明 scorer/evaluation 管线没有明显的标签直通泄漏。 |
89 | 90 |
|
90 | | -## Critical Caveat |
| 91 | +## Shared-Position Order-Control Scout |
91 | 92 |
|
92 | | -该结果仍然不能晋升。源 cache 的响应生成存在 class-ordered seed offset: |
93 | | -`scripts/run_h2_response_strength_validation.py` 中 member 侧使用 |
94 | | -`sample_offset = 0`,nonmember 侧使用 `sample_offset = len(member_indices)`。 |
95 | | -Output-output geometry 对采样种子和响应云形态敏感,因此当前强信号可能混入 |
| 93 | +源 `512 / 512` cache 的响应生成存在 class-ordered seed offset: |
| 94 | +`scripts/run_h2_response_strength_validation.py` 的历史默认行为是 member 侧 |
| 95 | +`sample_offset = 0`,nonmember 侧 `sample_offset = len(member_indices)`。 |
| 96 | +Output-output geometry 对采样种子和响应云形态敏感,因此必须检查强信号是否只是 |
96 | 97 | class-ordered sampling effect。 |
97 | 98 |
|
98 | | -这不是要继续在同一个 cache 上补表格;它只定义一个非常窄的下一步: |
99 | | -如果需要推进,最多释放一个有界 order-control / reseeded / interleaved |
100 | | -response-cache scout,用来判断该强信号是否跨 class-order 控制保留。 |
101 | | - |
102 | | -当前允许的最小脚本改动仅限于生成这个控制 cache: |
| 99 | +本轮只加入一个窄的 seed policy 控制: |
103 | 100 | `scripts/run_h2_response_strength_validation.py --seed-offset-policy shared-position`。 |
104 | | -该模式会让 member / nonmember 使用相同 per-position seed offset,并在 |
105 | | -`summary.json` 中标记 `order_control_scout = true`。它只用于重新评估 |
106 | | -class-ordered sampling effect,不代表 admission,也不得直接生成 Platform / |
107 | | -Runtime row。 |
| 101 | +该模式让 member / nonmember 使用相同 per-position seed offset,并在 |
| 102 | +`summary.json` 中标记 `order_control_scout = true`。运行边界为 |
| 103 | +`256 / 256`,timesteps `40 / 80 / 120 / 160`,repeats `2`,seed `176`, |
| 104 | +holdout repeats `7`,bootstrap iters `100`。GPU scout 用时 `208.866516s`。 |
| 105 | + |
| 106 | +Runner summary 的 H2 distance scorer 在 shared-position 下仍为正但尾部弱: |
| 107 | + |
| 108 | +| Metric | Raw H2 logistic | Lowpass H2 logistic | |
| 109 | +| --- | ---: | ---: | |
| 110 | +| AUC | `0.906967` | `0.898102` | |
| 111 | +| ASR | `0.837891` | `0.828125` | |
| 112 | +| TPR@1%FPR | `0.058594` | `0.066406` | |
| 113 | +| TPR@0.1%FPR | `0.003906` | `0.003906` | |
| 114 | + |
| 115 | +Output-cloud geometry review on the same shared-position cache: |
| 116 | +`workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-256-20260525.json` |
| 117 | + |
| 118 | +| Metric | Shared-position `256 / 256` | |
| 119 | +| --- | ---: | |
| 120 | +| AUC | `0.967819` | |
| 121 | +| ASR | `0.923828` | |
| 122 | +| TPR@1%FPR | `0.410156` | |
| 123 | +| TPR@0.1%FPR | `0.132812` | |
| 124 | + |
| 125 | +Label-shuffle sanity for the shared-position cache: |
| 126 | +`workspaces/black-box/artifacts/h2-output-cloud-geometry-shared-position-256-label-shuffle-20260525.json` |
| 127 | + |
| 128 | +| Metric | Shared-position label shuffle | |
| 129 | +| --- | ---: | |
| 130 | +| AUC | `0.464066` | |
| 131 | +| ASR | `0.505859` | |
| 132 | +| TPR@1%FPR | `0.003906` | |
| 133 | +| TPR@0.1%FPR | `0.0` | |
| 134 | + |
| 135 | +Same-size historical class-ordered subset from the old cache: |
| 136 | +`workspaces/black-box/artifacts/h2-output-cloud-geometry-class-ordered-subset-256-20260525.json` |
| 137 | + |
| 138 | +| Metric | Class-ordered subset `256 / 256` | |
| 139 | +| --- | ---: | |
| 140 | +| AUC | `0.967438` | |
| 141 | +| ASR | `0.916016` | |
| 142 | +| TPR@1%FPR | `0.179688` | |
| 143 | +| TPR@0.1%FPR | `0.105469` | |
| 144 | + |
| 145 | +Class-ordered subset label shuffle: |
| 146 | +`workspaces/black-box/artifacts/h2-output-cloud-geometry-class-ordered-subset-256-label-shuffle-20260525.json` |
| 147 | + |
| 148 | +| Metric | Class-ordered subset label shuffle | |
| 149 | +| --- | ---: | |
| 150 | +| AUC | `0.427902` | |
| 151 | +| ASR | `0.5` | |
| 152 | +| TPR@1%FPR | `0.0` | |
| 153 | +| TPR@0.1%FPR | `0.0` | |
| 154 | + |
| 155 | +Interpretation: shared-position order-control did not collapse the output-cloud |
| 156 | +geometry signal, and its label-shuffle check returns random-level. This removes |
| 157 | +the previous class-ordered seed-offset caveat as a sufficient explanation for |
| 158 | +the signal. The result still does not imply product admission: it is one |
| 159 | +controlled scout on H2 DDPM/CIFAR10 response-cache geometry, not a second |
| 160 | +public asset or Platform/Runtime contract. |
108 | 161 |
|
109 | 162 | ## Decision |
110 | 163 |
|
111 | | -`candidate complementary signal / order-control required / no admitted row`。 |
| 164 | +`candidate complementary signal / order-control scout passed / no admitted row`。 |
112 | 165 |
|
113 | 166 | 保留为 Research-side 强候选,因为它满足三个有价值条件: |
114 | 167 |
|
115 | 168 | - 它是不同 observable:output-output cloud geometry,而不是 seed-to-output distance。 |
116 | 169 | - 它在同一 H2 cache 上明显强于 raw/lowpass H2 logistic。 |
117 | 170 | - 它通过了 seed-177 稳定性和 label-shuffle sanity。 |
| 171 | +- 它在 `256 / 256` shared-position order-control scout 中没有因 seed-offset 控制而坍塌。 |
118 | 172 |
|
119 | 173 | 但当前不做以下事情: |
120 | 174 |
|
121 | 175 | - 不升级到 Platform/Runtime admitted bundle。 |
122 | 176 | - 不新增产品 schema、Runtime runner、UI 类型或 bundle row。 |
123 | 177 | - 不在同一 cache 上展开 KDE、shadow density、repeat-count、特征族或融合 sweep。 |
124 | | -- 不释放 GPU 或大下载。 |
| 178 | +- 不释放完整 `512 / 512` shared-position GPU rerun 或大下载;当前 `256 / 256` |
| 179 | + order-control 已经回答了会改变路线的 caveat。 |
125 | 180 |
|
126 | | -下一次重新评估只允许基于一个 order-control cache 的结果。如果 reseeded / |
127 | | -interleaved cache 仍保持强 AUC 和严格尾部恢复,再讨论是否进入更正式的 H2 |
128 | | -output-cloud 机制线;如果不保持,该候选直接关闭为 class-ordered response-cache |
129 | | -artifact。 |
| 181 | +下一次重新评估不应是同 cache feature sweep 或为了表格好看的 `512 / 512` 补跑。 |
| 182 | +只有在需要正式晋升机制线、发现第二公开资产、或要建立独立消费合约时,才重新定义 |
| 183 | +更高成本的验证任务。 |
130 | 184 |
|
131 | 185 | ## Platform and Runtime Impact |
132 | 186 |
|
|
0 commit comments