Skip to content

Commit 4cc562c

Browse files
Record H2 img2img output-cloud portability boundary
Record the CPU-only img2img portability boundary for H2 output-cloud geometry. The existing SD/CelebA img2img admission cache is weak or unstable and the stability cache is not distinct from simple distance, so this remains Research-only and does not release Platform/Runtime work.
1 parent 50aaba6 commit 4cc562c

8 files changed

Lines changed: 842 additions & 8 deletions

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,7 @@ workspaces/**/artifacts/**
152152
!workspaces/black-box/artifacts/h2-output-cloud-geometry-class-ordered-subset-256-20260525.json
153153
!workspaces/black-box/artifacts/h2-output-cloud-geometry-class-ordered-subset-256-label-shuffle-20260525.json
154154
!workspaces/black-box/artifacts/h2-output-cloud-transfer-shared-position-256-20260525.json
155+
!workspaces/black-box/artifacts/h2-img2img-output-cloud-portability-20260525.json
155156
!workspaces/black-box/artifacts/beans-lora-member-denoising-loss-scout-20260513.json
156157
!workspaces/black-box/artifacts/clid-image-identity-boundary-20260511.json
157158
!workspaces/black-box/artifacts/midfreq-same-noise-residual-cache-audit-20260512.json

ROADMAP.md

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,13 +43,26 @@ same-sample all-train/all-test diagnostic;decision gate 为
4343
`mean_auc = 0.959755``min_tpr_at_1pct_fpr = 0.375000`
4444
`min_tpr_at_0_1pct_fpr = 0.058594`
4545

46+
同日完成的 img2img portability review 只读取现有 SD/CelebA img2img response
47+
caches,不生成新响应、不下载模型、不释放 GPU。结果没有扩展 H2:admission
48+
`25 / 25` cache 上 output-cloud logistic 只有 `AUC = 0.7888`
49+
`TPR@1%FPR = 0.0``TPR@0.1%FPR = 0.0`,且比 best simple-distance
50+
`AUC -0.0880`;stability `10 / 10` cache 虽为 `AUC = 0.9600`
51+
但仍低于 simple-distance `AUC = 0.9900`。decision gate 为
52+
`img2img_output_cloud_weak_or_unstable`,所以该 review 只把 output-cloud
53+
geometry 限定为 H2 response-strength Research-side diagnostic,不打开
54+
img2img Runtime runner、Platform row、strength/seed/repeat/feature sweep 或
55+
input-distance fusion。
56+
4657
该结果只能作为 Research-side 强候选;下一步不是同 cache sweep,也不是为了补表格跑
4758
完整 `512 / 512` shared-position。重新打开只应基于正式机制晋升、第二公开资产或独立消费合约。
4859
当前 slots 仍为:
4960
`active_gpu_question = none``next_gpu_candidate = none`
50-
`CPU sidecar = none selected after H2 output-cloud cross-cache transfer review`
61+
`CPU sidecar = none selected after H2 img2img output-cloud portability review`
5162
See
52-
[docs/evidence/h2-output-cloud-geometry-20260525.md](docs/evidence/h2-output-cloud-geometry-20260525.md)
63+
[docs/evidence/h2-output-cloud-geometry-20260525.md](docs/evidence/h2-output-cloud-geometry-20260525.md)
64+
and
65+
[docs/evidence/h2-img2img-output-cloud-portability-20260525.md](docs/evidence/h2-img2img-output-cloud-portability-20260525.md)
5366

5467
## 2026-05-25 Feature-Packet 通道消费者裁决
5568

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# H2 Img2img Output-Cloud Portability Review
2+
3+
> Date: 2026-05-25
4+
> Status: weak or unstable / not distinct from simple distance / no admitted row / no Runtime runner
5+
6+
## Question
7+
8+
H2 output-cloud geometry is strong on the DDPM/CIFAR10 response-strength
9+
cache. This review asks a narrower portability question: does the same
10+
output-output geometry carry useful signal on the existing SD/CelebA
11+
image-to-image response caches, without using the known stronger
12+
input-to-output simple distance?
13+
14+
This is a CPU-only existing-cache review. It does not generate new responses,
15+
download models, or release GPU work.
16+
17+
## Contract
18+
19+
Script:
20+
`scripts/review_h2_img2img_output_cloud_portability.py`
21+
22+
Output:
23+
`workspaces/black-box/artifacts/h2-img2img-output-cloud-portability-20260525.json`
24+
25+
Inputs:
26+
27+
| Packet | Cache | Samples | Members | Nonmembers | Strength | Repeats |
28+
| --- | --- | ---: | ---: | ---: | ---: | ---: |
29+
| Admission | `workspaces/black-box/runs/h2-img2img-simple-distance-admission-20260501-r1/response-cache.npz` | `50` | `25` | `25` | `0.75` | `2` |
30+
| Stability | `workspaces/black-box/runs/h2-img2img-simple-distance-stability-20260501-r1/response-cache.npz` | `20` | `10` | `10` | `0.75` | `2` |
31+
32+
Features use only output-output geometry:
33+
34+
- within-strength repeat-pair RMSE
35+
- response-cloud PCA trace
36+
37+
The raw feature builder also considers duplicate mean/slope/std and PCA
38+
top-share views, but this single-strength packet makes those columns duplicate
39+
or constant. The review script prunes degenerate columns and records the
40+
dropped feature names in the JSON artifact.
41+
42+
The review intentionally excludes input-to-output distance so it cannot
43+
silently become the already-known img2img simple-distance scorer.
44+
45+
## Result
46+
47+
| Packet | Output-cloud logistic AUC | TPR@1%FPR | TPR@0.1%FPR | Best simple-distance AUC | Delta vs simple distance |
48+
| --- | ---: | ---: | ---: | ---: | ---: |
49+
| Admission `25 / 25` | `0.7888` | `0.0` | `0.0` | `0.8768` | `-0.0880` |
50+
| Stability `10 / 10` | `0.9600` | `0.8` | `0.8` | `0.9900` | `-0.0300` |
51+
52+
Decision gate:
53+
54+
| Field | Value |
55+
| --- | ---: |
56+
| `min_auc` | `0.7888` |
57+
| `min_tpr_at_0_1pct_fpr` | `0.0` |
58+
| `max_auc_delta_vs_best_simple_distance` | `-0.0300` |
59+
| `verdict` | `img2img_output_cloud_weak_or_unstable` |
60+
61+
The admission packet is the blocking result: output-cloud AUC stays below
62+
`0.8`, strict-tail recovery is zero, and it is materially weaker than the
63+
existing simple-distance comparator.
64+
65+
## Decision
66+
67+
`weak or unstable / not distinct from simple distance / no admitted row`.
68+
69+
This narrows, rather than expands, H2 output-cloud geometry:
70+
71+
- It remains a strong Research-side candidate on the DDPM/CIFAR10
72+
response-strength cache.
73+
- It does not port cleanly to the existing SD/CelebA img2img caches.
74+
- It does not justify a Runtime runner, Platform schema, admitted bundle row,
75+
image-to-image product claim, or same-contract sweep.
76+
77+
Do not expand this into strength, seed, repeat-count, feature-family,
78+
input-distance fusion, or GPU response-generation matrices. Reopen only if a
79+
second public asset, independent consumption contract, or formal mechanism
80+
promotion changes the decision value.
81+
82+
## Platform and Runtime Impact
83+
84+
Expose only a watch-only boundary metadata row. The admitted
85+
Platform/Runtime bundle remains the existing five rows: `recon`,
86+
`PIA baseline`, `PIA defended`, `GSA`, and `DPDM W-1`.

docs/evidence/reproduction-status.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ Smoke tests and dry runs are engineering validation, not benchmark claims.
3232
| Track | Status | Notes |
3333
| --- | --- | --- |
3434
| Black-box `recon` | `evidence-ready` | Strongest black-box method and admitted non-CLiD product row. Public data limits strict paper-aligned claims. The bounded public-100 step30 rerun plus unified artifact summary yields the promoted coherent packet: `AUC = 0.837`, `ASR = 0.74`, `TPR@1%FPR = 0.22`, `TPR@0.1%FPR = 0.11`. See [non-clid-black-box-reselection.md](non-clid-black-box-reselection.md), [recon-product-validation-contract.md](recon-product-validation-contract.md), [recon-product-validation-result.md](recon-product-validation-result.md), and [../product-bridge/recon-product-validation-handoff.md](../product-bridge/recon-product-validation-handoff.md). |
35-
| Black-box `H2 output-cloud geometry` | `hold-candidate` | Strong Research-side output-output geometry signal on H2 response caches, but not an admitted Platform/Runtime row. Existing `512 / 512` cache review gives `AUC = 0.961529`, `TPR@1%FPR = 0.333984`, `TPR@0.1%FPR = 0.117188`; seed `177` is stable and label shuffle is random-level. The `256 / 256` shared-position order-control scout preserves the signal (`AUC = 0.967819`, `TPR@1%FPR = 0.410156`, `TPR@0.1%FPR = 0.132812`) with random-level label shuffle (`AUC = 0.464066`), so class-ordered seed offset is not a sufficient explanation. The same controlled boundary at seed `177` remains strong (`AUC = 0.956192`, `TPR@1%FPR = 0.285156`, `TPR@0.1%FPR = 0.109375`) with random-level label shuffle (`AUC = 0.484070`), so the controlled signal is not single-seed. A CPU-only fold-disjoint transfer review across the two shared-position caches is also strong: seed `176` -> `177` gives `AUC = 0.948990`, `TPR@1%FPR = 0.375000`, `TPR@0.1%FPR = 0.058594`, and seed `177` -> `176` gives `AUC = 0.970520`, `TPR@1%FPR = 0.390625`, `TPR@0.1%FPR = 0.074219`; same-sample all-train/all-test transfer is diagnostic only. Do not promote, add schema/runner/UI/bundle rows, run same-cache feature sweeps, or schedule a full `512 / 512` rerun by default. See [h2-output-cloud-geometry-20260525.md](h2-output-cloud-geometry-20260525.md). |
35+
| Black-box `H2 output-cloud geometry` | `hold-candidate` | Strong Research-side output-output geometry signal on H2 response caches, but not an admitted Platform/Runtime row. Existing `512 / 512` cache review gives `AUC = 0.961529`, `TPR@1%FPR = 0.333984`, `TPR@0.1%FPR = 0.117188`; seed `177` is stable and label shuffle is random-level. The `256 / 256` shared-position order-control scout preserves the signal (`AUC = 0.967819`, `TPR@1%FPR = 0.410156`, `TPR@0.1%FPR = 0.132812`) with random-level label shuffle (`AUC = 0.464066`), so class-ordered seed offset is not a sufficient explanation. The same controlled boundary at seed `177` remains strong (`AUC = 0.956192`, `TPR@1%FPR = 0.285156`, `TPR@0.1%FPR = 0.109375`) with random-level label shuffle (`AUC = 0.484070`), so the controlled signal is not single-seed. A CPU-only fold-disjoint transfer review across the two shared-position caches is also strong: seed `176` -> `177` gives `AUC = 0.948990`, `TPR@1%FPR = 0.375000`, `TPR@0.1%FPR = 0.058594`, and seed `177` -> `176` gives `AUC = 0.970520`, `TPR@1%FPR = 0.390625`, `TPR@0.1%FPR = 0.074219`; same-sample all-train/all-test transfer is diagnostic only. The SD/CelebA img2img portability check is weak or unstable on the admission cache (`AUC = 0.7888`, zero strict-tail recovery) and not distinct from simple distance (`AUC -0.0880` on admission, `-0.0300` on stability), so it narrows output-cloud geometry to a Research-side H2 response-strength diagnostic. Do not promote, add schema/runner/UI/bundle rows, run same-cache or img2img feature sweeps, or schedule a full `512 / 512` rerun by default. See [h2-output-cloud-geometry-20260525.md](h2-output-cloud-geometry-20260525.md) and [h2-img2img-output-cloud-portability-20260525.md](h2-img2img-output-cloud-portability-20260525.md). |
3636
| Black-box `CLiD` | `hold-candidate` | Selected as a bounded black-box lane after H2 SD/CelebA text-to-image transfer was protocol-blocked. The official CPU `inter_output/*` replay is strong (`AUC = 0.961277`, `TPR@1%FPR = 0.675470`, `ASR = 0.891957`) and now has a machine-readable candidate-only card, but row identity remains blocked because the public score rows are numeric-only and the 2026-05-15 authenticated HF `mia_COCO.zip` `HEAD`/`Range` recheck still returned `403`. Earlier local prompt-conditioned packets were strong and repeat-stable, but prompt-neutral perturbation collapses the signal, swapped-prompt control is degraded, within-split prompt shuffle is weak and seed-sensitive, prompt-text-only review is moderate AUC but weak strict-tail, and control attribution shows auxiliary-feature instability under prompt controls. Current evidence supports a prompt-conditioned diagnostic claim only, not admitted general black-box evidence. No next CLiD GPU task is selected. See [../product-bridge/clid-candidate-evidence-card.md](../product-bridge/clid-candidate-evidence-card.md), [clid-official-inter-output-replay-20260515.md](clid-official-inter-output-replay-20260515.md), [clid-identity-manifest-gate-20260515.md](clid-identity-manifest-gate-20260515.md), [black-box-next-lane-selection.md](black-box-next-lane-selection.md), [clid-bridge-contract.md](clid-bridge-contract.md), [clid-score-schema-gate.md](clid-score-schema-gate.md), [clid-tiny-score-bridge.md](clid-tiny-score-bridge.md), [clid-100-score-packet.md](clid-100-score-packet.md), [clid-candidate-integrity-review.md](clid-candidate-integrity-review.md), [clid-repeat-stability.md](clid-repeat-stability.md), [clid-prompt-perturbation.md](clid-prompt-perturbation.md), [clid-prompt-conditioning-boundary.md](clid-prompt-conditioning-boundary.md), [clid-swapped-prompt-control.md](clid-swapped-prompt-control.md), [clid-within-split-shuffle-control.md](clid-within-split-shuffle-control.md), [clid-prompt-text-only-review.md](clid-prompt-text-only-review.md), and [clid-control-attribution.md](clid-control-attribution.md). |
3737
| Black-box `variation` | `code-ready` | API-only support method; needs real query data for stronger claims. |
3838
| Feature-packet consumer lane | `deferred-candidate` | 2026-05-25 consumer verdict keeps the gray-box feature-packet lane out of Platform/Runtime. Tracing the Roots remains positive Research evidence (`AUC = 0.815826`, `TPR@1%FPR = 0.134000`), but live narrow public-surface recheck found no second non-source-equivalent public feature-packet and no raw checkpoint/sample/regeneration assets. Do not add feature-packet schema, bundle export, validators, tests, Platform UI type, Runtime runner, GPU task, or download from this singleton. See [feature-packet-channel-consumer-verdict-20260525.md](feature-packet-channel-consumer-verdict-20260525.md) and [../product-bridge/feature-packet-lane.md](../product-bridge/feature-packet-lane.md). |

docs/evidence/workspace-evidence-index.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,16 @@ geometry candidate, not a second public asset or Platform/Runtime contract.
2929
Decision: `candidate complementary signal / order-control scout passed / seed-stable /
3030
cross-cache transfer strong / no admitted row / no download / no 512/512 rerun selected`.
3131

32+
Latest portability check:
33+
[h2-img2img-output-cloud-portability-20260525.md](h2-img2img-output-cloud-portability-20260525.md)
34+
records a CPU-only existing-cache review on the SD/CelebA img2img packets.
35+
The admission `25 / 25` cache is weak or unstable for output-cloud geometry
36+
(`AUC = 0.7888`, `TPR@1%FPR = 0.0`, `TPR@0.1%FPR = 0.0`) and underperforms the
37+
existing simple-distance comparator (`AUC -0.0880`). The stability `10 / 10`
38+
cache is positive (`AUC = 0.9600`) but still not distinct from simple distance
39+
(`AUC -0.0300`). Decision: `img2img output-cloud weak-or-unstable /
40+
not distinct from simple distance / no Runtime runner / no Platform row`.
41+
3242
Previous Research update:
3343
[feature-packet-channel-consumer-verdict-20260525.md](feature-packet-channel-consumer-verdict-20260525.md)
3444
records a consumer-boundary verdict for the gray-box feature-packet lane.

0 commit comments

Comments
 (0)