docs: sync h2 output-cloud consumer boundary#314
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the documentation for the 'H2 output-cloud geometry' research candidate, providing detailed performance metrics and clarifying its status as a research-side component rather than a platform feature. The review feedback identifies a duplication of this entry in the reproduction status table with inconsistent status labels and suggests consolidating these entries for a single source of truth. Additionally, there is a recommendation to align the list of technical exclusions in the product bridge documentation with the evidence documentation to ensure consistency.
| | True known-split mechanisms | `hold-weak` | MNIST/DDPM raw-loss and x0 scouts are weak. Tiny overfit final-layer gradient norm was positive only on the extreme `8 / 64` target, weakened at `16 / 64`, and a more optimistic `64 / 64` oracle gradient-prototype alignment follow-up is effectively random (`AUC = 0.500977`, `ASR = 0.562500`, zero low-FPR recovery). Fashion-MNIST DDPM now has three weak clean-split scouts: fixed-timestep PIA-style loss (`AUC = 0.535889`, `TPR@1%FPR = 0.03125`), SimA single-query score-norm (`AUC = 0.515137`, zero low-FPR recovery), and score-Jacobian sensitivity (`AUC = 0.511719`, zero low-FPR recovery). The Beans member-LoRA denoising-loss scout repaired pseudo-membership semantics by creating an exact `SD1.5 + Beans-member LoRA` target, but the internal conditional denoising-loss score is weak (`AUC = 0.414400`, reverse `0.585600`, `TPR@1%FPR = 0.080000`) and parameter-delta sensitivity is also weak (`AUC = 0.512000`). Do not run more final-layer gradient norm/cosine variants, Fashion-MNIST timestep/seed/`p`-norm/perturbation/norm/packet-size sweeps, or Beans LoRA train-step/rank/resolution/prompt/timestep/layer matrices by default. See [fashion-mnist-ddpm-score-jacobian-sensitivity-20260514.md](fashion-mnist-ddpm-score-jacobian-sensitivity-20260514.md), [fashion-mnist-ddpm-sima-score-norm-20260514.md](fashion-mnist-ddpm-sima-score-norm-20260514.md), [beans-lora-delta-sensitivity-20260513.md](beans-lora-delta-sensitivity-20260513.md), [beans-lora-member-denoising-loss-scout-20260513.md](beans-lora-member-denoising-loss-scout-20260513.md), [fashion-mnist-ddpm-pia-loss-scout-20260513.md](fashion-mnist-ddpm-pia-loss-scout-20260513.md), [tiny-known-split-gradient-prototype-alignment-20260513.md](tiny-known-split-gradient-prototype-alignment-20260513.md), [gradient-norm-stability-gate-20260512.md](gradient-norm-stability-gate-20260512.md), and [tiny-overfit-gradient-norm-scout-20260512.md](tiny-overfit-gradient-norm-scout-20260512.md). | | ||
| | Black-box `H2 response-strength` | candidate-only | Positive-but-bounded DDPM/CIFAR10 candidate: frozen cutoff-0.50 lowpass follow-up passed, and raw H2 recovered strict-tail signal on the fresh packet. SD/CelebA text-to-image transfer is blocked by protocol mismatch. The frozen SD/CelebA image-to-image micro-packet is runnable, but H2 logistic does not beat the same-cache simple distance comparator, so H2 is not promoted beyond candidate-only. A separate simple-distance line now has bounded single-asset evidence: first 10/10 packet `AUC = 0.92`, non-overlapping 10/10 packet `AUC = 0.99` with 9/10 TP at 0 FP, and non-overlapping 25/25 admission packet `AUC = 0.8768`, `ASR = 0.84`, 11/25 TP at 0 FP. This is not a conditional-diffusion generalization or a `recon` product replacement. See [black-box-response-strength-preflight.md](black-box-response-strength-preflight.md), [h2-lowpass-followup-contract.md](h2-lowpass-followup-contract.md), [h2-cross-asset-contract-preflight.md](h2-cross-asset-contract-preflight.md), [h2-image-to-image-contract.md](h2-image-to-image-contract.md), [h2-img2img-micro-result.md](h2-img2img-micro-result.md), [h2-img2img-simple-distance-review.md](h2-img2img-simple-distance-review.md), [h2-img2img-simple-distance-stability-result.md](h2-img2img-simple-distance-stability-result.md), and [h2-img2img-simple-distance-admission-result.md](h2-img2img-simple-distance-admission-result.md). | | ||
| | Black-box `H2 output-cloud geometry` | candidate-only | CPU-only review on the existing H2 response cache found a strong output-output geometry signal that excludes seed-to-output distance (`AUC = 0.961529`, `ASR = 0.900391`, `TPR@1%FPR = 0.333984`, `TPR@0.1%FPR = 0.117188`). Seed `177` remains stable (`AUC = 0.961048`), and label-shuffle sanity returns random-level (`AUC = 0.507595`). This is not admitted because the source cache used class-ordered sample offsets, so a reseeded or interleaved order-control cache is required before promotion. Do not expand into KDE, shadow-density, repeat-count, same-cache feature sweeps, Platform schema, Runtime runner, or bundle rows. See [h2-output-cloud-geometry-20260525.md](h2-output-cloud-geometry-20260525.md). | | ||
| | Black-box `H2 output-cloud geometry` | candidate-only | Output-output geometry is now a strong H2 Research-side candidate after seed-offset control, not a Platform/Runtime row. The existing `512 / 512` cache review gives `AUC = 0.961529`, `ASR = 0.900391`, `TPR@1%FPR = 0.333984`, `TPR@0.1%FPR = 0.117188`; seed `177` remains stable (`AUC = 0.961048`), and label-shuffle sanity returns random-level (`AUC = 0.507595`). The bounded `256 / 256` shared-position order-control scout preserves the signal (`AUC = 0.967819`, `TPR@1%FPR = 0.410156`, `TPR@0.1%FPR = 0.132812`) with random-level label shuffle (`AUC = 0.464066`), and the same controlled boundary at seed `177` remains strong (`AUC = 0.956192`, `TPR@1%FPR = 0.285156`, `TPR@0.1%FPR = 0.109375`) with random-level label shuffle (`AUC = 0.484070`). Class-ordered seed offset is no longer a sufficient explanation, but this remains a single H2 response-cache geometry candidate, not a second public asset or consumer contract. Do not expand into KDE, shadow-density, repeat-count, same-cache feature sweeps, full `512 / 512` reruns, Platform schema, Runtime runner, or bundle rows. See [h2-output-cloud-geometry-20260525.md](h2-output-cloud-geometry-20260525.md). | |
There was a problem hiding this comment.
The track Black-box H2 output-cloud geometry is duplicated in this table (see line 35). Additionally, the status label used here (candidate-only) is inconsistent with line 35 (hold-candidate), and neither appears in the Status Stages definitions (lines 8-26). Please consolidate these entries to ensure a single source of truth and use a status label that aligns with the defined stages (e.g., deferred-candidate).
| 当前产品桥接决策记录在 | ||
| [h2-simple-distance-product-bridge-comparison.md](h2-simple-distance-product-bridge-comparison.md)。 | ||
|
|
||
| 当前 H2 output-cloud geometry 状态:这是 Research 侧强机制候选,不是 Platform/Runtime 行。既有 H2 `512 / 512` response-cache review、`256 / 256` shared-position order-control scout 和 seed `177` stability scout 都保持强信号,且 label-shuffle 回到随机级;但该结果仍只是 H2 response-cache geometry 候选,不是第二公开资产、消费合约或产品准入证据。不要新增 Platform schema、Runtime runner、UI 类型、bundle row、同 cache feature sweep 或完整 `512 / 512` 补跑。参见 |
There was a problem hiding this comment.
For consistency with the detailed evidence documentation in docs/evidence/reproduction-status.md, consider including the specific technical exclusions (KDE, shadow-density, repeat-count) in the list of restricted expansions.
| 当前 H2 output-cloud geometry 状态:这是 Research 侧强机制候选,不是 Platform/Runtime 行。既有 H2 `512 / 512` response-cache review、`256 / 256` shared-position order-control scout 和 seed `177` stability scout 都保持强信号,且 label-shuffle 回到随机级;但该结果仍只是 H2 response-cache geometry 候选,不是第二公开资产、消费合约或产品准入证据。不要新增 Platform schema、Runtime runner、UI 类型、bundle row、同 cache feature sweep 或完整 `512 / 512` 补跑。参见 | |
| 当前 H2 output-cloud geometry 状态:这是 Research 侧强机制候选,不是 Platform/Runtime 行。既有 H2 `512 / 512` response-cache review、`256 / 256` shared-position order-control scout 和 seed `177` stability scout 都保持强信号,且 label-shuffle 回到随机级;但该结果仍只是 H2 response-cache geometry 候选,不是第二公开资产、消费合约或产品准入证据。不要扩展到 KDE、shadow-density、repeat-count、同 cache feature sweep、完整 `512 / 512` 补跑,或新增 Platform schema、Runtime runner、UI 类型和 bundle row。参见 |
Summary
Checks