[F15] Bundle schema: `physical_realism_check` block — DESIGN must justify K vs (model, GPU)

## Problem

A campaign can pick parameter values (in particular, KV cache capacity K) that **demonstrate the desired effect without corresponding to any realistic (model, GPU, workload) tuple**. The math works, the apparatus passes, the result is real for the synthetic regime — but the result doesn't say anything about real-hardware behavior, and a reviewer can ask "you constructed your own contention".

In paper-memorytime-mirage, the campaign locked workload parameters (concurrency=32, P_A=1024, P_B mixture, D=1) and chose K (~500–1000 blocks = 8–16K tokens) to make the mirage manifest. The realistic K for llama-3.1-8b on H100 is ~24,576 blocks ≈ 393K tokens (derived from 48 GiB available for KV at gpu_memory_utilization=0.9, divided by 128 KiB/token). At realistic K, the chosen workload uses ~0.6% of cache — fully un-contended — and the mirage cannot manifest at all.

The agent picked K based on **bucket-engagement math** (K such that ω·K is below typical occupancy), which is mathematically correct for showing the mechanism, but it answers a different question from "does this matter on real hardware?".

## Desired behavior

A new optional schema block on bundle.yaml:

```yaml
physical_realism_check:
  model: meta-llama/llama-3.1-8b-instruct
  gpu: H100-80GB
  gpu_memory_utilization: 0.9
  derived_k_realistic: 24576
  k_used_in_experiment: 1000
  k_realism_ratio: 0.041   # k_used / k_realistic
  justification: |
    K is set 24x smaller than physical to demonstrate mechanism isolation
    in the contested-cache regime. Production scenarios where K contention
    actually happens: long-context RAG, smaller GPUs, larger models, multi-LoRA.
    A separate campaign at realistic K with a contention-inducing workload
    (RAG-scale P̄=8K, concurrency=64) is needed to test under production conditions.
```

The DESIGN agent must populate this block when its `verified_parameters` includes any K-class quantity (KV blocks, max_model_len, batched-tokens budget). If `k_realism_ratio < 0.5` (configurable threshold), nous emits a soft warning: "your K is 24× smaller than physical realism. State your reasoning in `justification` or raise K."

## Suggested implementation sketch

1. Add `physical_realism_check` to bundle.yaml schema, optional but expected when K-class parameters are set.
2. Add a methodology-prompt section for DESIGN: "When you set K-class parameters, populate `physical_realism_check` with derivation and justification."
3. The validator emits a soft warning if `k_realism_ratio < threshold` and `justification` is empty/perfunctory.
4. Document in `nous schema bundle` rendering.

## Acceptance criteria

- [ ] Bundle schema documents `physical_realism_check`.
- [ ] DESIGN methodology prompt instructs the agent to populate it for K-class parameter choices.
- [ ] Validator emits a soft warning when ratio is far from 1 and justification is missing.
- [ ] Friction report F15 row in the tracking issue checks off.

## Severity

HIGH (paper-defensibility) — vulnerable to a reviewer "you constructed your own contention" criticism if the realism check isn't surfaced.

## Source

`friction-report.md` F15, paper-memorytime-mirage campaign (2026-05).


---

Part of friction-report tracking issue #245.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[F15] Bundle schema: `physical_realism_check` block — DESIGN must justify K vs (model, GPU) #260

Problem

Desired behavior

Suggested implementation sketch

Acceptance criteria

Severity

Source

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[F15] Bundle schema: physical_realism_check block — DESIGN must justify K vs (model, GPU) #260

Description

Problem

Desired behavior

Suggested implementation sketch

Acceptance criteria

Severity

Source

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[F15] Bundle schema: `physical_realism_check` block — DESIGN must justify K vs (model, GPU) #260