PredictabilityAtScale
diff --git a/‎README.md‎
Lines changed: 46 additions & 2 deletions b/‎README.md‎
Lines changed: 46 additions & 2 deletions
diff --git a/‎SKILL.md‎
Lines changed: 61 additions & 3 deletions b/‎SKILL.md‎
Lines changed: 61 additions & 3 deletions
diff --git a/‎docs/api-reference.md‎
Lines changed: 3 additions & 1 deletion b/‎docs/api-reference.md‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎docs/overrides.md‎
Lines changed: 4 additions & 3 deletions b/‎docs/overrides.md‎
Lines changed: 4 additions & 3 deletions
diff --git a/‎docs/prompt-format.md‎
Lines changed: 55 additions & 1 deletion b/‎docs/prompt-format.md‎
Lines changed: 55 additions & 1 deletion
@@ -162,6 +162,7 @@ Supported values for `warnings.contextSize` are `auto`, `off`, `result-only`, `c
 - **Overrides** — Environment and tier-based overrides (base → env → tier → runtime)
 - **5 provider adapters** — OpenAI (Chat), OpenAI (Responses), Anthropic, Gemini, OpenRouter — body-only output
 - **Provider-aware input caching controls** — optional `cache` front matter maps to OpenAI prompt cache hints, Anthropic `cache_control`, and Gemini `cachedContent`
+- **Vendor escape hatch** — optional `raw.<provider>` blocks shallow-merge unmodeled request-body fields into the final provider payload
 - **Validation** — Zod schema validation, Levenshtein-based "did you mean?" for typos, variable usage checks
 - **Context hardening** — structured regexes with flags, `/pattern/i` convenience syntax, and built-in `non_empty` / `reject_secrets` validators
 - **Optional short-circuit messages** — validators can return a structured `returnMessage` instead of throwing when configured
@@ -250,6 +251,48 @@ const request = openaiAdapter.render(prompt, {
 
 In browser or client-side code, keep provider credentials on the server. Use the rendered request body with your own server endpoint, server action, or edge function rather than calling a provider directly from the client.
 
+### Provider-specific fields and raw passthrough
+
+Use normalized fields first (`sampling`, `response`, `cache`, `tools`) so prompts stay portable. `response.schema` is the neutral JSON Schema path; adapters emit it as OpenAI/OpenRouter `response_format`, OpenAI Responses `text.format`, Anthropic `output_config.format`, and Gemini `generationConfig.responseJsonSchema`.
+
+Use `provider_options` when PromptOpsKit has a known provider-specific mapping, such as Anthropic `top_k`, Gemini's native `response_schema`, or OpenRouter routing fields.
+
+```yaml
+response:
+  format: json
+  schema_name: support_reply
+  schema_description: Structured support reply
+  schema:
+    type: object
+    properties:
+      answer:
+        type: string
+provider_options:
+  openrouter:
+    provider:
+      order: ["anthropic", "openai"]
+    transforms: ["middle-out"]
+```
+
+When a provider adds a body field PromptOpsKit does not model yet, use `raw`:
+
+```yaml
+raw:
+  openai:
+    service_tier: flex
+  anthropic:
+    service_tier: auto
+  gemini:
+    safetySettings:
+      - category: HARM_CATEGORY_DANGEROUS_CONTENT
+        threshold: BLOCK_ONLY_HIGH
+  openrouter:
+    usage:
+      include: true
+```
+
+Each adapter reads only its matching raw block and shallow-merges it into the generated request body after normalized mappings. This is intentionally an escape hatch; prefer first-class fields when they exist.
+
 On the server, adapters also provide async prompt-aware helpers so you can use the default `./prompts` and `./.generated-prompts/json` directories without creating a `PromptOpsKit` instance:
 
 ```typescript
@@ -561,10 +604,11 @@ Prompt files use YAML front matter with these fields:
 | `fallback_models` | `string[]` | Fallback model list |
 | `reasoning` | `object` | `{ effort, budget_tokens }` |
 | `sampling` | `object` | `{ temperature, top_p, frequency_penalty, presence_penalty, stop, max_output_tokens }` |
-| `response` | `object` | `{ format, stream, schema, schema_name, schema_strict }` |
+| `response` | `object` | `{ format, stream, schema, schema_name, schema_description, schema_strict }` |
 | `cache` | `object` | Provider-specific cache controls (`openai`, `anthropic`, `gemini`/`google`) |
 | `tools` | `array` | Tool references (string names or inline definitions) |
-| `provider_options` | `object` | Provider-specific non-portable options (`anthropic`, `gemini`) |
+| `provider_options` | `object` | Provider-specific non-portable options (`anthropic`, `gemini`, `openrouter`) |
+| `raw` | `object` | Provider-scoped request-body passthrough (`openai`, `openai-responses`, `anthropic`, `gemini`/`google`, `openrouter`) |
 | `mcp` | `object` | MCP server references |
 | `context` | `object` | `{ inputs, history }` — declare expected variables, with optional per-input `max_size`, `trim`, structured or literal `allow_regex`/`deny_regex`, and built-in `non_empty` / `reject_secrets` validators |
 | `includes` | `string[]` | Paths to included prompt files |
 
@@ -63,10 +63,11 @@ the fields required by that specific file:
 | `fallback_models` | string[] | no | Ordered fallback model list |
 | `reasoning` | object | no | `{ effort: low|medium|high, budget_tokens: number }` |
 | `sampling` | object | no | `{ temperature, top_p, frequency_penalty, presence_penalty, stop, max_output_tokens }` |
-| `response` | object | no | `{ format: text|json|markdown, stream: boolean, schema?: object, schema_name?: string, schema_strict?: boolean }` |
+| `response` | object | no | `{ format: text|json|markdown, stream: boolean, schema?: object, schema_name?: string, schema_description?: string, schema_strict?: boolean }` |
 | `cache` | object | no | Provider-specific cache controls (`openai`, `anthropic`, `gemini`/`google`) |
 | `tools` | array | no | Tool names (strings) or inline definitions with `{ name, description, input_schema }` |
-| `provider_options` | object | no | Provider-specific advanced options (`anthropic`, `gemini`) |
+| `provider_options` | object | no | Provider-specific advanced options (`anthropic`, `gemini`, `openrouter`) |
+| `raw` | object | no | Provider-scoped request-body passthrough for unmodeled vendor fields |
 | `mcp` | object | no | `{ servers: [string | { name, config }] }` |
 | `context.inputs` | `Array<string | { name, max_size?, trim?, allow_regex?, deny_regex?, non_empty?, reject_secrets? }>` | no | Declared variable names used in templates, with optional size budgets and runtime hardening controls |
 | `context.history` | object | no | `{ max_items: number }` |
@@ -232,12 +233,69 @@ tiers:
 ```
 
 Overridable fields: `model`, `fallback_models`, `reasoning`, `sampling`,
-`response`, `cache`, `tools`, `provider_options`.
+`response`, `cache`, `raw`, `tools`, `provider_options`.
 
 Override application order: **base → environment → tier → runtime**.
 
 ---
 
+## Provider-specific fields
+
+Prefer portable fields first:
+
+- Use `sampling` for common sampling controls
+- Use `response.schema`, `response.schema_name`, `response.schema_description`, and `response.schema_strict` for structured output when possible
+- Use `cache` for provider cache hints
+- Use `tools` for tool definitions
+
+Treat `response.schema` as the provider-neutral JSON Schema contract. The adapters emit it through provider-specific request fields: OpenAI/OpenRouter `response_format`, OpenAI Responses `text.format`, Anthropic `output_config.format`, and Gemini `generationConfig.responseJsonSchema`.
+
+Use `provider_options` for known non-portable mappings:
+
+```yaml
+provider_options:
+  anthropic:
+    top_k: 40
+    tool_choice:
+      type: auto
+    output_config:
+      format:
+        type: json_schema
+        schema:
+          type: object
+  gemini:
+    # Use only when Gemini's native schema dialect is required.
+    response_schema:
+      type: object
+    response_json_schema:
+      type: object
+  openrouter:
+    provider:
+      order: ["anthropic", "openai"]
+    transforms: ["middle-out"]
+```
+
+Use `raw` only when a vendor request-body field is important and PromptOpsKit does not model it yet:
+
+```yaml
+raw:
+  openai:
+    service_tier: flex
+  anthropic:
+    service_tier: auto
+  gemini:
+    safetySettings:
+      - category: HARM_CATEGORY_DANGEROUS_CONTENT
+        threshold: BLOCK_ONLY_HIGH
+  openrouter:
+    usage:
+      include: true
+```
+
+Raw blocks are provider-scoped (`openai`, `openai-responses`/`openai_responses`, `anthropic`, `gemini`/`google`, `openrouter`) and are shallow-merged into the final request body after normalized fields. When adding `raw`, include a short note in `# Notes` explaining why a first-class field is not being used.
+
+---
+
 ## Test sidecars
 
 Create a `.test.yaml` file alongside a prompt to define test cases:
 
@@ -247,7 +247,9 @@ const request = adapter.render(resolvedAsset, {
 });
 ```
 
-`RuntimeRenderOptions` for direct adapter rendering supports `environment`, `tier`, `runtime`, `variables`, `onContextOverflow`, `history`, `toolRegistry`, and `strict`.
+`RuntimeRenderOptions` for direct adapter rendering supports `environment`, `tier`, `runtime`, `variables`, `onContextOverflow`, `history`, `toolRegistry`, `strict`, and `openaiResponses`.
+
+Runtime overrides can include the same overridable front matter fields as `environments` and `tiers`, including `raw` provider passthrough blocks. Raw blocks are merged into provider request bodies after normalized fields and provider-specific options.
 
 ## Standalone `renderPrompt`
 
 
@@ -51,12 +51,13 @@ Only these fields can be overridden in `environments` and `tiers`:
 | `fallback_models` | `string[]` | Fallback model list |
 | `reasoning` | `object` | `{ effort, budget_tokens }` |
 | `sampling` | `object` | `{ temperature, top_p, frequency_penalty, presence_penalty, stop, max_output_tokens }` |
-| `response` | `object` | `{ format, stream, schema, schema_name, schema_strict }` |
+| `response` | `object` | `{ format, stream, schema, schema_name, schema_description, schema_strict }` |
 | `cache` | `object` | Provider-specific cache controls (`openai`, `anthropic`, `gemini`/`google`) |
+| `raw` | `object` | Provider-specific request-body passthrough blocks |
 | `tools` | `array` | Tool references |
-| `provider_options` | `object` | Provider-specific advanced options (`anthropic`, `gemini`) |
+| `provider_options` | `object` | Provider-specific advanced options (`anthropic`, `gemini`, `openrouter`) |
 
-Object fields (`reasoning`, `sampling`, `response`, `cache`, `provider_options`) are shallow-merged — individual sub-fields are replaced, but you don't need to repeat every sub-field.
+Object fields (`reasoning`, `sampling`, `response`, `cache`, `raw`, `provider_options`) are shallow-merged — individual sub-fields are replaced, but you don't need to repeat every sub-field.
 
 ## Applying overrides
 
 
@@ -132,6 +132,44 @@ cache:
 - `gemini.cached_content` (or `google.cached_content`) maps to `cachedContent` for requests that reuse a previously created Gemini cache.
 - You can safely include multiple provider blocks in the same prompt. Each adapter only reads its own block (`openai`, `anthropic`, or `gemini`/`google`) and ignores the others.
 
+## Structured JSON output
+
+Use the neutral `response` block for structured JSON whenever possible:
+
+```yaml
+response:
+  format: json
+  schema_name: support_reply
+  schema_description: Structured support reply
+  schema:
+    type: object
+    properties:
+      answer:
+        type: string
+```
+
+Adapters emit this JSON Schema through the provider-specific body shape: OpenAI/OpenRouter `response_format`, OpenAI Responses `text.format`, Anthropic `output_config.format`, and Gemini `generationConfig.responseJsonSchema`.
+
+Use provider-specific schema fields only when the vendor dialect itself matters, such as `provider_options.gemini.response_schema` for Gemini's native schema form.
+
+## Raw provider passthrough
+
+Use `raw` when a provider supports a request-body field that PromptOpsKit does not expose yet:
+
+```yaml
+raw:
+  openai:
+    service_tier: flex
+  anthropic:
+    service_tier: auto
+  gemini:
+    safetySettings:
+      - category: HARM_CATEGORY_DANGEROUS_CONTENT
+        threshold: BLOCK_ONLY_HIGH
+```
+
+Each adapter reads only its own raw block. Raw values are shallow-merged into the generated body after normalized mappings, so they can override generated fields. Prefer normalized fields (`sampling`, `response`, `cache`, `tools`) and `provider_options` first; reserve `raw` for vendor-specific fields that would otherwise be impossible to send.
+
 ## Sections
 
 The Markdown body is split on **H1 headings** into named sections. Three section names are recognized (case-insensitive):
@@ -261,7 +299,20 @@ sampling:
   temperature: 0.7
   max_output_tokens: 2048
 response:
-  format: text
+  format: json
+  schema_name: support_reply
+  schema_description: Structured support reply
+  schema:
+    type: object
+    properties:
+      answer:
+        type: string
+    required:
+      - answer
+provider_options:
+  openrouter:
+    transforms:
+      - middle-out
 context:
   inputs:
     - user_message
@@ -285,6 +336,9 @@ tiers:
     model: gpt-5.4-mini
   pro:
     model: gpt-5.4
+raw:
+  openai:
+    service_tier: flex
 metadata:
   owner: support-platform
   review_required: true