feat(server): opt-in model pruning via [server].prune_models

lightsofapollo · claude · lightsofapollo · commit 7359d67ba147 · 2026-05-10T21:40:38.000-06:00
Adds `[server].prune_models = true` (default false). When enabled the
generator drops `analysis.schemas` entries unreachable from the
picked operations before handing off to CodeGenerator. The OpenAI
example example flips it on and drops 280 of 2154 schemas (13%,
trimming src/gen/types.rs by ~3300 lines) with all 4 example tests
still green.

Algorithm:
  1. Seed the keep set with every schema referenced from picked
     ops' request body, response bodies, and parameter shapes.
  2. Also seed with every schema name never `$ref`-d anywhere
     in the spec — these include analyzer-synthesised inline
     enums (e.g. `WebSearchApproximateLocationType` from an inline
     `type: enum` field on a parent struct) that the codegen emits
     as siblings of their parent but doesn't track via $refs.
  3. BFS over each kept schema's raw JSON walking every `$ref`,
     plus the analyzer's `AnalyzedSchema.dependencies` set as
     belt-and-braces.

Limitation: step 2 is conservative — it keeps any spec schema
reached only via operations or multipart bodies (since those bypass
schema-to-schema $refs), which limits the achievable reduction.
A future analyzer change to track synthesised-from edges directly
would unlock &gt;50% pruning safely. Tracked as openapi-generator-id-tbd.

UX:
  - The generator prints "✂️  Pruned N schema(s) (M remain)".
  - When the HTTP client is also enabled, a warning notes that it
    will only see types reachable from picked server ops.
  - Config field documented as informational/opt-in: clients usually
    want all types, so pruning defaults off.

Follow-ups filed:
  - openapi-generator-vl2: parallel [client].operations for symmetric
    server/client selective generation.

318 main suite + 4 OpenAI example + 2 Anthropic example tests green;
clippy -D warnings clean.

Co-Authored-By: Claude Opus 4.7 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/.beads/issues.jsonl b/.beads/issues.jsonl
@@ -20,6 +20,8 @@
 {"id":"openapi-generator-st8","title":"[Q3] Builder pattern for operations with many parameters","description":"OpenAI's responses_create has 25+ parameters. Even with Option\u003cT\u003e for optionals, the call site is hostile: client.responses_create(model, None, None, ..., Some('system prompt'), None, ...). Goal: emit a \u003cOp\u003eBuilder\u003c'_\u003e per op with .field(value) setters and a final .send().await. Required path/header params remain positional on the entry method; optional + body fields become builder setters. For struct-typed bodies, also generate per-field setters on the builder (delegating into the body struct).\n\n## Context\nFiles: src/client_generator.rs. Evidence: src/client_generator.rs:836 generate_request_param emits flat positional method args. See umbrella gpu-cli/openapi-to-rust#14.","acceptance_criteria":"- [ ] [generator.builders] enabled = true; threshold = 3 in TOML config.\n- [ ] Each operation with \u003ethreshold optional params gets a builder struct.\n- [ ] Required params stay positional on the entry method.\n- [ ] .send(self) -\u003e Result\u003c\u003cResponseT\u003e, ApiOpError\u003c...\u003e\u003e runs the existing emitted body.\n- [ ] Snapshot tests for an op with many optional params show the new shape compiles and the existing call compiles.\n- [ ] All 49 currently-compiling specs still compile.","status":"open","priority":2,"issue_type":"task","owner":"james@littlebearlabs.io","created_at":"2026-05-08T23:11:55Z","created_by":"James Lal","updated_at":"2026-05-08T23:11:55Z","labels":["codegen","phase4","quality"],"dependency_count":0,"dependent_count":1,"comment_count":0}
 {"id":"openapi-generator-quq","title":"[Q2] Format-typed scalars (date-time, uuid, byte, binary, ipv4, ipv6, uri)","description":"Real-world specs use 'format' tags everywhere. Today everything collapses to String/Vec\u003cu8\u003e. This issue adds typed scalars to the generator with **on-by-default** behavior and per-format opt-out via [generator.types] TOML.\n\n## Defaults (flipped to opt-out model)\n\n| format | default strategy | rust type | opt-out |\n|---|---|---|---|\n| date-time | chrono | chrono::DateTime\u003cUtc\u003e | = \"string\" or \"time\" |\n| date | chrono | chrono::NaiveDate | = \"string\" or \"time\" |\n| time | chrono | chrono::NaiveTime | = \"string\" or \"time\" |\n| duration | chrono | chrono::Duration | = \"string\" or \"iso8601\" |\n| uuid | uuid | uuid::Uuid | = \"string\" |\n| byte | base64 | Vec\u003cu8\u003e + inline base64_serde mod | = \"string\" or \"vec_u8\" |\n| binary | bytes | bytes::Bytes | = \"string\" or \"vec_u8\" |\n| ipv4/ipv6 | std | std::net::Ipv*Addr | = \"string\" |\n| uri | url | url::Url | = \"string\" |\n| email | string (off) | String | = \"email_address\" to opt in |\n\n## Implementation\n\nGoes through new TypeMapper chokepoint (see Q2.0). Each used optional crate is reported via REQUIRED_DEPS.toml (see Q2.8).\n\n## Context\nFiles: src/analysis.rs (lines 2967, 1151), src/generator.rs, src/type_mapping.rs (new). Evidence: src/analysis.rs:2973 returns bare \"String\" for OpenApiSchemaType::String regardless of format. See umbrella gpu-cli/openapi-to-rust#14.","acceptance_criteria":"- [ ] [generator.types] TOML section with per-format strategy strings.\n- [ ] Each format's default is on (typed) when crate is small/common; opt-out via = \"string\".\n- [ ] CLI --types-conservative flag sets all strategies back to \"string\" for regression bisects.\n- [ ] date-time uses chrono::serde::rfc3339 codec.\n- [ ] uuid uses uuid::Uuid with serde feature.\n- [ ] byte round-trips via base64 (inline mod base64_serde, no runtime crate).\n- [ ] binary uses bytes::Bytes with serde feature.\n- [ ] One conformance fixture per format under tests/conformance/fixtures/schema/format-*.yaml.\n- [ ] All 49 currently-compiling specs still compile under default config (i.e. with typed scalars on).\n- [ ] All 49 specs also still compile under --types-conservative.","status":"closed","priority":2,"issue_type":"task","assignee":"James Lal","owner":"james@littlebearlabs.io","created_at":"2026-05-08T23:11:40Z","created_by":"James Lal","updated_at":"2026-05-09T08:59:12Z","started_at":"2026-05-09T06:44:01Z","closed_at":"2026-05-09T08:59:12Z","close_reason":"Q2 typed-scalar formats land with flipped defaults (chrono/uuid/url/bytes/std::net::Ip*Addr/base64+codec). TypeMappingConfig switched from Option\u003cString\u003e placeholders to enum-typed strategies (DateStrategy/UuidStrategy/ByteStrategy/...) with opt-out per format. Wired through SchemaType::Primitive's new serde_with field, surfaced via #[serde(with = ...)] in generator. base64_serde helper module (with Option submodule for nullable byte fields) emitted only when format:byte is actually used. type_lacks_default extended for chrono/url/time types. --types-conservative CLI flag collapses everything back to String for bisecting. spec-compile gate: all 54 specs pass with default typed-on config; 1 skipped (gitea, baseline). Integration suite: zero failures. New tests: 10 typed-scalar end-to-end + 7 TypeMapper unit tests. Email + duration kept off by default (email less universal; chrono::Duration's native serde is seconds, not ISO 8601 — proper duration support is a follow-up).","labels":["phase4","quality","schema"],"dependencies":[{"issue_id":"openapi-generator-quq","depends_on_id":"openapi-generator-r36","type":"blocks","created_at":"2026-05-08T23:37:02Z","created_by":"James Lal","metadata":"{}"}],"dependency_count":1,"dependent_count":1,"comment_count":0}
 {"id":"openapi-generator-99a","title":"[Q1] Method-name canonicalization","description":"Heuristic post-processor on snake-cased operationId: tokenize path template, drop trailing tokens that match path tokens (in reverse path order), drop trailing HTTP-method verb. Re-check uniqueness; restore tokens for collisions. Goal: Anthropic's betaGetFileMetadataV1FilesFileIdGet + path /v1/files/{fileId} + GET → get_file_metadata.\n\n## Context\nToday get_method_name emits op.operation_id.to_snake_case() verbatim. Anthropic's spec produces names like beta_get_file_metadata_v1_files_file_id_get — the path and HTTP method are literally appended into the operationId. See umbrella issue gpu-cli/openapi-to-rust#14.","acceptance_criteria":"- [ ] Heuristic implemented in src/client_generator.rs:get_method_name (line ~859).\n- [ ] Unique across operation set; collisions fall back to original.\n- [ ] CLI/config flag [generator.method_names] strip_path = true (default true).\n- [ ] Snapshot tests confirm anthropic produces get_file_metadata not beta_get_file_metadata_v1_files_file_id_get.\n- [ ] All 49 currently-compiling specs still compile.","status":"open","priority":2,"issue_type":"task","owner":"james@littlebearlabs.io","created_at":"2026-05-08T23:10:47Z","created_by":"James Lal","updated_at":"2026-05-08T23:10:47Z","labels":["codegen","phase4","quality"],"dependencies":[{"issue_id":"openapi-generator-99a","depends_on_id":"openapi-generator-st8","type":"blocks","created_at":"2026-05-08T17:11:55Z","created_by":"James Lal","metadata":"{}"}],"dependency_count":1,"dependent_count":0,"comment_count":0}
+{"id":"openapi-generator-vl2","title":"[Client] Selective operations option (parallel to [server].operations)","description":"Today the client generator emits methods for every operation in the spec. For users who only need a subset, a parallel '[client] operations = [\"opId\", ...]' selection would mirror the server-side opt-in.\n\nCombined with model pruning, this becomes the dual scenario: pick the ops you call (client) AND the ops you host (server), prune to the union of both reachable sets. The selector grammar from src/server/selector.rs is reusable as-is.\n\nFor now this is filed under 'maybe useful'. Most client users want every op. But the symmetric server-client design would be cleaner once it exists.","status":"open","priority":3,"issue_type":"feature","owner":"james@littlebearlabs.io","created_at":"2026-05-11T03:40:20Z","created_by":"James Lal","updated_at":"2026-05-11T03:40:20Z","dependency_count":0,"dependent_count":0,"comment_count":0}
+{"id":"openapi-generator-q3k","title":"[Server] Aggressive model pruning via analyzer-tracked synthetic ownership","description":"The current [server].prune_models implementation walks transitive $refs from picked ops, then keeps every schema not referenced by any $ref anywhere as a 'synthetic'. For OpenAI's spec this yields ~13% reduction because many real spec schemas are reached only via operations or multipart bodies, making 'never $ref'd' a poor synthetic signal.\n\nTo get \u003e50% reduction safely, the analyzer needs to track which synthetic enums/structs belong to which parent schema. Concretely: when analysis registers WebSearchApproximateLocationType as a synthetic of WebSearchApproximateLocation's inline 'type: enum' field, it should record the parent→synthetic edge in DependencyGraph or in AnalyzedSchema (new field 'synthesised_from: Option\u003cString\u003e').\n\nWith that edge tracked, the prune walk becomes: walk transitive $refs from picked ops, then for every kept name, also keep all schemas whose synthesised_from points at it. That's both more aggressive and more correct than the current heuristic.\n\nRelated: the analyzer's existing AnalyzedSchema.dependencies field is also incomplete (Response.deps lists ResponseError but ResponseError.deps is empty even though it has a field of type ResponseErrorCode). Same root cause — analyzer registers synthetic siblings but doesn't track ownership.","notes":"Discovered while implementing prune_models in commit (current). Conservative impl ships; aggressive impl requires analyzer changes.","status":"open","priority":3,"issue_type":"feature","owner":"james@littlebearlabs.io","created_at":"2026-05-11T03:40:12Z","created_by":"James Lal","updated_at":"2026-05-11T03:40:12Z","dependency_count":0,"dependent_count":0,"comment_count":0}
 {"id":"openapi-generator-in6","title":"[Server] Anthropic spec missing text/event-stream content type on messages_post","description":"Anthropic's published OpenAPI spec (specs/anthropic.yaml) declares POST /v1/messages 200 response with content-type application/json only. The real API streams when stream:true is set on the request body, but the spec never declares text/event-stream as a valid response content type.\n\nConsequence: 'server list' does not mark messages_post as [SSE], and downstream server codegen will not emit an SSE response variant for it. Both are technically correct given the spec text.\n\nMitigation options:\n1. Use the existing schema-extensions mechanism to overlay a text/event-stream response on /v1/messages.\n2. Add a config knob ('force_stream_for_operations') that promotes nominated ops to streaming regardless of declared response content.\n3. Detect that the request body has a 'stream:bool' field and auto-promote (heuristic — risky).\n\nOption 1 is the path that fits the existing project model. Add an example extension file documenting how to do this, and reference it from the server codegen docs once P6 lands.","notes":"Discovered while validating server P1 against specs/anthropic.yaml. messages_post is one of our two canonical test cases (umbrella sot, P6 9ek).","status":"closed","priority":3,"issue_type":"bug","owner":"james@littlebearlabs.io","created_at":"2026-05-11T01:54:29Z","created_by":"James Lal","updated_at":"2026-05-11T02:41:32Z","closed_at":"2026-05-11T02:41:32Z","close_reason":"Fixed via examples/server-anthropic-messages/sse-overlay.json — declares text/event-stream on POST /v1/messages 200, which makes the generator emit MessagesPostResponse::OkStream. The Anthropic example now exercises both unary and streaming branches. Future docs/PRs should mention this pattern as the canonical fix for missing-content-type spec gaps.","dependency_count":0,"dependent_count":0,"comment_count":0}
 {"id":"openapi-generator-s42","title":"Propagate target schema nullability through $ref properties","description":"When a $ref points to a schema that is itself anyOf[Object, null] (e.g. OpenAI ResponseError), the property using that $ref should be wrapped in Option\u003c\u003e. Currently we strip the null branch when analyzing the target schema and emit a struct, then properties referencing that struct don't pick up nullability. Real hit: OpenAI Response.error — we currently require nullable_overrides to handle it. Fix would record nullability on AnalyzedSchema and OR it in at the property level when the prop_type is a Reference. Lower priority since the override workaround is documented.","status":"open","priority":3,"issue_type":"bug","owner":"james@littlebearlabs.io","created_at":"2026-05-11T00:13:05Z","created_by":"James Lal","updated_at":"2026-05-11T00:13:05Z","dependency_count":0,"dependent_count":0,"comment_count":0}
 {"id":"openapi-generator-tv8","title":"[Q2.5] Optional BTreeSet for uniqueItems arrays (opt-in)","description":"Arrays with uniqueItems: true (13,276 occurrences across specs/) currently emit Vec\u003cT\u003e. Spec-faithful representation is a set. Add [generator.types.shape] unique_items_to_set = false (default) — opt-in to emit BTreeSet\u003cT\u003e instead of Vec\u003cT\u003e. Off by default because flipping this changes the public API of every uniqueItems field across the corpus.\n\n## Context\nFiles: src/type_mapping.rs (Q2.0), src/analysis.rs (array analysis), src/generator.rs. Evidence: 13,276 uniqueItems usages in specs/, today all become Vec. See umbrella gpu-cli/openapi-to-rust#14.","acceptance_criteria":"- [ ] [generator.types.shape] unique_items_to_set toggle works.\n- [ ] When on and item type implements Ord + Eq (primitives, strings, enums, named structs deriving them), array becomes BTreeSet\u003cT\u003e.\n- [ ] When on but item type isn't Ord (e.g. floats, complex unions), fall back to Vec\u003cT\u003e with a stderr warning naming the field.\n- [ ] All 49 specs still compile in default (off) mode.","status":"open","priority":3,"issue_type":"task","owner":"james@littlebearlabs.io","created_at":"2026-05-09T05:36:01Z","created_by":"James Lal","updated_at":"2026-05-09T05:36:01Z","dependencies":[{"issue_id":"openapi-generator-tv8","depends_on_id":"openapi-generator-r36","type":"blocks","created_at":"2026-05-08T23:37:06Z","created_by":"James Lal","metadata":"{}"}],"dependency_count":1,"dependent_count":0,"comment_count":0}
diff --git a/examples/server-openai-responses/openapi-to-rust.toml b/examples/server-openai-responses/openapi-to-rust.toml
@@ -14,3 +14,8 @@ framework = "axum"
 #                   different tag, so it also exercises the multi-tag
 #                   combined `build_router` factory.
 operations = ["createResponse", "listInputItems", "usage-costs"]
+# Drop schemas not reachable from the picked operations. Cuts the
+# 2154-schema OpenAI spec down to just what these three ops touch.
+# Safe here because we don't generate the HTTP client (those would
+# also lose types).
+prune_models = true
diff --git a/src/bin/openapi-to-rust.rs b/src/bin/openapi-to-rust.rs
@@ -230,6 +230,29 @@ async fn run(cli: Cli) -> Result<(), Box<dyn std::error::Error>> {
             println!("📊 Found {} schemas", analysis.schemas.len());
             println!("📊 Found {} operations", analysis.operations.len());
 
+            // Optional: prune `analysis.schemas` to the transitive
+            // closure reachable from the picked server operations.
+            // Opt-in via `[server] prune_models = true`. When the HTTP
+            // client is also enabled we warn that it'll lose types not
+            // covered by the server scope.
+            if let Some(server_section) = generator_config.server.as_ref()
+                && server_section.prune_models
+                && !server_section.operations.is_empty()
+            {
+                let pruned_count = prune_analysis_models(&mut analysis, server_section)?;
+                println!(
+                    "✂️  Pruned {pruned_count} schema(s) outside the server scope ({} remain)",
+                    analysis.schemas.len()
+                );
+                if generator_config.enable_async_client || generator_config.enable_sse_client {
+                    eprintln!(
+                        "⚠️  prune_models = true: the HTTP client will only see types \
+                         reachable from picked server operations. Set prune_models = false \
+                         or extend [server].operations to keep additional types."
+                    );
+                }
+            }
+
             // Generate code
             println!("⚙️  Generating code...");
             let generator = CodeGenerator::new(generator_config);
@@ -538,6 +561,38 @@ fn print_add_summary(
     Ok(())
 }
 
+/// Drop schemas from `analysis.schemas` that are unreachable from
+/// the operations picked by `[server].operations`. Returns the count
+/// of schemas removed.
+fn prune_analysis_models(
+    analysis: &mut openapi_to_rust::SchemaAnalysis,
+    server: &openapi_to_rust::config::ServerSection,
+) -> Result<usize, Box<dyn std::error::Error>> {
+    use openapi_to_rust::server::OperationIndex;
+    use openapi_to_rust::server::codegen::reachable_schemas;
+
+    let index = OperationIndex::from_analysis(analysis);
+    let selectors: Vec<Selector> = server
+        .operations
+        .iter()
+        .map(|s| Selector::parse(s))
+        .collect::<Result<_, _>>()?;
+    let resolution = resolve_selectors(&selectors, &index)?;
+
+    // Translate resolved summaries back to full OperationInfo refs
+    // so reachability can walk request/response/parameter shapes.
+    let ops: Vec<&openapi_to_rust::analysis::OperationInfo> = resolution
+        .operations
+        .iter()
+        .filter_map(|s| analysis.operations.get(&s.operation_id))
+        .collect();
+
+    let keep = reachable_schemas(analysis, &ops);
+    let before = analysis.schemas.len();
+    analysis.schemas.retain(|k, _| keep.contains(k));
+    Ok(before - analysis.schemas.len())
+}
+
 /// Surface a paste-ready impl skeleton at the end of `generate`.
 /// Reads the picked operations from the analysis to name the trait,
 /// method, and body type concretely. Goes to stderr so it doesn't
diff --git a/src/config.rs b/src/config.rs
@@ -228,6 +228,14 @@ pub struct ServerSection {
     /// Empty ⇒ section is a no-op.
     #[serde(default)]
     pub operations: Vec<String>,
+    /// Emit only the model types reachable (transitively) from the
+    /// picked operations. Off by default because the bundled
+    /// `types.rs` is shared with the HTTP client generator and
+    /// pruning would silently drop types client code still needs.
+    /// Enable when generating a server-only crate to cut the
+    /// emitted surface dramatically (often 100× for spec-heavy APIs).
+    #[serde(default)]
+    pub prune_models: bool,
 }
 
 impl ServerSection {
diff --git a/src/server/codegen.rs b/src/server/codegen.rs