Skip to content

Commit 8b20770

Browse files
fix(generator): compile 49 of 54 specs (was 43); broaden CI gold list
Continued chasing real-world spec failures through scripts/spec-compile.sh. 49 of 54 OpenAPI 3.x specs in specs/ now compile cleanly via cargo check (gitea is Swagger 2.0, skipped). Up from 43 in #18. ## Bugs fixed (in order of how many specs they unblocked) 1. **Wrong fallback arm for typed-error enums.** When an op had only `default` (no specific 2xx) error responses, op_error_type emitted the typed enum but the codegen's "no typed enum" arm tried `typed = Some(v)` where v: serde_json::Value, mismatching the typed slot. Aligned the conditions in client_generator.rs:1206 so the default arm becomes `typed = None` whenever any non-2xx response exists. 2. **Indirect cycles via union wrappers.** stripe's BankAccount → BankAccountCustomer (enum) → Customer → BankAccountCustomer cycle wasn't direct self-reference, so my prior self-ref Box fix didn't catch it. generate_union_enum and generate_discriminated_enum now also Box variant payloads whose target is in analysis.dependencies.recursive_schemas. Closed stripe (17 errs → 0), microsoft-graph (5 → 0), lithic (1535 → 0). 3. **Reserved std type names.** cloudflare has a schema literally named `Result`; emitting `pub enum Result` shadows std::result::Result, breaking every `-> Result<T, ApiOpError<...>>`. Also gcore had a `Default` schema shadowing std::default::Default. to_rust_type_name now appends `Type` to a small reserved-name set (Result, Option, Box, Vec, String, Default, Clone, Debug, Send, Sync, Sized, Iterator, From, Into, TryFrom, TryInto, AsRef, AsMut, Some, None, Ok, Err). 4. **Rust 2024 keyword `gen`.** vercel had fields/types named `gen`. Added to is_rust_keyword. 5. **Default derive on enum with no variant matching default.** telnyx has `default: "en"` on a language enum with values like `en-US`, `en-AU`, … — no exact match. We were emitting `#[derive(Default)]` without `#[default]` on any variant, triggering E0665. Now we drop the Default derive when no variant matches. 6. **Sort-enum negative-prefix collisions.** telnyx and gcore use `["created_at", "-created_at", "ASC", "-ASC", …]` for sort orders. Both PascalCased to the same Rust variant, causing E0428 on the inline param enum. generate_single_param_enum now dedupes variant names with `_2`/`_3`/… suffixes. 7. **Per-method parameter ident collisions.** vercel's `exclude_ids` + `exclude-ids`, modern-treasury's duplicate `name`, twilio's `StartTime`/`StartTime>` produced E0382 (use of moved value) and E0415 (binding declared twice) in generated bodies. Added `ParameterInfo.rust_ident` populated by the analyzer at operation scope; client_generator.rs consults it everywhere instead of sanitizing param.name independently per call site. 8. **Case-sensitive operationId collision detection.** telnyx had two ops with operationIds `getMdrUsageReports` and `GetMdrUsageReports`. These didn't collide string-wise but PascalCased to the same Rust ident, producing two `GetMdrUsageReportsApiError` enum definitions (E0428). T6's collision check now compares PascalCased forms. 9. **Non-string scalars in `enum`.** gitpod has `type: string, enum: [2000, 5000, 10000, ...]` — numeric values on a string-typed schema. string_enum_values used to filter to .as_str() only, producing an empty Vec → empty enum (E0665, E0004). Now coerces non-string scalars via Display. 10. **Unresolvable $refs.** pagerduty uses `#/components/parameters/foo/schema` (last segment `schema` isn't a type name). google-tasks uses Swagger 2.0 carry-over `#/definitions/Foo`. extract_schema_name now (a) recognises `#/definitions/{X}` as an alias for `#/components/schemas/{X}`, (b) tightens the last-segment fallback to require PascalCase and skip JSON Schema sub-path keywords, and (c) when a ref still can't be resolved, falls back to serde_json::Value with a stderr warning instead of failing whole-document analysis. 11. **Nullable-anyOf wrapper collisions with the inner $ref.** `Step.status: anyOf [$ref StepStatus, null]` synthesized a wrapper named `StepStatus` that overwrote the actual top-level schema. Detect `is_nullable_pattern` in property analysis and unwrap to the inner type. When a wrapper IS needed, suffix collisions with `Union2`/`Union3`. 12. **Type-name dedup at emission.** Defensive layer: if two analyzed schemas PascalCase to the same Rust ident, the first occurrence wins and later ones are silently dropped (catches cases where analysis missed the collision). ## CI The spec-compile job now exercises 49 specs, up from 43: + gcore lithic microsoft-graph stripe telnyx vercel ## Quality follow-ups tracked in `bd` (`.beads/issues.jsonl`) - Q1 Method-name canonicalization - Q2 Format-typed scalars (date-time, uuid, byte, binary, ipv*, uri) - Q3 Builder pattern for ops with many parameters (depends on Q1) - Q4 Tagged discriminator enums - Q5 Display for ApiOpError that surfaces the typed body All 205 unit tests still pass; clippy + fmt clean. Refs #14
1 parent 5c3a44c commit 8b20770

4 files changed

Lines changed: 115 additions & 22 deletions

File tree

.beads/issues.jsonl

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{"id":"openapi-generator-8tu","title":"[Q4] Tagged discriminator enums (drop untagged when discriminator+mapping is present)","description":"When a schema has discriminator: { propertyName: 'type', mapping: { ... } }, we know exactly which type to deserialize at runtime by reading one field. Yet today we still emit #[serde(untagged)] on the union enum, which makes serde try every variant in order on every deserialization (slow) and emits the variant payload's JSON inline instead of a tagged shape on serialization (loses the discriminator on round-trip). Anthropic's content blocks (text/image/tool_use/tool_result) and OpenAI's response items are exactly this pattern. Tagged is much better. Approach: in generate_discriminated_enum, when the spec provides discriminator with mapping, emit #[serde(tag = '\u003cdiscriminator.property_name\u003e')] and rename each variant to the mapping value. For unions WITHOUT a discriminator, untagged remains.\n\n## Context\nFiles: src/generator.rs. Evidence: src/generator.rs:1107 generate_discriminated_enum and 1251 generate_union_enum both emit #[serde(untagged)] regardless of discriminator presence. See umbrella gpu-cli/openapi-to-rust#14.","acceptance_criteria":"- [ ] Discriminator + mapping → #[serde(tag = ...)] enum, not untagged.\n- [ ] Round-trip test: deserialize a JSON sample, serialize back, byte-equal modulo whitespace.\n- [ ] Variants ordered to match mapping insertion order (deterministic codegen).\n- [ ] Pet/Cat/Dog allOf-parent pattern (umbrella H12) supported.\n- [ ] All 49 currently-compiling specs still compile.","status":"open","priority":2,"issue_type":"task","owner":"james@littlebearlabs.io","created_at":"2026-05-08T23:13:12Z","created_by":"James Lal","updated_at":"2026-05-08T23:13:12Z","labels":["phase4","quality","schema"],"dependency_count":0,"dependent_count":0,"comment_count":0}
2+
{"id":"openapi-generator-st8","title":"[Q3] Builder pattern for operations with many parameters","description":"OpenAI's responses_create has 25+ parameters. Even with Option\u003cT\u003e for optionals, the call site is hostile: client.responses_create(model, None, None, ..., Some('system prompt'), None, ...). Goal: emit a \u003cOp\u003eBuilder\u003c'_\u003e per op with .field(value) setters and a final .send().await. Required path/header params remain positional on the entry method; optional + body fields become builder setters. For struct-typed bodies, also generate per-field setters on the builder (delegating into the body struct).\n\n## Context\nFiles: src/client_generator.rs. Evidence: src/client_generator.rs:836 generate_request_param emits flat positional method args. See umbrella gpu-cli/openapi-to-rust#14.","acceptance_criteria":"- [ ] [generator.builders] enabled = true; threshold = 3 in TOML config.\n- [ ] Each operation with \u003ethreshold optional params gets a builder struct.\n- [ ] Required params stay positional on the entry method.\n- [ ] .send(self) -\u003e Result\u003c\u003cResponseT\u003e, ApiOpError\u003c...\u003e\u003e runs the existing emitted body.\n- [ ] Snapshot tests for an op with many optional params show the new shape compiles and the existing call compiles.\n- [ ] All 49 currently-compiling specs still compile.","status":"open","priority":2,"issue_type":"task","owner":"james@littlebearlabs.io","created_at":"2026-05-08T23:11:55Z","created_by":"James Lal","updated_at":"2026-05-08T23:11:55Z","labels":["codegen","phase4","quality"],"dependency_count":0,"dependent_count":1,"comment_count":0}
3+
{"id":"openapi-generator-quq","title":"[Q2] Format-typed scalars (date-time, uuid, byte, binary, ipv4, ipv6, uri)","description":"Real-world specs use 'format' tags everywhere. Today everything collapses to String/Vec\u003cu8\u003e. Add typed scalars: date-time → chrono::DateTime\u003cUtc\u003e; date → chrono::NaiveDate; time → chrono::NaiveTime; duration → chrono::Duration; uuid → uuid::Uuid; byte → Vec\u003cu8\u003e + base64 serde; binary → bytes::Bytes; ipv4/ipv6 → std::net::Ipv*Addr; uri/url → url::Url. Configurable via [generator.types] TOML section with per-format choices (chrono vs time vs string, bytes vs vec_u8, etc.). Default: 'string' (current behavior, opt-in).\n\n## Context\nFiles: Cargo.toml, src/analysis.rs, src/generator.rs, scripts/spec-compile.sh. Evidence: src/analysis.rs:3091 get_number_rust_type only handles int32/int64/float/double; string format never produces typed scalars. See umbrella gpu-cli/openapi-to-rust#14.","acceptance_criteria":"- [ ] All formats above accept a TOML override.\n- [ ] Default ('string') matches today's behavior — no spec regresses.\n- [ ] When chrono is selected, generated structs use chrono::serde::rfc3339 for format: date-time.\n- [ ] When uuid is selected, generated structs use uuid::Uuid (with serde feature).\n- [ ] byte round-trips via base64 (matches OAS spec).\n- [ ] One end-to-end fixture per format under tests/conformance/fixtures/schema/format-*.yaml proving the types deserialize a real example.\n- [ ] Generated crate's Cargo.toml gets the right feature-gated deps.","status":"open","priority":2,"issue_type":"task","owner":"james@littlebearlabs.io","created_at":"2026-05-08T23:11:40Z","created_by":"James Lal","updated_at":"2026-05-08T23:11:40Z","labels":["phase4","quality","schema"],"dependency_count":0,"dependent_count":0,"comment_count":0}
4+
{"id":"openapi-generator-99a","title":"[Q1] Method-name canonicalization","description":"Heuristic post-processor on snake-cased operationId: tokenize path template, drop trailing tokens that match path tokens (in reverse path order), drop trailing HTTP-method verb. Re-check uniqueness; restore tokens for collisions. Goal: Anthropic's betaGetFileMetadataV1FilesFileIdGet + path /v1/files/{fileId} + GET → get_file_metadata.\n\n## Context\nToday get_method_name emits op.operation_id.to_snake_case() verbatim. Anthropic's spec produces names like beta_get_file_metadata_v1_files_file_id_get — the path and HTTP method are literally appended into the operationId. See umbrella issue gpu-cli/openapi-to-rust#14.","acceptance_criteria":"- [ ] Heuristic implemented in src/client_generator.rs:get_method_name (line ~859).\n- [ ] Unique across operation set; collisions fall back to original.\n- [ ] CLI/config flag [generator.method_names] strip_path = true (default true).\n- [ ] Snapshot tests confirm anthropic produces get_file_metadata not beta_get_file_metadata_v1_files_file_id_get.\n- [ ] All 49 currently-compiling specs still compile.","status":"open","priority":2,"issue_type":"task","owner":"james@littlebearlabs.io","created_at":"2026-05-08T23:10:47Z","created_by":"James Lal","updated_at":"2026-05-08T23:10:47Z","labels":["codegen","phase4","quality"],"dependencies":[{"issue_id":"openapi-generator-99a","depends_on_id":"openapi-generator-st8","type":"blocks","created_at":"2026-05-08T17:11:55Z","created_by":"James Lal","metadata":"{}"}],"dependency_count":1,"dependent_count":0,"comment_count":0}
5+
{"id":"openapi-generator-81u","title":"[Q5] Display for ApiOpError that surfaces the typed body","description":"Today format!('{e}', e: ApiOpError\u003cE\u003e) on an Api variant prints 'API error 404: {full body}'. For a Stripe error that includes a huge param_documentation blob, the message becomes a wall of JSON. Users complain they can't tell at a glance what the typed variant captured. Approach: in ApiError::Display, truncate body to ~500 chars with a '… (truncated)' marker; if typed.is_some(), prepend '(typed: \u003cvariant_name\u003e)' (E: fmt::Debug bound already exists); if parse_error.is_some() and typed.is_none(), append '(parse error: …)'. Full body still accessible via .body field.\n\n## Context\nFiles: src/http_error.rs. Evidence: src/http_error.rs:234 ApiError Display prints body verbatim — for huge JSON bodies this is unreadable; typed.is_some() info is hidden. See umbrella gpu-cli/openapi-to-rust#14.","acceptance_criteria":"- [ ] ApiError Display truncates body at 500 chars (configurable via const).\n- [ ] Typed variant name appears when typed.is_some().\n- [ ] Parse error reason appears when typed parsing failed.\n- [ ] Full body still accessible via .body — no info loss.\n- [ ] Unit test in src/http_error.rs covers all three branches.","status":"open","priority":3,"issue_type":"task","owner":"james@littlebearlabs.io","created_at":"2026-05-08T23:13:13Z","created_by":"James Lal","updated_at":"2026-05-08T23:13:13Z","labels":["codegen","phase4","quality"],"dependency_count":0,"dependent_count":0,"comment_count":0}

.github/workflows/ci.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -65,9 +65,9 @@ jobs:
6565
- run: |
6666
scripts/spec-compile.sh \
6767
anthropic arcade asana box browserbase cartesia cerebras circleci \
68-
coda coingecko datadog-v2 digitalocean github gitpod \
68+
coda coingecko datadog-v2 digitalocean gcore github gitpod \
6969
google-calendar google-drive google-gmail google-tasks google-youtube \
70-
grafana groq imagekit increase launchdarkly letta luma meta-llama \
71-
modern-treasury openai pagerduty perplexity resend retell runway \
72-
sentry snyk spotify supabase terminal-shop together twilio val-town \
73-
writer
70+
grafana groq imagekit increase launchdarkly letta lithic luma \
71+
meta-llama microsoft-graph modern-treasury openai pagerduty \
72+
perplexity resend retell runway sentry snyk spotify stripe \
73+
supabase telnyx terminal-shop together twilio val-town vercel writer

src/analysis.rs

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3686,11 +3686,26 @@ impl SchemaAnalyzer {
36863686
// be unique, but real-world specs (arcade, cal-com, telnyx,
36873687
// val-town, …) frequently aren't. Auto-disambiguate by suffixing
36883688
// with the method, then a counter, and warn.
3689-
let operation_id = if analysis.operations.contains_key(&raw_operation_id) {
3689+
//
3690+
// The collision key is the PascalCased form so that case-only
3691+
// differences (telnyx has `getMdrUsageReports` AND
3692+
// `GetMdrUsageReports`) collide too — otherwise codegen would
3693+
// produce two `GetMdrUsageReportsApiError` enums in the same
3694+
// module.
3695+
use heck::ToPascalCase;
3696+
let canon = |s: &str| s.replace('.', "_").to_pascal_case();
3697+
let key_collides = |id: &str| -> bool {
3698+
let target = canon(id);
3699+
analysis
3700+
.operations
3701+
.keys()
3702+
.any(|existing| canon(existing) == target)
3703+
};
3704+
let operation_id = if key_collides(&raw_operation_id) {
36903705
let method_lower = method.to_lowercase();
36913706
let mut candidate = format!("{}_{}", raw_operation_id, method_lower);
36923707
let mut suffix = 2;
3693-
while analysis.operations.contains_key(&candidate) {
3708+
while key_collides(&candidate) {
36943709
candidate = format!("{}_{}_{}", raw_operation_id, method_lower, suffix);
36953710
suffix += 1;
36963711
}

src/generator.rs

Lines changed: 88 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -638,7 +638,7 @@ impl CodeGenerator {
638638
nullable: false,
639639
})
640640
.collect();
641-
self.generate_union_enum(schema, &schema_refs)
641+
self.generate_union_enum(schema, &schema_refs, analysis)
642642
} else {
643643
self.generate_discriminated_enum(
644644
schema,
@@ -648,7 +648,7 @@ impl CodeGenerator {
648648
)
649649
}
650650
}
651-
SchemaType::Union { variants } => self.generate_union_enum(schema, variants),
651+
SchemaType::Union { variants } => self.generate_union_enum(schema, variants, analysis),
652652
SchemaType::Reference { target } => {
653653
// For references, check if we need to generate a type alias
654654
// This handles cases like nullable patterns
@@ -868,12 +868,21 @@ impl CodeGenerator {
868868
) -> Result<TokenStream> {
869869
let enum_name = format_ident!("{}", self.to_rust_type_name(&schema.name));
870870

871-
// Determine which variant should be the default
871+
// Determine which variant should be the default. The spec's `default`
872+
// may not exactly match any enum value (telnyx has
873+
// `default: "en"` on a language enum that lists `en-US`, `en-AU`,
874+
// … — no exact match). When that happens, drop the `Default` derive
875+
// entirely instead of emitting it on an enum where no variant has
876+
// `#[default]` (E0665).
872877
let default_value = schema
873878
.default
874879
.as_ref()
875880
.and_then(|v| v.as_str())
876881
.map(|s| s.to_string());
882+
let has_default_match = match &default_value {
883+
Some(d) => values.iter().any(|v| v == d),
884+
None => !values.is_empty(),
885+
};
877886

878887
// Variant-name uniqueness: enum values that PascalCase to the same
879888
// identifier (e.g. `ASC`/`asc` both → `Asc`) collide and produce
@@ -933,16 +942,23 @@ impl CodeGenerator {
933942
TokenStream::new()
934943
};
935944

936-
// Generate derives with optional Specta support
937-
let derives = if self.config.enable_specta {
938-
quote! {
945+
// Generate derives with optional Specta support. Drop `Default` if
946+
// no variant ends up tagged `#[default]` (would trigger E0665).
947+
let derives = match (self.config.enable_specta, has_default_match) {
948+
(true, true) => quote! {
939949
#[derive(Debug, Clone, PartialEq, Eq, Deserialize, Serialize, Default)]
940950
#[cfg_attr(feature = "specta", derive(specta::Type))]
941-
}
942-
} else {
943-
quote! {
951+
},
952+
(true, false) => quote! {
953+
#[derive(Debug, Clone, PartialEq, Eq, Deserialize, Serialize)]
954+
#[cfg_attr(feature = "specta", derive(specta::Type))]
955+
},
956+
(false, true) => quote! {
944957
#[derive(Debug, Clone, PartialEq, Eq, Deserialize, Serialize, Default)]
945-
}
958+
},
959+
(false, false) => quote! {
960+
#[derive(Debug, Clone, PartialEq, Eq, Deserialize, Serialize)]
961+
},
946962
};
947963

948964
Ok(quote! {
@@ -1119,19 +1135,31 @@ impl CodeGenerator {
11191135
nullable: false,
11201136
})
11211137
.collect();
1122-
return self.generate_union_enum(schema, &schema_refs);
1138+
return self.generate_union_enum(schema, &schema_refs, analysis);
11231139
}
11241140

1141+
let enclosing = self.to_rust_type_name(&schema.name);
11251142
let enum_variants = variants.iter().map(|variant| {
11261143
let variant_name = format_ident!("{}", variant.rust_name);
11271144
let variant_value = &variant.discriminator_value;
11281145

1129-
// Always use tuple variant that references the existing type
1130-
// This ensures the standalone event types are actually used
11311146
let variant_type = format_ident!("{}", self.to_rust_type_name(&variant.type_name));
1147+
// Box variant payloads that point at the enclosing enum or any
1148+
// schema in the analysis's recursive set, otherwise the enum has
1149+
// infinite size (E0072).
1150+
let payload = if self.to_rust_type_name(&variant.type_name) == enclosing
1151+
|| analysis
1152+
.dependencies
1153+
.recursive_schemas
1154+
.contains(&variant.type_name)
1155+
{
1156+
quote! { Box<#variant_type> }
1157+
} else {
1158+
quote! { #variant_type }
1159+
};
11321160
quote! {
11331161
#[serde(rename = #variant_value)]
1134-
#variant_name(#variant_type),
1162+
#variant_name(#payload),
11351163
}
11361164
});
11371165

@@ -1224,6 +1252,7 @@ impl CodeGenerator {
12241252
&self,
12251253
schema: &crate::analysis::AnalyzedSchema,
12261254
variants: &[crate::analysis::SchemaRef],
1255+
analysis: &crate::analysis::SchemaAnalysis,
12271256
) -> Result<TokenStream> {
12281257
let enum_name = format_ident!("{}", self.to_rust_type_name(&schema.name));
12291258

@@ -1332,7 +1361,15 @@ impl CodeGenerator {
13321361
// break the cycle. Observed in microsoft-graph.yaml.
13331362
let target_rust_name = self.to_rust_type_name(&variant.target);
13341363
let enclosing_name = self.to_rust_type_name(&schema.name);
1335-
let variant_type_tokens = if target_rust_name == enclosing_name {
1364+
let is_self_ref = target_rust_name == enclosing_name;
1365+
// Indirect cycles (stripe BankAccount → BankAccountCustomer →
1366+
// Customer → BankAccountCustomer): variants pointing into the
1367+
// analysis's recursive_schemas set must also be heap-allocated.
1368+
let is_recursive_target = analysis
1369+
.dependencies
1370+
.recursive_schemas
1371+
.contains(&variant.target);
1372+
let variant_type_tokens = if is_self_ref || is_recursive_target {
13361373
quote! { Box<#variant_type_tokens> }
13371374
} else {
13381375
variant_type_tokens
@@ -1803,6 +1840,40 @@ impl CodeGenerator {
18031840
result = format!("Type{result}");
18041841
}
18051842

1843+
// Avoid masking ubiquitous std types and traits. cloudflare has a
1844+
// schema literally named `Result`, gcore has `Default`; emitting
1845+
// `pub enum Result { ... }` shadows std::result::Result and breaks
1846+
// every method's `-> Result<T, ApiOpError<...>>`. Same for impls
1847+
// like `impl Default for HttpClient { ... }` when `Default` resolves
1848+
// to the local type alias.
1849+
if matches!(
1850+
result.as_str(),
1851+
"Result"
1852+
| "Option"
1853+
| "Box"
1854+
| "Vec"
1855+
| "String"
1856+
| "Some"
1857+
| "None"
1858+
| "Ok"
1859+
| "Err"
1860+
| "Default"
1861+
| "Clone"
1862+
| "Debug"
1863+
| "Send"
1864+
| "Sync"
1865+
| "Sized"
1866+
| "Iterator"
1867+
| "From"
1868+
| "Into"
1869+
| "TryFrom"
1870+
| "TryInto"
1871+
| "AsRef"
1872+
| "AsMut"
1873+
) {
1874+
result.push_str("Type");
1875+
}
1876+
18061877
result
18071878
}
18081879

@@ -1931,6 +2002,8 @@ impl CodeGenerator {
19312002
| "unsized"
19322003
| "virtual"
19332004
| "yield"
2005+
// Rust 2024 edition reservations.
2006+
| "gen"
19342007
)
19352008
}
19362009

0 commit comments

Comments
 (0)