A consolidated read of four reference SDKs (Expedia Group, Square, Airbyte, Google Cloud / gax)
plus the AWS Smithy 2.0 and Microsoft Kiota ecosystems, mapped against our dexpace/java-sdk
design. This document captures what to copy, what to avoid, and what to build next.
Source agent reports are not committed; this document is the synthesized conclusion.
- TL;DR
- Reference SDK Capsule Table
- Subsystem-by-Subsystem Comparison
- Where We Already Lead
- Feature Backlog
- Code Generation Strategy
- Our core architecture is sound. Zero-dep
sdk-core, single-methodHttpClientSPI,IoProviderseam, body replayability, separate async adapter modules, ReentrantLock + interrupt discipline — every reference SDK gets at least one of these wrong. - Most of the early gaps are now filled. Retry, auth, pagination, idempotency, the typed
exception hierarchy, a rich tracer event vocabulary, a metrics seam, and client lifecycle
(
close) all ship insdk-coretoday. What remains genuinely unbuilt is narrow: webhook signature verification and aSettings→Contextlifecycle split. - The pipeline architecture absorbed Airbyte's
Hooktaxonomy (SdkInit/BeforeRequest/AfterSuccess/AfterError) — the cleanest middleware shape we've seen. It maps ontoRequestPipelineStep/ResponsePipelineStepplus a recovery-awareResponsePipelinethat folds a sealedResponseOutcome(Success(Response)/Failure(Throwable)) rather than a bare response. - For code generation, build our own. A KotlinPoet/JavaPoet-based emitter sitting on
swagger-parser, distributed as a Gradle plugin, targeting our
sdk-coreruntime. None of the off-the-shelf options (Fern, Speakeasy, Kiota, smithy-java, Expedia's plugin, OpenAPI Generator's stock kotlin emitter) preserves our zero-runtime-dep + Java 8 + Kotlin-first + pluggable-transport constraints. Forking any of them costs more than a typed-IR rewrite, and Mustache-based generators carry a built-in correctness tax (Expedia's plugin alone has 20 cataloged bugs from string-templating typos).
| Repo | Lang | JVM | Transport | Runtime deps | Codegen | Notable |
|---|---|---|---|---|---|---|
| Expedia Group | Kotlin | 8 | OkHttp 4.12 (+ ServiceLoader SPI) | Okio (api!), Jackson/SLF4J compileOnly |
OpenAPI Generator 7.15 fork + 25 Mustache templates | Trait composition for operations; per-status typed exceptions; unmaintained, ~20 known bugs |
| Square | Java | 8 | OkHttp 5 + Jackson 2.18 (hardcoded) | OkHttp, Jackson, full set | Fern (TS hosted CLI) | Raw/Cooked/Async client triplet; SyncPagingIterable + BiDirectionalPage; RetryInterceptor with proper Retry-After/X-RateLimit-Reset parsing |
| Airbyte | Java | 11 | java.net.http.HttpClient (hardcoded) |
Jackson 2.18, jackson-databind-nullable, commons-io | Speakeasy (Go SaaS, closed) | Hook taxonomy is gold; reflection-driven serialization is awful; one SDKError for everything |
| Google Cloud / gax | Java | 8 | gRPC + HTTP/JSON | Guava, gRPC, protobuf, AutoValue, OTel/OC, threetenbp | gapic-generator-java (proto plugin, Bazel) | Callable decorator chain; rich ApiTracer vocabulary; per-attempt shrinking deadlines; ApiFuture is a cautionary tale |
| AWS Smithy 2.0 / smithy-java | Java 21-only runtime | n/a (codegen) | java.net.http.HttpClient w/ virtual threads |
OTel, etc. | codegen-core + JavaCodegenIntegration (typed) |
IDL > OpenAPI; runtime locked to JDK 21; ~6 weeks since GA (2026-04-06); no Kotlin target |
| Microsoft Kiota | Java 11+ runtime | n/a (codegen) | RequestAdapter (default OkHttp) |
Gson, jakarta.annotation, std-uritemplate, OTel | C#-based generator | Fluent path DSL; locked-down customization; ~3.7K stars; replaces our HttpClient SPI |
| SDK | Shape | Verdict |
|---|---|---|
| Ours | interface HttpClient { fun execute(Request): Response } |
Right shape |
| Expedia | Transport + AsyncTransport (two SPIs); ServiceLoader discovery |
Duplicated hierarchies; ServiceLoader = silent classpath ordering bugs |
| Square | OkHttp Call.Factory hardcoded in every generated Raw*Client |
No SPI |
| Airbyte | HTTPClient { send(HttpRequest): HttpResponse<InputStream> } over JDK 11 types |
SPI exists but leaks JDK 11 types — OkHttp adapter is painful |
| gax | TransportChannel (marker-thin) + TransportChannelProvider.needsX() |
The send method does NOT live on the channel — actual sending is per-Callable |
| Kiota | RequestAdapter (owns send + deserialize + auth + tracing) |
Too thick; replaces our SPI |
Notes:
HttpClientandAsyncHttpClientalready extendAutoCloseablewith a no-op defaultclose(), so consumers can shut down OkHttp dispatchers, JDK selector threads, and connection pools while SAM literals stay lightweight. Both reference transports close only SDK-managed clients (BYO clients are never closed). (Expedia'sDisposable.kt:25was the prior art.)- Keep our existing transport SPI shape. Do not adopt gax's marker-channel design — our model maps to HTTP cleanly; gax's was driven by gRPC.
- Reject
ServiceLoaderdiscovery (Expedia's pattern). Keep explicit installation à laIo.installProvider(...).
- We're already correct here. Immutable models with
private constructor+Builder+newBuilder()matches Expedia (which got this right).RequestBody.isReplayable()/toReplayable(provider)is unique to us — every other reference SDK assumes bodies arebyte[]and silently breaks retry/auth-refresh on streaming bodies. - Expedia's
FileRequestBodyopportunity (we have it spec'd indocs/architecture.md): transports can type-check and dispatch toFileChannel.transferToforsendfile(2). None of the reference SDKs do this. - Header handling: Expedia's
Headerslower-cases names but allows duplicateadd()calls that keep both values. Our implementation should be canonicalized (HTTP/2 normalizes to lowercase) and deduplicated by header name + value at insertion time. Verify ourHeaders.ktmatches. - Path & query encoding: Expedia uses naïve
String.replace("{name}", value)for path params — no percent-encoding (bug #6 in their bug catalog). When we build a query/path builder, percent-encode against RFC 3986 unreserved char tables.
We lead unambiguously. Every reference SDK hard-codes one I/O library:
- Expedia welds Okio into
sdk-core's public API (RequestBody.writeTo(BufferedSink)is an Okio type — seeexpediagroup-sdk-core/build.gradle:17'sapi 'com.squareup.okio:okio:3.16.0'). Changing I/O libs would break every consumer. - Square reads bodies as
responseBody.string()— fully buffered, no streaming option. - Airbyte does
Utils.toUtf8AndClose(InputStream)— same fully-buffered problem. - gax has no I/O abstraction; transports operate on byte arrays directly.
Our IoProvider seam + Source/Sink contracts is the architecturally cleanest piece of the SDK. Keep it. Document the install pattern more prominently in the public README — Expedia's late-init IllegalStateException is a good model for failure messages.
| SDK | Architecture | Verdict |
|---|---|---|
| Ours | RequestPipeline / recovery-aware ResponsePipeline fold over a sealed ResponseOutcome; PipelineStep fun interfaces |
Right shape; recovery semantics in place |
| Expedia | ExecutionPipeline = Request → Request + Response → Response folds |
Can't intercept transport, can't loop, can't proceed |
| Square | One OkHttp Interceptor (the retry one) |
No SDK-level pipeline |
| Airbyte | Hook taxonomy: SdkInit, BeforeRequest, AfterSuccess, AfterError |
Best design in the cohort |
| gax | Per-call-shape decorator chain (Callables.retrying(...), TracedUnaryCallable, ...) |
Powerful but N parallel hierarchies for N call shapes |
How Airbyte's hook taxonomy (utils/Hook.java in their repo) maps to our types, as built:
SdkInit→ builder configuration (no dedicated type)BeforeRequest≡RequestPipelineStepAfterSuccess≡ResponsePipelineStep(runs on the success path only)AfterError≡ResponseRecoveryStep, taking a sealedResponseOutcome(Success(Response)/Failure(Throwable)) instead ofEither
ResponsePipeline folds the outcome through the response steps (success path) and then through
every recovery step, so recovery always observes the terminal outcome — including failures
thrown by a response step. This generalizes Airbyte's Hooks.afterError(...): a recovery step
may rescue a failure into a success, replace the throwable, or pass through.
Two design choices stuck:
- No
chain.proceed(...)looping. Folds stay simple; retry lives in a dedicated step that delegates toHttpClient.executedirectly, so retry composes into the pipeline without chain semantics. - All exceptions funnel through one path. Airbyte's design has a corner — a
BeforeRequestthat throws bypassesAfterError. We avoid it: a step throwable is wrapped intoResponseOutcome.Failureand fed to the recovery chain, andapplynever throws.
Retry now ships as a pipeline step (RetryStep over RetrySettings + BackoffCalculator +
RetryAfterParser, plus the stage-based DefaultRetryStep). The table below records which
reference SDK each behavior was modeled on:
| Feature | Best example |
|---|---|
Retry-After (numeric + HTTP date) parsing |
Square RetryInterceptor.java:64-87 |
X-RateLimit-Reset (Unix epoch) parsing |
Square RetryInterceptor.java:90-102 |
| Exponential backoff with capped jitter | Square RetryInterceptor.java:104-107 |
| Per-attempt shrinking deadlines (each attempt timeout caps to remaining total budget) | gax ExponentialRetryAlgorithm.java:119-173 |
Split algorithm: ResultRetryAlgorithm + TimedRetryAlgorithm |
gax RetryAlgorithm.java:45-90 |
Per-method retryableCodes: Set<StatusCode> |
gax UnaryCallSettings |
| Streaming retry resumption | gax StreamResumptionStrategy |
Critical correctness rules (collected from anti-patterns observed across the SDKs):
- Idempotency awareness. Retry only safe-by-HTTP-method (GET/HEAD/OPTIONS/PUT/DELETE) requests, or non-safe requests whose
RequestBody.isReplayable()is true. Square retries everything (shouldRetrychecks status only, ignores method — a real bug for POST timeouts). ScheduledExecutorServicefor delay, neverThread.sleep. Square and Airbyte both block viaThread.sleep, which pins virtual thread carriers. UseCompletableFuture.delayedExecutoror a dedicated scheduler.- Restore interrupt flag. Square's
Thread.sleepswallow + re-throw asIOException(RetryInterceptor.java:44) violates our cancellation discipline. OnInterruptedException:Thread.currentThread().interrupt()+ throwInterruptedIOException. - Retry as a
RequestPipelineStep, not a transport interceptor. Square's pattern of "retry is an OkHttp Interceptor" means BYOOkHttpClientsilently loses retry. Pipeline-level retry composes with other steps and works across transports.
Auth lives in sdk-core today: a sealed Credential family (KeyCredential, NamedKeyCredential,
BearerToken), RFC 7235 challenge parsing, ChallengeHandler implementations (Basic, Digest,
Composite), and pipeline auth steps (BearerTokenAuthStep, KeyCredentialAuthStep). The table
below records where each scheme's design was sourced from:
| Scheme | Best impl to reference |
|---|---|
| OAuth2 client_credentials | Airbyte ClientCredentialsHook.java:36-95 (one class implements 3 hook interfaces; SessionManager.java keyed by MD5(clientId:clientSecret); auto-evict on 401) |
OAuth token storage with Clock |
Expedia OAuthTokenStorage.kt:33-103 (immutable, testable) |
| Basic | Trivial — Expedia's impl is fine |
| Bearer | Generated everywhere; just a header step |
| ADC / multi-source | gax GoogleCredentialsProvider (JWT optimization for service accounts) |
| Pluggable auth provider SPI | Kiota's AuthenticationProvider (taxonomy: AnonymousAuthenticationProvider, BaseBearerTokenAuthenticationProvider, ApiKeyAuthenticationProvider) |
Anti-patterns to avoid:
- Expedia's
OAuthStepusessynchronizedaround a network call (pipeline/step/OAuthStep.kt:30,40-44). Pins virtual thread carriers for the duration of the OAuth round-trip. UseReentrantLock(our rule) plus a coalescing future so concurrent calls share a single refresh. - Airbyte's
Securityreflects on@SpeakeasyMetadatastrings per request (Security.java:26-103). Don't do runtime reflection over string-DSL metadata. - Square punts on
idempotency_key— required field on every write DTO, no auto-gen. (See Idempotency.)
What shipped, and what's left:
- Auth lives in
sdk-core(http.auth), not a separatesdk-authmodule: a sealedCredentialfamily +ChallengeHandlerimpls (Basic, Digest, Composite) + anAuthSteppillar (BearerTokenAuthStep,KeyCredentialAuthStep). TheAuthStepbase requires HTTPS, strips the cross-origin redirect marker so a caller credential is never re-stamped onto a server-chosen host, and exposes ahandleChallengehook for token-refresh / step-up flows. - Token fetch is a
BearerTokenProviderSAM (fetch(scopes, params)). A full OAuth2 client_credentials provider with coalesced refresh and 401 eviction is not yet built — the seam (handleChallenge) exists, the policy does not. - Per-call auth override is not yet exposed — auth is wired per pipeline. Airbyte's per-client-only model is a real limitation for multi-tenant clients; a per-call override remains worth adding.
- Square requires
idempotency_keyas a typed DTO field on every write (e.g.RefundPaymentRequest.java:27,93-95). No auto-generation, no defaulting. For a payments SDK this is a customer-trust risk. - Airbyte ships an
IdempotencyHookas a built-inBeforeRequesthook (utils/Hook.java:284-292). InjectsIdempotency-Key: UUID.randomUUID().toString()on each call.
IdempotencyKeyStep covers this as a RequestPipelineStep:
- Injects
Idempotency-Key: UUID.randomUUID()forPOST/PUT/PATCH(themethodsset is configurable, e.g. to addDELETE). - A header already on the request wins —
respectExisting(defaulttrue) leaves caller-set keys untouched. - The key value comes from a pluggable
keyStrategy(defaultUUID.randomUUID().toString()), so APIs that want deterministic keys from a request hash can supply their own; the header name is configurable too.
sdk-core/serde/ holds the abstractions (Serializer/Deserializer/Serde, plus the
Tristate<T> sealed type), and sdk-serde-jackson is the concrete adapter. Reference SDKs:
- Expedia: hardcoded Jackson Kotlin module. Mapper is consumer-supplied. Polymorphism via
@JsonSubTypes. Forces every consumer to ship Jackson. - Square: Jackson + Jdk8Module + JavaTimeModule + custom
DateTimeDeserializer.FAIL_ON_UNKNOWN_PROPERTIES=false+WRITE_DATES_AS_TIMESTAMPS=false— the de-facto-correct SDK defaults.ObjectMappers.java:21-22. - Airbyte: Jackson +
jackson-databind-nullablefor tri-stateJsonNullable<T>(PATCH semantics). Reflection-driven via@SpeakeasyMetadatastrings — avoid. - Kiota: Pluggable
ParseNode/SerializationWriterper media type. Default = Gson.
What shipped, and what's left:
Serde/Serializer/Deserializerstay insdk-corewith no Jackson dependency.sdk-serde-jacksonships the adapter: Kotlin + JSR-310 + Jdk8 modules,FAIL_ON_UNKNOWN_PROPERTIESandWRITE_DATES_AS_TIMESTAMPSboth disabled (Square's two flags). Asdk-serde-kotlinxis still optional/later.Tristate<T>is defined insdk-core/serde(Absent,Present<T>(value),Null);sdk-serde-jackson'sTristateModulemaps it to Jackson's missing-field / null / present distinction for PATCH semantics.oneOfdeserialization is still open — codegen will drive it. The rule stands: prefer discriminator-driven; fall back to ordered candidate probing only with explicit hints; fail loudly on ambiguity. Airbyte'sOneOfDeserializer.java:104-107silently picks first match by default — a real data-corruption risk.
| SDK | Approach | Verdict |
|---|---|---|
| Ours | Status value class (total fromCode) + typed HttpException hierarchy with a derived retryable flag |
Typed hierarchy in place |
| Expedia | Per-operation {Op}{StatusCode}Exception extending ExpediaGroupApiException |
Best per-status DX |
| Square | SquareException (base) + SquareApiException (non-2xx); tolerant body parser handles 3 error shapes |
Good error-body parsing |
| Airbyte | One SDKError for all non-2xx |
Worst — no typed access |
| gax | ApiException + 14 typed subclasses (NotFoundException, UnavailableException, ...); retryable flag baked at construction |
Cleanest baseline taxonomy |
What shipped, and what's left:
HttpExceptionis the base, with status-code-keyed subclasses (NotFoundException,UnauthorizedException,TooManyRequestsException,InternalServerErrorException, etc.) plusClientErrorException/ServerErrorExceptionfallbacks for unmapped 4xx/5xx.NetworkExceptionis a sibling for transport failures. The set mirrors gax's taxonomy scaled to HTTP statuses.isRetryable: Boolean(from theRetryableinterface) is avalderived once at construction fromRetryUtils.isRetryable(status.code)— not a per-subclass constant. This is the single source of truth, so it can never disagree with the live retry policy (408 retryable; 501/505 not).NetworkExceptionimplements the same interface (alwaystrue), so a retry predicate keys off the interface:(t as? Retryable)?.isRetryable == true.- The base exposes
status+headers+ a lazybody: ResponseBody?(not eagerly buffered), plus a non-consumingbodySnapshot()that reads from apeek()view so the primary read path is undisturbed — large 5xx bodies don't OOM. Per-operation per-status subclasses carrying typed bodies (Expedia pattern) are still codegen's job. - Tolerant error-body parsing (Square's
SquareApiException.parseErrors) remains future work — never throw inside an exception constructor; pass through the raw body on parse failure.
Paginator<T> + Page<T> ship in sdk-core, driven by a PaginationStrategy (cursor,
page-number, token, link-header), and http.paging.PagedIterable wraps the result. Reference
designs we drew on:
- Square:
SyncPagingIterable<T>(Iterable<T>lazy iterator),SyncPage<T>(per-page holder),BiDirectionalPage<T>(forward + backward cursors),CustomPager<T>(user-implementation stub for HATEOAS). - gax:
PagedListResponse<RequestT, ResponseT, ResourceT>driven byPagedListDescriptor(injectToken,extractNextToken,extractResources).iterateAll()returns lazyIterable<ResourceT>.FixedSizeCollectionrepaginates to consumer-chosen page sizes. - Expedia GraphQL:
Paginator(abstractIterator<T>base) +PaginatedStream(Stream<T>over the paginator). Synchronous only.
What shipped, and what's left:
Paginator<T>andPage<T>are insdk-core.iterateAll()returns a lazyIterable<T>;streamAll()returns a Java 8Stream<T>. Each call hands back an independent iterator with its own state.- The strategy is injected via
PaginationStrategy, with concrete impls covering cursor (next_cursor/prev_cursor), page-number, token, and link-header (RFC 8288). AmaxPagescap guards against servers that never advance their cursor. - Async variants for
sdk-async-coroutines(Flow<T>) andsdk-async-reactor(Flux<T>) are not yet built. BiDirectionalPageis deferred until a real API needs it; Square's pattern is good when needed.
Streaming already ships in the form that matters for HTTP APIs: a WHATWG-compliant Server-Sent
Events reader (http.sse) in sdk-core, surfaced as a backpressured Reactor Flux<ServerSentEvent>
in sdk-async-reactor. Long-running-operation polling is still aspirational. Reference designs:
- gax
OperationFuture<R, M>: extendsApiFuture<R>withgetName(),peekMetadata(),getPollingFuture(). LRO modeled as retry-with-different-result-predicate (OperationResponsePollAlgorithm).resumeFutureCall(operationName, ctx)allows reattaching across restarts. (gax/.../longrunning/OperationFuture.java:42-128.) - gax server-streaming:
ResponseObserver<V>(gRPC-style) with explicit backpressure viaStreamController.disableAutoInboundFlowControl()+request(int). Watchdog closes streams with no demand→response progress withinwaitTimeout.
For us: defer LRO until a specific consumer needs it. Our SSE streaming already leans on
reactive-streams backpressure via Reactor; gax's explicit StreamController model (request(int))
is the reference if a non-Reactor server-streaming surface is ever needed.
| SDK | Vocabulary | Verdict |
|---|---|---|
| Ours | Span / TracingScope + an HttpTracer with named retry/request/response events |
Event-rich vocabulary in place |
| Expedia | SLF4J only | None |
| Square | None | None |
| Airbyte | System.out debug logger |
None |
| gax | ApiTracer with 15+ named events (operationSucceeded, attemptStarted, attemptFailed, responseReceived, requestUrlResolved, ...) |
Gold standard |
gax's ApiTracer (gax/.../tracing/ApiTracer.java:47-219) treats retry/streaming/LRO as
first-class events. A generic OpenTelemetry tracer can't render meaningful retry dashboards
without convention; an event-rich tracer can.
What shipped, and what's left:
HttpTracercarries the named events with no-op defaults:operationStarted/operationSucceeded/operationFailed,attemptStarted(attemptNumber),attemptFailed(...),attemptRetriesExhausted(throwable),requestUrlResolved(url),requestSent,responseHeadersReceived(...),responseReceived,connectionAcquired(...).NoopHttpTraceris the default factory.- A
MetricsRecorderseam exists separate from tracing —Meterwithcounter(...)(LongCounter) andhistogram(...)(DoubleHistogram),NoopMeteras the default. Latency/errors/retries are the SRE-relevant signals; gax'sMetricsTracer+GoldenSignalsMetricsRecorderwas the model. - An
sdk-instrumentation-oteladapter wiring our events to OpenTelemetry spans + metrics is not yet built. - Client identity header (
User-Agent).ClientIdentityStepbuilds the composite token line (dexpace-sdk/<ver> jvm/<javaver>, custom tokens prepended), modeled on gax'sApiClientHeaderProvider.
We have the right architecture (separate sdk-async-* modules). Anti-patterns observed:
- gax invented
ApiFuturebecause GuavaListenableFuturecouldn't be in the public API (shading). Now every Cloud client returnsApiFutureand ergonomic composition is impossible. Lesson: never invent a new future type. KeepHttpClient.executesync; letsdk-async-coroutinesreturnsuspend,sdk-async-reactorreturnMono, etc. - Expedia has
Transport+AsyncTransportas two parallel SPIs, with twin executor hierarchies (AbstractRequestExecutor+AbstractAsyncRequestExecutor) and twin OAuth managers (OAuthManager+OAuthAsyncManager). ~90% code duplication. We avoid this by having one core + per-async-flavor adapters.
- gax has the cleanest separation:
StubSettings(immutable user-facing) →ClientSettings(facade) →ClientContext(resolved runtime: connected transport, fetched creds, built executor, default call context).ClientContext.create(settings)is the resolution boundary.BackgroundResourcesemantics: closing the context cleans up owned executors. - Square's
Suppliers.memoizefor lazy sub-client init (Suppliers.java:13-22) — 10-lineAtomicReference.updateAndGet, lock-free, thread-safe. Worth copying verbatim for our future generated client shell so we don't eagerly construct 30+ sub-clients on everynew XxxClient(...).
Action items:
- Split
Configurationinto immutable user-facingXxxSettingsand resolved runtimeXxxContext. Resolution stage owns executor lifecycle (AutoCloseable). - Adopt
Suppliers.memoizefor any client-of-clients pattern we ship. - Resource ownership: per gax's
BackgroundResourcediscipline, distinguish user-supplied (don't close) from SDK-owned (close onclient.close()) executors and transports.
- Square
WebhooksHelper.verifySignature: HMAC-SHA256, but usesString.equals(WebhooksHelper.java:53) — vulnerable to timing attacks. No timestamp/replay check.
Action items:
- Build
sdk-webhooksmodule.WebhookVerifier.verify(secret, payload, signature, timestamp, tolerance). UseMessageDigest.isEqualfor constant-time comparison. Require timestamp + tolerance (e.g. ±5min) for replay protection.
These are not gaps. Stay the course.
- Zero non-SLF4J runtime deps in
sdk-core. Unique among the cohort. gax pulls Guava+gRPC+Protobuf+AutoValue+OTel+threetenbp; Expedia welds Okio into the public API; Square+Airbyte are full Jackson+OkHttp/JDK11. - Java 8 bytecode target. None of Kiota / smithy-java / Airbyte support Java 8. gax does, Square does, Expedia does — but with much heavier deps.
- Single
HttpClientSPI with our ownRequest/Response. Doesn't leak transport types (Airbyte leaks JDK 11; Square leaks OkHttp). IoProviderseam withSource/Sinkcontracts. No reference SDK has this level of I/O abstraction.RequestBody.isReplayable()/toReplayable(provider). First-class replayability; nobody else has it.- Body logging via
TeeSink(request) + eager-buffer-then-peek (response).LoggableRequestBody/LoggableResponseBodydesign preserves streaming semantics — Expedia eagerly buffers every request body when logging is on (a real regression for streaming uploads). - ReentrantLock + interrupt-restore discipline. Documented in
CLAUDE.md. Square, Expedia, and Airbyte all violate this in their retry/auth paths. - Explicit Kotlin visibility (Strict mode) +
internal+@JvmSynthetic. Real enforcement. gax uses@InternalApias marker-annotations on public types, which has no compiler enforcement. - Async as adapter modules, not invented futures. No
DexpaceFuture. We sidestep gax'sApiFuturedebt. - Separate transport modules (OkHttp + JDK HttpClient). Already in place — most ref SDKs only ship one.
The foundational and most of the DX-win work from this survey now ships in sdk-core (with
adapters where noted):
- Retry pipeline step.
Retry-After+X-RateLimit-Resetparsing, backoff with jitter, idempotency-aware (HTTP method + replayable body),ScheduledExecutorService-based delay. [pipeline/step/retry/,http/pipeline/steps/DefaultRetryStep.kt] - Typed exception hierarchy.
HttpExceptionbase + status-code subclasses;retryablederived fromRetryUtils.isRetryable(status.code). [http/response/exception/] - Recovery step. Recovery-aware
ResponsePipelinefolding a sealedResponseOutcome(Success/Failure);ResponseRecoveryStepis theAfterErroranalog. [pipeline/] HttpClient.close()/ lifecycle.AutoCloseableon both SPIs and both transports; SDK-managed clients close, BYO clients don't. [client/]- Idempotency-key step. Auto-injects
Idempotency-Key: UUID.randomUUID()forPOST/PUT/PATCH; caller-set header wins; pluggable key strategy. [pipeline/step/IdempotencyKeyStep.kt] - Auth.
Credentialfamily + RFC 7235 challenge parsing + Basic/Digest/CompositeChallengeHandlers +AuthSteppillar. [http/auth/,http/pipeline/steps/] sdk-serde-jacksonadapter. Kotlin + JSR-310 + Jdk8 modules;FAIL_ON_UNKNOWN_PROPERTIESandWRITE_DATES_AS_TIMESTAMPSdisabled;Tristate<T>viaTristateModule.- Pagination primitives.
Paginator<T>+Page<T>+PaginationStrategy(cursor / page-number / token / link-header) with amaxPagescap;PagedIterablewrapper. - Client identity header.
ClientIdentityStepbuilding thedexpace-sdk/<ver> jvm/<javaver>token line. - Tracer event vocabulary + metrics seam.
HttpTracerwith named retry/request/response events;Meter/LongCounter/DoubleHistogramseparate from tracing. - SSE streaming. WHATWG reader in
sdk-core; backpressuredFlux<ServerSentEvent>insdk-async-reactor.
Ordered by leverage:
- OAuth2 client_credentials flow. Coalesced refresh + 401 eviction over the existing
BearerTokenProvider/handleChallengeseam; per-call auth override viaRequestContext. sdk-instrumentation-oteladapter. Wire theHttpTracerevents andMeterseam to OpenTelemetry spans + metrics.sdk-webhooksmodule. HMAC-SHA256/SHA1 verifier, constant-time compare, timestamp+tolerance replay check.- Configuration
Settings→Contextresolution split. Apply gax'sBackgroundResourcediscipline at the resolution boundary. - Async pagination variants.
Flow<T>(sdk-async-coroutines) andFlux<T>(sdk-async-reactor) overPaginator. - Tolerant error-body parsing. Decode typed/structured error payloads without throwing inside an exception constructor.
- Long-running operation polling helpers. Only when a real consumer needs them.
- Batching primitives. Only when a real consumer needs them.
| Option | Verdict | Reason |
|---|---|---|
| Fork Expedia's OpenAPI plugin | Reject | Inherits 20 cataloged bugs (uppercase discriminators, path params not URL-encoded, accept-header status-200-only, etc.); Mustache fragility; Kotlin-only; locks us to OpenAPI Generator 7.15 |
| OpenAPI Generator (stock kotlin emitter) | Reject | Same Mustache fragility; per-language quality variance; runtime emitted alongside generated code (consumer must ship Jackson/OkHttp) |
| Smithy 2.0 + smithy-java runtime | Reject | smithy-java requires JDK 21 (we're Java 8); brand-new GA (April 2026); thick runtime duplicates our sdk-core |
Smithy 2.0 + custom JavaCodegenIntegration targeting our runtime |
Defer | Viable but 7-10 person-weeks; only justified once we own multiple services in Smithy IDL |
| Microsoft Kiota | Reject | No Kotlin target; Java 11+ default; replaces our HttpClient SPI with RequestAdapter; 5-8 runtime deps; locked-down customization |
| Fern (Square's generator) | Reject | Generates its own runtime (effectively replacing sdk-core); SaaS-recommended workflow; OkHttp + Jackson hardcoded; Java only |
| Speakeasy (Airbyte's generator) | Reject | Closed-source SaaS; venture-funded vendor in critical path of every release; no self-host |
| gapic-generator-java | Reject | Proto + Bazel; would require an OpenAPI→proto front-end; co-versioned runtime model is too heavy for us |
| Build our own (KotlinPoet + swagger-parser) | Recommend | Typed IR + typed code emission; no Mustache; reuses our sdk-core types; comparable cost to forking/integrating any of the above |
A new Gradle module sdk-codegen plus a Gradle plugin sdk-codegen-gradle-plugin.
Stack:
io.swagger.parser.v3:swagger-parseras front-end. Handles OpenAPI 2.0/3.0/3.1, resolves$refs, well-maintained. (Smithy IDL can become an alternative source later via a separate parser adapter.)- Internal normalized IR mapped from the swagger-parser
OpenAPIobject. Resolves discriminators, flattensallOf, names inline schemas deterministically, splitsreadOnly/writeOnlyproperties into request-vs-response models. - KotlinPoet for Kotlin emission; JavaPoet for Java emission. Same IR, two emitters.
- Generator-side
Integrationhooks (Smithy-inspired) so consumers can inject preprocessing, decorate type/symbol resolution, intercept named code sections (e.g. "operation-error-handling") without forking templates. - Golden file tests. Every meaningful spec feature gets a fixture spec + a checked-in expected output. CI compiles the generated output.
Why this beats Mustache-based alternatives:
- Compiler-checked emission. KotlinPoet's
FileSpec/TypeSpec/FunSpecis typed Kotlin; refactors are safe; IDE supports it. Every Mustache bug in Expedia's plugin (path params not URL-encoded, discriminators uppercased, unused imports, dead validation, builder param type drops nullability) is a string-typo that the compiler would have caught. - Targets our runtime directly. Generator emits
org.dexpace.sdk.core.http.Request/Response/Headersand our pipeline steps. No Mustache template indirection. - Hot-path: no reflection. Generated code is straight-line builder calls. Compare Airbyte's
@SpeakeasyMetadataruntime reflection — slow, opaque, fragile. - Pluggable serde. Codegen emits an abstraction (
Serializer<T>) that adapter modules implement. Don't hardcode Jackson into templates. - Multi-language without a runtime fork. Same IR → KotlinPoet for Kotlin, JavaPoet for Java. We don't get TS/Go/Python for free, but we never claimed multi-language as a goal.
From Expedia's plugin (expediagroup-sdk-openapi-plugin):
- Operation-trait composition pattern. Each operation implements only the traits relevant to its spec features (
UrlPathTrait,HeadersTrait,UrlQueryParamsTrait,OperationRequestBodyTrait<T>,OperationResponseBodyTrait<T>,OperationNoResponseBodyTrait). Cleaner than annotation-based codegen; no reflection at runtime. Define these traits in a newsdk-rest(orsdk-operations) module. - Operation / Params split. Keep path/query/header concerns in a dedicated
*OperationParamsclass withpathParams()/queryParams()/headers()projections. Operation class becomes a pure request-info adapter. - Per-status typed exceptions. Emit
{OperationName}{StatusCode}Exceptionin a<modelpkg>.exceptionsub-package. Improvement over Expedia: on parse failure, attach the raw body as a suppressed throwable rather than silentlynull. @JsonDeserialize(builder = Builder::class)+ private constructor + Builder. Jackson-friendly immutability.- IR processor hooks. Expedia's
processOperation(CodegenOperation): CodegenOperationlambdas let users adapt to spec quirks without forking templates. Same idea, but typed ((OperationIR) -> OperationIR). - Spec preprocessor pattern. Expedia uses an external npm tool; we should build a JVM-native preprocessor pipeline:
$refresolution,allOfflattening,operationId → tagnormalization, inline-schema naming, header injection. All testable separately from the emitter. - Mustache template merge mechanism is replaced by IR processor hooks in our world; the value was the layered defaults idea, which we retain.
From Square (Fern's Java output):
- Suppliers.memoize for lazy sub-client init. Verbatim, 10 lines, Java 8.
- Raw/Cooked client split.
*ClientreturnsT,Raw*ClientreturnsResponse<T>(body + headers). SinglewithRawResponse()accessor for crossing the boundary. (We have our ownResponsealready; raw clients return it, cooked clients call.body().) - Async client mirror. Optional. Generated as
Async*ClientreturningCompletableFuture<T>, delegating to oursdk-async-coroutines-style adapter under the hood. - Forward-compatible enums. Square's
enable-forward-compatible-enums: truesetting emits anUNKNOWNsentinel rather than throwing on unrecognized enum values. Adopt by default; SDK releases shouldn't break on server-side enum additions.
From Airbyte (Speakeasy's Java output):
- Hook taxonomy (
SdkInit/BeforeRequest/AfterSuccess/AfterError). Already mapped onto our pipeline; see Pipeline / Middleware. IdempotencyHookpattern. Realized asIdempotencyKeyStep, aBeforeRequeststep injectingIdempotency-Key.ClientCredentialsHookpattern. Single class implements 3 hook interfaces (SdkInitfor client init,BeforeRequestfor token injection,AfterErrorfor 401 eviction) — the model for the OAuth2 flow still on the Remaining backlog.
From gax:
- Per-method
CallSettings. Each generated operation has typedretryableCodes: Set<Int>+retrySettings: RetrySettingsdefaults. Consumers override at call site viaRequestContext. - Composite client header.
dexpace-sdk/<ver> jvm/<javaver> okhttp/<okver>token line.
From Smithy (without adopting smithy-java):
- Trait-based extensibility. Custom OpenAPI vendor extensions (
x-dexpace-*) get first-class treatment via Integration hooks; generator can be taught about new traits without forking. smithy-build.json-style projections. Eventually consider model transforms (filter operations by tag, rename namespaces) as first-class config.
(Distilled from the bug catalogs and anti-patterns observed across the cohort. None of these should ever appear in our generator.)
- Mustache templating. Compiler-unchecked string concatenation. Source of every Expedia bug.
- Auto-uppercasing or transforming discriminator values (Expedia
Discriminator.kt:33). Never mutate spec values on the wire. !!non-null assertions in generator code (ExpediaDiscriminator.kt:33). Use explicit error messages.- Hardcoded language (Expedia
setGeneratorName("kotlin")). Treat target language as a first-class config axis. - Hardcoded serde library (Expedia + Airbyte hardcode Jackson). Emit through a
Serializer<T>abstraction. - Path-param interpolation without percent-encoding (Expedia, Airbyte). Use RFC 3986 unreserved char tables.
- Reflection-driven serialization with string-DSL metadata (Airbyte
@SpeakeasyMetadata("...")). Generate code, not metadata. Thread.sleepin retry loops (Square, Airbyte). UseScheduledExecutorService. Restore interrupt flag onInterruptedException.synchronizedaround network calls (ExpediaOAuthStep). UseReentrantLock+ coalescing futures.HttpClient.newHttpClient()per call (Airbyte). Pool transports at the client level.- One
SDKErrorfor everything (Airbyte). Typed per-status exceptions. - Inventing your own future type (gax
ApiFuture). UseCompletableFuture+ adapter modules. - Annotation-only visibility (gax
@InternalApion public types). Use real visibility modifiers. - Threetenbp dual API surface (gax). Use
java.timeonly. - Reading entire error/response bodies as String (Airbyte
Utils.toUtf8AndClose, SquareresponseBody.string()). Stream via ourSource. - ServiceLoader for transport discovery (Expedia). Explicit installation only.
String.equalsfor cryptographic comparison (SquareWebhooksHelper).MessageDigest.isEqual.- Eager-buffering request bodies during logging (Expedia
RequestLoggingStep). Use ourTeeSinkdesign. - Dead code shipped with no tests (Expedia's
linkTypeMustache references,CustomPagerthrown-on-use stub). If a feature isn't implemented end-to-end, don't ship the seam. - 100% coverage enforcement on every module (Expedia's Kover). Forces coverage on getters; encourages package exclusions. Target logic packages; exempt POJOs.
- Status-200-only Accept header aggregation (Expedia
HttpAcceptHeaderLambda). Aggregate across all 2xx responses. - Catch-and-null on error response parse (Expedia
getExceptionForCode). Propagate parse failure as suppressed throwable. - Auto-fallback to
Stringfor unschematized error bodies (ExpediaOperationExceptionsLambda). Pass-through raw bytes. - Duplicate sync/async hierarchies (Expedia
Transport/AsyncTransport,OAuthManager/OAuthAsyncManager). Single core + async adapters.
Order of operations once the Tier 1 backlog above is partially complete (retry + auth + typed exceptions land first; codegen can target them):
- Week 1-2.
sdk-codegenmodule skeleton. swagger-parser integration. Define internal IR (Spec,Operation,Model,Enum,OneOf,Param). Initial OpenAPI 3.x → IR mapping. - Week 3-4. Spec preprocessor pipeline (
$refresolution,allOfflatten, inline-schema naming). Golden-file tests against fixture specs. - Week 5-6. KotlinPoet emission for: models (immutable + Builder +
@JsonDeserialize), enums (forward-compatible withUNKNOWN), per-operation*Paramsclasses, per-status exception classes. - Week 7-8. Operation emission:
*Operationclasses implementing operation traits; sync + async (Async*Client) clients;Raw*Clientand cooked variant. - Week 9-10. Gradle plugin packaging; Integration SPI for consumer customization (preprocessor + IR-processor + named-section hooks).
- Week 11-12. Round-trip a non-trivial fixture spec (oneOf + allOf + discriminator + recursive ref + multipart + binary + readOnly/writeOnly split). Compile output. Run JVM smoke tests against MockWebServer.
Out of scope for v1: JavaPoet emission, Smithy IDL front-end, gRPC, multi-language. All deferrable.
Total estimated effort: 8-12 person-weeks for a v1 that emits a usable Kotlin SDK against
our runtime. Comparable to a Smithy JavaCodegenIntegration (7-10 weeks) or forking Expedia's
plugin and fixing its bugs (4-6 weeks but inherits Mustache fragility forever).