Skip to content

Clean-architecture restart: SDK foundation + utilities resource#8

Merged
MarketDataDev03 merged 57 commits into
mainfrom
clean-architecture-restart
May 28, 2026
Merged

Clean-architecture restart: SDK foundation + utilities resource#8
MarketDataDev03 merged 57 commits into
mainfrom
clean-architecture-restart

Conversation

@MarketDataDev03

@MarketDataDev03 MarketDataDev03 commented May 26, 2026

Copy link
Copy Markdown
Collaborator

Clean-architecture restart: SDK foundation + utilities resource

📖 Reviewer's guide: docs/REFACTOR_REVIEW_GUIDE.md — flow-by-flow walkthrough with mermaid diagrams, file:line citations, and the 14 subtle corners worth a second look. Open it side-by-side with the diff. If you only have 30 min, read §2 (sync request flow), §4 (retry), §10 (subtle corners).

Foundation for the Java SDK: configuration cascade, async-first HTTP transport, retry + Retry-After + status cache, 50-permit concurrency cap, sealed exception hierarchy, Response<T> surface, parallel-arrays JSON decoding, and the utilities resource façade. stocks / options / funds / markets are deliberately deferred to follow-up PRs.

Why this exists

The previous main carried an early, partial shape that did not satisfy the cross-language SDK Requirements: no retry/backoff, no rate-limit tracking, no sealed exceptions, no Response<T> surface, no async/sync parity, no concurrency cap. This PR establishes the canonical foundation the way ADRs 001–007 dictate, then wires the smallest possible resource (utilities — three endpoints: /status/, /user/, /headers/, all unversioned) on top of it so the foundation can be exercised end-to-end. Future per-resource PRs (stocks, options, funds, markets) plug into the same transport / parser / response surface — they only declare their endpoints, wire-format columns, and response models.

What's in this PR

Grouped by SDK layer. Each item references the spec section that demands it.

Configuration & lifecycle (§4, §5, §16)

  • MarketDataClient — public no-arg and 4-arg constructors; package-private 6-arg seam for hermetic tests; close(), utilities(), getRateLimits().
  • Configuration.resolve(...) — explicit → MARKETDATA_* env var → .env → default cascade, with baseUrl / apiVersion normalization + validation and printable-ASCII apiKey rejection (CRLF, NUL, high-bit bytes — issue Bump org.gradle.toolchains.foojay-resolver-convention from 0.10.0 to 1.0.0 #23).
  • DotEnvLoader.env file loading with explicit allowed-keys whitelist; warnings buffered until the logger is configured (issue Bump com.fasterxml.jackson.core:jackson-databind from 2.18.2 to 2.22.0 #25 attaches them as suppressed if resolve itself throws).
  • EnvVars — the only place that knows the MARKETDATA_* variable names; systemLookup() restricts reads to the allowed set so SDK code cannot accidentally read unrelated env vars.
  • DemoModeapiKey == null predicate driving the §5 startup-validation skip.
  • Startup validation — single-attempt GET /user/ (uses RetryPolicy.noRetry() so a slow/down API surfaces within seconds instead of burning the ~6.75 min retry budget); skipped in demo mode; partial-construction guard closes the transport and attaches the close failure as a suppressed exception if the probe throws.

HTTP transport (§4, ADR-004, ADR-006)

  • HttpTransport — single point of contact between resources and the network. Owns the URL/request builders, applies status routing (2xx + 404 → success envelope; 4xx/5xx → typed exception). Async-first per ADR-006; joinSync is a thin wrapper that unwraps CompletionException so consumers see MarketDataException directly.
  • HttpDispatcher — single-shot send under the 50-permit semaphore. Captures sync-throws from HttpClient.sendAsync so the permit cannot leak. Maps IOException to NetworkError exactly once.
  • HttpResponseEnvelopebyte[] body + status + request-id + headers + URL. Format-agnostic (resources decide JSON / CSV / HTML).
  • HttpStatusMapper — status → sealed-exception table; default-bucket logic for 3xx / 1xx / out-of-range codes that still routes to a permit.
  • RequestSpec — declarative GET specification: path, query params, format, versioned flag.

Retry & backoff (§9)

  • RetryPolicy — 4 attempts default, exponential 1s × 2^n capped at 30s; retries NetworkError (only with IOException somewhere in the cause chain — issue docs(examples): rewrite the consumer examples as teaching material #15) and ServerError 501–599; never 500, never 4xx, never ParseError/RateLimitError/AuthenticationError/BadRequestError/NotFoundError. noRetry() factory for startup validation.
  • RetryExecutor — pure orchestration of Supplier<CompletableFuture<T>> retries via CompletableFuture.delayedExecutor (no scheduled-thread pool); single cancellation handler installed once, in-flight attempt tracked via AtomicReference.
  • RetryAfterHeader — RFC 7231 parser: delta-seconds and RFC 1123 HTTP-date both supported; negative deltas and past dates clamp to Duration.ZERO.
  • Retry-After 10-minute cap (issue refactor: remove retired Options Strikes endpoint #21) — pathological server values (Retry-After: 9999999999) are logged and the SDK falls back to exponential. The raw value remains visible on ServerError.getRetryAfter().
  • StatusCache — stale-while-revalidate (270 s refresh threshold / 300 s expiry); longest-path-prefix match with trailing-slash normalization (issue feat: add of(...) convenience factory to StockQuotesRequest and OptionsQuotesRequest #18); re-check after triggerRefresh (issue docs: sync from documentation@55d35fe34473617fb1009ba844cffec94a57f0bc #19) so synchronous-fetcher test setups don't always answer ALLOW on first call; self-bypass for the /status/ URI so the gate cannot trap its own refresh.

Rate-limit tracking & preflight (§8, §10.3)

  • RateLimitHeaders.parse(...) — reads the four x-api-ratelimit-* headers as an all-or-nothing set; partial deliveries return null (no phantom zeros that would freeze the preflight gate).
  • RateLimitSnapshot — record: limit, remaining, reset (Instant), consumed.
  • HttpTransport.checkRateLimitPreflight — fails fast with RateLimitError when the latest snapshot reports remaining == 0 and now < reset. Returns null once reset elapses so the next response refreshes the snapshot.
  • Preflight bypass on server-hinted retry — when the previous attempt was a ServerError carrying a parsed Retry-After, the snapshot is ignored for that retry so the server-orchestrated backoff prevails.

Concurrency (§12, ADR-007)

  • AsyncSemaphore — async-safe 50-permit limiter. Replaces j.u.c.Semaphore so executeAsync never parks the caller's thread. Two invariants: every permit accounted in exactly one place (available / held / pending), and future.complete(...) always runs outside the lock.

Response surface (§13.5)

  • Response<T> — typed data(), defensive-copied rawBody() (clone on both write and read), statusCode(), requestId() (Cloudflare cf-ray), requestUrl(), isJson()/isCsv()/isHtml(), isNoData() (true for 404, the API's no-data envelope status), saveToFile(Path), redacted toString().
  • Format enum — JSON/CSV/HTML, package-private (queried via the boolean accessors so the enum can grow without breaking compiled consumers).

JSON parsing (§11, ADR-005)

  • JsonResponseParser — one ObjectMapper per client; resources self-register wire-format SimpleModules in their constructors. Pre-checks empty body (issue Bump actions/checkout from 4 to 7 #29) and emits a précis ParseError ("Empty response body…") instead of Jackson's generic "No content to map".
  • ParallelArrays.zip + listDeserializer(...) — declarative factory that collapses the ~30-line custom-deserializer skeleton to three pieces: column names, row builder, container wrapper. Row accessors are strict by default (null/wrong-type cells raise JsonMappingException) so a server-side dropped column cannot silently degrade to sentinel values.
  • RequestHeadersDeserializer, UserDeserializer — hand-written deserializers for the two wire shapes that aren't parallel-arrays.

Resources

  • UtilitiesResource — sync + async pair (ADR-006) for each of /status/ / /user/ / /headers/ (all unversioned per the backend's URL conf); package-private constructor (ADR-007); registers its wire-format SimpleModule on the parser at construction time; package-private validateAuth() intent-named for the startup probe.

Exceptions (§6, ADR-002)

  • MarketDataException — sealed base, 7 permits: AuthenticationError, BadRequestError, NotFoundError, RateLimitError, ServerError, NetworkError, ParseError.
  • ErrorContext — record: requestId?, requestUrl, statusCode, timestamp. Factories forResponse(...) / forNoResponse(...).
  • getSupportInfo() — multi-line dump with America/New_York timestamp and exception-type tag, ready to paste into a support ticket.
  • getRequestUrl() returns the query-redacted URL (§16); getContext().requestUrl() exposes the raw URL for discretionary diagnostic use.

Logging (§7, §16)

  • MarketDataLogging.configure(...) — installs CanonicalLogFormatter ({timestamp} - {logger} - {level} - {message}) on the SDK root logger (com.marketdata.sdk). First-install-wins; consumer-pre-config detection backs off entirely if the consumer already attached a handler or set a level (idempotency flag not latched on the skip, so the SDK can claim the slot later).
  • CanonicalLogFormatter — JUL Formatter enforcing the spec shape.
  • MARKETDATA_LOGGING_LEVEL env var — DEBUG/INFO/WARNING/ERROR (case-insensitive); unrecognized values fall back to INFO with a logged warning.
  • Query-string redaction (HttpDispatcher.safeUri) — log lines never persist ?token=, ?account_id=, ?symbol=; the full URI stays available on ErrorContext for consumer-side diagnostics.
  • Token redaction (Tokens.redact) — ≤8 chars → fully masked; >8 chars → ***…***ABCD (enough to disambiguate, not enough to use).

Packaging & CI

  • build.gradle.ktsjava-library, maven-publish, Spotless, JaCoCo, Vanniktech Maven Publish. -PtestJdk=N toolchain hook applies uniformly to test and integrationTest.
  • .github/workflows/ (existing — not in this diff, but exercised by it): pull-request.yml (JDK 17 only), main.yml (matrix {17, 21, 25} + integration tests on push to main), pr-matrix-on-demand.yml (slash command), pr-integration-on-demand.yml (slash command, branch-protection check).

What's deliberately not in this PR

(Copied from CLAUDE.md "Deliberately deferred"; each item requires the per-resource layer to land first.)

  • §1.2 resource groupings — only client.utilities() is wired today; client.stocks, client.options, client.funds, client.markets are TODO.
  • §2 endpoint method coverage and §3 universal parameters beyond utilities.
  • §11 wire-format decoding for resources beyond utilities. The plumbing (ParallelArrays.zip, JsonResponseParser, Response<T>) is in place — each new endpoint just declares its fields and row builder.
  • §8 per-request rate-limit attachment to Response<T>. Today the snapshot is client-level (via client.getRateLimits()).
  • §13 100 % coverage threshold via JaCoCo violationRules. Deferred until the resource layer lands so the ratchet meaningfully gates functional code, not scaffolding.

Diff at a glance

92 files changed, +8,533 −3,143
Area Files Notes
src/main 53 The whole SDK foundation. ~3,700 LoC in ~47 files.
src/test 34 One unit test class per source class, plus end-to-end (HttpTransportE2ETest).
src/integrationTest 2 Live-API smoke (gated by MARKETDATA_TOKEN).
build.gradle.kts 1 Plugins + -PtestJdk wiring.
CLAUDE.md 1 Onboarding context for AI coding assistants.
.gitignore 1

The branch carries 54 commits — many small "after review" / "fix" commits. The final shape is what matters; the commit history is incremental, not a logical squash.

How to review this

Open docs/REFACTOR_REVIEW_GUIDE.md alongside the diff. The guide is structured by flow, not by file, so a reviewer can pick the slice that interests them:

  1. SDK topology — class inventory by layer, dependency arrows, the ADR-007 internal boundary.
  2. Sync request flow end-to-end — a client.utilities().status() call traced through every class it touches. Mermaid sequence diagram. Start here if you've never read this codebase.
  3. Construction flow — what happens inside new MarketDataClient(...): cascade, validation, logger, partial-construction guard, startup probe. Mermaid flowchart.
  4. Retry, Retry-After, and StatusCache — the §9 contract, with all the cap / bypass / synchronous-fetcher corners that look weird but exist for a reason. Mermaid flowchart.
  5. Rate-limit tracking + preflight — §8 and §10.3 with the server-hinted-retry bypass.
  6. Concurrency (AsyncSemaphore) — invariants, why not j.u.c.Semaphore, FIFO drain on close.
  7. Response<T> + JSON parsing — surface, defensive copies, ParallelArrays.zip and the strict Row accessors.
  8. Sealed exception hierarchy — 7 permits, ErrorContext, getSupportInfo(), redacted vs raw URL.
  9. Configuration & logging — cascade rungs, token redaction, consumer-pre-config detection, warning replay.
  10. Subtle corners (issue-driven) — 14 spots a reviewer would otherwise rabbit-hole on, each one with a file:line citation and a one-paragraph "what's weird / why" note.

If you only have 30 minutes, read sections 2 → 4 → 10 in that order — that covers the load-bearing flows and the things most likely to look wrong on first read but aren't.

Test plan

The repo's Makefile wraps both ./gradlew and the demos under examples/. Run make help for the full list.

Unit tests + build

  • make build — unit tests + Spotless + JaCoCo on JDK 17.
  • ./gradlew -PtestJdk=21 test — unit tests on JDK 21. (No Make wrapper; -PtestJdk=N is passed to gradlew directly.)
  • ./gradlew -PtestJdk=25 test — unit tests on JDK 25.
  • PR-on-demand integration tests via /integrationtest (JDK 17) or /integrationtestfull (matrix) — exercises the live API.

Consumer demos (run make publish once first; the mock-server-based ones need make mock-server in a separate terminal)

  • make demo-quickstart — idiomatic per-resource usage tour (today: utilities; grows with each new resource).
  • make demo-live — happy path against api.marketdata.app.
  • make demo-config — cascade + validation + redaction (needs mock server).
  • make demo-exceptions — sealed-hierarchy round-trip (needs mock server).
  • make demo-retry — backoff, Retry-After, preflight (needs mock server).
  • make demo-responseResponse<T> surface (needs mock server).
  • make demo-concurrency — 50-permit cap end-to-end (needs mock server).
  • make demos-all — all five mock-server-based demos back-to-back.

Out of scope / known caveats

  • NotFoundError is currently unreachable. HttpTransport.routeAndEnvelope short-circuits all 404 responses to the success-envelope branch (the API's no-data convention), so the case 404 -> NotFoundError(...) in HttpStatusMapper never executes. The permit stays in the sealed hierarchy because ADR-002 fixes the 7 canonical permits — removing it would be an amendment, not a silent code change. Re-introducing reachability requires the SDK-side decision: only treat 404 as no-data when the body carries s:"no_data"; otherwise route through NotFoundError. Tracked for a follow-up.
  • CSV / HTML formats are not exercised in this PR. The isCsv()/isHtml() predicates work; no current utility endpoint negotiates a non-JSON format, so the path is in the surface but unproven end-to-end. Future resources that emit CSV will close the gap.
  • RetryPolicy customization is not yet a public API. Today a consumer can't pass a custom policy; the only public knobs are the policy defaults and the constructor's validateOnStartup flag. Promoting RetryPolicy (or a builder around it) to public requires deciding the shape — defer until a real consumer asks.
  • Markets / stocks / options / funds endpoints are removed from main in this PR. A prior MarketsResource was an early shape that didn't fit the new foundation; rebuilding it cleanly on top of Response<T> / ParallelArrays is the next per-resource PR.

@socket-security

socket-security Bot commented May 26, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedpypi/​uvicorn@​0.32.098100100100100
Addedpypi/​fastapi@​0.115.0100100100100100

View full report

@codecov

codecov Bot commented May 26, 2026

Copy link
Copy Markdown

The author of this PR, MarketDataDev03, is not an activated member of this organization on Codecov.
Please activate this user on Codecov to display this PR comment.
Coverage data is still being uploaded to Codecov.io for purposes of overall coverage calculations.
Please don't hesitate to email us at support@codecov.io with any questions.

@MarketDataDev03 MarketDataDev03 self-assigned this May 26, 2026
@MarketDataDev03 MarketDataDev03 requested review from MarketDataDev01 and removed request for MarketDataDev00 May 26, 2026 18:29
@MarketDataDev03 MarketDataDev03 marked this pull request as ready for review May 26, 2026 18:30
Function<String, @Nullable String> env,
Path dotEnvPath,
Consumer<DotEnvLoader.Warning> warnings) {
Map<String, String> dotEnv = DotEnvLoader.load(dotEnvPath, warnings, EnvVars.ALLOWED_KEYS);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check for inline comments in env file not being ignored, it would be nice to add that feature

public record RequestHeaders(Map<String, String> headers) {

public RequestHeaders {
headers = Map.copyOf(headers);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null validation to avoid NullPointerException

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only caller of new RequestHeaders(...) in main code is RequestHeadersDeserializer, which already guarantees a non-null map (via getNullValue override + raw == null guard). Defaulting null to Map.of() or remapping it to IllegalArgumentException would protect no real path — it would just obfuscate future @NullMarked contract violations.
Since the constructor can't actually fail today, I think it's better to fail loud and fast on any future violation than to silently swallow it.

public RequestHeaders { Objects.requireNonNull(headers, "headers"); headers = Map.copyOf(headers); }

@MarketDataDev03 MarketDataDev03 merged commit 77489b8 into main May 28, 2026
5 checks passed
@MarketDataDev03 MarketDataDev03 deleted the clean-architecture-restart branch May 28, 2026 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants