Skip to content

feat(router): entity caching with L1/L2, shadow mode, and per-request cache controls (3/6)#2955

Draft
SkArchon wants to merge 10 commits into
milinda/entity-caching-1b-demofrom
milinda/entity-caching-2-router
Draft

feat(router): entity caching with L1/L2, shadow mode, and per-request cache controls (3/6)#2955
SkArchon wants to merge 10 commits into
milinda/entity-caching-1b-demofrom
milinda/entity-caching-2-router

Conversation

@SkArchon

@SkArchon SkArchon commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Replaces #2945, which was auto-closed as merged when a bad force-push momentarily collapsed the stack branches onto one commit. Same content, same stack position.

Part 2 of 4 — split out from #2777 (jensneuse/entity-caching-v2). Stacked on #2944.

What's included

  • router/pkg/entitycache: in-memory L1, Redis L2, circuit breaker
  • router/core: factoryresolver mapping of proto cache config to planner metadata, graph server wiring, per-request cache-control headers (graphql_handler), plan generator nil guard, config schema + defaults
  • metrics/otel: entity cache metrics (opt-in telemetry) and attributes
  • graphql-go-tools bumped to PR feat: add caching to loader graphql-go-tools#1259 HEAD (resolver/planner side of the feature) — the two must land together
  • router-tests/entity_caching: full integration suite with self-contained gqlgen subgraphs, Redis tests, and a compose tool for testdata/config.json
  • demo: cachegraph/cachegraph_ext/viewer subgraphs, cache-demo runner, latency injector, router cache configs (demo go.mod moves in lockstep with router deps; demo is path-replaced into router-tests builds)

Excluded

router/internal/graphiql/graphiql.html (playground embed) ships in part 3/4 — this PR runs with the previous embedded playground.

Verification

go build ./... passes in router/ and demo/; go vet ./entity_caching/... passes in router-tests/.

🤖 Generated with Claude Code

Update: test subgraphs coupled to demo

The six entity_caching test subgraphs were moved into demo/pkg/subgraphs/cachetest/ and are imported by the suite — same coupling pattern as the main suite (testenvdemo/pkg/subgraphs). router-tests no longer carries subgraph implementations. Pure move: schemas/resolvers/data unchanged, make compose reproduces an identical testdata/config.json, full suite green.

Diff cleanup

  • The no-op-tracer-provider optimization (HasNonDefaultExporter + bootstrap guard) moved to the benchmark/tooling PR (6/6) — it is a general tracing perf change motivated by benchmarking, not entity caching.
  • Reverted ~1.5k lines of demo/pkg/subgraphs/projects/generated proto churn — leftover reserved markers from fields added and removed during development; the GraphQL schema is unchanged.
  • Dropped router/entity-caching.config.yaml (unreferenced sample) and router-tests/entity_caching/graph.yaml (duplicated the subgraph list that cmd/compose owns; nothing consumed it).
  • The gqlgen generate target moved to demo/Makefile (generate-cachetest) next to the subgraphs it regenerates.

Rebased onto main (2026-06-10)

  • graphql-go-tools re-pinned to 63fa1c88 — PR feat: add caching to loader graphql-go-tools#1259 with v2.4.4 merged in (branch entity-caching-v244-merge; conflicts in resolve/context.go + resolve/resolvable.go resolved keeping caching fields + the actualListSizestypeNameStats rename). Needed because main's router uses v2.4.4 APIs (TypeNameStats) that the PR head lacked.
  • All subscription-overhaul adaptation shims dropped in favor of main's versions (websocket, flushwriter, executor, updater, mocks, errors, operation_processor, plan_generator, headers_test).
  • One astjson adaptation the other way: the caching arena rework makes MergeValuesWithPath return 2 values; main's flushwriter call site adapted.
  • cmd/compose replaced with a wgc-based Makefile target (+ restored graph.yaml input) since composition-go was removed on main (chore: remove composition-go #2830); testdata/config.json regenerated with main's composition.
  • entity_caching suite green throughout (incl. the WS subscription populate/invalidate tests, which depend on ExecutionOptions.Caching wiring in the WS path).

Summary by CodeRabbit

  • New Features

    • Added support for entity, query, and request-scoped caching across the demo router and subgraphs.
    • Introduced new cache-focused demo endpoints and router configurations for cache-only and cache test scenarios.
    • Added a standalone cache demo server and viewer support.
  • Bug Fixes

    • Updated configuration handling so cache settings are mapped correctly and new fields are recognized.
  • Tests

    • Added broad coverage for cache behavior, including Redis, in-memory caching, cache invalidation, and concurrency scenarios.

Milinda Dias and others added 3 commits June 11, 2026 00:19
… generated bindings

Extracted from jensneuse/entity-caching-v2 (PR #2777) — composition + proto layer only.

- composition: @openfed__entityCache / @openfed__queryCache / @openfed__requestScoped
  directive definitions, extraction in normalization-factory, ConfigurationData types,
  warnings, and tests (incl. fuzz + mapping rules)
- proto: additive DataSourceConfiguration fields 17-21 and new messages
  (EntityCacheConfiguration, RootFieldCacheConfiguration, EntityKeyMapping,
  EntityCacheFieldMapping, CachePopulateConfiguration, CacheInvalidateConfiguration,
  RequestScopedFieldConfiguration)
- regenerated bindings: router/gen, connect-go, connect (incl. graphqlmetrics regen
  side-effects), composition-go bundle
- shared: router-config serializer for the new proto fields

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…t subgraphs

Extracted from jensneuse/entity-caching-v2 (PR #2777) — demo layer, stacked on
the composition+proto PR.

- cachegraph / cachegraph_ext / viewer: demo subgraphs exercising the caching
  directives (mutable data store, latency injector, auth profiles)
- cmd/cache-demo: standalone cache-only runner (ports 4012-4014)
- pkg/subgraphs/cachetest: the six deterministic subgraphs consumed by the
  router-tests entity_caching suite (articles, articlesmeta, details,
  inventory, items, viewer)
- graph-cache-only.yaml + config-cache-only.json + Makefile compose targets

Builds and tests green against the base router (no router changes required).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… cache controls

Extracted from jensneuse/entity-caching-v2 (PR #2777) — router layer, stacked on
the demo PR. Entity cache OTEL/Prometheus metrics are split into the next PR.

- router/pkg/entitycache: in-memory L1, Redis L2, circuit breaker
- router/core: factoryresolver mapping of proto cache config to planner metadata,
  graph server wiring, per-request cache-control headers (incl. WS subscriptions),
  EntityCacheKeyInterceptor module hook
- graphql-go-tools pinned to 63fa1c88 (PR #1259 + v2.4.4 merged via
  entity-caching-v244-merge branch) — this also moves astjson to the caching
  arena rework (MergeValuesWithPath now returns 2 values; flushwriter adapted)
- router-tests/entity_caching: integration suite consuming demo's cachetest
  subgraphs; config composed via wgc (composition-go was removed on main),
  testdata/config.json regenerated with main's composition
- subscription-overhaul adaptation shims from the old base were dropped in
  favor of main's versions (websocket, flushwriter, executor, updater, mocks)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fc5033c2-a911-47bc-aad1-94c177957737

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

This PR implements multi-layer (L1 per-request, L2 cross-request) entity caching in the Cosmo router. It extends the protobuf contract with new cache configuration message types, introduces in-memory (ristretto), Redis, and circuit-breaker cache backends, wires entity caching through the router core (executor, factoryresolver, GraphQL handler, graph server), adds demo subgraphs exercising all @openfed__entityCache/@openfed__queryCache/@openfed__requestScoped directives, and provides a full integration test harness.

Changes

Entity Caching Infrastructure

Layer / File(s) Summary
Proto/TS contract: EntityCachingConfiguration rename and new message types
proto/wg/cosmo/node/v1/node.proto, connect/src/wg/cosmo/node/v1/node_pb.ts, shared/src/router-config/builder.ts
Renames entity_cache_configurationsentity_cache, adds request_scoped_configurations (tag 4) and query_cache_configurations (tag 5) to EntityCachingConfiguration, and introduces four new messages: RequestScopedConfiguration, QueryCacheConfiguration, EntityKeyMapping, EntityCacheFieldMapping. Fixes the entityCache property assignment in the shared TS builder.
Router config model, JSON schema, and validation
router/pkg/config/config.go, router/pkg/config/config.schema.json, router/pkg/config/config_test.go, router/pkg/config/testdata/*
Adds MemoryStorageProvider, EntityCachingConfiguration (L1/L2/circuit-breaker/subgraph-override structs), and FileSystemStorageProvider to router config. Extends JSON schema with storage_providers.memory and a full entity_caching block. Adds validateMemoryProviderUsage to reject memory providers in non-entity-caching storage contexts. Updates default and full config test fixtures.
Cache backends: MemoryEntityCache, RedisEntityCache, CircuitBreakerCache, and tests
router/pkg/entitycache/memory.go, router/pkg/entitycache/redis.go, router/pkg/entitycache/circuit_breaker.go, router/pkg/entitycache/*_test.go
Implements three resolve.LoaderCache backends: ristretto-backed memory cache with atomic item counter and TTL, pipelined Redis cache with key prefixing, and a circuit-breaker wrapper with atomic closed/open/half-open state machine. Comprehensive unit and concurrency tests using miniredis cover all three backends.
Router core entity caching wiring
router/core/modules.go, router/core/router_config.go, router/core/graphql_handler.go, router/core/executor.go, router/core/factoryresolver.go, router/core/graph_server.go, router/core/router.go, router/core/supervisor_instance.go, router/core/websocket.go, router/core/*_test.go
Defines EntityCacheKeyInterceptor interface; extends Config with cache instances and interceptors; GraphQLHandler derives per-request CachingOptions from handler config and trace headers; executor.go passes EntityCachingConfig into planner and builds invalidation config map; factoryresolver.go maps entity/query/mutation/subscription/request-scoped cache configs into DataSourceMetadata.FederationMetaData; graph_server.go builds cache instances, wires interceptors, and registers metrics; router.go constructs Redis/memory cache instances with optional circuit-breaker and closes them on shutdown.
Demo cachegraph primary subgraph
demo/pkg/subgraphs/cachegraph/**
Introduces the main demo subgraph with a federation v2.5 schema using all OpenFed cache directives across entity types (Article, Listing, Venue, UserProfile, Catalog, Metric). Implements mutex-guarded articleStore and listingStore with clone-safe concurrent access, auth-header-aware UserProfile resolver, and race-detector tests.
Demo cachegraph_ext and viewer (production) subgraphs
demo/pkg/subgraphs/cachegraph_ext/**, demo/pkg/subgraphs/viewer/**
Adds cachegraph_ext providing relatedArticles and personalizedRecommendation via @requires, and partial-cache-load Catalog testing. Adds production viewer subgraph with @openfed__requestScoped directive, Authorization-header-based viewer resolution, and WithHTTPRequest context helper. Both wired into subgraphs.go with latency injection.
Demo cachetest subgraphs
demo/pkg/subgraphs/cachetest/**
Adds six gqlgen subgraphs for the entity-caching integration test suite: items (mutation/subscription cache populate/invalidate, composite product keys, @openfed__is), details (multi-key Item/Product/Warehouse), inventory (availability), articles (@requires currentViewer personalization), articlesmeta (article metadata federation), and cachetest/viewer (@openfed__requestScoped with @interfaceObject Personalized).
Demo configs, CLI, and subgraph HTTP wiring
demo/Makefile, demo/cmd/cache-demo/main.go, demo/cmd/all/main.go, demo/graph*.yaml, demo/router-cache*.yaml, demo/config-cache-only.json, demo/pkg/injector/latency.go, demo/pkg/subgraphs/subgraphs.go, demo/go.mod
Adds compose-cache/compose-cachetest/generate-cachetest Makefile targets, cache-demo standalone server, new cachegraph/cachegraph_ext/viewer port flags, graph and router YAML configs, full engine config JSON, injector.Latency middleware, and extends Ports struct and New constructor.
Integration tests, testenv, tracing fixture, and miscellaneous fixes
router-tests/entitycaching/**, router-tests/entity_caching_standard_subgraphs_test.go, router-tests/testenv/testenv.go, router-tests/protocol/testdata/tracing.json, router-tests/go.mod, router/core/flushwriter.go, router/pkg/graphqlschemausage/*, router/go.mod
Adds entitycaching test harness with httptest subgraph servers, FailingEntityCache, ControllableCache, config template, cache option builders, invalidation extension middleware, and Redis subtests. Adds entity_caching_standard_subgraphs_test.go exercising L2 miss/hit, key separation, shadow mode, and list queries. Updates testenv to retry HTTP 501. Adds cache_trace fields to tracing fixture. Fixes astjson.MergeValues call-site return values in flushwriter and schemausage.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • wundergraph/cosmo#2980: Adds @openfed__entityCache normalization that populates EntityCachingConfiguration.entityCacheConfigurations — the field this PR renames to entityCache and extends with request/query cache sections.
  • wundergraph/cosmo#2986: Implements @openfed__queryCache and @openfed__is directives that produce the QueryCacheConfiguration/EntityKeyMapping/EntityCacheFieldMapping proto messages introduced in this PR.
  • wundergraph/cosmo#2996: Directly modifies the same EntityCachingConfiguration field naming in node_pb.ts/node.proto (entity_cacheentity_cache_configurations) that this PR also updates.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main router entity-caching feature and key capabilities introduced in the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

Comment @coderabbitai help to get the list of available commands.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown

❌ Internal Query Planner CI checks failed

The Internal Query Planner CI checks failed in the celestial repository, and this is going to stop the merge of this PR.
If you are part of the WunderGraph organization, you can see the PR with more details.

@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 9.80736% with 515 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.33%. Comparing base (15b3265) to head (9a6df7b).
⚠️ Report is 1 commits behind head on milinda/entity-caching-1b-demo.

Files with missing lines Patch % Lines
router/core/factoryresolver.go 5.50% 98 Missing and 5 partials ⚠️
router/pkg/entitycache/circuit_breaker.go 0.00% 78 Missing ⚠️
router/core/graph_server.go 12.94% 69 Missing and 5 partials ⚠️
router/core/router.go 8.97% 68 Missing and 3 partials ⚠️
router/pkg/entitycache/memory.go 0.00% 65 Missing ⚠️
router/pkg/entitycache/redis.go 0.00% 44 Missing ⚠️
router/pkg/config/config.go 0.00% 32 Missing ⚠️
router/core/executor.go 31.70% 28 Missing ⚠️
router/core/graphql_handler.go 47.05% 13 Missing and 5 partials ⚠️
router/core/supervisor_instance.go 0.00% 1 Missing ⚠️
... and 1 more
Additional details and impacted files
@@                        Coverage Diff                         @@
##           milinda/entity-caching-1b-demo    #2955      +/-   ##
==================================================================
- Coverage                           46.48%   44.33%   -2.15%     
==================================================================
  Files                                1121     1108      -13     
  Lines                              155128   155099      -29     
  Branches                            10065    10065              
==================================================================
- Hits                                72107    68768    -3339     
- Misses                              81185    84452    +3267     
- Partials                             1836     1879      +43     
Files with missing lines Coverage Δ
router/core/flushwriter.go 77.56% <100.00%> (-3.21%) ⬇️
router/core/modules.go 47.72% <ø> (-20.46%) ⬇️
router/core/router_config.go 93.82% <ø> (ø)
router/core/websocket.go 76.26% <100.00%> (-0.51%) ⬇️
router/core/supervisor_instance.go 0.00% <0.00%> (ø)
router/gen/proto/wg/cosmo/node/v1/node.pb.go 26.88% <50.00%> (+0.32%) ⬆️
router/core/graphql_handler.go 50.78% <47.05%> (-14.24%) ⬇️
router/core/executor.go 74.69% <31.70%> (-14.37%) ⬇️
router/pkg/config/config.go 20.31% <0.00%> (-61.98%) ⬇️
router/pkg/entitycache/redis.go 0.00% <0.00%> (ø)
... and 5 more

... and 134 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@SkArchon SkArchon force-pushed the milinda/entity-caching-1b-demo branch from 5ef33a6 to 243a7a4 Compare June 22, 2026 17:32
SkArchon added 5 commits June 22, 2026 23:09
…hing-2-router

Brings the reworked entity-caching composition/proto format (from the
queryCache-directive stack via the demo branch) into the router branch.
Composition and proto files resolved to the new format; router runtime
code is adjusted to the new proto format in a follow-up commit.
The queryCache-directive rework nests all entity-caching config under a
single EntityCachingConfiguration on DataSourceConfiguration, and renames
RootFieldCacheConfiguration -> QueryCacheConfiguration and
RequestScopedFieldConfiguration -> RequestScopedConfiguration. Route the
router config builder through DataSourceConfiguration.GetEntityCachingConfiguration()
and its typed getters, and handle CachePopulateConfiguration.max_age_seconds
now being a plain int64 (was optional *int64).
The entity-caching graphql-go-tools fork pinned by this branch predates
plan.Configuration.IgnoreImplementingTypeWeights (added in released v2.5.0),
while the released v2.5.0 lacks the entity-cache resolve.LoaderCache/CacheEntry
APIs this branch needs. Guard the CostControl assignment with a TODO until the
entity-caching fork is rebased onto v2.5.0; the toggle is a no-op until then.
Nest entity-caching test fixtures under DataSourceConfiguration.EntityCachingConfiguration,
rename RootFieldCacheConfiguration->QueryCacheConfiguration and
RequestScopedFieldConfiguration->RequestScopedConfiguration, and pass
CachePopulateConfiguration.max_age_seconds as a plain int64 (was *int64).
…g-2-router

# Conflicts:
#	composition/src/directive-definition-data/directive-definition-data.ts
#	composition/src/errors/errors.ts
#	composition/src/router-configuration/types.ts
#	composition/src/v1/constants/constants.ts
#	composition/src/v1/constants/directive-definitions.ts
#	composition/src/v1/normalization/normalization-factory.ts
#	composition/src/v1/normalization/types/types.ts
#	composition/src/v1/normalization/utils.ts
#	composition/tests/v1/directives/cache-populate.test.ts
#	connect-go/gen/proto/wg/cosmo/node/v1/node.pb.go
#	connect/src/wg/cosmo/node/v1/node_pb.ts
#	proto/wg/cosmo/node/v1/node.proto
#	router/gen/proto/wg/cosmo/node/v1/node.pb.go
#	shared/src/router-config/builder.ts
@SkArchon SkArchon changed the base branch from milinda/entity-caching-1b-demo to main June 30, 2026 14:50

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

🧹 Nitpick comments (3)
demo/Makefile (1)

47-49: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Normalize config-cache-only.json the same way as compose-cachetest.

Right now this target writes a one-line artifact, which makes the committed demo/config-cache-only.json diff hard to review and hides contract regressions much more easily.

♻️ Proposed change
 compose-cache:
 	$(wgc_router_ci) compose -i ./graph-cache-only.yaml -o ./config-cache-only.json
+	jq . ./config-cache-only.json > ./config-cache-only.json.tmp
+	mv ./config-cache-only.json.tmp ./config-cache-only.json
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@demo/Makefile` around lines 47 - 49, The compose-cache target is generating
an unformatted one-line config-cache-only.json artifact, unlike
compose-cachetest. Update the compose-cache rule in the Makefile to run the same
normalization step used by compose-cachetest after calling $(wgc_router_ci)
compose, so demo/config-cache-only.json is pretty-printed and reviewable.
router/core/graphql_handler_caching_options_test.go (1)

30-172: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add coverage for KeyInterceptors.

This suite locks down the flag and prefix branches, but it never exercises the new buildL2CacheKeyInterceptor() path. A small test with one or two fake EntityCacheKeyInterceptors would catch regressions in cache-key rewriting.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@router/core/graphql_handler_caching_options_test.go` around lines 30 - 172,
Add a test in the GraphQL handler caching options suite that exercises the new
KeyInterceptors path by verifying cachingOptions() wires in the
L2CacheKeyInterceptor produced by buildL2CacheKeyInterceptor(). Use a new
caching-options handler setup with one or two fake EntityCacheKeyInterceptor
implementations, then assert the returned resolve.CachingOptions includes a
non-nil L2CacheKeyInterceptor and that it rewrites keys as expected. Keep the
existing flag/prefix coverage intact and place the new assertions near the other
TestGraphQLHandler_cachingOptions_* cases.
router/pkg/entitycache/circuit_breaker_test.go (1)

256-266: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Use sync.WaitGroup.Go in these test goroutine fan-outs.
These blocks can drop the manual Add/Done bookkeeping and use wg.Go(...) instead, which keeps the accounting in one place and matches the Go 1.25 concurrency style.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@router/pkg/entitycache/circuit_breaker_test.go` around lines 256 - 266, The
goroutine fan-out in the circuit breaker test still uses manual WaitGroup
bookkeeping, which should be updated to the newer style. In the test block that
launches the `cb.Get` calls, replace the explicit `wg.Add(1)`/`defer wg.Done()`
pattern around the anonymous goroutine with `wg.Go(...)` on the same
`sync.WaitGroup` variable, so the launch and completion accounting are handled
in one place and the test matches the Go 1.25 concurrency style.

Source: Learnings

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@demo/cmd/cache-demo/main.go`:
- Around line 4-19: Set sensible timeouts on the demo HTTP servers used in
cache-demo main so slow or stalled clients cannot hold connections open
indefinitely. Update the server setup in main and the related server
construction for the cache demo handlers (including the gqlgen handler and
playground listener) to configure read, header, write, and idle timeouts on the
http.Server instances.

In `@demo/pkg/injector/latency.go`:
- Around line 19-24: The Latency middleware in Latency should stop blocking on
time.Sleep after the request is canceled. Update the artificial delay path to
wait on the request context from r.Context() alongside the timer, and return
early if the context is done so canceled requests do not keep goroutines pinned.
Keep the existing validation and logging, but make the sleep in Latency
cancellation-aware.

In `@demo/pkg/subgraphs/cachegraph/subgraph/data.go`:
- Around line 226-233: Both listingStore.all() and allVenues() currently range
over maps directly, so their returned slices can come back in different orders
between calls. Update these helpers to collect items and sort them before
returning, matching the stable ordering already used by allCatalogs(). Use the
existing clone-and-return pattern in listingStore.all() and the venue listing
path in allVenues() so callers get deterministic demo responses.
- Around line 143-156: The articleStore.byIDs helper currently deduplicates IDs
into a set and iterates over s.articles, which reorders results instead of
preserving the caller’s requested sequence. Update byIDs to return articles in
the same order as the input ids slice (while still cloning matches and handling
missing IDs), and keep the lookup logic centered around articleStore, byIDs, and
cloneArticle so query/resolver callers receive ordered results.

In `@demo/pkg/subgraphs/cachetest/articles/subgraph/data.go`:
- Around line 46-65: The article fixture lookup currently treats missing IDs as
zero-value articles, so invalid IDs appear to resolve successfully. Update
buildArticle to fail fast when articleFixtures[id] or any related fixture is
missing, returning an error instead of constructing a partial Article, and
change FindArticleByID to propagate that error back to the caller. Keep the fix
localized around buildArticle and FindArticleByID so cache-test fixture/key
mismatches surface immediately.

In `@demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/entity.resolvers.go`:
- Around line 15-23: Update entityResolver.FindArticleByID so it checks whether
the id exists in articleMetaFixtures before building the Article; if the lookup
fails, return nil (and no Article) instead of fabricating a zero-valued
model.Article. Keep the existing field mapping for the success path, but gate it
behind the ok check so unknown keys are treated as not found.

In `@demo/pkg/subgraphs/cachetest/items/subgraph/data.go`:
- Around line 11-15: The item ID generator is still package-global, so parallel
resolvers/tests can share mutable state and produce nondeterministic IDs. Move
ID allocation into the store instance created by newItemStore() and have
create() use that store-local counter instead of the package-level nextID.
Update the related item store/resolver flow in data.go and the associated
create/newItemStore paths so each resolver gets its own deterministic sequence.

In `@demo/pkg/subgraphs/cachetest/items/subgraph/schema.resolvers.go`:
- Around line 24-29: The non-blocking send in the item update cache event path
silently drops invalidation/population notifications when the channel is full or
not ready, which breaks the cache-driven flows. Update the event emission in the
resolver methods that use ItemUpdatedCh so the mutation does not silently
succeed without delivering the event; instead make the send reliable or
explicitly handle/report the failure. Apply the same fix in all affected
resolver blocks that publish cache events.

In `@demo/router-cache.yaml`:
- Around line 24-51: The router-cache demo forwards X-Artificial-Latency to
subgraphs but does not whitelist it in CORS, so browser preflight fails for that
header. Update the CORS/header allowlist in the demo router config to mirror the
behavior in demo/router-cache-cp.yaml, and keep the change aligned with the
existing headers section and any related CORS settings in this config.

In `@router-tests/entitycaching/harness_test.go`:
- Around line 500-510: The test helper in harness_test.go is overwriting any
existing GraphQL extensions when injecting cacheInvalidation, so update the
response-building logic to preserve and merge the wrapped handler’s current
extensions payload instead of replacing it. In the modified response assembly
near the ext/resp handling, read any existing extensions from
resp["extensions"], decode or reuse them, add the cacheInvalidation entry, then
re-marshal the combined extensions back onto the response before marshaling
modified.

In `@router-tests/entitycaching/testdata/config.json`:
- Around line 119-306: Update the embedded RouterConfig fixture to use the
renamed proto JSON keys instead of the stale cache field names. In this file,
replace the old datasource-style names under the config object that feeds
nodev1.RouterConfig, including the sections currently labeled
entityCacheConfigurations, rootFieldCacheConfigurations, and
requestScopedFields, with the new entity/query/request-scoped cache field names
so unmarshaling continues to work after the proto contract change. Use the
surrounding RouterConfig fixture structure to locate the affected blocks and
keep the nested cache mappings unchanged aside from the top-level key renames.

In `@router-tests/testenv/testenv.go`:
- Around line 753-768: The 501 retry override is only being applied in the
TLS-specific branch, while the shared retryable client setup still uses the
default policy. Update the client creation path in the relevant test setup
functions so the retry override is installed in the common/shared retry client
initialization, then reuse that helper in both the non-TLS setup and the TLS
override branch. Reference the retry client setup around the
retryClient.CheckRetry override in the affected testenv helpers to keep both
paths consistent.

In `@router/core/executor.go`:
- Around line 268-272: The
`security.cost_control.ignore_implementing_type_weights` setting is currently
defined but not applied in `executor.go`, so it becomes a silent no-op. Update
the planner wiring in `routerEngineCfg`/`planConfig` handling to either reject
this config when the pinned `graphql-go-tools` fork cannot support
`plan.Configuration.IgnoreImplementingTypeWeights`, or restore the assignment
behind a clear version check so the option actually takes effect. Keep the
change centered around `executor.go` and the `planConfig` setup.

In `@router/core/graph_server.go`:
- Around line 239-243: The startup path in graph_server.go leaks partially
initialized entity cache resources because buildEntityCacheInstances() succeeds
before later setup failures can return. Add the same cleanup-on-error guard
already used elsewhere in GraphServer initialization: after assigning
s.entityCacheInstances, ensure every subsequent error path releases those
instances before returning. Use the existing GraphServer startup flow and the
buildEntityCacheInstances/entityCacheInstances symbols to place the cleanup
consistently across later fallible steps.
- Around line 1070-1080: The shared entity-cache metrics are being collected
inside configureCacheMetrics via buildGraphMux, so the same
srv.entityCacheInstances get exported multiple times for each mux. Move the
entityMetricSources / entityCacheMetricRegistrations path to server-level setup
in newGraphServer, or ensure buildGraphMux only registers them for the base mux,
so shared caches are emitted once and not duplicated across feature-flag muxes.

In `@router/core/router.go`:
- Around line 2035-2042: The shutdown loop in router cleanup is closing aliased
cache instances twice because buildEntityCacheInstances() stores the same cache
under both "default" and the provider_id. Update the close logic around
r.entityCacheInstances to deduplicate instances before calling
io.Closer.Close(), using the cache identity rather than the map key so each
underlying cache is closed only once. Keep the change localized to the
entity-cache shutdown path and preserve the existing error aggregation via
err.Append.

In `@router/pkg/entitycache/circuit_breaker.go`:
- Around line 44-48: `NewCircuitBreakerCache` currently narrows
`cfg.FailureThreshold` to `int32` without validation, which can wrap large
values and allows non-positive thresholds to behave incorrectly. Add a bounds
check in `NewCircuitBreakerCache` before assigning `failureThreshold` on
`CircuitBreakerCache`, rejecting or clamping values that are <= 0 or exceed the
`int32` range, and only store a safe validated threshold.

In `@router/pkg/entitycache/memory.go`:
- Around line 69-90: The invalidation path in MemoryEntityCache should not
depend on a prior Get check because queued SetWithTTL writes can make the key
appear absent and skip deletion. Update the Delete method on MemoryEntityCache
to always call c.cache.Del(k) for each key, and keep the len accounting separate
from the delete call so cache invalidation is unconditional.

In `@shared/src/router-config/builder.ts`:
- Around line 136-140: The EntityCachingConfiguration builder is still only
passing entityCache, cacheInvalidateConfigurations, and
cachePopulateConfigurations, so the new request-scoped and query-cache directive
fields are being dropped and can leave the result undefined when they are the
only cache settings. Update the builder path that constructs
EntityCachingConfiguration to include the new cache directive fields alongside
the existing ones, and make sure the aggregation logic in builder.ts preserves
them even when entityCacheConfigurations is absent.

---

Nitpick comments:
In `@demo/Makefile`:
- Around line 47-49: The compose-cache target is generating an unformatted
one-line config-cache-only.json artifact, unlike compose-cachetest. Update the
compose-cache rule in the Makefile to run the same normalization step used by
compose-cachetest after calling $(wgc_router_ci) compose, so
demo/config-cache-only.json is pretty-printed and reviewable.

In `@router/core/graphql_handler_caching_options_test.go`:
- Around line 30-172: Add a test in the GraphQL handler caching options suite
that exercises the new KeyInterceptors path by verifying cachingOptions() wires
in the L2CacheKeyInterceptor produced by buildL2CacheKeyInterceptor(). Use a new
caching-options handler setup with one or two fake EntityCacheKeyInterceptor
implementations, then assert the returned resolve.CachingOptions includes a
non-nil L2CacheKeyInterceptor and that it rewrites keys as expected. Keep the
existing flag/prefix coverage intact and place the new assertions near the other
TestGraphQLHandler_cachingOptions_* cases.

In `@router/pkg/entitycache/circuit_breaker_test.go`:
- Around line 256-266: The goroutine fan-out in the circuit breaker test still
uses manual WaitGroup bookkeeping, which should be updated to the newer style.
In the test block that launches the `cb.Get` calls, replace the explicit
`wg.Add(1)`/`defer wg.Done()` pattern around the anonymous goroutine with
`wg.Go(...)` on the same `sync.WaitGroup` variable, so the launch and completion
accounting are handled in one place and the test matches the Go 1.25 concurrency
style.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f20110f4-4144-448f-bd8a-8bc8a759a8f7

📥 Commits

Reviewing files that changed from the base of the PR and between 31f9596 and a9e3827.

⛔ Files ignored due to path filters (29)
  • connect-go/gen/proto/wg/cosmo/node/v1/node.pb.go is excluded by !**/*.pb.go, !**/gen/**
  • demo/pkg/subgraphs/cachegraph/subgraph/generated/federation.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachegraph/subgraph/generated/generated.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachegraph_ext/subgraph/generated/federation.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachegraph_ext/subgraph/generated/federation.requires.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachegraph_ext/subgraph/generated/generated.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/articles/subgraph/generated/federation.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/articles/subgraph/generated/federation.requires.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/articles/subgraph/generated/generated.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/articles/subgraph/generated/staticcheck.conf is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/generated/federation.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/generated/generated.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/generated/staticcheck.conf is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/details/subgraph/generated/federation.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/details/subgraph/generated/generated.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/details/subgraph/generated/staticcheck.conf is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/inventory/subgraph/generated/federation.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/inventory/subgraph/generated/generated.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/items/subgraph/generated/federation.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/items/subgraph/generated/generated.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/items/subgraph/generated/staticcheck.conf is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/viewer/subgraph/generated/federation.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/viewer/subgraph/generated/generated.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/cachetest/viewer/subgraph/generated/staticcheck.conf is excluded by !**/generated/**
  • demo/pkg/subgraphs/viewer/subgraph/generated/federation.go is excluded by !**/generated/**
  • demo/pkg/subgraphs/viewer/subgraph/generated/generated.go is excluded by !**/generated/**
  • router-tests/go.sum is excluded by !**/*.sum
  • router/gen/proto/wg/cosmo/node/v1/node.pb.go is excluded by !**/*.pb.go, !**/gen/**
  • router/go.sum is excluded by !**/*.sum
📒 Files selected for processing (136)
  • connect/src/wg/cosmo/node/v1/node_pb.ts
  • demo/Makefile
  • demo/cmd/all/main.go
  • demo/cmd/cache-demo/main.go
  • demo/config-cache-only.json
  • demo/go.mod
  • demo/graph-cache-only.yaml
  • demo/graph-cachetest.yaml
  • demo/graph.yaml
  • demo/pkg/injector/latency.go
  • demo/pkg/subgraphs/cachegraph/cachegraph.go
  • demo/pkg/subgraphs/cachegraph/generate.go
  • demo/pkg/subgraphs/cachegraph/gqlgen.yml
  • demo/pkg/subgraphs/cachegraph/subgraph/auth_profiles.go
  • demo/pkg/subgraphs/cachegraph/subgraph/data.go
  • demo/pkg/subgraphs/cachegraph/subgraph/data_race_test.go
  • demo/pkg/subgraphs/cachegraph/subgraph/data_test.go
  • demo/pkg/subgraphs/cachegraph/subgraph/entity.resolvers.go
  • demo/pkg/subgraphs/cachegraph/subgraph/model/models_gen.go
  • demo/pkg/subgraphs/cachegraph/subgraph/resolver.go
  • demo/pkg/subgraphs/cachegraph/subgraph/schema.graphqls
  • demo/pkg/subgraphs/cachegraph/subgraph/schema.resolvers.go
  • demo/pkg/subgraphs/cachegraph_ext/cachegraph_ext.go
  • demo/pkg/subgraphs/cachegraph_ext/generate.go
  • demo/pkg/subgraphs/cachegraph_ext/gqlgen.yml
  • demo/pkg/subgraphs/cachegraph_ext/subgraph/data.go
  • demo/pkg/subgraphs/cachegraph_ext/subgraph/entity.resolvers.go
  • demo/pkg/subgraphs/cachegraph_ext/subgraph/model/models_gen.go
  • demo/pkg/subgraphs/cachegraph_ext/subgraph/resolver.go
  • demo/pkg/subgraphs/cachegraph_ext/subgraph/schema.graphqls
  • demo/pkg/subgraphs/cachegraph_ext/subgraph/schema.resolvers.go
  • demo/pkg/subgraphs/cachegraph_ext/subgraph/schema.resolvers_test.go
  • demo/pkg/subgraphs/cachetest/articles/articles.go
  • demo/pkg/subgraphs/cachetest/articles/generate.go
  • demo/pkg/subgraphs/cachetest/articles/gqlgen.yml
  • demo/pkg/subgraphs/cachetest/articles/subgraph/data.go
  • demo/pkg/subgraphs/cachetest/articles/subgraph/entity.resolvers.go
  • demo/pkg/subgraphs/cachetest/articles/subgraph/model/models_gen.go
  • demo/pkg/subgraphs/cachetest/articles/subgraph/resolver.go
  • demo/pkg/subgraphs/cachetest/articles/subgraph/schema.graphqls
  • demo/pkg/subgraphs/cachetest/articles/subgraph/schema.resolvers.go
  • demo/pkg/subgraphs/cachetest/articlesmeta/articlesmeta.go
  • demo/pkg/subgraphs/cachetest/articlesmeta/generate.go
  • demo/pkg/subgraphs/cachetest/articlesmeta/gqlgen.yml
  • demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/data.go
  • demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/entity.resolvers.go
  • demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/model/models_gen.go
  • demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/resolver.go
  • demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/schema.graphqls
  • demo/pkg/subgraphs/cachetest/details/details.go
  • demo/pkg/subgraphs/cachetest/details/generate.go
  • demo/pkg/subgraphs/cachetest/details/gqlgen.yml
  • demo/pkg/subgraphs/cachetest/details/subgraph/data.go
  • demo/pkg/subgraphs/cachetest/details/subgraph/entity.resolvers.go
  • demo/pkg/subgraphs/cachetest/details/subgraph/model/models_gen.go
  • demo/pkg/subgraphs/cachetest/details/subgraph/resolver.go
  • demo/pkg/subgraphs/cachetest/details/subgraph/schema.graphqls
  • demo/pkg/subgraphs/cachetest/inventory/generate.go
  • demo/pkg/subgraphs/cachetest/inventory/gqlgen.yml
  • demo/pkg/subgraphs/cachetest/inventory/inventory.go
  • demo/pkg/subgraphs/cachetest/inventory/subgraph/data.go
  • demo/pkg/subgraphs/cachetest/inventory/subgraph/entity.resolvers.go
  • demo/pkg/subgraphs/cachetest/inventory/subgraph/model/models_gen.go
  • demo/pkg/subgraphs/cachetest/inventory/subgraph/resolver.go
  • demo/pkg/subgraphs/cachetest/inventory/subgraph/schema.graphqls
  • demo/pkg/subgraphs/cachetest/items/generate.go
  • demo/pkg/subgraphs/cachetest/items/gqlgen.yml
  • demo/pkg/subgraphs/cachetest/items/items.go
  • demo/pkg/subgraphs/cachetest/items/subgraph/data.go
  • demo/pkg/subgraphs/cachetest/items/subgraph/entity.resolvers.go
  • demo/pkg/subgraphs/cachetest/items/subgraph/model/models_gen.go
  • demo/pkg/subgraphs/cachetest/items/subgraph/resolver.go
  • demo/pkg/subgraphs/cachetest/items/subgraph/schema.graphqls
  • demo/pkg/subgraphs/cachetest/items/subgraph/schema.resolvers.go
  • demo/pkg/subgraphs/cachetest/viewer/generate.go
  • demo/pkg/subgraphs/cachetest/viewer/gqlgen.yml
  • demo/pkg/subgraphs/cachetest/viewer/subgraph/data.go
  • demo/pkg/subgraphs/cachetest/viewer/subgraph/entity.resolvers.go
  • demo/pkg/subgraphs/cachetest/viewer/subgraph/model/models_gen.go
  • demo/pkg/subgraphs/cachetest/viewer/subgraph/resolver.go
  • demo/pkg/subgraphs/cachetest/viewer/subgraph/schema.graphqls
  • demo/pkg/subgraphs/cachetest/viewer/subgraph/schema.resolvers.go
  • demo/pkg/subgraphs/cachetest/viewer/viewer.go
  • demo/pkg/subgraphs/subgraphs.go
  • demo/pkg/subgraphs/viewer/generate.go
  • demo/pkg/subgraphs/viewer/gqlgen.yml
  • demo/pkg/subgraphs/viewer/subgraph/context.go
  • demo/pkg/subgraphs/viewer/subgraph/data.go
  • demo/pkg/subgraphs/viewer/subgraph/entity.resolvers.go
  • demo/pkg/subgraphs/viewer/subgraph/model/models_gen.go
  • demo/pkg/subgraphs/viewer/subgraph/resolver.go
  • demo/pkg/subgraphs/viewer/subgraph/schema.graphqls
  • demo/pkg/subgraphs/viewer/subgraph/schema.resolvers.go
  • demo/pkg/subgraphs/viewer/viewer.go
  • demo/router-cache-cp.yaml
  • demo/router-cache.yaml
  • proto/wg/cosmo/node/v1/node.proto
  • router-tests/entity_caching_standard_subgraphs_test.go
  • router-tests/entitycaching/entitycaching_test.go
  • router-tests/entitycaching/harness_test.go
  • router-tests/entitycaching/redis_test.go
  • router-tests/entitycaching/setup_test.go
  • router-tests/entitycaching/testdata/config.json
  • router-tests/go.mod
  • router-tests/protocol/testdata/tracing.json
  • router-tests/testenv/testenv.go
  • router/core/executor.go
  • router/core/executor_entity_cache_test.go
  • router/core/factoryresolver.go
  • router/core/factoryresolver_entity_cache_test.go
  • router/core/factoryresolver_test.go
  • router/core/flushwriter.go
  • router/core/graph_server.go
  • router/core/graphql_handler.go
  • router/core/graphql_handler_caching_options_test.go
  • router/core/modules.go
  • router/core/router.go
  • router/core/router_config.go
  • router/core/router_entity_cache_test.go
  • router/core/supervisor_instance.go
  • router/core/websocket.go
  • router/go.mod
  • router/pkg/config/config.go
  • router/pkg/config/config.schema.json
  • router/pkg/config/config_test.go
  • router/pkg/config/testdata/config_defaults.json
  • router/pkg/config/testdata/config_full.json
  • router/pkg/entitycache/circuit_breaker.go
  • router/pkg/entitycache/circuit_breaker_test.go
  • router/pkg/entitycache/memory.go
  • router/pkg/entitycache/memory_test.go
  • router/pkg/entitycache/redis.go
  • router/pkg/entitycache/redis_test.go
  • router/pkg/graphqlschemausage/schemausage_bench_test.go
  • router/pkg/graphqlschemausage/schemausage_test.go
  • shared/src/router-config/builder.ts

Comment on lines +4 to +19
import (
"log"
"net/http"
"strconv"

"github.com/99designs/gqlgen/graphql"
"github.com/99designs/gqlgen/graphql/handler"
"github.com/99designs/gqlgen/graphql/handler/transport"
"github.com/99designs/gqlgen/graphql/playground"
"golang.org/x/sync/errgroup"

"github.com/wundergraph/cosmo/demo/pkg/injector"
"github.com/wundergraph/cosmo/demo/pkg/subgraphs/cachegraph"
"github.com/wundergraph/cosmo/demo/pkg/subgraphs/cachegraph_ext"
"github.com/wundergraph/cosmo/demo/pkg/subgraphs/viewer"
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "http.Server constructions in demo code:"
rg -n -C2 'http\.Server\s*\{' demo/cmd demo/pkg

echo
echo "Existing timeout fields in demo code:"
rg -n -C2 'ReadTimeout|ReadHeaderTimeout|WriteTimeout|IdleTimeout' demo/cmd demo/pkg

Repository: wundergraph/cosmo

Length of output: 2720


Set timeouts on these demo HTTP servers. Both http.Server instances here have no read/header/write/idle timeouts, so a slow or stalled client can tie up the demo listeners indefinitely and starve the cache demo.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@demo/cmd/cache-demo/main.go` around lines 4 - 19, Set sensible timeouts on
the demo HTTP servers used in cache-demo main so slow or stalled clients cannot
hold connections open indefinitely. Update the server setup in main and the
related server construction for the cache demo handlers (including the gqlgen
handler and playground listener) to configure read, header, write, and idle
timeouts on the http.Server instances.

Source: Linters/SAST tools

Comment on lines +19 to +24
func Latency(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if v := r.Header.Get(ArtificialLatencyHeader); v != "" {
if ms, err := strconv.Atoi(v); err == nil && ms > 0 && ms <= 10000 {
log.Printf("[latency] %s %s sleep %dms prefix=%q", r.Method, r.Host, ms, r.Header.Get("X-WG-Cache-Key-Prefix"))
time.Sleep(time.Duration(ms) * time.Millisecond)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Honor request cancellation while injecting latency.

time.Sleep keeps the handler blocked for up to 10s even after the client disconnects or the request context is canceled. Since this wrapper is now applied broadly in demo/pkg/subgraphs/subgraphs.go, canceled demo traffic can still tie up goroutines unnecessarily.

Proposed fix
 func Latency(next http.Handler) http.Handler {
 	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
 		if v := r.Header.Get(ArtificialLatencyHeader); v != "" {
 			if ms, err := strconv.Atoi(v); err == nil && ms > 0 && ms <= 10000 {
 				log.Printf("[latency] %s %s sleep %dms prefix=%q", r.Method, r.Host, ms, r.Header.Get("X-WG-Cache-Key-Prefix"))
-				time.Sleep(time.Duration(ms) * time.Millisecond)
+				timer := time.NewTimer(time.Duration(ms) * time.Millisecond)
+				defer timer.Stop()
+				select {
+				case <-timer.C:
+				case <-r.Context().Done():
+					return
+				}
 			}
 		}
 		next.ServeHTTP(w, r)
 	})
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func Latency(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if v := r.Header.Get(ArtificialLatencyHeader); v != "" {
if ms, err := strconv.Atoi(v); err == nil && ms > 0 && ms <= 10000 {
log.Printf("[latency] %s %s sleep %dms prefix=%q", r.Method, r.Host, ms, r.Header.Get("X-WG-Cache-Key-Prefix"))
time.Sleep(time.Duration(ms) * time.Millisecond)
func Latency(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if v := r.Header.Get(ArtificialLatencyHeader); v != "" {
if ms, err := strconv.Atoi(v); err == nil && ms > 0 && ms <= 10000 {
log.Printf("[latency] %s %s sleep %dms prefix=%q", r.Method, r.Host, ms, r.Header.Get("X-WG-Cache-Key-Prefix"))
timer := time.NewTimer(time.Duration(ms) * time.Millisecond)
defer timer.Stop()
select {
case <-timer.C:
case <-r.Context().Done():
return
}
}
}
next.ServeHTTP(w, r)
})
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@demo/pkg/injector/latency.go` around lines 19 - 24, The Latency middleware in
Latency should stop blocking on time.Sleep after the request is canceled. Update
the artificial delay path to wait on the request context from r.Context()
alongside the timer, and return early if the context is done so canceled
requests do not keep goroutines pinned. Keep the existing validation and
logging, but make the sleep in Latency cancellation-aware.

Comment on lines +143 to +156
func (s *articleStore) byIDs(ids []string) []*model.Article {
idSet := make(map[string]struct{}, len(ids))
for _, id := range ids {
idSet[id] = struct{}{}
}
s.mu.RLock()
defer s.mu.RUnlock()
var out []*model.Article
for _, a := range s.articles {
if _, ok := idSet[a.ID]; ok {
out = append(out, cloneArticle(a))
}
}
return out

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Preserve the caller’s ID order in byIDs.

This helper turns ids into a set and then walks s.articles, so a request like ["3", "1"] comes back in store order instead of request order. Any query/resolver using it will return a reordered list.

Suggested fix
 func (s *articleStore) byIDs(ids []string) []*model.Article {
-	idSet := make(map[string]struct{}, len(ids))
-	for _, id := range ids {
-		idSet[id] = struct{}{}
-	}
 	s.mu.RLock()
 	defer s.mu.RUnlock()
-	var out []*model.Article
+	byID := make(map[string]*model.Article, len(s.articles))
 	for _, a := range s.articles {
-		if _, ok := idSet[a.ID]; ok {
-			out = append(out, cloneArticle(a))
-		}
+		byID[a.ID] = a
+	}
+	out := make([]*model.Article, 0, len(ids))
+	for _, id := range ids {
+		if a, ok := byID[id]; ok {
+			out = append(out, cloneArticle(a))
+		}
 	}
 	return out
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (s *articleStore) byIDs(ids []string) []*model.Article {
idSet := make(map[string]struct{}, len(ids))
for _, id := range ids {
idSet[id] = struct{}{}
}
s.mu.RLock()
defer s.mu.RUnlock()
var out []*model.Article
for _, a := range s.articles {
if _, ok := idSet[a.ID]; ok {
out = append(out, cloneArticle(a))
}
}
return out
func (s *articleStore) byIDs(ids []string) []*model.Article {
s.mu.RLock()
defer s.mu.RUnlock()
byID := make(map[string]*model.Article, len(s.articles))
for _, a := range s.articles {
byID[a.ID] = a
}
out := make([]*model.Article, 0, len(ids))
for _, id := range ids {
if a, ok := byID[id]; ok {
out = append(out, cloneArticle(a))
}
}
return out
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@demo/pkg/subgraphs/cachegraph/subgraph/data.go` around lines 143 - 156, The
articleStore.byIDs helper currently deduplicates IDs into a set and iterates
over s.articles, which reorders results instead of preserving the caller’s
requested sequence. Update byIDs to return articles in the same order as the
input ids slice (while still cloning matches and handling missing IDs), and keep
the lookup logic centered around articleStore, byIDs, and cloneArticle so
query/resolver callers receive ordered results.

Comment on lines +226 to +233
func (s *listingStore) all() []*model.Listing {
s.mu.RLock()
defer s.mu.RUnlock()
out := make([]*model.Listing, 0, len(s.listings))
for _, l := range s.listings {
out = append(out, cloneListing(l))
}
return out

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Return these lists in a stable order.

Both listingStore.all() and allVenues() iterate Go maps directly, so their response order can change between calls. That makes demo responses nondeterministic, and it is inconsistent with allCatalogs(), which already sorts before returning.

Also applies to: 298-303

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@demo/pkg/subgraphs/cachegraph/subgraph/data.go` around lines 226 - 233, Both
listingStore.all() and allVenues() currently range over maps directly, so their
returned slices can come back in different orders between calls. Update these
helpers to collect items and sort them before returning, matching the stable
ordering already used by allCatalogs(). Use the existing clone-and-return
pattern in listingStore.all() and the venue listing path in allVenues() so
callers get deterministic demo responses.

Comment on lines +46 to +65
func buildArticle(id string) *model.Article {
fixture := articleFixtures[id]
relatedIDs := relatedArticleIDs[id]
related := make([]*model.Article, 0, len(relatedIDs))
for _, rid := range relatedIDs {
relatedFixture := articleFixtures[rid]
related = append(related, &model.Article{
ID: rid,
Title: relatedFixture.Title,
Body: relatedFixture.Body,
Tags: relatedFixture.Tags,
})
}
return &model.Article{
ID: id,
Title: fixture.Title,
Body: fixture.Body,
Tags: fixture.Tags,
RelatedArticles: related,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Fail fast on unknown article IDs.

buildArticle currently turns a missing fixture lookup into a zero-value Article, so an invalid id still looks like a successful entity resolution. That hides fixture/key mismatches and weakens the cache-test coverage. Please return an error for unknown ids and propagate it back through FindArticleByID.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@demo/pkg/subgraphs/cachetest/articles/subgraph/data.go` around lines 46 - 65,
The article fixture lookup currently treats missing IDs as zero-value articles,
so invalid IDs appear to resolve successfully. Update buildArticle to fail fast
when articleFixtures[id] or any related fixture is missing, returning an error
instead of constructing a partial Article, and change FindArticleByID to
propagate that error back to the caller. Keep the fix localized around
buildArticle and FindArticleByID so cache-test fixture/key mismatches surface
immediately.

Comment on lines +1070 to +1080
entityMetricSources := make(map[string]cacheMetricSource)
for name, cache := range srv.entityCacheInstances {
source, ok := cache.(cacheMetricSource)
if !ok {
continue
}
entityMetricSources[name] = source
}
for _, registration := range entityCacheMetricRegistrations(entityMetricSources) {
metricInfos = append(metricInfos, rmetric.NewCacheMetricInfo(registration.cacheType, registration.maxCost, registration.metrics))
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Don't register shared entity-cache metrics once per graph mux.

srv.entityCacheInstances is server-wide state built once in newGraphServer, but Lines 1070-1080 append it every time buildGraphMux() calls configureCacheMetrics(). With feature-flag muxes, the same cache is exported multiple times under different mux attributes, so telemetry for one shared cache gets duplicated. Register these metrics once at the server level, or gate them to the base mux only.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@router/core/graph_server.go` around lines 1070 - 1080, The shared
entity-cache metrics are being collected inside configureCacheMetrics via
buildGraphMux, so the same srv.entityCacheInstances get exported multiple times
for each mux. Move the entityMetricSources / entityCacheMetricRegistrations path
to server-level setup in newGraphServer, or ensure buildGraphMux only registers
them for the base mux, so shared caches are emitted once and not duplicated
across feature-flag muxes.

Comment thread router/core/router.go
Comment on lines +2035 to +2042
// Close entity cache instances that implement io.Closer (e.g. ristretto-backed MemoryEntityCache).
for _, cache := range r.entityCacheInstances {
if closer, ok := cache.(io.Closer); ok {
if closeErr := closer.Close(); closeErr != nil {
err.Append(fmt.Errorf("failed to close entity cache: %w", closeErr))
}
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Deduplicate aliased caches before closing them.

buildEntityCacheInstances() intentionally stores the same instance under "default" and the real provider_id. This shutdown loop closes every map entry, so aliased caches are closed twice. For caches backed by closers that are not idempotent, a normal shutdown can now report an error on the second Close().

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@router/core/router.go` around lines 2035 - 2042, The shutdown loop in router
cleanup is closing aliased cache instances twice because
buildEntityCacheInstances() stores the same cache under both "default" and the
provider_id. Update the close logic around r.entityCacheInstances to deduplicate
instances before calling io.Closer.Close(), using the cache identity rather than
the map key so each underlying cache is closed only once. Keep the change
localized to the entity-cache shutdown path and preserve the existing error
aggregation via err.Append.

Comment on lines +44 to +48
func NewCircuitBreakerCache(cache resolve.LoaderCache, cfg CircuitBreakerConfig) *CircuitBreakerCache {
cb := &CircuitBreakerCache{
cache: cache,
failureThreshold: int32(cfg.FailureThreshold),
cooldownPeriod: cfg.CooldownPeriod,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

fd -p 'config.schema.json' router/pkg/config -x sed -n '/failure_threshold/,+12p' {}

rg -n --context 2 'FailureThreshold|NewCircuitBreakerCache' \
  router/pkg/config \
  router/pkg/entitycache \
  router/core \
  -g '*.go'

Repository: wundergraph/cosmo

Length of output: 4453


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the circuit breaker implementation and config validation/loading path.
sed -n '1,220p' router/pkg/entitycache/circuit_breaker.go
printf '\n--- CONFIG ---\n'
sed -n '1170,1235p' router/pkg/config/config.go
printf '\n--- SEARCH VALIDATION ---\n'
rg -n --context 3 'minimum: 1|FailureThreshold|validate|validation|schema' router/pkg/config router/pkg/entitycache -g '*.go' -g '*.json' -g '*.yaml' -g '*.yml'

Repository: wundergraph/cosmo

Length of output: 50373


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the config loading order and the schema entry for the breaker threshold.
sed -n '1520,1645p' router/pkg/config/config.go
printf '\n--- SCHEMA ENTRY ---\n'
rg -n --context 4 '"failure_threshold"' router/pkg/config/config.schema.json
printf '\n--- ENV PARSING PATH ---\n'
rg -n --context 2 'env\.Parse|caarlos0/env|LoadConfig\(|ValidateConfig\(' router/pkg/config/config.go router/pkg/config/config_test.go -g '*.go'

Repository: wundergraph/cosmo

Length of output: 42212


Guard FailureThreshold before narrowing it to int32.
int32(cfg.FailureThreshold) can wrap for large values, and non-positive thresholds make the breaker open on the first failure. Add a bounds check here before storing it.

🧰 Tools
🪛 ast-grep (0.44.0)

[warning] 46-46: Narrowing a non-constant integer to a smaller fixed-width type (int8/int16/int32, uint8/uint16/uint32) can silently overflow or wrap, yielding negative or truncated values that are dangerous in size, length, or index logic. Validate the source value is within the target type's range before converting (e.g. bounds-check, or use a checked helper), and avoid narrowing untrusted or len()/parsed values.
Context: int32(cfg.FailureThreshold)
Note: [CWE-190] Integer Overflow or Wraparound.

(integer-overflow-narrowing-conversion-go)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@router/pkg/entitycache/circuit_breaker.go` around lines 44 - 48,
`NewCircuitBreakerCache` currently narrows `cfg.FailureThreshold` to `int32`
without validation, which can wrap large values and allows non-positive
thresholds to behave incorrectly. Add a bounds check in `NewCircuitBreakerCache`
before assigning `failureThreshold` on `CircuitBreakerCache`, rejecting or
clamping values that are <= 0 or exceed the `int32` range, and only store a safe
validated threshold.

Source: Linters/SAST tools

Comment on lines +69 to +90
func (c *MemoryEntityCache) Set(_ context.Context, entries []*resolve.CacheEntry) error {
if len(entries) == 0 {
return nil
}
for _, entry := range entries {
if entry == nil {
continue
}
// Negative TTL means "no expiration" per the LoaderCache contract.
// Ristretto treats ttl<=0 as no expiration, so clamp negatives to 0.
ttl := entry.TTL
if ttl < 0 {
ttl = 0
}
// Check if key already exists (update vs new entry)
_, exists := c.cache.Get(entry.Key)
if c.cache.SetWithTTL(entry.Key, entry.Value, int64(len(entry.Value)), ttl) && !exists {
c.len.Add(1)
}
}
c.cache.Wait()
return nil

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Map the relevant file and nearby methods
ast-grep outline router/pkg/entitycache/memory.go --view expanded || true
echo '---'
cat -n router/pkg/entitycache/memory.go | sed -n '1,220p'

echo '--- SEARCH RISTRETTO USAGE ---'
rg -n "SetWithTTL|Wait\(|Get\(|Del\(" router/pkg/entitycache -S || true

echo '--- GO MODULE / DEP VERSION ---'
rg -n 'dgraph-io/ristretto' go.mod go.sum router/go.mod router/go.sum -S || true

Repository: wundergraph/cosmo

Length of output: 11175


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "entity cache|MemoryEntityCache|ristretto|invalidation|Delete\(" . -S || true

Repository: wundergraph/cosmo

Length of output: 31305


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '980,1015p' router/core/graph_server.go
echo '---'
sed -n '220,245p' router/pkg/entitycache/memory_test.go
echo '---'
sed -n '60,90p' router/pkg/entitycache/redis.go

Repository: wundergraph/cosmo

Length of output: 2707


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
import os, glob
paths = []
for base in [os.path.expanduser('~/go/pkg/mod'), '/go/pkg/mod']:
    if os.path.isdir(base):
        paths.extend(glob.glob(base + '/github.com/dgraph-io/ristretto*/**/*.go', recursive=True))
print('\n'.join(sorted(paths)[:40]))
PY

Repository: wundergraph/cosmo

Length of output: 3493


🏁 Script executed:

#!/bin/bash
set -euo pipefail

CACHE_FILE="/home/jailuser/go/pkg/mod/github.com/dgraph-io/ristretto/v2@v2.1.0/cache.go"
sed -n '1,260p' "$CACHE_FILE"

Repository: wundergraph/cosmo

Length of output: 10404


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '260,520p' /home/jailuser/go/pkg/mod/github.com/dgraph-io/ristretto/v2@v2.1.0/cache.go
echo '---'
sed -n '1,240p' /home/jailuser/go/pkg/mod/github.com/dgraph-io/ristretto/v2@v2.1.0/store.go

Repository: wundergraph/cosmo

Length of output: 13481


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '520,760p' /home/jailuser/go/pkg/mod/github.com/dgraph-io/ristretto/v2@v2.1.0/cache.go

Repository: wundergraph/cosmo

Length of output: 5765


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '1,220p' router/pkg/entitycache/memory_test.go
echo '---'
sed -n '220,320p' router/pkg/entitycache/memory_test.go

Repository: wundergraph/cosmo

Length of output: 7857


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "itemDelete|itemUpdate|itemNew|setBuf|storedItems.Set|storedItems.Del" /home/jailuser/go/pkg/mod/github.com/dgraph-io/ristretto/v2@v2.1.0/cache.go -n -S
echo '---'
sed -n '760,940p' /home/jailuser/go/pkg/mod/github.com/dgraph-io/ristretto/v2@v2.1.0/cache.go

Repository: wundergraph/cosmo

Length of output: 2223


Make Delete unconditional in router/pkg/entitycache/memory.go:93-103. Get can miss a key that’s already been queued by SetWithTTL, so the invalidation is skipped and stale data survives. Call c.cache.Del(k) regardless, and keep len bookkeeping separate.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@router/pkg/entitycache/memory.go` around lines 69 - 90, The invalidation path
in MemoryEntityCache should not depend on a prior Get check because queued
SetWithTTL writes can make the key appear absent and skip deletion. Update the
Delete method on MemoryEntityCache to always call c.cache.Del(k) for each key,
and keep the len accounting separate from the delete call so cache invalidation
is unconditional.

Comment on lines 136 to 140
return new EntityCachingConfiguration({
cacheInvalidateConfigurations,
cachePopulateConfigurations,
entityCacheConfigurations,
entityCache: entityCacheConfigurations,
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Propagate the new cache directive fields in this builder.

This constructor still emits only entityCache, cacheInvalidateConfigurations, and cachePopulateConfigurations. The updated contract also carries request-scoped and query-cache metadata, so this path silently drops those directives and can even return undefined when they are the only cache config present.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@shared/src/router-config/builder.ts` around lines 136 - 140, The
EntityCachingConfiguration builder is still only passing entityCache,
cacheInvalidateConfigurations, and cachePopulateConfigurations, so the new
request-scoped and query-cache directive fields are being dropped and can leave
the result undefined when they are the only cache settings. Update the builder
path that constructs EntityCachingConfiguration to include the new cache
directive fields alongside the existing ones, and make sure the aggregation
logic in builder.ts preserves them even when entityCacheConfigurations is
absent.

@SkArchon SkArchon changed the base branch from main to milinda/entity-caching-1b-demo June 30, 2026 15:21
…o entity_cache_configurations

Align the proto field with origin/main (entity_cache_configurations). The old
name serialized to JSON as entityCache, so the router silently dropped the
control-plane's entity caching config on unmarshal. Regenerated Go/TS bindings
and updated all GetEntityCache() consumers in the router core and tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant