feat(router): entity caching with L1/L2, shadow mode, and per-request cache controls (3/6)#2955
feat(router): entity caching with L1/L2, shadow mode, and per-request cache controls (3/6)#2955SkArchon wants to merge 10 commits into
Conversation
… generated bindings Extracted from jensneuse/entity-caching-v2 (PR #2777) — composition + proto layer only. - composition: @openfed__entityCache / @openfed__queryCache / @openfed__requestScoped directive definitions, extraction in normalization-factory, ConfigurationData types, warnings, and tests (incl. fuzz + mapping rules) - proto: additive DataSourceConfiguration fields 17-21 and new messages (EntityCacheConfiguration, RootFieldCacheConfiguration, EntityKeyMapping, EntityCacheFieldMapping, CachePopulateConfiguration, CacheInvalidateConfiguration, RequestScopedFieldConfiguration) - regenerated bindings: router/gen, connect-go, connect (incl. graphqlmetrics regen side-effects), composition-go bundle - shared: router-config serializer for the new proto fields Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…t subgraphs Extracted from jensneuse/entity-caching-v2 (PR #2777) — demo layer, stacked on the composition+proto PR. - cachegraph / cachegraph_ext / viewer: demo subgraphs exercising the caching directives (mutable data store, latency injector, auth profiles) - cmd/cache-demo: standalone cache-only runner (ports 4012-4014) - pkg/subgraphs/cachetest: the six deterministic subgraphs consumed by the router-tests entity_caching suite (articles, articlesmeta, details, inventory, items, viewer) - graph-cache-only.yaml + config-cache-only.json + Makefile compose targets Builds and tests green against the base router (no router changes required). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… cache controls Extracted from jensneuse/entity-caching-v2 (PR #2777) — router layer, stacked on the demo PR. Entity cache OTEL/Prometheus metrics are split into the next PR. - router/pkg/entitycache: in-memory L1, Redis L2, circuit breaker - router/core: factoryresolver mapping of proto cache config to planner metadata, graph server wiring, per-request cache-control headers (incl. WS subscriptions), EntityCacheKeyInterceptor module hook - graphql-go-tools pinned to 63fa1c88 (PR #1259 + v2.4.4 merged via entity-caching-v244-merge branch) — this also moves astjson to the caching arena rework (MergeValuesWithPath now returns 2 values; flushwriter adapted) - router-tests/entity_caching: integration suite consuming demo's cachetest subgraphs; config composed via wgc (composition-go was removed on main), testdata/config.json regenerated with main's composition - subscription-overhaul adaptation shims from the old base were dropped in favor of main's versions (websocket, flushwriter, executor, updater, mocks) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThis PR implements multi-layer (L1 per-request, L2 cross-request) entity caching in the Cosmo router. It extends the protobuf contract with new cache configuration message types, introduces in-memory (ristretto), Redis, and circuit-breaker cache backends, wires entity caching through the router core (executor, factoryresolver, GraphQL handler, graph server), adds demo subgraphs exercising all ChangesEntity Caching Infrastructure
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
Comment |
❌ Internal Query Planner CI checks failedThe Internal Query Planner CI checks failed in the celestial repository, and this is going to stop the merge of this PR. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## milinda/entity-caching-1b-demo #2955 +/- ##
==================================================================
- Coverage 46.48% 44.33% -2.15%
==================================================================
Files 1121 1108 -13
Lines 155128 155099 -29
Branches 10065 10065
==================================================================
- Hits 72107 68768 -3339
- Misses 81185 84452 +3267
- Partials 1836 1879 +43
🚀 New features to boost your workflow:
|
5ef33a6 to
243a7a4
Compare
…hing-2-router Brings the reworked entity-caching composition/proto format (from the queryCache-directive stack via the demo branch) into the router branch. Composition and proto files resolved to the new format; router runtime code is adjusted to the new proto format in a follow-up commit.
The queryCache-directive rework nests all entity-caching config under a single EntityCachingConfiguration on DataSourceConfiguration, and renames RootFieldCacheConfiguration -> QueryCacheConfiguration and RequestScopedFieldConfiguration -> RequestScopedConfiguration. Route the router config builder through DataSourceConfiguration.GetEntityCachingConfiguration() and its typed getters, and handle CachePopulateConfiguration.max_age_seconds now being a plain int64 (was optional *int64).
The entity-caching graphql-go-tools fork pinned by this branch predates plan.Configuration.IgnoreImplementingTypeWeights (added in released v2.5.0), while the released v2.5.0 lacks the entity-cache resolve.LoaderCache/CacheEntry APIs this branch needs. Guard the CostControl assignment with a TODO until the entity-caching fork is rebased onto v2.5.0; the toggle is a no-op until then.
Nest entity-caching test fixtures under DataSourceConfiguration.EntityCachingConfiguration, rename RootFieldCacheConfiguration->QueryCacheConfiguration and RequestScopedFieldConfiguration->RequestScopedConfiguration, and pass CachePopulateConfiguration.max_age_seconds as a plain int64 (was *int64).
…g-2-router # Conflicts: # composition/src/directive-definition-data/directive-definition-data.ts # composition/src/errors/errors.ts # composition/src/router-configuration/types.ts # composition/src/v1/constants/constants.ts # composition/src/v1/constants/directive-definitions.ts # composition/src/v1/normalization/normalization-factory.ts # composition/src/v1/normalization/types/types.ts # composition/src/v1/normalization/utils.ts # composition/tests/v1/directives/cache-populate.test.ts # connect-go/gen/proto/wg/cosmo/node/v1/node.pb.go # connect/src/wg/cosmo/node/v1/node_pb.ts # proto/wg/cosmo/node/v1/node.proto # router/gen/proto/wg/cosmo/node/v1/node.pb.go # shared/src/router-config/builder.ts
…into milinda/entity-caching-2-router
There was a problem hiding this comment.
Actionable comments posted: 19
🧹 Nitpick comments (3)
demo/Makefile (1)
47-49: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winNormalize
config-cache-only.jsonthe same way ascompose-cachetest.Right now this target writes a one-line artifact, which makes the committed
demo/config-cache-only.jsondiff hard to review and hides contract regressions much more easily.♻️ Proposed change
compose-cache: $(wgc_router_ci) compose -i ./graph-cache-only.yaml -o ./config-cache-only.json + jq . ./config-cache-only.json > ./config-cache-only.json.tmp + mv ./config-cache-only.json.tmp ./config-cache-only.json🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@demo/Makefile` around lines 47 - 49, The compose-cache target is generating an unformatted one-line config-cache-only.json artifact, unlike compose-cachetest. Update the compose-cache rule in the Makefile to run the same normalization step used by compose-cachetest after calling $(wgc_router_ci) compose, so demo/config-cache-only.json is pretty-printed and reviewable.router/core/graphql_handler_caching_options_test.go (1)
30-172: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd coverage for
KeyInterceptors.This suite locks down the flag and prefix branches, but it never exercises the new
buildL2CacheKeyInterceptor()path. A small test with one or two fakeEntityCacheKeyInterceptors would catch regressions in cache-key rewriting.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/core/graphql_handler_caching_options_test.go` around lines 30 - 172, Add a test in the GraphQL handler caching options suite that exercises the new KeyInterceptors path by verifying cachingOptions() wires in the L2CacheKeyInterceptor produced by buildL2CacheKeyInterceptor(). Use a new caching-options handler setup with one or two fake EntityCacheKeyInterceptor implementations, then assert the returned resolve.CachingOptions includes a non-nil L2CacheKeyInterceptor and that it rewrites keys as expected. Keep the existing flag/prefix coverage intact and place the new assertions near the other TestGraphQLHandler_cachingOptions_* cases.router/pkg/entitycache/circuit_breaker_test.go (1)
256-266: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winUse
sync.WaitGroup.Goin these test goroutine fan-outs.
These blocks can drop the manualAdd/Donebookkeeping and usewg.Go(...)instead, which keeps the accounting in one place and matches the Go 1.25 concurrency style.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/pkg/entitycache/circuit_breaker_test.go` around lines 256 - 266, The goroutine fan-out in the circuit breaker test still uses manual WaitGroup bookkeeping, which should be updated to the newer style. In the test block that launches the `cb.Get` calls, replace the explicit `wg.Add(1)`/`defer wg.Done()` pattern around the anonymous goroutine with `wg.Go(...)` on the same `sync.WaitGroup` variable, so the launch and completion accounting are handled in one place and the test matches the Go 1.25 concurrency style.Source: Learnings
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@demo/cmd/cache-demo/main.go`:
- Around line 4-19: Set sensible timeouts on the demo HTTP servers used in
cache-demo main so slow or stalled clients cannot hold connections open
indefinitely. Update the server setup in main and the related server
construction for the cache demo handlers (including the gqlgen handler and
playground listener) to configure read, header, write, and idle timeouts on the
http.Server instances.
In `@demo/pkg/injector/latency.go`:
- Around line 19-24: The Latency middleware in Latency should stop blocking on
time.Sleep after the request is canceled. Update the artificial delay path to
wait on the request context from r.Context() alongside the timer, and return
early if the context is done so canceled requests do not keep goroutines pinned.
Keep the existing validation and logging, but make the sleep in Latency
cancellation-aware.
In `@demo/pkg/subgraphs/cachegraph/subgraph/data.go`:
- Around line 226-233: Both listingStore.all() and allVenues() currently range
over maps directly, so their returned slices can come back in different orders
between calls. Update these helpers to collect items and sort them before
returning, matching the stable ordering already used by allCatalogs(). Use the
existing clone-and-return pattern in listingStore.all() and the venue listing
path in allVenues() so callers get deterministic demo responses.
- Around line 143-156: The articleStore.byIDs helper currently deduplicates IDs
into a set and iterates over s.articles, which reorders results instead of
preserving the caller’s requested sequence. Update byIDs to return articles in
the same order as the input ids slice (while still cloning matches and handling
missing IDs), and keep the lookup logic centered around articleStore, byIDs, and
cloneArticle so query/resolver callers receive ordered results.
In `@demo/pkg/subgraphs/cachetest/articles/subgraph/data.go`:
- Around line 46-65: The article fixture lookup currently treats missing IDs as
zero-value articles, so invalid IDs appear to resolve successfully. Update
buildArticle to fail fast when articleFixtures[id] or any related fixture is
missing, returning an error instead of constructing a partial Article, and
change FindArticleByID to propagate that error back to the caller. Keep the fix
localized around buildArticle and FindArticleByID so cache-test fixture/key
mismatches surface immediately.
In `@demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/entity.resolvers.go`:
- Around line 15-23: Update entityResolver.FindArticleByID so it checks whether
the id exists in articleMetaFixtures before building the Article; if the lookup
fails, return nil (and no Article) instead of fabricating a zero-valued
model.Article. Keep the existing field mapping for the success path, but gate it
behind the ok check so unknown keys are treated as not found.
In `@demo/pkg/subgraphs/cachetest/items/subgraph/data.go`:
- Around line 11-15: The item ID generator is still package-global, so parallel
resolvers/tests can share mutable state and produce nondeterministic IDs. Move
ID allocation into the store instance created by newItemStore() and have
create() use that store-local counter instead of the package-level nextID.
Update the related item store/resolver flow in data.go and the associated
create/newItemStore paths so each resolver gets its own deterministic sequence.
In `@demo/pkg/subgraphs/cachetest/items/subgraph/schema.resolvers.go`:
- Around line 24-29: The non-blocking send in the item update cache event path
silently drops invalidation/population notifications when the channel is full or
not ready, which breaks the cache-driven flows. Update the event emission in the
resolver methods that use ItemUpdatedCh so the mutation does not silently
succeed without delivering the event; instead make the send reliable or
explicitly handle/report the failure. Apply the same fix in all affected
resolver blocks that publish cache events.
In `@demo/router-cache.yaml`:
- Around line 24-51: The router-cache demo forwards X-Artificial-Latency to
subgraphs but does not whitelist it in CORS, so browser preflight fails for that
header. Update the CORS/header allowlist in the demo router config to mirror the
behavior in demo/router-cache-cp.yaml, and keep the change aligned with the
existing headers section and any related CORS settings in this config.
In `@router-tests/entitycaching/harness_test.go`:
- Around line 500-510: The test helper in harness_test.go is overwriting any
existing GraphQL extensions when injecting cacheInvalidation, so update the
response-building logic to preserve and merge the wrapped handler’s current
extensions payload instead of replacing it. In the modified response assembly
near the ext/resp handling, read any existing extensions from
resp["extensions"], decode or reuse them, add the cacheInvalidation entry, then
re-marshal the combined extensions back onto the response before marshaling
modified.
In `@router-tests/entitycaching/testdata/config.json`:
- Around line 119-306: Update the embedded RouterConfig fixture to use the
renamed proto JSON keys instead of the stale cache field names. In this file,
replace the old datasource-style names under the config object that feeds
nodev1.RouterConfig, including the sections currently labeled
entityCacheConfigurations, rootFieldCacheConfigurations, and
requestScopedFields, with the new entity/query/request-scoped cache field names
so unmarshaling continues to work after the proto contract change. Use the
surrounding RouterConfig fixture structure to locate the affected blocks and
keep the nested cache mappings unchanged aside from the top-level key renames.
In `@router-tests/testenv/testenv.go`:
- Around line 753-768: The 501 retry override is only being applied in the
TLS-specific branch, while the shared retryable client setup still uses the
default policy. Update the client creation path in the relevant test setup
functions so the retry override is installed in the common/shared retry client
initialization, then reuse that helper in both the non-TLS setup and the TLS
override branch. Reference the retry client setup around the
retryClient.CheckRetry override in the affected testenv helpers to keep both
paths consistent.
In `@router/core/executor.go`:
- Around line 268-272: The
`security.cost_control.ignore_implementing_type_weights` setting is currently
defined but not applied in `executor.go`, so it becomes a silent no-op. Update
the planner wiring in `routerEngineCfg`/`planConfig` handling to either reject
this config when the pinned `graphql-go-tools` fork cannot support
`plan.Configuration.IgnoreImplementingTypeWeights`, or restore the assignment
behind a clear version check so the option actually takes effect. Keep the
change centered around `executor.go` and the `planConfig` setup.
In `@router/core/graph_server.go`:
- Around line 239-243: The startup path in graph_server.go leaks partially
initialized entity cache resources because buildEntityCacheInstances() succeeds
before later setup failures can return. Add the same cleanup-on-error guard
already used elsewhere in GraphServer initialization: after assigning
s.entityCacheInstances, ensure every subsequent error path releases those
instances before returning. Use the existing GraphServer startup flow and the
buildEntityCacheInstances/entityCacheInstances symbols to place the cleanup
consistently across later fallible steps.
- Around line 1070-1080: The shared entity-cache metrics are being collected
inside configureCacheMetrics via buildGraphMux, so the same
srv.entityCacheInstances get exported multiple times for each mux. Move the
entityMetricSources / entityCacheMetricRegistrations path to server-level setup
in newGraphServer, or ensure buildGraphMux only registers them for the base mux,
so shared caches are emitted once and not duplicated across feature-flag muxes.
In `@router/core/router.go`:
- Around line 2035-2042: The shutdown loop in router cleanup is closing aliased
cache instances twice because buildEntityCacheInstances() stores the same cache
under both "default" and the provider_id. Update the close logic around
r.entityCacheInstances to deduplicate instances before calling
io.Closer.Close(), using the cache identity rather than the map key so each
underlying cache is closed only once. Keep the change localized to the
entity-cache shutdown path and preserve the existing error aggregation via
err.Append.
In `@router/pkg/entitycache/circuit_breaker.go`:
- Around line 44-48: `NewCircuitBreakerCache` currently narrows
`cfg.FailureThreshold` to `int32` without validation, which can wrap large
values and allows non-positive thresholds to behave incorrectly. Add a bounds
check in `NewCircuitBreakerCache` before assigning `failureThreshold` on
`CircuitBreakerCache`, rejecting or clamping values that are <= 0 or exceed the
`int32` range, and only store a safe validated threshold.
In `@router/pkg/entitycache/memory.go`:
- Around line 69-90: The invalidation path in MemoryEntityCache should not
depend on a prior Get check because queued SetWithTTL writes can make the key
appear absent and skip deletion. Update the Delete method on MemoryEntityCache
to always call c.cache.Del(k) for each key, and keep the len accounting separate
from the delete call so cache invalidation is unconditional.
In `@shared/src/router-config/builder.ts`:
- Around line 136-140: The EntityCachingConfiguration builder is still only
passing entityCache, cacheInvalidateConfigurations, and
cachePopulateConfigurations, so the new request-scoped and query-cache directive
fields are being dropped and can leave the result undefined when they are the
only cache settings. Update the builder path that constructs
EntityCachingConfiguration to include the new cache directive fields alongside
the existing ones, and make sure the aggregation logic in builder.ts preserves
them even when entityCacheConfigurations is absent.
---
Nitpick comments:
In `@demo/Makefile`:
- Around line 47-49: The compose-cache target is generating an unformatted
one-line config-cache-only.json artifact, unlike compose-cachetest. Update the
compose-cache rule in the Makefile to run the same normalization step used by
compose-cachetest after calling $(wgc_router_ci) compose, so
demo/config-cache-only.json is pretty-printed and reviewable.
In `@router/core/graphql_handler_caching_options_test.go`:
- Around line 30-172: Add a test in the GraphQL handler caching options suite
that exercises the new KeyInterceptors path by verifying cachingOptions() wires
in the L2CacheKeyInterceptor produced by buildL2CacheKeyInterceptor(). Use a new
caching-options handler setup with one or two fake EntityCacheKeyInterceptor
implementations, then assert the returned resolve.CachingOptions includes a
non-nil L2CacheKeyInterceptor and that it rewrites keys as expected. Keep the
existing flag/prefix coverage intact and place the new assertions near the other
TestGraphQLHandler_cachingOptions_* cases.
In `@router/pkg/entitycache/circuit_breaker_test.go`:
- Around line 256-266: The goroutine fan-out in the circuit breaker test still
uses manual WaitGroup bookkeeping, which should be updated to the newer style.
In the test block that launches the `cb.Get` calls, replace the explicit
`wg.Add(1)`/`defer wg.Done()` pattern around the anonymous goroutine with
`wg.Go(...)` on the same `sync.WaitGroup` variable, so the launch and completion
accounting are handled in one place and the test matches the Go 1.25 concurrency
style.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f20110f4-4144-448f-bd8a-8bc8a759a8f7
⛔ Files ignored due to path filters (29)
connect-go/gen/proto/wg/cosmo/node/v1/node.pb.gois excluded by!**/*.pb.go,!**/gen/**demo/pkg/subgraphs/cachegraph/subgraph/generated/federation.gois excluded by!**/generated/**demo/pkg/subgraphs/cachegraph/subgraph/generated/generated.gois excluded by!**/generated/**demo/pkg/subgraphs/cachegraph_ext/subgraph/generated/federation.gois excluded by!**/generated/**demo/pkg/subgraphs/cachegraph_ext/subgraph/generated/federation.requires.gois excluded by!**/generated/**demo/pkg/subgraphs/cachegraph_ext/subgraph/generated/generated.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/articles/subgraph/generated/federation.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/articles/subgraph/generated/federation.requires.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/articles/subgraph/generated/generated.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/articles/subgraph/generated/staticcheck.confis excluded by!**/generated/**demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/generated/federation.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/generated/generated.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/articlesmeta/subgraph/generated/staticcheck.confis excluded by!**/generated/**demo/pkg/subgraphs/cachetest/details/subgraph/generated/federation.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/details/subgraph/generated/generated.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/details/subgraph/generated/staticcheck.confis excluded by!**/generated/**demo/pkg/subgraphs/cachetest/inventory/subgraph/generated/federation.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/inventory/subgraph/generated/generated.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/items/subgraph/generated/federation.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/items/subgraph/generated/generated.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/items/subgraph/generated/staticcheck.confis excluded by!**/generated/**demo/pkg/subgraphs/cachetest/viewer/subgraph/generated/federation.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/viewer/subgraph/generated/generated.gois excluded by!**/generated/**demo/pkg/subgraphs/cachetest/viewer/subgraph/generated/staticcheck.confis excluded by!**/generated/**demo/pkg/subgraphs/viewer/subgraph/generated/federation.gois excluded by!**/generated/**demo/pkg/subgraphs/viewer/subgraph/generated/generated.gois excluded by!**/generated/**router-tests/go.sumis excluded by!**/*.sumrouter/gen/proto/wg/cosmo/node/v1/node.pb.gois excluded by!**/*.pb.go,!**/gen/**router/go.sumis excluded by!**/*.sum
📒 Files selected for processing (136)
connect/src/wg/cosmo/node/v1/node_pb.tsdemo/Makefiledemo/cmd/all/main.godemo/cmd/cache-demo/main.godemo/config-cache-only.jsondemo/go.moddemo/graph-cache-only.yamldemo/graph-cachetest.yamldemo/graph.yamldemo/pkg/injector/latency.godemo/pkg/subgraphs/cachegraph/cachegraph.godemo/pkg/subgraphs/cachegraph/generate.godemo/pkg/subgraphs/cachegraph/gqlgen.ymldemo/pkg/subgraphs/cachegraph/subgraph/auth_profiles.godemo/pkg/subgraphs/cachegraph/subgraph/data.godemo/pkg/subgraphs/cachegraph/subgraph/data_race_test.godemo/pkg/subgraphs/cachegraph/subgraph/data_test.godemo/pkg/subgraphs/cachegraph/subgraph/entity.resolvers.godemo/pkg/subgraphs/cachegraph/subgraph/model/models_gen.godemo/pkg/subgraphs/cachegraph/subgraph/resolver.godemo/pkg/subgraphs/cachegraph/subgraph/schema.graphqlsdemo/pkg/subgraphs/cachegraph/subgraph/schema.resolvers.godemo/pkg/subgraphs/cachegraph_ext/cachegraph_ext.godemo/pkg/subgraphs/cachegraph_ext/generate.godemo/pkg/subgraphs/cachegraph_ext/gqlgen.ymldemo/pkg/subgraphs/cachegraph_ext/subgraph/data.godemo/pkg/subgraphs/cachegraph_ext/subgraph/entity.resolvers.godemo/pkg/subgraphs/cachegraph_ext/subgraph/model/models_gen.godemo/pkg/subgraphs/cachegraph_ext/subgraph/resolver.godemo/pkg/subgraphs/cachegraph_ext/subgraph/schema.graphqlsdemo/pkg/subgraphs/cachegraph_ext/subgraph/schema.resolvers.godemo/pkg/subgraphs/cachegraph_ext/subgraph/schema.resolvers_test.godemo/pkg/subgraphs/cachetest/articles/articles.godemo/pkg/subgraphs/cachetest/articles/generate.godemo/pkg/subgraphs/cachetest/articles/gqlgen.ymldemo/pkg/subgraphs/cachetest/articles/subgraph/data.godemo/pkg/subgraphs/cachetest/articles/subgraph/entity.resolvers.godemo/pkg/subgraphs/cachetest/articles/subgraph/model/models_gen.godemo/pkg/subgraphs/cachetest/articles/subgraph/resolver.godemo/pkg/subgraphs/cachetest/articles/subgraph/schema.graphqlsdemo/pkg/subgraphs/cachetest/articles/subgraph/schema.resolvers.godemo/pkg/subgraphs/cachetest/articlesmeta/articlesmeta.godemo/pkg/subgraphs/cachetest/articlesmeta/generate.godemo/pkg/subgraphs/cachetest/articlesmeta/gqlgen.ymldemo/pkg/subgraphs/cachetest/articlesmeta/subgraph/data.godemo/pkg/subgraphs/cachetest/articlesmeta/subgraph/entity.resolvers.godemo/pkg/subgraphs/cachetest/articlesmeta/subgraph/model/models_gen.godemo/pkg/subgraphs/cachetest/articlesmeta/subgraph/resolver.godemo/pkg/subgraphs/cachetest/articlesmeta/subgraph/schema.graphqlsdemo/pkg/subgraphs/cachetest/details/details.godemo/pkg/subgraphs/cachetest/details/generate.godemo/pkg/subgraphs/cachetest/details/gqlgen.ymldemo/pkg/subgraphs/cachetest/details/subgraph/data.godemo/pkg/subgraphs/cachetest/details/subgraph/entity.resolvers.godemo/pkg/subgraphs/cachetest/details/subgraph/model/models_gen.godemo/pkg/subgraphs/cachetest/details/subgraph/resolver.godemo/pkg/subgraphs/cachetest/details/subgraph/schema.graphqlsdemo/pkg/subgraphs/cachetest/inventory/generate.godemo/pkg/subgraphs/cachetest/inventory/gqlgen.ymldemo/pkg/subgraphs/cachetest/inventory/inventory.godemo/pkg/subgraphs/cachetest/inventory/subgraph/data.godemo/pkg/subgraphs/cachetest/inventory/subgraph/entity.resolvers.godemo/pkg/subgraphs/cachetest/inventory/subgraph/model/models_gen.godemo/pkg/subgraphs/cachetest/inventory/subgraph/resolver.godemo/pkg/subgraphs/cachetest/inventory/subgraph/schema.graphqlsdemo/pkg/subgraphs/cachetest/items/generate.godemo/pkg/subgraphs/cachetest/items/gqlgen.ymldemo/pkg/subgraphs/cachetest/items/items.godemo/pkg/subgraphs/cachetest/items/subgraph/data.godemo/pkg/subgraphs/cachetest/items/subgraph/entity.resolvers.godemo/pkg/subgraphs/cachetest/items/subgraph/model/models_gen.godemo/pkg/subgraphs/cachetest/items/subgraph/resolver.godemo/pkg/subgraphs/cachetest/items/subgraph/schema.graphqlsdemo/pkg/subgraphs/cachetest/items/subgraph/schema.resolvers.godemo/pkg/subgraphs/cachetest/viewer/generate.godemo/pkg/subgraphs/cachetest/viewer/gqlgen.ymldemo/pkg/subgraphs/cachetest/viewer/subgraph/data.godemo/pkg/subgraphs/cachetest/viewer/subgraph/entity.resolvers.godemo/pkg/subgraphs/cachetest/viewer/subgraph/model/models_gen.godemo/pkg/subgraphs/cachetest/viewer/subgraph/resolver.godemo/pkg/subgraphs/cachetest/viewer/subgraph/schema.graphqlsdemo/pkg/subgraphs/cachetest/viewer/subgraph/schema.resolvers.godemo/pkg/subgraphs/cachetest/viewer/viewer.godemo/pkg/subgraphs/subgraphs.godemo/pkg/subgraphs/viewer/generate.godemo/pkg/subgraphs/viewer/gqlgen.ymldemo/pkg/subgraphs/viewer/subgraph/context.godemo/pkg/subgraphs/viewer/subgraph/data.godemo/pkg/subgraphs/viewer/subgraph/entity.resolvers.godemo/pkg/subgraphs/viewer/subgraph/model/models_gen.godemo/pkg/subgraphs/viewer/subgraph/resolver.godemo/pkg/subgraphs/viewer/subgraph/schema.graphqlsdemo/pkg/subgraphs/viewer/subgraph/schema.resolvers.godemo/pkg/subgraphs/viewer/viewer.godemo/router-cache-cp.yamldemo/router-cache.yamlproto/wg/cosmo/node/v1/node.protorouter-tests/entity_caching_standard_subgraphs_test.gorouter-tests/entitycaching/entitycaching_test.gorouter-tests/entitycaching/harness_test.gorouter-tests/entitycaching/redis_test.gorouter-tests/entitycaching/setup_test.gorouter-tests/entitycaching/testdata/config.jsonrouter-tests/go.modrouter-tests/protocol/testdata/tracing.jsonrouter-tests/testenv/testenv.gorouter/core/executor.gorouter/core/executor_entity_cache_test.gorouter/core/factoryresolver.gorouter/core/factoryresolver_entity_cache_test.gorouter/core/factoryresolver_test.gorouter/core/flushwriter.gorouter/core/graph_server.gorouter/core/graphql_handler.gorouter/core/graphql_handler_caching_options_test.gorouter/core/modules.gorouter/core/router.gorouter/core/router_config.gorouter/core/router_entity_cache_test.gorouter/core/supervisor_instance.gorouter/core/websocket.gorouter/go.modrouter/pkg/config/config.gorouter/pkg/config/config.schema.jsonrouter/pkg/config/config_test.gorouter/pkg/config/testdata/config_defaults.jsonrouter/pkg/config/testdata/config_full.jsonrouter/pkg/entitycache/circuit_breaker.gorouter/pkg/entitycache/circuit_breaker_test.gorouter/pkg/entitycache/memory.gorouter/pkg/entitycache/memory_test.gorouter/pkg/entitycache/redis.gorouter/pkg/entitycache/redis_test.gorouter/pkg/graphqlschemausage/schemausage_bench_test.gorouter/pkg/graphqlschemausage/schemausage_test.goshared/src/router-config/builder.ts
| import ( | ||
| "log" | ||
| "net/http" | ||
| "strconv" | ||
|
|
||
| "github.com/99designs/gqlgen/graphql" | ||
| "github.com/99designs/gqlgen/graphql/handler" | ||
| "github.com/99designs/gqlgen/graphql/handler/transport" | ||
| "github.com/99designs/gqlgen/graphql/playground" | ||
| "golang.org/x/sync/errgroup" | ||
|
|
||
| "github.com/wundergraph/cosmo/demo/pkg/injector" | ||
| "github.com/wundergraph/cosmo/demo/pkg/subgraphs/cachegraph" | ||
| "github.com/wundergraph/cosmo/demo/pkg/subgraphs/cachegraph_ext" | ||
| "github.com/wundergraph/cosmo/demo/pkg/subgraphs/viewer" | ||
| ) |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "http.Server constructions in demo code:"
rg -n -C2 'http\.Server\s*\{' demo/cmd demo/pkg
echo
echo "Existing timeout fields in demo code:"
rg -n -C2 'ReadTimeout|ReadHeaderTimeout|WriteTimeout|IdleTimeout' demo/cmd demo/pkgRepository: wundergraph/cosmo
Length of output: 2720
Set timeouts on these demo HTTP servers. Both http.Server instances here have no read/header/write/idle timeouts, so a slow or stalled client can tie up the demo listeners indefinitely and starve the cache demo.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/cmd/cache-demo/main.go` around lines 4 - 19, Set sensible timeouts on
the demo HTTP servers used in cache-demo main so slow or stalled clients cannot
hold connections open indefinitely. Update the server setup in main and the
related server construction for the cache demo handlers (including the gqlgen
handler and playground listener) to configure read, header, write, and idle
timeouts on the http.Server instances.
Source: Linters/SAST tools
| func Latency(next http.Handler) http.Handler { | ||
| return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { | ||
| if v := r.Header.Get(ArtificialLatencyHeader); v != "" { | ||
| if ms, err := strconv.Atoi(v); err == nil && ms > 0 && ms <= 10000 { | ||
| log.Printf("[latency] %s %s sleep %dms prefix=%q", r.Method, r.Host, ms, r.Header.Get("X-WG-Cache-Key-Prefix")) | ||
| time.Sleep(time.Duration(ms) * time.Millisecond) |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Honor request cancellation while injecting latency.
time.Sleep keeps the handler blocked for up to 10s even after the client disconnects or the request context is canceled. Since this wrapper is now applied broadly in demo/pkg/subgraphs/subgraphs.go, canceled demo traffic can still tie up goroutines unnecessarily.
Proposed fix
func Latency(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if v := r.Header.Get(ArtificialLatencyHeader); v != "" {
if ms, err := strconv.Atoi(v); err == nil && ms > 0 && ms <= 10000 {
log.Printf("[latency] %s %s sleep %dms prefix=%q", r.Method, r.Host, ms, r.Header.Get("X-WG-Cache-Key-Prefix"))
- time.Sleep(time.Duration(ms) * time.Millisecond)
+ timer := time.NewTimer(time.Duration(ms) * time.Millisecond)
+ defer timer.Stop()
+ select {
+ case <-timer.C:
+ case <-r.Context().Done():
+ return
+ }
}
}
next.ServeHTTP(w, r)
})
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func Latency(next http.Handler) http.Handler { | |
| return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { | |
| if v := r.Header.Get(ArtificialLatencyHeader); v != "" { | |
| if ms, err := strconv.Atoi(v); err == nil && ms > 0 && ms <= 10000 { | |
| log.Printf("[latency] %s %s sleep %dms prefix=%q", r.Method, r.Host, ms, r.Header.Get("X-WG-Cache-Key-Prefix")) | |
| time.Sleep(time.Duration(ms) * time.Millisecond) | |
| func Latency(next http.Handler) http.Handler { | |
| return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { | |
| if v := r.Header.Get(ArtificialLatencyHeader); v != "" { | |
| if ms, err := strconv.Atoi(v); err == nil && ms > 0 && ms <= 10000 { | |
| log.Printf("[latency] %s %s sleep %dms prefix=%q", r.Method, r.Host, ms, r.Header.Get("X-WG-Cache-Key-Prefix")) | |
| timer := time.NewTimer(time.Duration(ms) * time.Millisecond) | |
| defer timer.Stop() | |
| select { | |
| case <-timer.C: | |
| case <-r.Context().Done(): | |
| return | |
| } | |
| } | |
| } | |
| next.ServeHTTP(w, r) | |
| }) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/pkg/injector/latency.go` around lines 19 - 24, The Latency middleware in
Latency should stop blocking on time.Sleep after the request is canceled. Update
the artificial delay path to wait on the request context from r.Context()
alongside the timer, and return early if the context is done so canceled
requests do not keep goroutines pinned. Keep the existing validation and
logging, but make the sleep in Latency cancellation-aware.
| func (s *articleStore) byIDs(ids []string) []*model.Article { | ||
| idSet := make(map[string]struct{}, len(ids)) | ||
| for _, id := range ids { | ||
| idSet[id] = struct{}{} | ||
| } | ||
| s.mu.RLock() | ||
| defer s.mu.RUnlock() | ||
| var out []*model.Article | ||
| for _, a := range s.articles { | ||
| if _, ok := idSet[a.ID]; ok { | ||
| out = append(out, cloneArticle(a)) | ||
| } | ||
| } | ||
| return out |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Preserve the caller’s ID order in byIDs.
This helper turns ids into a set and then walks s.articles, so a request like ["3", "1"] comes back in store order instead of request order. Any query/resolver using it will return a reordered list.
Suggested fix
func (s *articleStore) byIDs(ids []string) []*model.Article {
- idSet := make(map[string]struct{}, len(ids))
- for _, id := range ids {
- idSet[id] = struct{}{}
- }
s.mu.RLock()
defer s.mu.RUnlock()
- var out []*model.Article
+ byID := make(map[string]*model.Article, len(s.articles))
for _, a := range s.articles {
- if _, ok := idSet[a.ID]; ok {
- out = append(out, cloneArticle(a))
- }
+ byID[a.ID] = a
+ }
+ out := make([]*model.Article, 0, len(ids))
+ for _, id := range ids {
+ if a, ok := byID[id]; ok {
+ out = append(out, cloneArticle(a))
+ }
}
return out
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (s *articleStore) byIDs(ids []string) []*model.Article { | |
| idSet := make(map[string]struct{}, len(ids)) | |
| for _, id := range ids { | |
| idSet[id] = struct{}{} | |
| } | |
| s.mu.RLock() | |
| defer s.mu.RUnlock() | |
| var out []*model.Article | |
| for _, a := range s.articles { | |
| if _, ok := idSet[a.ID]; ok { | |
| out = append(out, cloneArticle(a)) | |
| } | |
| } | |
| return out | |
| func (s *articleStore) byIDs(ids []string) []*model.Article { | |
| s.mu.RLock() | |
| defer s.mu.RUnlock() | |
| byID := make(map[string]*model.Article, len(s.articles)) | |
| for _, a := range s.articles { | |
| byID[a.ID] = a | |
| } | |
| out := make([]*model.Article, 0, len(ids)) | |
| for _, id := range ids { | |
| if a, ok := byID[id]; ok { | |
| out = append(out, cloneArticle(a)) | |
| } | |
| } | |
| return out | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/pkg/subgraphs/cachegraph/subgraph/data.go` around lines 143 - 156, The
articleStore.byIDs helper currently deduplicates IDs into a set and iterates
over s.articles, which reorders results instead of preserving the caller’s
requested sequence. Update byIDs to return articles in the same order as the
input ids slice (while still cloning matches and handling missing IDs), and keep
the lookup logic centered around articleStore, byIDs, and cloneArticle so
query/resolver callers receive ordered results.
| func (s *listingStore) all() []*model.Listing { | ||
| s.mu.RLock() | ||
| defer s.mu.RUnlock() | ||
| out := make([]*model.Listing, 0, len(s.listings)) | ||
| for _, l := range s.listings { | ||
| out = append(out, cloneListing(l)) | ||
| } | ||
| return out |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Return these lists in a stable order.
Both listingStore.all() and allVenues() iterate Go maps directly, so their response order can change between calls. That makes demo responses nondeterministic, and it is inconsistent with allCatalogs(), which already sorts before returning.
Also applies to: 298-303
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/pkg/subgraphs/cachegraph/subgraph/data.go` around lines 226 - 233, Both
listingStore.all() and allVenues() currently range over maps directly, so their
returned slices can come back in different orders between calls. Update these
helpers to collect items and sort them before returning, matching the stable
ordering already used by allCatalogs(). Use the existing clone-and-return
pattern in listingStore.all() and the venue listing path in allVenues() so
callers get deterministic demo responses.
| func buildArticle(id string) *model.Article { | ||
| fixture := articleFixtures[id] | ||
| relatedIDs := relatedArticleIDs[id] | ||
| related := make([]*model.Article, 0, len(relatedIDs)) | ||
| for _, rid := range relatedIDs { | ||
| relatedFixture := articleFixtures[rid] | ||
| related = append(related, &model.Article{ | ||
| ID: rid, | ||
| Title: relatedFixture.Title, | ||
| Body: relatedFixture.Body, | ||
| Tags: relatedFixture.Tags, | ||
| }) | ||
| } | ||
| return &model.Article{ | ||
| ID: id, | ||
| Title: fixture.Title, | ||
| Body: fixture.Body, | ||
| Tags: fixture.Tags, | ||
| RelatedArticles: related, | ||
| } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Fail fast on unknown article IDs.
buildArticle currently turns a missing fixture lookup into a zero-value Article, so an invalid id still looks like a successful entity resolution. That hides fixture/key mismatches and weakens the cache-test coverage. Please return an error for unknown ids and propagate it back through FindArticleByID.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/pkg/subgraphs/cachetest/articles/subgraph/data.go` around lines 46 - 65,
The article fixture lookup currently treats missing IDs as zero-value articles,
so invalid IDs appear to resolve successfully. Update buildArticle to fail fast
when articleFixtures[id] or any related fixture is missing, returning an error
instead of constructing a partial Article, and change FindArticleByID to
propagate that error back to the caller. Keep the fix localized around
buildArticle and FindArticleByID so cache-test fixture/key mismatches surface
immediately.
| entityMetricSources := make(map[string]cacheMetricSource) | ||
| for name, cache := range srv.entityCacheInstances { | ||
| source, ok := cache.(cacheMetricSource) | ||
| if !ok { | ||
| continue | ||
| } | ||
| entityMetricSources[name] = source | ||
| } | ||
| for _, registration := range entityCacheMetricRegistrations(entityMetricSources) { | ||
| metricInfos = append(metricInfos, rmetric.NewCacheMetricInfo(registration.cacheType, registration.maxCost, registration.metrics)) | ||
| } |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Don't register shared entity-cache metrics once per graph mux.
srv.entityCacheInstances is server-wide state built once in newGraphServer, but Lines 1070-1080 append it every time buildGraphMux() calls configureCacheMetrics(). With feature-flag muxes, the same cache is exported multiple times under different mux attributes, so telemetry for one shared cache gets duplicated. Register these metrics once at the server level, or gate them to the base mux only.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/core/graph_server.go` around lines 1070 - 1080, The shared
entity-cache metrics are being collected inside configureCacheMetrics via
buildGraphMux, so the same srv.entityCacheInstances get exported multiple times
for each mux. Move the entityMetricSources / entityCacheMetricRegistrations path
to server-level setup in newGraphServer, or ensure buildGraphMux only registers
them for the base mux, so shared caches are emitted once and not duplicated
across feature-flag muxes.
| // Close entity cache instances that implement io.Closer (e.g. ristretto-backed MemoryEntityCache). | ||
| for _, cache := range r.entityCacheInstances { | ||
| if closer, ok := cache.(io.Closer); ok { | ||
| if closeErr := closer.Close(); closeErr != nil { | ||
| err.Append(fmt.Errorf("failed to close entity cache: %w", closeErr)) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Deduplicate aliased caches before closing them.
buildEntityCacheInstances() intentionally stores the same instance under "default" and the real provider_id. This shutdown loop closes every map entry, so aliased caches are closed twice. For caches backed by closers that are not idempotent, a normal shutdown can now report an error on the second Close().
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/core/router.go` around lines 2035 - 2042, The shutdown loop in router
cleanup is closing aliased cache instances twice because
buildEntityCacheInstances() stores the same cache under both "default" and the
provider_id. Update the close logic around r.entityCacheInstances to deduplicate
instances before calling io.Closer.Close(), using the cache identity rather than
the map key so each underlying cache is closed only once. Keep the change
localized to the entity-cache shutdown path and preserve the existing error
aggregation via err.Append.
| func NewCircuitBreakerCache(cache resolve.LoaderCache, cfg CircuitBreakerConfig) *CircuitBreakerCache { | ||
| cb := &CircuitBreakerCache{ | ||
| cache: cache, | ||
| failureThreshold: int32(cfg.FailureThreshold), | ||
| cooldownPeriod: cfg.CooldownPeriod, |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
fd -p 'config.schema.json' router/pkg/config -x sed -n '/failure_threshold/,+12p' {}
rg -n --context 2 'FailureThreshold|NewCircuitBreakerCache' \
router/pkg/config \
router/pkg/entitycache \
router/core \
-g '*.go'Repository: wundergraph/cosmo
Length of output: 4453
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the circuit breaker implementation and config validation/loading path.
sed -n '1,220p' router/pkg/entitycache/circuit_breaker.go
printf '\n--- CONFIG ---\n'
sed -n '1170,1235p' router/pkg/config/config.go
printf '\n--- SEARCH VALIDATION ---\n'
rg -n --context 3 'minimum: 1|FailureThreshold|validate|validation|schema' router/pkg/config router/pkg/entitycache -g '*.go' -g '*.json' -g '*.yaml' -g '*.yml'Repository: wundergraph/cosmo
Length of output: 50373
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show the config loading order and the schema entry for the breaker threshold.
sed -n '1520,1645p' router/pkg/config/config.go
printf '\n--- SCHEMA ENTRY ---\n'
rg -n --context 4 '"failure_threshold"' router/pkg/config/config.schema.json
printf '\n--- ENV PARSING PATH ---\n'
rg -n --context 2 'env\.Parse|caarlos0/env|LoadConfig\(|ValidateConfig\(' router/pkg/config/config.go router/pkg/config/config_test.go -g '*.go'Repository: wundergraph/cosmo
Length of output: 42212
Guard FailureThreshold before narrowing it to int32.
int32(cfg.FailureThreshold) can wrap for large values, and non-positive thresholds make the breaker open on the first failure. Add a bounds check here before storing it.
🧰 Tools
🪛 ast-grep (0.44.0)
[warning] 46-46: Narrowing a non-constant integer to a smaller fixed-width type (int8/int16/int32, uint8/uint16/uint32) can silently overflow or wrap, yielding negative or truncated values that are dangerous in size, length, or index logic. Validate the source value is within the target type's range before converting (e.g. bounds-check, or use a checked helper), and avoid narrowing untrusted or len()/parsed values.
Context: int32(cfg.FailureThreshold)
Note: [CWE-190] Integer Overflow or Wraparound.
(integer-overflow-narrowing-conversion-go)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/pkg/entitycache/circuit_breaker.go` around lines 44 - 48,
`NewCircuitBreakerCache` currently narrows `cfg.FailureThreshold` to `int32`
without validation, which can wrap large values and allows non-positive
thresholds to behave incorrectly. Add a bounds check in `NewCircuitBreakerCache`
before assigning `failureThreshold` on `CircuitBreakerCache`, rejecting or
clamping values that are <= 0 or exceed the `int32` range, and only store a safe
validated threshold.
Source: Linters/SAST tools
| func (c *MemoryEntityCache) Set(_ context.Context, entries []*resolve.CacheEntry) error { | ||
| if len(entries) == 0 { | ||
| return nil | ||
| } | ||
| for _, entry := range entries { | ||
| if entry == nil { | ||
| continue | ||
| } | ||
| // Negative TTL means "no expiration" per the LoaderCache contract. | ||
| // Ristretto treats ttl<=0 as no expiration, so clamp negatives to 0. | ||
| ttl := entry.TTL | ||
| if ttl < 0 { | ||
| ttl = 0 | ||
| } | ||
| // Check if key already exists (update vs new entry) | ||
| _, exists := c.cache.Get(entry.Key) | ||
| if c.cache.SetWithTTL(entry.Key, entry.Value, int64(len(entry.Value)), ttl) && !exists { | ||
| c.len.Add(1) | ||
| } | ||
| } | ||
| c.cache.Wait() | ||
| return nil |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Map the relevant file and nearby methods
ast-grep outline router/pkg/entitycache/memory.go --view expanded || true
echo '---'
cat -n router/pkg/entitycache/memory.go | sed -n '1,220p'
echo '--- SEARCH RISTRETTO USAGE ---'
rg -n "SetWithTTL|Wait\(|Get\(|Del\(" router/pkg/entitycache -S || true
echo '--- GO MODULE / DEP VERSION ---'
rg -n 'dgraph-io/ristretto' go.mod go.sum router/go.mod router/go.sum -S || trueRepository: wundergraph/cosmo
Length of output: 11175
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "entity cache|MemoryEntityCache|ristretto|invalidation|Delete\(" . -S || trueRepository: wundergraph/cosmo
Length of output: 31305
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '980,1015p' router/core/graph_server.go
echo '---'
sed -n '220,245p' router/pkg/entitycache/memory_test.go
echo '---'
sed -n '60,90p' router/pkg/entitycache/redis.goRepository: wundergraph/cosmo
Length of output: 2707
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python3 - <<'PY'
import os, glob
paths = []
for base in [os.path.expanduser('~/go/pkg/mod'), '/go/pkg/mod']:
if os.path.isdir(base):
paths.extend(glob.glob(base + '/github.com/dgraph-io/ristretto*/**/*.go', recursive=True))
print('\n'.join(sorted(paths)[:40]))
PYRepository: wundergraph/cosmo
Length of output: 3493
🏁 Script executed:
#!/bin/bash
set -euo pipefail
CACHE_FILE="/home/jailuser/go/pkg/mod/github.com/dgraph-io/ristretto/v2@v2.1.0/cache.go"
sed -n '1,260p' "$CACHE_FILE"Repository: wundergraph/cosmo
Length of output: 10404
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '260,520p' /home/jailuser/go/pkg/mod/github.com/dgraph-io/ristretto/v2@v2.1.0/cache.go
echo '---'
sed -n '1,240p' /home/jailuser/go/pkg/mod/github.com/dgraph-io/ristretto/v2@v2.1.0/store.goRepository: wundergraph/cosmo
Length of output: 13481
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '520,760p' /home/jailuser/go/pkg/mod/github.com/dgraph-io/ristretto/v2@v2.1.0/cache.goRepository: wundergraph/cosmo
Length of output: 5765
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '1,220p' router/pkg/entitycache/memory_test.go
echo '---'
sed -n '220,320p' router/pkg/entitycache/memory_test.goRepository: wundergraph/cosmo
Length of output: 7857
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "itemDelete|itemUpdate|itemNew|setBuf|storedItems.Set|storedItems.Del" /home/jailuser/go/pkg/mod/github.com/dgraph-io/ristretto/v2@v2.1.0/cache.go -n -S
echo '---'
sed -n '760,940p' /home/jailuser/go/pkg/mod/github.com/dgraph-io/ristretto/v2@v2.1.0/cache.goRepository: wundergraph/cosmo
Length of output: 2223
Make Delete unconditional in router/pkg/entitycache/memory.go:93-103. Get can miss a key that’s already been queued by SetWithTTL, so the invalidation is skipped and stale data survives. Call c.cache.Del(k) regardless, and keep len bookkeeping separate.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/pkg/entitycache/memory.go` around lines 69 - 90, The invalidation path
in MemoryEntityCache should not depend on a prior Get check because queued
SetWithTTL writes can make the key appear absent and skip deletion. Update the
Delete method on MemoryEntityCache to always call c.cache.Del(k) for each key,
and keep the len accounting separate from the delete call so cache invalidation
is unconditional.
| return new EntityCachingConfiguration({ | ||
| cacheInvalidateConfigurations, | ||
| cachePopulateConfigurations, | ||
| entityCacheConfigurations, | ||
| entityCache: entityCacheConfigurations, | ||
| }); |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Propagate the new cache directive fields in this builder.
This constructor still emits only entityCache, cacheInvalidateConfigurations, and cachePopulateConfigurations. The updated contract also carries request-scoped and query-cache metadata, so this path silently drops those directives and can even return undefined when they are the only cache config present.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@shared/src/router-config/builder.ts` around lines 136 - 140, The
EntityCachingConfiguration builder is still only passing entityCache,
cacheInvalidateConfigurations, and cachePopulateConfigurations, so the new
request-scoped and query-cache directive fields are being dropped and can leave
the result undefined when they are the only cache settings. Update the builder
path that constructs EntityCachingConfiguration to include the new cache
directive fields alongside the existing ones, and make sure the aggregation
logic in builder.ts preserves them even when entityCacheConfigurations is
absent.
…o entity_cache_configurations Align the proto field with origin/main (entity_cache_configurations). The old name serialized to JSON as entityCache, so the router silently dropped the control-plane's entity caching config on unmarshal. Regenerated Go/TS bindings and updated all GetEntityCache() consumers in the router core and tests.
Part 2 of 4 — split out from #2777 (
jensneuse/entity-caching-v2). Stacked on #2944.What's included
factoryresolvermapping of proto cache config to planner metadata, graph server wiring, per-request cache-control headers (graphql_handler), plan generator nil guard, config schema + defaultstestdata/config.jsonExcluded
router/internal/graphiql/graphiql.html(playground embed) ships in part 3/4 — this PR runs with the previous embedded playground.Verification
go build ./...passes inrouter/anddemo/;go vet ./entity_caching/...passes inrouter-tests/.🤖 Generated with Claude Code
Update: test subgraphs coupled to demo
The six entity_caching test subgraphs were moved into
demo/pkg/subgraphs/cachetest/and are imported by the suite — same coupling pattern as the main suite (testenv→demo/pkg/subgraphs). router-tests no longer carries subgraph implementations. Pure move: schemas/resolvers/data unchanged,make composereproduces an identicaltestdata/config.json, full suite green.Diff cleanup
HasNonDefaultExporter+ bootstrap guard) moved to the benchmark/tooling PR (6/6) — it is a general tracing perf change motivated by benchmarking, not entity caching.demo/pkg/subgraphs/projects/generatedproto churn — leftoverreservedmarkers from fields added and removed during development; the GraphQL schema is unchanged.router/entity-caching.config.yaml(unreferenced sample) androuter-tests/entity_caching/graph.yaml(duplicated the subgraph list thatcmd/composeowns; nothing consumed it).generatetarget moved todemo/Makefile(generate-cachetest) next to the subgraphs it regenerates.Rebased onto main (2026-06-10)
63fa1c88— PR feat: add caching to loader graphql-go-tools#1259 with v2.4.4 merged in (branchentity-caching-v244-merge; conflicts inresolve/context.go+resolve/resolvable.goresolved keeping caching fields + theactualListSizes→typeNameStatsrename). Needed because main's router uses v2.4.4 APIs (TypeNameStats) that the PR head lacked.MergeValuesWithPathreturn 2 values; main's flushwriter call site adapted.cmd/composereplaced with awgc-based Makefile target (+ restoredgraph.yamlinput) sincecomposition-gowas removed on main (chore: remove composition-go #2830);testdata/config.jsonregenerated with main's composition.ExecutionOptions.Cachingwiring in the WS path).Summary by CodeRabbit
New Features
Bug Fixes
Tests