Skip to content

Commit 495abc2

Browse files
KyleAMathewsclaudeautofix-ci[bot]
authored
Fix unbounded expression growth in DeduplicatedLoadSubset (#1348)
* fix: track original predicates in DeduplicatedLoadSubset to prevent unbounded WHERE growth updateTracking() was called with the modified loadOptions (containing the minusWherePredicates difference expression) instead of the original request options. When a "load all" request (where=undefined) was made after accumulating eq() predicates, the difference expression NOT(IN([...])) was tracked instead of setting hasLoadedAllData=true. Each subsequent "load all" request compounded the expression: NOT(IN(...) OR NOT(IN(...))), producing deeply nested 193KB+ WHERE clauses that crashed Electric's parser. The fix separates loadOptions (sent to backend, may contain optimized difference query) from trackingOptions (used for tracking, preserves the original predicate). This ensures "load all" correctly sets hasLoadedAllData=true regardless of the optimization applied to the actual request. https://claude.ai/code/session_01Vp75RhjVR4tV5FdjKJbBWE * Apply code review and simplification fixes - Condense block comment explaining trackingOptions/loadOptions split - Remove duplicate inline comments at updateTracking call sites - Remove stray console.log and unused minusWherePredicates import - Fix "exponentially" → "unboundedly" in test comment (growth is linear) - Remove product-specific "sessions index page" reference from test - Strengthen expression assertion to verify full NOT(IN(...)) structure - Add sync return path test for the tracking fix - Simplify test by removing unused calls array Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add changeset for unbounded WHERE growth fix Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: apply automated fixes --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
1 parent eeb5321 commit 495abc2

3 files changed

Lines changed: 148 additions & 20 deletions

File tree

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
'@tanstack/db': patch
3+
---
4+
5+
Fix unbounded WHERE expression growth in `DeduplicatedLoadSubset` when loading all data after accumulating specific predicates. The deduplication layer now correctly tracks the original request predicate (e.g., `where: undefined` for "load all") instead of the optimized difference query sent to the backend, ensuring `hasLoadedAllData` is properly set and subsequent requests are deduplicated.

packages/db/src/query/subset-dedupe.ts

Lines changed: 14 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -126,28 +126,29 @@ export class DeduplicatedLoadSubset {
126126
return prom
127127
}
128128

129-
// Not fully covered by existing data
130-
// Compute the subset of data that is not covered by the existing data
131-
// such that we only have to load that subset of missing data
132-
const clonedOptions = cloneOptions(options)
129+
// Not fully covered by existing data — load the missing subset.
130+
// We need two clones: trackingOptions preserves the original predicate for
131+
// accurate tracking (e.g., where=undefined means "all data"), while loadOptions
132+
// may be narrowed with a difference expression for the actual backend request.
133+
const trackingOptions = cloneOptions(options)
134+
const loadOptions = cloneOptions(options)
133135
if (this.unlimitedWhere !== undefined && options.limit === undefined) {
134136
// Compute difference to get only the missing data
135137
// We can only do this for unlimited queries
136138
// and we can only remove data that was loaded from unlimited queries
137139
// because with limited queries we have no way to express that we already loaded part of the matching data
138-
clonedOptions.where =
139-
minusWherePredicates(clonedOptions.where, this.unlimitedWhere) ??
140-
clonedOptions.where
140+
loadOptions.where =
141+
minusWherePredicates(loadOptions.where, this.unlimitedWhere) ??
142+
loadOptions.where
141143
}
142144

143145
// Call underlying loadSubset to load the missing data
144-
const resultPromise = this._loadSubset(clonedOptions)
146+
const resultPromise = this._loadSubset(loadOptions)
145147

146148
// Handle both sync (true) and async (Promise<void>) return values
147149
if (resultPromise === true) {
148-
// Sync return - update tracking synchronously
149-
// Clone options before storing to protect against caller mutation
150-
this.updateTracking(clonedOptions)
150+
// Sync return - update tracking with the original predicate
151+
this.updateTracking(trackingOptions)
151152
return true
152153
} else {
153154
// Async return - track the promise and update tracking after it resolves
@@ -158,16 +159,14 @@ export class DeduplicatedLoadSubset {
158159

159160
// We need to create a reference to the in-flight entry so we can remove it later
160161
const inflightEntry = {
161-
options: clonedOptions, // Store cloned options for subset matching
162+
options: loadOptions, // Store load options for subset matching of in-flight requests
162163
promise: resultPromise
163164
.then((result) => {
164165
// Only update tracking if this request is still from the current generation
165166
// If reset() was called, the generation will have incremented and we should
166167
// not repopulate the state that was just cleared
167168
if (capturedGeneration === this.generation) {
168-
// Use the cloned options that we captured before any caller mutations
169-
// This ensures we track exactly what was loaded, not what the caller changed
170-
this.updateTracking(clonedOptions)
169+
this.updateTracking(trackingOptions)
171170
}
172171
return result
173172
})

packages/db/tests/query/subset-dedupe.test.ts

Lines changed: 129 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@ import {
44
cloneOptions,
55
} from '../../src/query/subset-dedupe'
66
import { Func, PropRef, Value } from '../../src/query/ir'
7-
import { minusWherePredicates } from '../../src/query/predicate-utils'
87
import type { BasicExpression, OrderBy } from '../../src/query/ir'
98
import type { LoadSubsetOptions } from '../../src/types'
109

@@ -517,9 +516,6 @@ describe(`createDeduplicatedLoadSubset`, () => {
517516
eq(ref(`status`), val(`active`)),
518517
)
519518

520-
const test = minusWherePredicates(secondPredicate, firstPredicate)
521-
console.log(`test`, test)
522-
523519
await deduplicated.loadSubset({ where: secondPredicate })
524520
expect(callCount).toBe(2)
525521
expect(calls[1]).toEqual({
@@ -596,7 +592,135 @@ describe(`createDeduplicatedLoadSubset`, () => {
596592
// i.e. should request NOT (age > 20)
597593
await deduplicated.loadSubset({})
598594
expect(callCount).toBe(2)
599-
expect(calls[1]).toEqual({ where: not(gt(ref(`age`), val(20))) }) // Should request all data except what we already loaded
595+
expect(calls[1]).toEqual({ where: not(gt(ref(`age`), val(20))) })
596+
597+
// After loading all data, subsequent calls should be deduplicated
598+
const result = await deduplicated.loadSubset({
599+
where: gt(ref(`age`), val(5)),
600+
})
601+
expect(result).toBe(true)
602+
expect(callCount).toBe(2)
603+
})
604+
605+
it(`should not produce unbounded WHERE expressions when loading all data after eq accumulation`, async () => {
606+
// This test reproduces the production bug where accumulating many eq predicates
607+
// and then loading all data (no WHERE clause) caused unboundedly growing
608+
// expressions instead of correctly setting hasLoadedAllData=true.
609+
let callCount = 0
610+
const calls: Array<LoadSubsetOptions> = []
611+
const mockLoadSubset = (options: LoadSubsetOptions) => {
612+
callCount++
613+
calls.push(cloneOptions(options))
614+
return Promise.resolve()
615+
}
616+
617+
const deduplicated = new DeduplicatedLoadSubset({
618+
loadSubset: mockLoadSubset,
619+
})
620+
621+
// Simulate visiting multiple tasks, each adding an eq predicate
622+
for (let i = 0; i < 10; i++) {
623+
await deduplicated.loadSubset({
624+
where: eq(ref(`task_id`), val(`uuid-${i}`)),
625+
})
626+
}
627+
// After 10 eq calls, unlimitedWhere should be IN(task_id, [uuid-0, ..., uuid-9])
628+
expect(callCount).toBe(10)
629+
630+
// Now load all data (no WHERE clause)
631+
// This should send NOT(IN(...)) to the backend but track as "all data loaded"
632+
await deduplicated.loadSubset({})
633+
expect(callCount).toBe(11)
634+
635+
// The load request should be NOT(IN(task_id, [all accumulated uuids]))
636+
const loadWhere = calls[10]!.where as any
637+
expect(loadWhere.name).toBe(`not`)
638+
expect(loadWhere.args[0].name).toBe(`in`)
639+
expect(loadWhere.args[0].args[0].path).toEqual([`task_id`])
640+
const loadedUuids = (
641+
loadWhere.args[0].args[1].value as Array<string>
642+
).sort()
643+
const expectedUuids = Array.from(
644+
{ length: 10 },
645+
(_, i) => `uuid-${i}`,
646+
).sort()
647+
expect(loadedUuids).toEqual(expectedUuids)
648+
649+
// Critical: after loading all data, subsequent requests should be deduplicated
650+
const result1 = await deduplicated.loadSubset({
651+
where: eq(ref(`task_id`), val(`uuid-999`)),
652+
})
653+
expect(result1).toBe(true) // Covered by "all data" load
654+
expect(callCount).toBe(11) // No additional call
655+
656+
// Loading all data again should also be deduplicated
657+
const result2 = await deduplicated.loadSubset({})
658+
expect(result2).toBe(true)
659+
expect(callCount).toBe(11) // Still no additional call
660+
})
661+
662+
it(`should not produce unbounded WHERE expressions with synchronous loadSubset`, async () => {
663+
// Same scenario as the async accumulation test, but with a sync mock
664+
// to exercise the sync return path (line 150 of subset-dedupe.ts)
665+
let callCount = 0
666+
const mockLoadSubset = () => {
667+
callCount++
668+
return true as const
669+
}
670+
671+
const deduplicated = new DeduplicatedLoadSubset({
672+
loadSubset: mockLoadSubset,
673+
})
674+
675+
// Accumulate eq predicates via sync returns
676+
for (let i = 0; i < 10; i++) {
677+
deduplicated.loadSubset({
678+
where: eq(ref(`task_id`), val(`uuid-${i}`)),
679+
})
680+
}
681+
expect(callCount).toBe(10)
682+
683+
// Load all data (no WHERE clause) — should track as "all data loaded"
684+
deduplicated.loadSubset({})
685+
expect(callCount).toBe(11)
686+
687+
// Subsequent requests should be deduplicated
688+
const result1 = deduplicated.loadSubset({
689+
where: eq(ref(`task_id`), val(`uuid-999`)),
690+
})
691+
expect(result1).toBe(true)
692+
expect(callCount).toBe(11)
693+
694+
const result2 = deduplicated.loadSubset({})
695+
expect(result2).toBe(true)
696+
expect(callCount).toBe(11)
697+
})
698+
699+
it(`should handle multiple all-data loads without expression growth`, async () => {
700+
let callCount = 0
701+
const mockLoadSubset = () => {
702+
callCount++
703+
return Promise.resolve()
704+
}
705+
706+
const deduplicated = new DeduplicatedLoadSubset({
707+
loadSubset: mockLoadSubset,
708+
})
709+
710+
// First: load some specific data
711+
await deduplicated.loadSubset({
712+
where: eq(ref(`task_id`), val(`uuid-1`)),
713+
})
714+
expect(callCount).toBe(1)
715+
716+
// Load all data (first time)
717+
await deduplicated.loadSubset({})
718+
expect(callCount).toBe(2)
719+
720+
// Load all data (second time) - should be deduplicated since we already have everything
721+
const result = await deduplicated.loadSubset({})
722+
expect(result).toBe(true)
723+
expect(callCount).toBe(2) // No additional call - all data already loaded
600724
})
601725

602726
it(`should handle multiple overlapping unlimited calls`, async () => {

0 commit comments

Comments
 (0)