Commit c4cac2a
authored
Fix singleton stack-corruption NPE in DatetimeUdtNormalizeRule (opensearch-project#5458)
* fix: instantiate DatetimeUdt normalize/output-cast rules per plan() call
DatetimeUdtNormalizeRule and DatetimeOutputCastRule extend
RelHomogeneousShuttle, which inherits a stateful Deque<RelNode> stack
from RelShuttleImpl. DatetimeExtension.postAnalysisRules() returned
the static INSTANCE of each rule, sharing the same shuttle (and the
same stack) across every UnifiedQueryPlanner.plan() invocation.
If any traversal ever ends with an unbalanced stack, residual entries
persist to the next query. The next query's visitChild() then pops a
stale or empty stack and throws NoSuchElementException at
RelShuttleImpl.visitChild line 67 (the stack.pop() in the finally
block) — surfacing as the cluster-side stack trace reported on
analytics-engine-routed parquet indices for queries that combine
aggregations over datetime UDT columns (e.g.
"stats count() as field_count, distinct_count(field)").
Return fresh instances per plan() instead. Drop the INSTANCE
constants and the Lombok @NoArgsConstructor on both rules; document
the singleton-unsafety on each class JavaDoc.
Add a regression test that runs several plan() calls in sequence
against the same context, covering stats+distinct_count over both
schema-declared and eval-derived datetime columns.
Signed-off-by: Kai Huang <ahkcs@amazon.com>
* test: add analytics-engine regression IT for singleton stack-corruption
CalciteDatetimeUdtNormalizeRegressionIT exercises the failure pattern
that triggered the cluster-side NoSuchElementException:
stats + distinct_count over datetime columns, repeated 20 times to
amplify any plan() carry-over.
The IT is harness-aware:
- Without `-Dtests.analytics.force_routing=true`: queries go through
the V2 / Calcite engine path. The DatetimeUdtNormalizeRule path is
not exercised, so the IT passes as a baseline correctness check.
- With `-Dtests.analytics.force_routing=true
-Dtests.analytics.parquet_indices=true`: every query routes through
the analytics-engine path and hits the DatetimeUdtNormalizeRule
shuttle that this PR fixes. The 20-iteration pattern surfaces any
remaining singleton-stack carry-over.
CI's :integTest task (in-process testCluster without analytics-engine)
runs the IT through the V2 path, which is safe and fast. The
analytics-engine verification path is via :integTestRemote against an
externally-managed cluster built per
`docs/dev/ppl-analytics-engine-routing.md`.
Signed-off-by: Kai Huang <ahkcs@amazon.com>
* test: switch regression IT to concurrent query pattern
The sequential iteration variant passed even with the singleton bug in
place — local cluster doesn't carry over enough state between calls in
one thread. The actual production trigger is parallel queries from a
dashboard "field statistics" panel: multiple cluster threads call
plan() simultaneously, all using the shared singleton's non-thread-safe
ArrayDeque. Their push/pop operations interleave and corrupt the stack.
Verified locally against analytics-engine path with parquet indices:
- Unfixed cluster: 2-3 / 80 queries fail with NoSuchElementException
(HTTP 500), matching the production stack trace exactly.
- Fixed cluster: 0 / 80 failures.
Uses CompletableFuture + 8-thread pool to fire 80 queries per test
across:
- testConcurrentStatsDistinctCountOverDatetime: same shape, varied
datetime fields.
- testConcurrentMixedDatetimePlans: three different plan shapes
interleaved — mixed visitChild call counts amplify the race.
Signed-off-by: Kai Huang <ahkcs@amazon.com>
* test: rename to CalcitePlannerConcurrencyIT (review nit)
@dai-chen flagged that the IT name was over-scoped to a single rule and
the file would read better as a general bucket for planner-level
concurrency / state-isolation regressions. The actual surface under test
is UnifiedQueryPlanner's post-analysis pipeline — any RelShuttle
extension that doesn't isolate per-call state is unsafe under concurrent
load, not just the datetime rules.
Renames the file and class, updates the JavaDoc to describe the planner-
level invariant rather than the specific Datetime* rules, and notes the
current cases as the regression that motivated the suite. Test method
bodies and assertions are unchanged.
Signed-off-by: Kai Huang <ahkcs@amazon.com>
---------
Signed-off-by: Kai Huang <ahkcs@amazon.com>1 parent dc05942 commit c4cac2a
5 files changed
Lines changed: 187 additions & 12 deletions
File tree
- api/src
- main/java/org/opensearch/sql/api/spec/datetime
- test/java/org/opensearch/sql/api/spec/datetime
- integ-test/src/test/java/org/opensearch/sql/calcite/remote
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
| 26 | + | |
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| |||
Lines changed: 6 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
13 | | - | |
14 | 12 | | |
15 | 13 | | |
16 | 14 | | |
| |||
21 | 19 | | |
22 | 20 | | |
23 | 21 | | |
24 | | - | |
25 | | - | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
26 | 28 | | |
27 | 29 | | |
28 | | - | |
29 | | - | |
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| |||
Lines changed: 3 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
10 | | - | |
11 | 9 | | |
12 | 10 | | |
13 | 11 | | |
| |||
22 | 20 | | |
23 | 21 | | |
24 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
25 | 26 | | |
26 | | - | |
27 | 27 | | |
28 | 28 | | |
29 | | - | |
30 | | - | |
31 | 29 | | |
32 | 30 | | |
33 | 31 | | |
| |||
Lines changed: 21 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
154 | 154 | | |
155 | 155 | | |
156 | 156 | | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
157 | 178 | | |
158 | 179 | | |
159 | 180 | | |
| |||
Lines changed: 155 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
0 commit comments