Commit fda259a
committed
[SPARK-XXXXX][SS] Widen stateful operator output and state schema nullability
### What changes were proposed in this pull request?
Introduce a three-component fix for stateful-operator nullability drift,
gated by `spark.sql.streaming.statefulOperator.alwaysNullableOutput.enabled`
(pinned per-query via the offset log):
- (a) `WidenStatefulOpNullability.widenStateSchema`: every stateful physical
exec widens its state key/value schema to fully nullable at construction.
- (b) `WidenStatefulOpNullability.widenOutputForStatefulOp`: every stateful
logical and physical operator widens its declared `output` to fully nullable.
- (c) `WidenStatefulOperatorAttributeNullability`: an optimizer rule that
widens `AttributeReference`s inside stateful ops' internal expressions and
propagates upward through ancestor expressions.
### Why are the changes needed?
`PropagateEmptyRelation` can drop empty `Union` branches, causing a
per-column nullability flip that propagates into a stateful operator's
state schema across microbatches or restarts. This causes either
`STATE_STORE_KEY_SCHEMA_NOT_COMPATIBLE` on restart or a codegen NPE
when state-restored rows carry nulls in columns declared non-nullable.
### Does this PR introduce _any_ user-facing change?
No user-visible behavior change for new queries (all stateful operator
outputs become nullable, which is semantically correct). Existing queries
keep their original behavior via the offset log gate.
### How was this patch tested?
New `StreamingStatefulOperatorNullabilityDriftSuite` covering:
- New-query path: Union-branch-drop restart scenarios for aggregate,
dropDuplicates, dropDuplicatesWithinWatermark.
- Codegen NPE regression with struct grouping keys.
- Existing-query path: widening forced off still triggers schema mismatch.
- Rule-level: scope check (non-stateful subtrees skipped).
- Helper-level: `deepWidenAttribute` recursion into nested types.
### Was this patch authored or co-authored using generative AI tooling?
Yes.1 parent 3e8c865 commit fda259a
20 files changed
Lines changed: 877 additions & 277 deletions
File tree
- sql
- catalyst/src/main/scala/org/apache/spark/sql
- catalyst
- analysis
- plans/logical
- internal
- connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/streaming
- core/src
- main/scala/org/apache/spark/sql/execution
- adaptive
- python/streaming
- streaming
- checkpointing
- operators/stateful
- flatmapgroupswithstate
- join
- transformwithstate
- runtime
- test/scala/org/apache/spark/sql
- execution/streaming/state
- streaming
Lines changed: 129 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
Lines changed: 25 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
| 21 | + | |
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| |||
746 | 746 | | |
747 | 747 | | |
748 | 748 | | |
749 | | - | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
750 | 753 | | |
751 | 754 | | |
752 | 755 | | |
| |||
1225 | 1228 | | |
1226 | 1229 | | |
1227 | 1230 | | |
1228 | | - | |
| 1231 | + | |
| 1232 | + | |
| 1233 | + | |
| 1234 | + | |
1229 | 1235 | | |
1230 | 1236 | | |
1231 | 1237 | | |
| |||
1749 | 1755 | | |
1750 | 1756 | | |
1751 | 1757 | | |
1752 | | - | |
| 1758 | + | |
| 1759 | + | |
| 1760 | + | |
| 1761 | + | |
1753 | 1762 | | |
1754 | 1763 | | |
1755 | 1764 | | |
| |||
2004 | 2013 | | |
2005 | 2014 | | |
2006 | 2015 | | |
2007 | | - | |
| 2016 | + | |
| 2017 | + | |
| 2018 | + | |
| 2019 | + | |
2008 | 2020 | | |
2009 | 2021 | | |
2010 | 2022 | | |
| |||
2172 | 2184 | | |
2173 | 2185 | | |
2174 | 2186 | | |
2175 | | - | |
| 2187 | + | |
| 2188 | + | |
| 2189 | + | |
| 2190 | + | |
2176 | 2191 | | |
2177 | 2192 | | |
2178 | 2193 | | |
| |||
2184 | 2199 | | |
2185 | 2200 | | |
2186 | 2201 | | |
2187 | | - | |
| 2202 | + | |
| 2203 | + | |
| 2204 | + | |
| 2205 | + | |
2188 | 2206 | | |
2189 | 2207 | | |
2190 | 2208 | | |
| |||
Lines changed: 11 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
| 23 | + | |
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| |||
568 | 568 | | |
569 | 569 | | |
570 | 570 | | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
571 | 576 | | |
572 | 577 | | |
573 | 578 | | |
| |||
657 | 662 | | |
658 | 663 | | |
659 | 664 | | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
660 | 670 | | |
661 | 671 | | |
662 | 672 | | |
| |||
Lines changed: 7 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
159 | 159 | | |
160 | 160 | | |
161 | 161 | | |
162 | | - | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
163 | 165 | | |
164 | 166 | | |
165 | 167 | | |
| |||
206 | 208 | | |
207 | 209 | | |
208 | 210 | | |
209 | | - | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
210 | 214 | | |
211 | 215 | | |
212 | 216 | | |
| |||
Lines changed: 16 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3403 | 3403 | | |
3404 | 3404 | | |
3405 | 3405 | | |
| 3406 | + | |
| 3407 | + | |
| 3408 | + | |
| 3409 | + | |
| 3410 | + | |
| 3411 | + | |
| 3412 | + | |
| 3413 | + | |
| 3414 | + | |
| 3415 | + | |
| 3416 | + | |
| 3417 | + | |
| 3418 | + | |
| 3419 | + | |
| 3420 | + | |
| 3421 | + | |
3406 | 3422 | | |
3407 | 3423 | | |
3408 | 3424 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
89 | | - | |
| 89 | + | |
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
| |||
Lines changed: 3 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
| 21 | + | |
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
47 | | - | |
| 47 | + | |
| 48 | + | |
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
| |||
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| |||
81 | 82 | | |
82 | 83 | | |
83 | 84 | | |
84 | | - | |
| 85 | + | |
| 86 | + | |
85 | 87 | | |
86 | 88 | | |
87 | 89 | | |
| |||
Lines changed: 6 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| 46 | + | |
46 | 47 | | |
47 | 48 | | |
48 | 49 | | |
| |||
51 | 52 | | |
52 | 53 | | |
53 | 54 | | |
54 | | - | |
| 55 | + | |
55 | 56 | | |
56 | 57 | | |
57 | 58 | | |
| |||
69 | 70 | | |
70 | 71 | | |
71 | 72 | | |
72 | | - | |
| 73 | + | |
73 | 74 | | |
74 | 75 | | |
75 | 76 | | |
| |||
94 | 95 | | |
95 | 96 | | |
96 | 97 | | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
97 | 101 | | |
98 | 102 | | |
99 | 103 | | |
| |||
Lines changed: 4 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
204 | 204 | | |
205 | 205 | | |
206 | 206 | | |
207 | | - | |
| 207 | + | |
| 208 | + | |
208 | 209 | | |
209 | 210 | | |
210 | 211 | | |
| |||
254 | 255 | | |
255 | 256 | | |
256 | 257 | | |
257 | | - | |
| 258 | + | |
| 259 | + | |
258 | 260 | | |
259 | 261 | | |
260 | 262 | | |
| |||
0 commit comments