Commit bfcc62b
[SPARK-56971][SS] Add CommitMetadataV3 and SinkMetadataInfo for sink evolution
### What changes were proposed in this pull request?
Backport of [SPARK-56971] ([#56019](#56019)) to `branch-4.2`.
Add the commit log data structures for streaming sink evolution:
- `CommitMetadataV3` (`VERSION_3` of the commit log wire format) carries a `sinkMetadataMap: Map[String, SinkMetadataInfo]` keyed by sink name, in addition to the V2 fields (`nextBatchWatermarkMs`, `stateUniqueIds`).
- `SinkMetadataInfo` records per-sink metadata: `sinkName`, `commitOffset` (serialized via `OffsetV2.json()`), `providerName`, `apiVersion`, and an `isActive` flag used to distinguish the current sink from historical sinks that were used in earlier batches but are no longer in use.
- `CommitMetadataV3.activeSinkMetadataInfo` returns the entry with `isActive = true`; `CommitMetadataV3` requires exactly one active sink.
- `CommitLog.createMetadata` learns to produce a `CommitMetadataV3` when `commitLogFormatVersion = VERSION_3`, requiring a non-empty `sinkMetadataMap`.
- `CommitLog.readCommitMetadata` dispatches `v3` files to the new class.
The V3 metadata is dormant in this PR: no caller produces it yet. Wiring through `MicroBatchExecution` is the SPARK-56972 follow-up.
**Prerequisite commit.** SPARK-56971 was built on top of [SPARK-56970] ([#56018](#56018)), which splits `CommitMetadata` into a `CommitMetadataBase` trait with concrete `CommitMetadata` (V1) and `CommitMetadataV2` case classes. `branch-4.2` does not yet have SPARK-56970, so this PR includes it as the first commit and adds SPARK-56971 on top. Both commits are cherry-picked from the `branch-4.x` backports (`5322ec30c02` and `706ce2f3743`). The only conflicts were import-line collisions in `CommitLogSuite.scala` (the suite extends `SparkFunSuite with SharedSparkSession` on `branch-4.2`); the resolved `CommitLog.scala` is identical to `branch-4.x`.
### Why are the changes needed?
SPARK-56719 added `DataStreamWriter.name()` as the API surface for sink evolution. Without a place in the commit log to durably record the sink name and offset alongside the rest of a committed batch's metadata, sink names cannot be observed on restart and the evolution feature cannot be completed. This PR introduces that storage in the 4.2 release line.
### Does this PR introduce _any_ user-facing change?
No. `CommitMetadataV3` is in the internal `org.apache.spark.sql.execution.streaming.checkpointing` package and is not produced by any code path yet. As part of the SPARK-56970 refactor, V1 commit log files no longer serialize `stateUniqueIds: null`; old V1 files continue to be read because the V1 deserializer ignores the (now-unknown) field.
### How was this patch tested?
- Cherry-picked the two `branch-4.x` commits; resolved import conflicts in `CommitLogSuite.scala`.
- Existing and new `CommitLogSuite` cases (V1/V2/V3 SerDe, historical-sink retention, `createMetadata` V3 empty-map failure, exactly-one-active-sink invariant).
- `sql/core` main and test sources compile cleanly on `branch-4.2` (`build/sbt sql/Test/compile`).
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (claude-opus-4-8)
Closes #56548 from ericm-db/SPARK-56971-branch-4.2.
Lead-authored-by: Eric Marnadi <eric.marnadi@databricks.com>
Co-authored-by: ericm-db <eric.marnadi@databricks.com>
Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>1 parent c655abe commit bfcc62b
9 files changed
Lines changed: 393 additions & 93 deletions
File tree
- sql/core/src
- main/scala/org/apache/spark/sql/execution/streaming
- checkpointing
- runtime
- state
- test/scala/org/apache/spark/sql
- execution
- datasources/v2/state
- streaming/state
- streaming
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
51 | | - | |
| 51 | + | |
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
| |||
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
75 | | - | |
| 75 | + | |
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
| |||
Lines changed: 190 additions & 21 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| 29 | + | |
| 30 | + | |
29 | 31 | | |
30 | 32 | | |
31 | 33 | | |
| |||
50 | 52 | | |
51 | 53 | | |
52 | 54 | | |
53 | | - | |
| 55 | + | |
54 | 56 | | |
55 | 57 | | |
56 | 58 | | |
57 | | - | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
58 | 62 | | |
59 | 63 | | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
| 64 | + | |
| 65 | + | |
72 | 66 | | |
73 | 67 | | |
74 | | - | |
| 68 | + | |
75 | 69 | | |
76 | | - | |
| 70 | + | |
77 | 71 | | |
78 | 72 | | |
79 | 73 | | |
80 | 74 | | |
81 | 75 | | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
82 | 102 | | |
83 | 103 | | |
84 | 104 | | |
85 | 105 | | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
86 | 176 | | |
87 | 177 | | |
88 | 178 | | |
| |||
104 | 194 | | |
105 | 195 | | |
106 | 196 | | |
| 197 | + | |
107 | 198 | | |
108 | 199 | | |
109 | 200 | | |
110 | 201 | | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
111 | 206 | | |
112 | | - | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
113 | 229 | | |
114 | | - | |
115 | | - | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
116 | 246 | | |
117 | 247 | | |
118 | | - | |
| 248 | + | |
119 | 249 | | |
120 | 250 | | |
121 | | - | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
122 | 291 | | |
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/runtime/MicroBatchExecution.scala
Lines changed: 4 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | | - | |
| 49 | + | |
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
| |||
1464 | 1464 | | |
1465 | 1465 | | |
1466 | 1466 | | |
1467 | | - | |
| 1467 | + | |
| 1468 | + | |
| 1469 | + | |
1468 | 1470 | | |
1469 | 1471 | | |
1470 | 1472 | | |
| |||
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
294 | 294 | | |
295 | 295 | | |
296 | 296 | | |
297 | | - | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
298 | 300 | | |
299 | 301 | | |
300 | 302 | | |
| |||
Lines changed: 13 additions & 21 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
376 | 376 | | |
377 | 377 | | |
378 | 378 | | |
379 | | - | |
380 | | - | |
381 | | - | |
382 | | - | |
383 | | - | |
384 | | - | |
385 | | - | |
386 | | - | |
387 | | - | |
388 | | - | |
389 | | - | |
390 | | - | |
391 | | - | |
392 | | - | |
393 | | - | |
394 | | - | |
395 | | - | |
396 | | - | |
397 | | - | |
398 | | - | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
399 | 390 | | |
| 391 | + | |
400 | 392 | | |
401 | 393 | | |
402 | 394 | | |
| |||
Lines changed: 3 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | | - | |
| 28 | + | |
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| |||
237 | 237 | | |
238 | 238 | | |
239 | 239 | | |
240 | | - | |
| 240 | + | |
241 | 241 | | |
242 | 242 | | |
243 | 243 | | |
244 | | - | |
| 244 | + | |
245 | 245 | | |
246 | 246 | | |
247 | 247 | | |
| |||
0 commit comments