Commit a202283
authored
**Closes #930.**
`verifyOwnerFromSnapshot` was comparing the **raw mutation key** against
the engine's **routing-key ranges**, which sit in a completely different
lexicographical band. This caused every adapter write (DynamoDB, SQS,
S3, Redis internal) that crossed into a non-default Raft group to be
rejected with a spurious `ErrComposed1Violation`.
## 再現 (Issue #930)
1. Multi-group cluster を `--shardRanges ":<T3_KEY>=1,<T3_KEY>:=2"` で起動
2. M5a multi-table workload (PRs #911 / #916 / #924 / #925) を実行
3. Group 2 に routing されるテーブルへの全 CreateTable / Put / Get が以下の FSM error
で失敗:
```
observed-version v=1: key "!ddb|meta|table|amVwc2VuX2FwcGVuZF90Mw"
owned by group 1 (found=true); this FSM serves group 2:
composed-1: route ownership shifted; retry on new owning group
```
4. Workers が全 txn で `DynamoDB ResourceNotFoundException`、 Elle は
`:empty-transaction-graph` を report
## Root cause
`OwnerOf(rawKey)` の挙動:
- `routes[].Start` と `routes[].End` は **routing key**
(`!ddb|route|table|...`) の namespace
- でも `mut.Key` は **raw user key** (`!ddb|meta|table|...`)
- `!ddb|meta|...` (ASCII `m`=109) と `!ddb|route|...` (ASCII `r`=114) で 5
文字目がズレる
- 結果: 全 `!ddb|meta|...` キーが lexicographically `T3_KEY=!ddb|route|...`
より小さく、 group 1 へ "fall" させてしまう
- 実際の routing (`engine.GetRoute(routeKey(rawKey))`) は正しく group 2 を返すので、
Dispatch と verify で不整合発生
## Fix
`snap.OwnerOf(routeKey(mut.Key))` で routing-key namespace で比較するよう
normalize。 `ShardRouter.ResolveGroup` と `ShardStore.GetAt` が
`engine.GetRoute(routeKey(rawKey))` するのと同じ semantic。
## E2E 検証
修正後の `./scripts/run-jepsen-m5-local.sh` 実行結果:
```
results.edn: {:valid? true}
```
Workers が txn を正常完了、 read で先行 append を全て確認、 修正前の
`ResourceNotFoundException` は完全に解消。
## Caller audit (loop directive)
`verifyOwnerFromSnapshot` は `verifyComposed1` の 2 箇所から呼ばれる:
- observed-version check (line 618)
- current-version check (line 626)
両方とも同じ mutation list を受けるので、 normalization は均等に適用される。 unexported なので外部
caller なし。
## Test plan
- [x] `./scripts/run-jepsen-m5-local.sh` — `{:valid? true}`
- [ ] FSM レベルのユニットテスト追加 (follow-up — multi-group FSM verification の test
infrastructure 整備とセット)
- [x] `go build ./...` — 0 issues
## 関連 PR
- #926 (`fix/m5a-local-host`): `--host 127.0.0.1` + `--shardRanges`
default coverage — これも routing 問題だが、 PR #926 が解決するのは `Dispatch` 失敗パス。 本
PR は `verifyComposed1` の routing-key normalization。 両方が必要。
- #911 / #916 / #924 / #925: Composed-1 M5a 実装 (全部マージ済)
- 親 design doc:
`docs/design/2026_05_29_partial_composed1_cross_group_commit_guard.md`
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Fixed a route ownership verification issue that was incorrectly
flagging certain adapter write operations as violations. The system now
properly handles key comparisons during ownership checks, preventing
spurious errors and improving system reliability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
3 files changed
Lines changed: 50 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
695 | 695 | | |
696 | 696 | | |
697 | 697 | | |
698 | | - | |
| 698 | + | |
| 699 | + | |
| 700 | + | |
| 701 | + | |
| 702 | + | |
| 703 | + | |
699 | 704 | | |
700 | 705 | | |
701 | | - | |
702 | | - | |
| 706 | + | |
| 707 | + | |
703 | 708 | | |
704 | 709 | | |
705 | 710 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
321 | 321 | | |
322 | 322 | | |
323 | 323 | | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
758 | 758 | | |
759 | 759 | | |
760 | 760 | | |
761 | | - | |
762 | | - | |
| 761 | + | |
| 762 | + | |
763 | 763 | | |
764 | 764 | | |
765 | 765 | | |
| |||
0 commit comments