Skip to content

fix: compact nested projection list elements by index#171

Merged
pdf-amzn merged 1 commit into
ExtendDB:mainfrom
yesyayen:fix/nested-projection-compaction
Jun 10, 2026
Merged

fix: compact nested projection list elements by index#171
pdf-amzn merged 1 commit into
ExtendDB:mainfrom
yesyayen:fix/nested-projection-compaction

Conversation

@yesyayen

@yesyayen yesyayen commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

What

This makes nested ProjectionExpression results match Amazon DynamoDB when a projection selects multiple elements of the same list. When a projection selects more than one element of a list (for example a[0], a[2]), Amazon DynamoDB returns those elements compacted into a new list, ordered by their original index. ExtendDB collapsed them to a single element.

I replaced the per-path resolve / wrap / merge assembly in crates/core/src/expression/projection.rs with a projection trie. apply_projection now builds one trie from all paths (resolving #name refs, rejecting index-start paths) and walks the item once: a map keeps only the selected keys, and a list emits the selected indices in ascending order.

Behavior: today vs Amazon DynamoDB

The item under test has mylist = [zero, one, two, three], listOfMaps = [{val:a0,x:x0},{val:a1,x:x1},{val:a2,x:x2}], and listWithNull = [keep0, NULL, keep2].

ProjectionExpression ExtendDB (before this PR) Amazon DynamoDB
mylist[0], mylist[2] {"mylist":["two"]} {"mylist":["zero","two"]}
mylist[2], mylist[0] {"mylist":["zero"]} {"mylist":["zero","two"]} (ordered by index, not by expression)
mylist[1], mylist[3] {"mylist":["three"]} {"mylist":["one","three"]}
listOfMaps[0].val, listOfMaps[2].val {"listOfMaps":[{"val":"a2"}]} {"listOfMaps":[{"val":"a0"},{"val":"a2"}]}
listWithNull[0], listWithNull[2] {"listWithNull":["keep2"]} {"listWithNull":["keep0","keep2"]}
listWithNull (whole list) {"listWithNull":["keep0",NULL,"keep2"]} same (already correct)
listWithNull[1] (the NULL) {"listWithNull":[NULL]} same (already correct)

Why

This is a conformance gap. It is a data-correctness bug: customers got the wrong elements back from a list projection. #162 corrected the read-path validation and parse-error labels and called out nested projection correctness as a separate follow-up; this is that follow-up.

Closes: n/a (no separate issue; tracked as the deferred nested-projection item from #162).

Testing done

Added tests/test_nested_projection.py (9 dual-target cases). Every expected result was captured from Amazon DynamoDB.

  • New pytest: 9 pass against ExtendDB and the same 9 pass against Amazon DynamoDB. On the unfixed binary, 5 of the 9 fail (the multi-index cases) and the 4 controls pass.
  • Rust unit tests in projection.rs: 17 pass, including 8 new cases covering compaction, index ordering, NULL handling, list-of-maps sub-field projection, and same-index merge.

Out of scope

The following are tracked separately:

  • apply_projection rebuilds the projection trie per returned item on Query and Scan. The paths are identical across items, so building the trie once per request and reusing it is a follow-up optimization. This matches the old apply path, which also did per-item work, so it is not a regression.

Checklist

  • I have read CONTRIBUTING.md
  • All tests pass (cargo test --workspace)
  • Code is formatted (cargo fmt --check)
  • Clippy is clean (cargo clippy -- -W clippy::pedantic)
  • I have added or updated tests for new functionality
  • I have updated documentation if behavior changed (no docs change: this only makes ExtendDB more conformant)
  • Breaking changes are noted below (if any)
  • If this changes the wire protocol, Storage trait, auth model, on-disk format, or public CLI surface, an RFC has been accepted or is linked below. Otherwise, an ADR captures the decision (link below).

ADR / RFC: n/a.

Breaking changes

n/a


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache License 2.0 and I agree to the Developer Certificate of Origin (DCO). See CONTRIBUTING.md for details.

Multi-index list projections (e.g. `a[0], a[2]`) collapsed to one
element: each path built a one-element list and the merge overwrote it.
Replace the per-path wrap/merge with a projection trie that walks the
item once, emitting selected indices in ascending order. Preserves
projected NULLs and merges same-index sub-paths.
@yesyayen yesyayen marked this pull request as ready for review June 8, 2026 22:20
@pdf-amzn pdf-amzn added this pull request to the merge queue Jun 10, 2026
Merged via the queue into ExtendDB:main with commit cbea20f Jun 10, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants