feat: add lance_dataset_restore for rolling back to a prior version#18
Conversation
Commits a new manifest that aliases the fragments of an older version as the new latest. Returns a fresh LanceDataset* at the new manifest; the caller's original handle is untouched. If the target is already the latest, the call is a no-op (no new manifest is written). Version 0 is rejected up front since lance_dataset_open already uses 0 as the "latest" sentinel; overloading it here would be confusing. Non-existent versions surface upstream's Error::NotFound as LANCE_ERR_NOT_FOUND; concurrent commits surface as LANCE_ERR_COMMIT_CONFLICT. C++ gets a matching lance::Dataset::restore(version) member that returns a new Dataset. Tests: 5 new Rust unit tests (restore-to-prior with version/rowcount assertions, restore-to-latest no-op, non-existent version, version == 0 rejection, NULL dataset rejection). C and C++ integration tests smoke the no-op path against the shared read-only dataset. Closes lance-format#12.
|
cc @jja725 |
jja725
left a comment
There was a problem hiding this comment.
LGTM overall — clean handle pattern, sensible rejection of version == 0, original handle untouched. One inline note on the no-op short-circuit's behavior under concurrent writes.
| let restored = block_on(async { | ||
| let latest = ds.inner.latest_version_id().await?; | ||
| let mut checked_out = ds.inner.checkout_version(version).await?; | ||
| if version != latest { |
There was a problem hiding this comment.
There's a TOCTOU window between latest_version_id().await? (line 61) and this comparison: if a concurrent writer commits a new manifest in between, latest is stale, but we still skip the restore() call. The user's call returns success even though their stated intent — "make version the new latest" — was silently violated (the actual latest is now the concurrent writer's version, not theirs).
The restore() path itself surfaces concurrent commits as LANCE_ERR_COMMIT_CONFLICT, but the no-op path swallows them.
A few options:
- Drop the optimization — always call
restore(). Worst case is one extra manifest when the user intentionally restores to the current latest, which is unusual. - Keep but document — note in the docstring that the no-op decision is based on a snapshot of
latest_version_idand may be incorrect under concurrent writes. - Re-read after checkout — read
latestaftercheckout_versionand compare toversion; still racy, but tighter window.
My preference would be option 1 — the optimization is small and the surprise is real.
There was a problem hiding this comment.
Good catch — went with option 1 (always commit). Pushed in bf0c3e4: dropped the latest_version_id/skip optimization so restore() is unconditionally called after checkout_version(). Updated the Rust + C + C++ tests to assert the new always-bumps semantics, and refreshed the docstring + module doc to call out the TOCTOU rationale.
Branch was also rebased over the Phase 2 merges; restore.rs picked up the upstream LanceDataset.inner: RwLock<Arc<Dataset>> change via ds.snapshot(). All tests green (cargo test + cargo test --test compile_and_run_test -- --ignored).
Drops the skip-when-version==latest optimization in lance_dataset_restore. The prior fast-path read latest_version_id before deciding whether to commit, opening a TOCTOU window: a concurrent writer landing a newer manifest in between would silently leave their version as latest, even though the caller's stated intent was "make `version` the new latest". Always calling restore() costs at most one extra manifest in the rare case the caller intentionally restores to the current latest, and makes the API's contract hold unconditionally under concurrent writers. Updates Rust + C + C++ tests to assert the new always-bumps behavior. Addresses review on PR lance-format#18.
Pulls in the Phase 2 index/scanner additions (a663d53, c6b2daf, 7cd4ab6). Conflict resolutions: - src/lib.rs: keep both `mod index` and `mod restore` plus their re-exports. - tests/c_api_test.rs: place the Restore section before the Index lifecycle section so both test groups coexist. - src/restore.rs: adapt to the upstream change of `LanceDataset.inner` from `Arc<Dataset>` to `RwLock<Arc<Dataset>>`. Take a snapshot via the new `LanceDataset::snapshot()` accessor and wrap the restored dataset in a fresh `RwLock<Arc<...>>` to match the new field type.
|
Could you please take another look at this PR when you have time? Thanks @jja725 |
## Summary Bring the README roadmap in line with what actually shipped on `main`: - Fix the **header path links** at the top — `include/lance.h` / `include/lance.hpp` 404'd after #26 (Phase 4 packaging) moved the headers under `include/lance/`. - **Phase 3 → Dataset write**: flip `[ ]` → `[x]` (`lance_dataset_write` from #16, `lance_dataset_write_with_params` from #20). - **Phase 3 → Version management**: flip `[ ]` → `[x]` — list (`lance_dataset_versions`, #18), rollback (`lance_dataset_restore`, #18), and version checkout (`lance_dataset_open(uri, opts, version)`, documented in #19). - **Additional features**: add `Substrait filter pushdown` row pointing at `lance_scanner_set_substrait_filter` (#25). No source changes; docs only. ## Test plan - [x] Markdown renders correctly (links resolve to `include/lance/lance.h` and `include/lance/lance.hpp`) - [x] `git diff origin/main` shows only the four targeted edits
Summary
lance_dataset_restore(dataset, version)— commits a new manifest whose fragments matchversionand returns a fresh handle at that versionlance::Dataset::restore(version)member returning a newDatasetMotivation
With
lance_dataset_versions(#17) C/C++ callers can list the history, but there was no way to actually roll back to an older version. This closes the remaining piece of read-side version management from the Phase 3 roadmap.Contract
versionalready matches the current latest, so the caller's "makeversionthe new latest" intent holds unconditionally under concurrent writers.version == 0is rejected withLANCE_ERR_INVALID_ARGUMENT(it islance_dataset_open's "latest" sentinel).LANCE_ERR_INVALID_ARGUMENT(NULL handle orversion == 0),LANCE_ERR_NOT_FOUND(unknown version),LANCE_ERR_COMMIT_CONFLICT(concurrent commit, typically another restore).Test plan
cargo test— 5 Rust tests:NotFoundversion == 0rejectioncargo clippy --all-targets -- -D warningsandcargo fmt --checkcleancargo test --test compile_and_run_test -- --ignored— C and C++ end-to-end tests exercise restore-to-current and assert the version bumpCloses #12.