Skip to content

Add lance_dataset_restore() for restoring a prior version #12

@LuciferYang

Description

@LuciferYang

Summary

Add an FFI API to restore a Lance dataset to a prior version. Closes the "restore" portion of the Version management row in the README Phase 3 roadmap.

Motivation

Dataset::restore creates a new manifest version whose fragments match those of an older version, effectively "rolling back" the visible latest. The C API does not expose this operation today. This is the first metadata-only commit path added to the C FFI.

Proposed API

C:

/* Returns a fresh open handle at V_new (the new manifest version).
   The input handle is unaffected and remains usable. Returns NULL on error. */
LanceDataset* lance_dataset_restore(const LanceDataset* dataset, uint64_t version);

C++:

// Member on lance::Dataset; throws lance::Error on failure.
Dataset restore(uint64_t version) const;

Upstream mapping

Confirmed against upstream lance source:

  • checkout_version(&self, version: impl Into<Ref>) -> Result<Self> — returns a fresh Dataset at the target version.
  • restore(&mut self) -> Result<()> — mutates the checked-out Dataset in place.

FFI shim call sequence:

let mut ds = original.checkout_version(version).await?;  // owned value, not shared
ds.restore().await?;                                     // in-place mutation on the local value
// Wrap ds in Arc<Dataset> and return a fresh LanceDataset* handle.

Because ds is a locally-owned value before wrapping, the existing Arc<Dataset> handle model is preserved — no RwLock required.

Semantics

  • On success, the returned handle points at V_new (the new manifest), which has the same fragments as V_old. V_new.version > V_old.version. Callers will not see V_old.version after restore; lance_dataset_version() on the returned handle reports V_new.version.
  • version == 0LANCE_ERR_INVALID_ARGUMENT (0 is reserved for "latest" in lance_dataset_open; do not overload).
  • Restore to current latest → no-op success via skip-commit: return a fresh handle at the already-latest version without writing a new manifest. Rationale: the caller's intent ("make latest match this version") is already satisfied.
  • Non-existent target version → LANCE_ERR_NOT_FOUND.

Tests

  • Write → record V1 → append → record V2 → restore(V1) → verify version() reports V_new (> V2) and count_rows() matches V1's row count.
  • Restore to non-existent version → LANCE_ERR_NOT_FOUND.
  • Restore to 0LANCE_ERR_INVALID_ARGUMENT.
  • Restore to current latest → success, returned handle's version() equals pre-call latest_version(), no new manifest file written.
  • C/C++ integration: round-trip happy path in tests/cpp/test_c_api.c + test_cpp_api.cpp.

Locked design decisions

  • Returns a fresh handle rather than mutating in place (no RwLock).
  • Restore-to-latest is a no-op success (skip commit).

Related

  • Part of the plan in docs/plan-write-path.md.
  • Sibling tickets: A1 (versions list), A3 (checkout docs), B1 (dataset write), B2 (write params).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions