Skip to content

Commit d721501

Browse files
jja725claude
andauthored
docs: add roadmap and usage examples to README (#10)
## Summary - Adds the [liblance RFC](lance-format/lance#6035) roadmap with implementation status checkboxes - Adds project description, build instructions, and C/C++ usage examples ## Test plan - [x] Documentation-only change, no code modified 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 5771d6f commit d721501

1 file changed

Lines changed: 117 additions & 1 deletion

File tree

README.md

Lines changed: 117 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,118 @@
11
# lance-c
2-
The C binding to Lance.
2+
3+
The C/C++ binding to [Lance](https://github.com/lancedb/lance), providing native access to the Lance columnar format via a stable C ABI and header-only C++ RAII wrappers.
4+
5+
- **C header:** [`include/lance.h`](include/lance.h)
6+
- **C++ wrappers:** [`include/lance.hpp`](include/lance.hpp) (header-only, RAII, exceptions)
7+
- **Data exchange:** [Arrow C Data Interface](https://arrow.apache.org/docs/format/CDataInterface.html) for zero-copy interop
8+
9+
## Roadmap
10+
11+
Based on the [liblance RFC](https://github.com/lance-format/lance/discussions/6035).
12+
13+
### Phase 1: Core Read Path + C++ Wrappers (MVP)
14+
15+
| Status | Component | Description |
16+
|--------|-----------|-------------|
17+
| [x] | Infrastructure | `lance-c` crate with Cargo.toml, Tokio runtime initialization |
18+
| [x] | Error handling | Thread-local error codes/messages for cross-FFI safety |
19+
| [x] | C header | `lance.h` with Arrow C Data Interface structs |
20+
| [x] | Dataset operations | Open/close with URI + storage options + version support |
21+
| [x] | Schema export | Arrow C Data Interface for zero-copy schema exchange |
22+
| [x] | Scanner builder | Column projection, SQL filters, limit/offset, batch size, row ID, fragment filtering |
23+
| [x] | ArrowArrayStream export | `lance_scanner_to_arrow_stream()` blocking API |
24+
| [x] | Batch iteration | `lance_scanner_next()` blocking function |
25+
| [x] | Poll + waker iteration | `lance_scanner_poll_next()` for async engines (Velox, Presto) |
26+
| [x] | Random access | Index-based row retrieval via `lance_dataset_take()` |
27+
| [x] | C++ wrappers | Header-only RAII library (`lance::Dataset`, `lance::Scanner`, `lance::Batch`) |
28+
| [x] | Builder pattern | Fluent Scanner API (`.limit().offset().batch_size().with_row_id()`) |
29+
30+
### Phase 2: Vector Search & Indexing
31+
32+
| Status | Component | Description |
33+
|--------|-----------|-------------|
34+
| [ ] | Vector search | Nearest-neighbor via scanner with metric/k/nprobes |
35+
| [ ] | Full-text search | FTS queries through scanner interface |
36+
| [ ] | Vector index creation | IVF_PQ, IVF_FLAT, IVF_SQ, HNSW variants |
37+
| [ ] | Scalar index creation | BTree, Bitmap, Inverted, Label-List indexes |
38+
| [ ] | Index management | List and drop index operations |
39+
| [ ] | C++ wrappers | `create_vector_index()` and `create_scalar_index()` methods |
40+
41+
### Phase 3: Write Path & Mutations
42+
43+
| Status | Component | Description |
44+
|--------|-----------|-------------|
45+
| [ ] | Dataset write | Create / append / overwrite from ArrowArrayStream |
46+
| [x] | Fragment writer | Batch-at-a-time fragment file writing (no commit) via `lance_write_fragments()` |
47+
| [ ] | Delete operations | Predicate-based deletion |
48+
| [ ] | Update operations | Expression-based row updates |
49+
| [ ] | Merge-insert | Upsert functionality with builder pattern |
50+
| [ ] | Schema evolution | Add/drop/alter columns with expressions |
51+
| [ ] | Version management | Checkout, restore, list version operations |
52+
53+
### Phase 4: Advanced Features
54+
55+
| Status | Component | Description |
56+
|--------|-----------|-------------|
57+
| [x] | Fragment-level access | Fragment enumeration, ID listing, scanner fragment filtering |
58+
| [ ] | Compaction | Fragment consolidation operations |
59+
| [ ] | Statistics export | Row counts, column stats for query planning |
60+
| [x] | Cloud storage | S3, GCS, Azure via storage options pass-through |
61+
| [ ] | Package distribution | vcpkg and Conan recipe packaging |
62+
63+
### Additional (not in RFC)
64+
65+
| Status | Component | Description |
66+
|--------|-----------|-------------|
67+
| [x] | Async scan | Callback-based `lance_scanner_scan_async()` for non-blocking scans |
68+
| [x] | Dataset metadata | `lance_dataset_version()`, `lance_dataset_count_rows()`, `lance_dataset_latest_version()` |
69+
70+
## Building
71+
72+
```bash
73+
cargo build --release
74+
```
75+
76+
The build produces `liblance_c.{so,dylib,dll}` and the headers in `include/`.
77+
78+
## Usage
79+
80+
### C
81+
82+
```c
83+
#include "lance.h"
84+
85+
LanceDataset* ds = lance_dataset_open("data.lance", NULL, 0);
86+
if (!ds) {
87+
printf("Error: %s\n", lance_last_error_message());
88+
return 1;
89+
}
90+
91+
struct ArrowArrayStream stream;
92+
LanceScanner* scanner = lance_scanner_new(ds, NULL, NULL);
93+
lance_scanner_to_arrow_stream(scanner, &stream);
94+
// consume stream...
95+
96+
lance_scanner_close(scanner);
97+
lance_dataset_close(ds);
98+
```
99+
100+
### C++
101+
102+
```cpp
103+
#include "lance.hpp"
104+
105+
auto ds = lance::Dataset::open("data.lance");
106+
printf("rows: %llu, version: %llu\n", ds.count_rows(), ds.version());
107+
108+
ArrowArrayStream stream;
109+
ds.scan()
110+
.limit(100)
111+
.batch_size(1024)
112+
.to_arrow_stream(&stream);
113+
// consume stream...
114+
```
115+
116+
## License
117+
118+
Apache-2.0

0 commit comments

Comments
 (0)