feat: implement background compaction for Lance fragments (#16)#28
Merged
beinan merged 5 commits intolance-format:mainfrom Jan 28, 2026
Merged
feat: implement background compaction for Lance fragments (#16)#28beinan merged 5 commits intolance-format:mainfrom
beinan merged 5 commits intolance-format:mainfrom
Conversation
beinan
approved these changes
Jan 28, 2026
Collaborator
beinan
left a comment
There was a problem hiding this comment.
lgtm, but could you fix the ci?
Implements issue lance-format#16 with comprehensive compaction functionality: **Core Features:** - Manual compaction via `compact()` method - Optional background compaction with configurable intervals - Comprehensive configuration (thresholds, quiet hours, intervals) - Advanced observability (stats API, metrics, logging) **Implementation Details:** - Rust: Added CompactionConfig, CompactionStats types to store.rs - Rust: Implemented compact(), should_compact(), compaction_stats() - Rust: Background task with Tokio interval timer and graceful shutdown - Python: PyO3 bindings for all compaction methods - Python: High-level API with full docstrings - Tests: 10 comprehensive tests (all passing) **Configuration Options:** - enable_background_compaction: Enable auto-compaction - compaction_interval_secs: Check interval (default: 300s) - compaction_min_fragments: Trigger threshold (default: 5) - compaction_target_rows: Target rows per fragment (default: 1M) - quiet_hours: Skip compaction during specified hours **Metrics Returned:** - fragments_removed/added - files_removed/added - is_compacting status - last_compaction timestamp - total_compactions count All tests pass. Documentation updated with usage examples. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Addresses lance-format#21 - Release the Global Interpreter Lock during all blocking operations to allow Python threads to run concurrently. **Changes:** - Wrapped all `runtime.block_on()` calls in `py.allow_threads()` - Applies to: create(), add(), compact(), compaction_stats(), checkout(), search(), list() **Benefits:** - Python interpreter no longer freezes during operations - Background threads (heartbeats, UI) remain responsive - Critical for S3-backed stores (50-500ms+ latency) - Critical for long-running compaction operations **Pattern:** ```rust py.allow_threads(|| { self.runtime .block_on(async_operation()) .map_err(to_py_err) })? ``` This ensures concurrent Python execution while Rust performs expensive I/O and computation. All tests pass (19 passed, 2 skipped). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
321b21c to
83d56e7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements #16 - Background compaction for Lance fragments with comprehensive features:
compact()methodChanges
Rust Core:
CompactionConfigandCompactionStatstypescompact(),should_compact(),compaction_stats()methodsPython API:
Context.create()Tests:
Usage Example
Manual Compaction:
```python
ctx = Context.create("context.lance")
for i in range(100):
ctx.add("user", f"message {i}")
metrics = ctx.compact()
print(f"Removed {metrics['fragments_removed']} fragments")
```
Background Compaction:
```python
ctx = Context.create(
"context.lance",
enable_background_compaction=True,
compaction_interval_secs=300,
compaction_min_fragments=10,
quiet_hours=[(22, 6)], # 10pm-6am
)
```
Check Status:
```python
stats = ctx.compaction_stats()
print(f"Fragments: {stats['total_fragments']}")
print(f"Last compaction: {stats['last_compaction']}")
```
Test Results
```
10 passed in 5.39s
✅ Manual compaction reduces fragments
✅ Data integrity preserved
✅ Concurrent writes work
✅ Compaction stats accurate
✅ Custom options work
✅ Background compaction triggers
✅ Quiet hours respected
✅ Metrics structure correct
✅ Empty context handled
✅ Multiple compactions work
```
Architecture
Checklist