Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
251 changes: 83 additions & 168 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,204 +1,119 @@
# Contributing to Lance Namespace
# CLAUDE.md

The Lance Namespace codebase is at [lance-format/lance-namespace](https://github.com/lance-format/lance-namespace).
This codebase contains:
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

- The Lance Namespace specification
- The core `LanceNamespace` interface and generic connect functionality for all languages except Rust
(for Rust, these are located in the [lance-format/lance](https://github.com/lance-format/lance) repo)
- Generated clients and servers using OpenAPI generator
## What This Repo Is

This project should only be used to make spec and interface changes to Lance Namespace,
or to add new clients and servers to be generated based on community demand.
In general, we welcome more generated components to be added as long as
the contributor is willing to set up all the automations for generation and publication.
Lance Namespace specification, core interfaces, and OpenAPI-generated clients/servers.
The single source of truth is the OpenAPI spec at `docs/src/rest.yaml`.

For contributing changes to directory and REST namespaces, please go to the [lance](https://github.com/lance-format/lance) repo.
**Scope:** Only spec changes, interface changes, and new generated clients/servers belong here.
Implementation changes (directory/REST namespaces) go to [lance-format/lance](https://github.com/lance-format/lance).
Other namespace implementations go to [lance-format/lance-namespace-impls](https://github.com/lance-format/lance-namespace-impls).

For contributing changes to implementations other than the directory and REST namespace,
or for adding new namespace implementations,
please go to the [lance-namespace-impls](https://github.com/lance-format/lance-namespace-impls) repo.
## Build Commands

## Project Dependency
Requires [uv](https://docs.astral.sh/uv/getting-started/installation/). First run: `make sync`

This project contains the core Lance Namespace specification, interface and generated modules across all languages.
The dependency structure varies by language due to different build and distribution models.
| Command | Description |
|---------|-------------|
| `make lint` | Validate OpenAPI spec with openapi-spec-validator |
| `make gen` | Clean + codegen + lint all languages |
| `make build` | Full build: lint + docs + gen + build all languages |
| `make gen-<lang>` | Codegen one language: `gen-python`, `gen-java`, `gen-rust` |
| `make build-<lang>` | Build one language: `build-python`, `build-java`, `build-rust` |
| `make serve-docs` | Preview docs (auto-runs `gen-java` first) |

### Rust
Inside `java/` or `python/`, you can target specific modules:
`make gen-java-apache-client`, `make build-java-springboot-server`, etc.

For Rust, the interface module `lance-namespace` and implementations (`lance-namespace-impls` for REST and directory namespaces)
are located in the core [lance-format/lance](https://github.com/lance-format/lance) repository.
This is because Rust uses source code builds, and separating modules across repositories makes dependency management complicated.
### Running Tests

The dependency chain is: `lance-namespace` → `lance` → `lance-namespace-impls`

### Other Languages (e.g. Python, Java)

For Python, Java, and other languages, the core `LanceNamespace` interface and generic connect functionality
are maintained in **this repository** (e.g., `lance-namespace` for Python, `lance-namespace-core` for Java).
The core [lance-format/lance](https://github.com/lance-format/lance) repository then imports these modules.

The reason for this import direction is that `lance-namespace-impls` (REST and directory namespace implementations)
are used in the Lance Python and Java bindings, and are exposed back through the corresponding language interfaces.
These language interfaces can also be imported dynamically without the need to have a dependency of the Lance core library bindings in those languages.

### Other Implementations

For namespace implementations other than directory and REST namespaces,
those are stored in the [lance-format/lance-namespace-impls](https://github.com/lance-format/lance-namespace-impls) repository,
with one implementation per language.

### Dependency Diagram

```mermaid
flowchart TB
subgraph this_repo["lance-namespace repo"]
spec["Spec & Generated Clients"]
py_core["Python: lance-namespace"]
java_core["Java: lance-namespace-core"]
end

subgraph lance_repo["lance repo"]
subgraph rust_modules["Rust Modules"]
rs_ns["lance-namespace"]
rs_lance["lance"]
rs_impls["lance-namespace-impls<br/>(dir, rest)"]
end
py_lance["Python: lance"]
java_lance["Java: lance"]
end

subgraph impls_repo["namespace-impls repo"]
polaris["Apache Polaris"] ~~~ hive["Apache Hive"] ~~~ iceberg_rest["Apache Iceberg REST"] ~~~ unity["Unity Catalog"] ~~~ glue["AWS Glue"]
end

%% Rust dependencies (source build)
rs_ns --> rs_lance
rs_lance --> rs_impls

%% Python/Java dependencies
py_core --> py_lance
java_core --> java_lance
rs_impls -.-> py_lance
rs_impls -.-> java_lance

%% Other implementations depend on core interfaces and lance bindings
py_core -.-> impls_repo
java_core -.-> impls_repo
py_lance -.-> impls_repo
java_lance -.-> impls_repo

style this_repo fill:#1565c0,color:#fff
style lance_repo fill:#e65100,color:#fff
style impls_repo fill:#7b1fa2,color:#fff
style rust_modules fill:#ff8a65,color:#000
```

## Repository structure

This repository currently contains the following components:

| Component | Language | Path | Description |
|-----------------------|----------|----------------------------------------|------------------------------------------------------------|
| Spec | | docs/src | Lance Namespace Specification |
| Python Core | Python | python/lance_namespace | Core LanceNamespace interface and connect functionality |
| Python UrlLib3 Client | Python | python/lance_namespace_urllib3_client | Generated Python urllib3 client for Lance REST Namespace |
| Java Core | Java | java/lance-namespace-core | Core LanceNamespace interface and connect functionality |
| Java Apache Client | Java | java/lance-namespace-apache-client | Generated Java Apache HTTP client for Lance REST Namespace |
| Java SpringBoot Server| Java | java/lance-namespace-springboot-server | Generated Java SpringBoot server for Lance REST Namespace |
| Rust Reqwest Client | Rust | rust/lance-namespace-reqwest-client | Generated Rust reqwest client for Lance REST Namespace |


## Install uv
```bash
# Python
cd python/lance_namespace && uv sync && uv run pytest
cd python/lance_namespace_urllib3_client && uv sync && uv run pytest

We use [uv](https://docs.astral.sh/uv/getting-started/installation/) for development.
Make sure it is installed, and run:
# Java (checkstyle + spotless + maven build with tests)
cd java && make check # style checks only
cd java && make build # full build including tests

```bash
make sync
# Rust
cd rust && cargo test --all-features
```

## Lint

To ensure the OpenAPI definition is valid, you can use the lint command to check it.
### Java Style Checks

Java uses Spotless (formatting) and Checkstyle (linting). The `java/Makefile` `check` target
runs both. These are enforced in CI. Fix formatting issues with:
```bash
make lint
cd java && mvn spotless:apply
```

## Build

There are 3 commands that is available at top level as well as inside each language folder:
## Generated vs Hand-Written Code

- `make clean`: remove all codegen modules
- `make gen`: codegen and lint all modules (depends on `clean`)
- `make build`: build all modules (depends on `gen`)
**Never manually edit generated code.** CI (`spec.yml`) verifies that running `make clean && make gen`
produces no diff — any manual edits to generated files will be rejected.

You can also run `make <command>-<language>` to only run the command in the specific language, for example:
### Hand-written (edit these):
- `docs/src/rest.yaml` — OpenAPI spec, the single source of truth
- `python/lance_namespace/` — Python core interface, connect factory, error hierarchy
- `java/lance-namespace-core/` — Java core interface, connect factory, errors
- `java/lance-namespace-core-async/` — Java async wrapper around core
- `java/openapi-templates/` — Custom Mustache templates for Java codegen

- `make gen-python`: codegen and lint all Python modules
- `make build-rust`: build all Rust modules
### Generated (do not edit):
- `python/lance_namespace_urllib3_client/` — Python HTTP client + all model classes
- `java/lance-namespace-apache-client/` — Java Apache HttpClient implementation
- `java/lance-namespace-async-client/` — Java native async HttpClient implementation
- `java/lance-namespace-springboot-server/` — Spring Boot server skeleton
- `rust/lance-namespace-reqwest-client/` — Rust reqwest client

You can also run `make <command>-<language>-<module>` inside a language folder to run the command against a specific module, for example:
Codegen uses `openapi-generator-cli` (v7.12.0 via uv). Language-specific ignore files
(e.g., `.apache-client-ignore`) control which generated artifacts are committed.

- `make gen-rust-reqwest-client`: codegen and lint the Rust reqwest client module
- `make build-java-springboot-server`: build the Java Spring Boot server module
## Architecture

## Documentation
### Plugin/Registry Pattern

### Setup
Both Python and Java use a plugin system where implementations are discovered at runtime:

The documentation website is built using [mkdocs-material](https://pypi.org/project/mkdocs-material).
Start the server with:

```shell
make serve-docs
```

### Generated Model Documentation

The operation request and response model documentation is generated from the Java Apache Client.
When building or serving docs, the Java client must be generated first to produce the model Markdown files,
which are then copied to `docs/src/operations/models/`.

This happens automatically when running:

```shell
make build-docs # or make serve-docs
```
**Python** (`lance_namespace/__init__.py`):
- `connect(impl, properties)` — factory that resolves an implementation name
- `register_namespace_impl(name, class_path)` — register external implementations
- Resolution: native aliases ("dir", "rest") → registered impls → full module.Class path
- Uses `importlib.import_module()` for dynamic loading

These commands depend on `gen-java` to ensure the Java client docs are up-to-date before building the documentation.
**Java** (`LanceNamespace.java`):
- `LanceNamespace.connect(impl, properties, allocator)` — static factory
- `registerNamespaceImpl(name, className)` / `unregisterNamespaceImpl(name)`
- Resolution: `NATIVE_IMPLS` map → `REGISTERED_IMPLS` concurrent map → full class name
- Uses reflection with no-arg constructor + `initialize()` call
- Requires Apache Arrow `BufferAllocator` parameter

### Understanding the Build Process
### Error System

The contents in `lance-namespace/docs` are for the ease of contributors to edit and preview.
After code merge, the contents are added to the
[main Lance documentation](https://github.com/lance-format/lance/tree/main/docs)
during the Lance doc CI build time, and is presented in the Lance website under
[Lance Namespace Spec](https://lance.org/lance/format/namespace).
Consistent error codes (0-21) across all languages in `ErrorCode` enum/class.
Each code has a corresponding exception class. Factory function `from_error_code()` maps codes to exceptions.

## Release Process
### API Operations

This section describes the CI/CD workflows for automated version management, releases, and publishing.
The REST spec defines 40+ endpoints under `/v1/` organized as:
- **Namespace ops:** create, list, describe, drop, exists
- **Table ops:** CRUD, schema mutations, versioning, indexing, tags, query/insert/merge
- **Transaction ops:** describe, alter
- **Batch ops:** batch version create, batch commit (atomic multi-table)

### Version Scheme
All operations are default methods on `LanceNamespace` that throw `UnsupportedOperationError`,
allowing implementations to opt into only the operations they support.

- **Stable releases:** `X.Y.Z` (e.g., 1.2.3)
- **Preview releases:** `X.Y.Z-beta.N` (e.g., 1.2.3-beta.1)
### Documentation Build

### Creating a Release
Model docs are generated from the Java Apache Client's Javadoc and copied to `docs/src/operations/models/`.
This is why `build-docs` and `serve-docs` depend on `gen-java`.

1. **Create Release Draft**
- Go to Actions → "Create Release"
- Select parameters:
- Release type (major/minor/patch)
- Release channel (stable/preview)
- Dry run (test without pushing)
- Run workflow (creates a draft release)
## Dependency Structure

2. **Review and Publish**
- Go to the [Releases page](../../releases) to review the draft
- Edit release notes if needed
- Click "Publish release" to:
- For stable releases: Trigger automatic publishing for Java, Python, Rust
- For preview releases: Create a beta release (not published)
- **Rust:** Interface lives in the [lance](https://github.com/lance-format/lance) repo, not here. Only the generated reqwest client is here.
- **Python/Java:** Core interfaces live here; implementations are in the lance repo and consume these interfaces.
- The Python core package re-exports all model types from the generated urllib3 client, so downstream code only needs to import `lance_namespace`.
19 changes: 16 additions & 3 deletions java/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -133,11 +133,24 @@ check-apache-client:
check-springboot-server:
./mvnw checkstyle:check spotless:check -pl lance-namespace-springboot-server -am

# lance-namespace-base module (hand-written, no codegen)
.PHONY: lint-base
lint-base: gen-apache-client
./mvnw spotless:apply -pl lance-namespace-base -am

.PHONY: build-base
build-base: build-apache-client build-core lint-base
./mvnw install -pl lance-namespace-base -am

.PHONY: check-base
check-base:
./mvnw checkstyle:check spotless:check -pl lance-namespace-base -am

.PHONY: check
check: check-apache-client check-async-client check-springboot-server check-core check-core-async
check: check-apache-client check-async-client check-springboot-server check-core check-core-async check-base

.PHONY: lint
lint: lint-apache-client lint-async-client lint-springboot-server lint-core lint-core-async
lint: lint-apache-client lint-async-client lint-springboot-server lint-core lint-core-async lint-base

.PHONY: build
build: build-apache-client build-async-client build-springboot-server build-core build-core-async
build: build-apache-client build-async-client build-springboot-server build-core build-core-async build-base
Loading
Loading