Skip to content

Commit 816a7c0

Browse files
committed
feat(archetype): scaffold lance-graph-archetype crate (DU-2.1..2.6)
Adds the Archetype Transcode Crate to the workspace: ECS-style Component + Processor traits, World meta-state, CommandBroker deferred-mutation queue, and ArchetypeError — all Arrow-backed, inside-BBB. 12 unit tests pass. Workspace member entry added to root Cargo.toml. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
1 parent ddb3017 commit 816a7c0

10 files changed

Lines changed: 613 additions & 0 deletions

File tree

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# Archetype Transcode Crate Scaffold — v1
2+
3+
> **Status:** In progress (2026-04-24)
4+
> **Owner:** @archetype-specialist, @truth-architect
5+
> **Scope:** NEW crate `crates/lance-graph-archetype/`; deps only on `lance-graph-contract`, `arrow`, `lance` (peer-dep, optional).
6+
> **Depends on:** ADR-0001 Decision 1 (transcode-not-bridge). No runtime dependency on upstream Python.
7+
8+
## Goal
9+
10+
Flip `lance-graph-archetype` from "does-not-exist" to "scaffolded-and-locked." Ship the 6 foundational trait/struct files per ADR-0001 Decision 1. No runtime behaviour yet — this is the LOCKED-MAPPING-INCOMPLETE → LOCKED-AND-SCAFFOLDED pivot.
11+
12+
## Deliverables
13+
14+
- **DU-2.1**`crates/lance-graph-archetype/Cargo.toml` + `src/lib.rs` + workspace `members` entry in root `Cargo.toml`.
15+
- **DU-2.2**`src/component.rs: pub trait Component { fn arrow_field() -> arrow::datatypes::Field; fn type_id() -> &'static str; }` plus a test-only `MockComponent` impl asserting trait-object construction.
16+
- **DU-2.3**`src/processor.rs: pub trait Processor { fn matches(schema: &arrow::datatypes::Schema) -> bool; fn process(batch: arrow::record_batch::RecordBatch) -> Result<arrow::record_batch::RecordBatch, ArchetypeError>; }`.
17+
- **DU-2.4**`src/world.rs: pub struct World { tick: u64, dataset_uri: String }` with `new() / tick() / current_tick() / fork(&self, branch: &str) / at_tick(&self, tick: u64)` methods. `fork()` and `at_tick()` return `Err(ArchetypeError::Unimplemented { method: "..." })` stubs — docstrings tie to ADR-0001:61-72 / 95.
18+
- **DU-2.5**`src/command_broker.rs: pub struct CommandBroker { queue: Vec<Command>, ... }` + `pub enum Command { Spawn, Despawn, Update }` — channel-based drain interface with `submit() / drain()` method stubs.
19+
- **DU-2.6**`src/error.rs: pub enum ArchetypeError { Unimplemented { method: &'static str }, SchemaMismatch { ... }, LanceIo(...) }` with `thiserror::Error` impl.
20+
21+
## Non-goals (explicit)
22+
23+
- Runtime World tick behaviour — stubs only.
24+
- `AsyncProcessor` (Python async equivalent) — future follow-up.
25+
- Entity=`PersonaCard` wiring — DU-2.7, later PR.
26+
- Lance dataset integration beyond the `dataset_uri: String` placeholder — the `fork()``lance::checkout(branch)` wiring is DU-2.8.
27+
28+
## Acceptance criteria
29+
30+
- `cargo check -p lance-graph-archetype` compiles cleanly.
31+
- `cargo test -p lance-graph-archetype` — minimum 4 tests pass (one per core trait + one per stub-returns-Unimplemented).
32+
- `cargo test --workspace` — no regressions in other crates.
33+
- Root `Cargo.toml` workspace.members updated.
34+
- `STATUS_BOARD.md` DU-2 row status: Queued → In progress.
35+
- Verdict flip in `.claude/plans/unified-integration-v1.md §6`: Archetype row `LOCKED-MAPPING-INCOMPLETE``LOCKED-AND-SCAFFOLDED`.
36+
- `.claude/board/INTEGRATION_PLANS.md` — prepend entry pointing to this plan file.
37+
- `.claude/board/LATEST_STATE.md § Contract Inventory` — add a new block for `lance-graph-archetype` naming the shipped types.
38+
- `.claude/board/EPIPHANIES.md` — prepend short FINDING entry noting scaffold landed.
39+
40+
## Architecture notes
41+
42+
Per ADR-0001 Decision 1 (`.claude/adr/0001-archetype-transcode-stack.md:14-102`): this crate defines its OWN Rust interface. It does NOT mirror the Python `VangelisTech/archetype` API. The Python repo is a DESIGN SPEC, not a runtime dependency. "Upstream Python API unstable" is NOT a blocker.
43+
44+
Per ADR-0001 Decision 3 (`adr/0001-archetype-transcode-stack.md:320-334`): BBB invariant bans `Vsa16kF32` / `RoleKey` / `NarsTruth` / `BlackboardEntry` from crossing the membrane. Archetype types defined in this crate are INSIDE-BBB; they do NOT appear on `CognitiveEventRow`. The scalar projection for "archetype tick happened" is already covered by `CognitiveEventRow.cycle_fp_hi/lo` + `MetaWord`.
45+
46+
Mapping (locked, do not re-litigate):
47+
48+
| ECS concept | lance-graph-contract type | This crate |
49+
|---|---|---|
50+
| Entity | `contract::persona::PersonaCard` | imported, not redefined |
51+
| World | `contract::a2a_blackboard::Blackboard` (runtime) + `World { dataset_uri, tick }` (archetype meta) | the latter is new here |
52+
| Tick | `contract::collapse_gate::GateDecision` fire | imported, not redefined |
53+
| Component | trait in this crate | **DU-2.2** |
54+
| Processor | trait in this crate | **DU-2.3** |
55+
| CommandBroker | struct in this crate | **DU-2.5** |
56+
57+
## File layout
58+
59+
```
60+
crates/lance-graph-archetype/
61+
Cargo.toml
62+
src/
63+
lib.rs # pub use component::*; etc.
64+
component.rs # trait Component
65+
processor.rs # trait Processor
66+
world.rs # struct World
67+
command_broker.rs # struct CommandBroker, enum Command
68+
error.rs # enum ArchetypeError (thiserror)
69+
```
70+
71+
## Test layout
72+
73+
Each module gets a `#[cfg(test)] mod tests` with at minimum one test. Minimum 4 tests total:
74+
75+
1. `component::tests::mock_component_has_arrow_field`
76+
2. `processor::tests::trait_object_is_constructable`
77+
3. `world::tests::fork_returns_unimplemented`
78+
4. `world::tests::tick_increments`
79+
80+
## Dependencies
81+
82+
```toml
83+
[dependencies]
84+
lance-graph-contract = { path = "../lance-graph-contract" }
85+
arrow = { workspace = true }
86+
thiserror = { workspace = true }
87+
88+
[dev-dependencies]
89+
# nothing initially
90+
```

Cargo.lock

Lines changed: 9 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ members = [
77
"crates/lance-graph-contract",
88
"crates/neural-debug",
99
"crates/lance-graph-callcenter",
10+
"crates/lance-graph-archetype",
1011
]
1112
exclude = [
1213
# Python bindings (upstream-inherited, opt-in via --manifest-path)
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
[package]
2+
name = "lance-graph-archetype"
3+
version = "0.1.0"
4+
edition = "2021"
5+
description = "Archetype transcode scaffold (ECS-style types transcoded to Arrow/Lance). Per ADR-0001 Decision 1: defines its OWN Rust interface (not a mirror of upstream Python). Inside-BBB only — types in this crate never cross the CognitiveEventRow membrane."
6+
license = "Apache-2.0"
7+
keywords = ["lance", "graph", "archetype", "ecs", "transcode"]
8+
9+
[dependencies]
10+
lance-graph-contract = { path = "../lance-graph-contract" }
11+
# NOTE: plan specified { workspace = true } for arrow/thiserror, but this
12+
# workspace has no shared [workspace.dependencies] table today; using
13+
# explicit versions consistent with the rest of the codebase (arrow 57,
14+
# thiserror 2). See PR description for the single-line deviation.
15+
arrow = "57"
16+
thiserror = "2"
17+
18+
[dev-dependencies]
19+
# nothing initially
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
//! `CommandBroker` — a channel-based queue for deferred world mutations.
2+
//!
3+
//! Per ADR-0001 Decision 1, `CommandBroker` is the archetype-side
4+
//! equivalent of Bevy's `Commands` / the Python ECS `CommandBroker`:
5+
//! a FIFO queue of world mutations that accumulates during a
6+
//! Processor pass and is drained by the World at tick boundaries.
7+
//!
8+
//! Stub-only at this stage — `submit` accepts commands, `drain` returns
9+
//! what was submitted in order. Actual application-to-World logic lands
10+
//! in DU-2.7 together with Entity wiring.
11+
12+
/// A deferred world-mutation command. The three variants cover the
13+
/// ECS-standard operations (spawn a new entity, despawn an existing
14+
/// one, or update components on one). Payloads are opaque at the
15+
/// scaffold stage; DU-2.7 will parameterise them over concrete
16+
/// `Component` types.
17+
#[derive(Debug, Clone, PartialEq, Eq)]
18+
pub enum Command {
19+
/// Spawn a new entity. The `u64` is a placeholder for the
20+
/// eventually-to-be-typed component bundle identifier.
21+
Spawn(u64),
22+
23+
/// Despawn an entity by its integer ID.
24+
Despawn(u64),
25+
26+
/// Update the entity identified by the first `u64` with the
27+
/// component-bundle identifier in the second `u64`.
28+
Update(u64, u64),
29+
}
30+
31+
/// FIFO queue of deferred commands.
32+
///
33+
/// Used by Processors to schedule world mutations without mutating the
34+
/// World mid-pass (which would break the Arrow-batch transcode model).
35+
/// The World's tick driver calls `drain` at tick boundaries and applies
36+
/// the commands in order. The scaffold uses a `Vec<Command>`; DU-2.7
37+
/// may upgrade to a `std::sync::mpsc::channel` for multi-processor
38+
/// concurrency.
39+
#[derive(Debug, Default, Clone)]
40+
pub struct CommandBroker {
41+
queue: Vec<Command>,
42+
}
43+
44+
impl CommandBroker {
45+
/// Construct an empty broker. No allocation is performed until the
46+
/// first `submit`.
47+
pub fn new() -> Self {
48+
Self { queue: Vec::new() }
49+
}
50+
51+
/// Enqueue a command. O(1) amortised.
52+
pub fn submit(&mut self, cmd: Command) {
53+
self.queue.push(cmd);
54+
}
55+
56+
/// Drain all queued commands in insertion order. Returns an owned
57+
/// `Vec<Command>`; the broker is empty afterwards. O(n).
58+
pub fn drain(&mut self) -> Vec<Command> {
59+
std::mem::take(&mut self.queue)
60+
}
61+
62+
/// Read the current queue length without draining.
63+
pub fn len(&self) -> usize {
64+
self.queue.len()
65+
}
66+
67+
/// `true` iff no commands are queued.
68+
pub fn is_empty(&self) -> bool {
69+
self.queue.is_empty()
70+
}
71+
}
72+
73+
#[cfg(test)]
74+
mod tests {
75+
use super::*;
76+
77+
#[test]
78+
fn new_broker_is_empty() {
79+
let b = CommandBroker::new();
80+
assert_eq!(b.len(), 0);
81+
assert!(b.is_empty());
82+
}
83+
84+
#[test]
85+
fn submit_and_drain_preserves_order() {
86+
let mut b = CommandBroker::new();
87+
b.submit(Command::Spawn(1));
88+
b.submit(Command::Update(1, 7));
89+
b.submit(Command::Despawn(1));
90+
assert_eq!(b.len(), 3);
91+
92+
let drained = b.drain();
93+
assert_eq!(
94+
drained,
95+
vec![Command::Spawn(1), Command::Update(1, 7), Command::Despawn(1)]
96+
);
97+
assert!(b.is_empty());
98+
}
99+
}
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
//! The `Component` trait — ECS-style component definition transcoded to Arrow.
2+
//!
3+
//! Per ADR-0001 Decision 1 (Archetype Transcode, not bridge): a `Component`
4+
//! is a Rust-side type that declares how it projects into an Arrow
5+
//! `Field`. The transcode surface is Arrow because every downstream
6+
//! consumer of this crate lands in a Lance dataset; the `arrow_field`
7+
//! method is what a `Processor` keys its `matches(schema)` check against.
8+
//!
9+
//! This trait deliberately stays Sized-agnostic at the scaffold stage —
10+
//! only associated functions, no self-receiver. Implementors declare
11+
//! static metadata (field shape, type ID) and the runtime machinery
12+
//! lives elsewhere.
13+
14+
use arrow::datatypes::Field;
15+
16+
/// An ECS-style component that knows how to project itself into an Arrow
17+
/// `Field`. Components do not carry row data at this stage — they declare
18+
/// SHAPE. Row data flows through `RecordBatch`es handed to `Processor`.
19+
///
20+
/// **BBB-invariant:** component types defined by implementors live
21+
/// INSIDE-BBB. They do not cross the external membrane (see
22+
/// `lance_graph_contract::external_membrane`). The scalar projection
23+
/// "a component tick happened" is carried by `CognitiveEventRow`'s
24+
/// existing columns (`cycle_fp_hi/lo`, `MetaWord`); this crate does
25+
/// not extend that row.
26+
pub trait Component {
27+
/// Arrow field descriptor for this component. Called once at
28+
/// `Processor::matches` time, not per-row. Implementors should
29+
/// return a `Field` with a stable name and dtype.
30+
fn arrow_field() -> Field;
31+
32+
/// Stable string identifier for this component type. Used by the
33+
/// `CommandBroker` drain path to address entities-by-component
34+
/// without relying on Rust's `TypeId` (which is not stable across
35+
/// builds). Convention: `"<crate>::<type>"`.
36+
fn type_id() -> &'static str;
37+
}
38+
39+
#[cfg(test)]
40+
mod tests {
41+
use super::*;
42+
use arrow::datatypes::DataType;
43+
44+
/// Test-only component used to assert that the trait is implementable
45+
/// and that its metadata is reachable without constructing a value.
46+
struct MockComponent;
47+
48+
impl Component for MockComponent {
49+
fn arrow_field() -> Field {
50+
Field::new("mock_component", DataType::Int64, false)
51+
}
52+
53+
fn type_id() -> &'static str {
54+
"lance_graph_archetype::tests::MockComponent"
55+
}
56+
}
57+
58+
#[test]
59+
fn mock_component_has_arrow_field() {
60+
let field = MockComponent::arrow_field();
61+
assert_eq!(field.name(), "mock_component");
62+
assert_eq!(field.data_type(), &DataType::Int64);
63+
assert!(!field.is_nullable());
64+
}
65+
66+
#[test]
67+
fn mock_component_type_id_is_stable() {
68+
assert_eq!(
69+
MockComponent::type_id(),
70+
"lance_graph_archetype::tests::MockComponent"
71+
);
72+
}
73+
}
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
//! Error type for the archetype transcode crate.
2+
//!
3+
//! Per ADR-0001 Decision 1, this crate defines its own error surface rather
4+
//! than mirroring the Python `VangelisTech/archetype` exceptions. The
5+
//! variants below are scoped to the scaffold (DU-2.1..2.6) — Lance I/O
6+
//! wiring for `World::fork` / `World::at_tick` is deliberately parked
7+
//! behind DU-2.8.
8+
9+
use thiserror::Error;
10+
11+
/// Top-level error type for archetype transcode operations.
12+
///
13+
/// All fallible methods in this crate return `Result<T, ArchetypeError>`.
14+
/// The `Unimplemented` variant is used for stubs that will be wired in
15+
/// follow-up deliverables; see `World::fork` / `World::at_tick` for the
16+
/// canonical example.
17+
#[derive(Debug, Error)]
18+
pub enum ArchetypeError {
19+
/// A stub method that has not yet been wired. The `method` field names
20+
/// the specific method (for example, `"World::fork"`). Once the
21+
/// corresponding deliverable (DU-2.7 / DU-2.8) lands, the variant
22+
/// stays but the call site no longer returns it.
23+
#[error("archetype method `{method}` is not yet implemented (scaffold stub)")]
24+
Unimplemented {
25+
/// Fully-qualified method name, for example `"World::fork"`.
26+
method: &'static str,
27+
},
28+
29+
/// A `Processor::process` invocation received a `RecordBatch` whose
30+
/// schema does not match what the processor declared via `matches`.
31+
/// The `expected` / `actual` fields are human-readable descriptions;
32+
/// no Arrow schema equality is defined at the scaffold stage.
33+
#[error("schema mismatch: expected {expected}, got {actual}")]
34+
SchemaMismatch {
35+
/// Human-readable description of the expected schema.
36+
expected: String,
37+
/// Human-readable description of the actual schema that arrived.
38+
actual: String,
39+
},
40+
41+
/// Placeholder for Lance dataset I/O errors. Once DU-2.8 wires
42+
/// `lance::checkout(branch)` into `World::fork`, the inner type will
43+
/// be upgraded from `String` to `lance::Error`. Today it carries a
44+
/// bare message — no `lance` dependency on this PR per the plan's
45+
/// Non-goals section.
46+
#[error("lance I/O error: {0}")]
47+
LanceIo(String),
48+
}
49+
50+
#[cfg(test)]
51+
mod tests {
52+
use super::*;
53+
54+
#[test]
55+
fn unimplemented_carries_method_name() {
56+
let err = ArchetypeError::Unimplemented { method: "World::fork" };
57+
let msg = format!("{err}");
58+
assert!(msg.contains("World::fork"));
59+
assert!(msg.contains("not yet implemented"));
60+
}
61+
62+
#[test]
63+
fn schema_mismatch_formats() {
64+
let err = ArchetypeError::SchemaMismatch {
65+
expected: "Schema{a: Int32}".to_string(),
66+
actual: "Schema{a: Utf8}".to_string(),
67+
};
68+
let msg = format!("{err}");
69+
assert!(msg.contains("expected"));
70+
assert!(msg.contains("Int32"));
71+
assert!(msg.contains("Utf8"));
72+
}
73+
}

0 commit comments

Comments
 (0)