Skip to content

Commit bc98b7e

Browse files
bitnergadomski
andauthored
feat: add generic search client traits and adapters (#994)
### Description Introduce a family of search client traits with blanket implementations and adapter utilities, replacing ad-hoc per-backend boilerplate with a consistent, extensible design. #### New traits (`stac::api`) | Trait | Purpose | Required method | | ----- | ------- | --------------- | | `ItemsClient` | Single-page item search | `search` | | `StreamItemsClient` | Stream items across all pages | `search_stream` | | `CollectionsClient` | Fetch all collections at once | `collections` | | `PagedCollectionsClient` | Cursor-paginated collections (future-proofing) | `collections_page` | | `StreamCollectionsClient` | Stream all collections | `collections_stream` | | `ArrowItemsClient` *(geoarrow)* | Arrow record batch output | `search_to_arrow` | | `TransactionClient` | Write items and collections | `add_item`, `add_collection` | #### Blanket implementations - `CollectionsClient + Clone + Sync → StreamCollectionsClient` — eagerly fetches all collections and yields as a stream; no wrapper struct needed - `ArrowItemsClient + Sync → ItemsClient + StreamItemsClient` *(geoarrow feature)* — collects record batches synchronously and returns owned items #### Adapter utilities - `PagedItemsStream\<T>` — wraps any `ItemsClient` to provide `StreamItemsClient` via token/skip pagination (`ItemCollection::next`) - `stream_pages_generic` — free function driving the items pagination loop; used by `PagedItemsStream` and all server backends - `stream_pages_collections_generic` — collections equivalent for `PagedCollectionsClient` backends (ready for future paginated `/collections` support) - `RecordBatchReaderAdapter\<I>` *(geoarrow)* — bridges `Iterator<Item = Result<RecordBatch, E>>` to `arrow_array::RecordBatchReader` #### Backend implementations All three server backends (memory, duckdb, pgstac) implement the full trait family. `stac-duckdb` provides `HrefClient` (`ArrowItemsClient`) and `SyncHrefClient` (`ItemsClient` + `CollectionsClient` + `StreamItemsClient` via `Mutex`). `stac-io`'s `Client` implements `StreamItemsClient` with HATEOAS link-following rather than token pagination. #### Design notes - The `ItemsClient + Clone → StreamItemsClient` blanket cannot be added because it would overlap with the `ArrowItemsClient` blanket under Rust's coherence rules. Server backends use `stream_pages_generic` directly in their explicit `StreamItemsClient` impls. - `PagedCollectionsClient` has no blanket `StreamCollectionsClient` for the same reason (would overlap with the `CollectionsClient` blanket). Paginated backends call `stream_pages_collections_generic` in their own impl. ### Related issues - Groundwork for federated search / streaming backends ### Checklist - [x] Unit tests - [x] Documentation, including doctests - [x] Pull request title follows [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/) - [x] Pre-commit hooks pass (`prek run --all-files`) --------- Co-authored-by: Pete Gadomski <pete.gadomski@gmail.com>
1 parent 0f0158e commit bc98b7e

18 files changed

Lines changed: 1006 additions & 126 deletions

File tree

.github/copilot-instructions.md

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# rustac - Agent Instructions
2+
3+
## Workspace map
4+
5+
- `crates/core` (`stac`): canonical types, API traits, shared logic.
6+
- `crates/io`: HTTP and object-store I/O.
7+
- `crates/duckdb`: DuckDB-backed querying.
8+
- `crates/server`: API server backends and wiring.
9+
- `crates/pgstac`: pgstac integration.
10+
- `crates/extensions`, `crates/validate`, `crates/wasm`, `crates/derive`, `crates/cli`: supporting crates.
11+
12+
## Build and validation
13+
14+
```sh
15+
cargo test
16+
cargo test -p stac
17+
prek run --all-files
18+
```
19+
20+
DuckDB tests may require `DUCKDB_LIB_DIR`; alternatively use `--features duckdb-bundled` where supported.
21+
22+
## High-level rules
23+
24+
- Keep changes minimal and crate-local.
25+
- Keep default builds lightweight; gate optional capabilities behind features.
26+
- Preserve cross-crate consistency in naming, error shape, and docs style.
27+
28+
## Error handling
29+
30+
- One public error enum per crate in `src/error.rs` (typically `crate::Error`).
31+
- Use `thiserror`, `#[non_exhaustive]`, and `#[error(transparent)]`/`#[from]` for wrappers.
32+
- Document each error variant with a short doc comment.
33+
- Do not introduce parallel error enums when a new variant on crate `Error` is sufficient.
34+
35+
## API and naming
36+
37+
- Use names that describe behavior (for traits/functions), not implementation detail.
38+
- Avoid redundant suffixes/prefixes (`_generic`, duplicated context words).
39+
- Keep trait families coherent (`Items*`, `Collections*`, streaming variants).
40+
- Prefer explicit conversion adapters where coherence or ownership prevents a blanket impl.
41+
42+
## Features and dependencies
43+
44+
- Heavy or optional functionality must be feature-gated.
45+
- Keep Arrow/GeoArrow/GeoParquet dependencies scoped to relevant features.
46+
- Avoid widening default feature surfaces without strong need.
47+
- Prefer workspace dependency versions and existing crate patterns.
48+
49+
## Async and memory behavior
50+
51+
- Treat `Stream` as the async equivalent of `Iterator`.
52+
- Prefer streaming paths for large result sets.
53+
- If buffering is required, document the reason and bound memory when practical.
54+
- Be explicit about sync/async boundaries (for example, borrowed readers and blocking APIs).
55+
56+
## Documentation
57+
58+
- Library/API behavior belongs in Rust doc comments (`//!` and `///`).
59+
- Use module-level docs for design overviews and trait relationships.
60+
- Keep `docs/` focused on user-facing MkDocs content (CLI/history/site pages), not deep library internals.
61+
- Keep examples realistic and compile-aware (`no_run` when needed).
62+
63+
## Testing
64+
65+
- Favor real implementations for behavior tests.
66+
- Reserve mocks mainly for network/HTTP boundary simulation.
67+
- Unit tests: colocated in `#[cfg(test)] mod tests`.
68+
- Integration tests: crate `tests/` directories.
69+
- Use `#[tokio::test]` for async code paths.
70+
71+
## Code style
72+
73+
- Remove stray/placeholder comments and dead code.
74+
- Keep function docs and type docs short, factual, and behavior-focused.
75+
- Match existing formatting and module organization.
76+
77+
## PR and git hygiene
78+
79+
- Use conventional commit style for PR titles.
80+
- Run `prek run --all-files` before finalizing.
81+
- Keep history clean; squash fixups when requested.
82+
- Include tests and docs updates for behavior changes.

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,4 @@ node_modules/
99
package-lock.json
1010
crates/wasm/tests/__screenshots__
1111
.yarn/
12+
.plans/

crates/core/Cargo.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ rust-version.workspace = true
1313

1414
[features]
1515
std = []
16+
async = ["dep:async-stream", "dep:futures", "dep:futures-core"]
1617
geo = ["dep:geo"]
1718
geoarrow = [
1819
"dep:geoarrow-array",
@@ -28,13 +29,16 @@ geoarrow = [
2829
geoparquet = ["geoarrow", "dep:geoparquet", "dep:parquet"]
2930

3031
[dependencies]
32+
async-stream = { workspace = true, optional = true }
33+
futures = { workspace = true, optional = true }
3134
arrow-array = { workspace = true, optional = true, features = ["chrono-tz"] }
3235
arrow-cast = { workspace = true, optional = true }
3336
arrow-json = { workspace = true, optional = true }
3437
arrow-schema = { workspace = true, optional = true }
3538
bytes.workspace = true
3639
chrono = { workspace = true, features = ["serde"] }
3740
cql2.workspace = true
41+
futures-core = { workspace = true, optional = true }
3842
geo = { workspace = true, optional = true }
3943
geo-traits = { workspace = true, optional = true }
4044
geo-types = { workspace = true, optional = true }

0 commit comments

Comments
 (0)