Skip to content

Commit 406bf21

Browse files
committed
📄 docs(agents): update AGENTS.md files for v1.0.0 structure
Sync the root, containers and storage AGENTS.md with the post-V1 repo: - root: fix project tree (drop removed bridges/pipelines/post/examples, add cli/types/viewer/downloadable_examples), fix Core abstractions table (Dataset/Features/FeatureIdentifier removed -> Sample/Infos/ProblemDefinition), doc tooling Sphinx -> Zensical, build command -> docs/generate_doc.sh - containers: Dataset/Features/FeatureIdentifier no longer exist; document Sample (pydantic BaseModel), DefaultManager and utils helpers - storage: add backend_api.py (BackendModule Protocol) and the BACKENDS registry
1 parent 8da3198 commit 406bf21

3 files changed

Lines changed: 55 additions & 31 deletions

File tree

AGENTS.md

Lines changed: 24 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ issues, or documentation).
4848
- **Build backend**: setuptools with setuptools-scm (dynamic versioning)
4949
- **Linter/formatter**: ruff
5050
- **Test framework**: pytest
51-
- **Documentation**: Sphinx (ReadTheDocs)
51+
- **Documentation**: Zensical (with mkdocstrings for the API reference), published on ReadTheDocs
5252
- **CI/CD**: GitHub Actions
5353

5454
## Project structure
@@ -61,38 +61,47 @@ issues, or documentation).
6161
├── CHANGELOG.md <- Version history
6262
├── CONTRIBUTING.md <- Contribution guidelines
6363
├── src/plaid/ <- Source code
64-
│ ├── __init__.py
64+
│ ├── __init__.py <- Public API: Sample, Infos, ProblemDefinition
6565
│ ├── constants.py <- Global constants
6666
│ ├── problem_definition.py <- ProblemDefinition (core concept)
67-
│ ├── containers/ <- Dataset, Sample, Features (see nested AGENTS.md)
67+
│ ├── infos.py <- Infos (dataset/problem metadata)
68+
│ ├── containers/ <- Sample container + helpers (see nested AGENTS.md)
6869
│ ├── storage/ <- Storage backends: zarr, hf_datasets, cgns (see nested AGENTS.md)
69-
│ ├── bridges/ <- HuggingFace bridge utilities
70-
│ ├── pipelines/ <- sklearn-compatible processing blocks
71-
│ ├── post/ <- Post-processing (metrics, bisection)
72-
│ └── examples/ <- Built-in example datasets
70+
│ ├── types/ <- Shared type aliases and definitions
71+
│ ├── cli/ <- Command-line entry points (e.g. plaidcheck)
72+
│ ├── viewer/ <- Dataset visualization services
73+
│ └── downloadable_examples/ <- Built-in downloadable example datasets
7374
├── tests/ <- Test suite
7475
├── docs/ <- Sphinx documentation source
75-
├── examples/ <- Usage examples
76-
└── benchmarks/ <- Performance benchmarks
76+
└── examples/ <- Usage examples
7777
```
7878

79+
> Note: the v1.0.0 reorganization removed the top-level `Dataset` re-export and the
80+
> `bridges/`, `pipelines/`, `post/` and `examples/` source packages. Data is now handled
81+
> through `Sample` objects and the `storage` layer. See `docs/source/upgrade_guide.md`.
82+
7983
## Architecture and key concepts
8084

8185
### Core abstractions
8286

8387
| Concept | Module | Description |
8488
|---------|--------|-------------|
8589
| `ProblemDefinition` | `problem_definition.py` | Declares fields, meshes, and their roles (input/output/context) for a physics problem |
86-
| `Sample` | `containers/sample.py` | One simulation snapshot: mesh + field values |
87-
| `Dataset` | `containers/dataset.py` | Ordered collection of Samples with shared ProblemDefinition |
88-
| `Features` | `containers/features.py` | Named tensor-like data with metadata |
89-
| `FeatureIdentifier` | `containers/feature_identifier.py` | Unique key to identify a feature across samples |
90+
| `Infos` | `infos.py` | Metadata describing a dataset/problem (legal, data production, etc.) |
91+
| `Sample` | `containers/sample.py` | One simulation snapshot: mesh + field values (a pydantic `BaseModel`) |
92+
93+
`Sample`, `Infos` and `ProblemDefinition` are re-exported at the top level of the
94+
`plaid` package, together with the helpers `get_number_of_samples` and `get_sample_ids`
95+
from `containers/utils.py`.
9096

9197
### Storage pattern
9298

9399
Storage uses a **Registry pattern** (`storage/registry.py`) to dispatch read/write
94100
operations to the correct backend (zarr, hf_datasets, cgns). Each backend implements
95-
a `reader.py` and `writer.py` following a common interface defined in `storage/common/`.
101+
a `reader.py` and `writer.py` following the backend contract defined in
102+
`storage/backend_api.py` and the shared interfaces in `storage/common/`.
103+
Reading/writing a collection of samples is done through this storage layer rather
104+
than through a dedicated `Dataset` class.
96105

97106
## Code conventions
98107

@@ -182,7 +191,7 @@ uv run ruff check --fix .
182191
uv run ruff format .
183192

184193
# Build documentation
185-
cd docs && make html
194+
bash docs/generate_doc.sh
186195
```
187196

188197
## Contribution workflow

src/plaid/containers/AGENTS.md

Lines changed: 21 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,37 @@
11
# AGENTS.md -- plaid/containers
22

3-
This module defines the core data containers of the PLAID data model.
3+
This module defines the core data container of the PLAID data model.
44

55
## Key classes
66

77
| Class | File | Description |
88
|-------|------|-------------|
9-
| `Dataset` | `dataset.py` | Ordered collection of `Sample` objects sharing a common `ProblemDefinition`. Main entry point for loading and manipulating simulation data. |
10-
| `Sample` | `sample.py` | Single simulation snapshot containing mesh coordinates and field values as `Features`. |
11-
| `Features` | `features.py` | Named tensor-like container with shape and dtype metadata. Wraps numpy arrays. |
12-
| `FeatureIdentifier` | `feature_identifier.py` | Immutable key (name + location) used to uniquely identify a feature across samples. |
13-
| `DefaultManager` | `managers/default_manager.py` | Manages default values and missing data for features within a dataset. |
9+
| `Sample` | `sample.py` | Single simulation snapshot containing mesh coordinates and field values. Implemented as a pydantic `BaseModel`. This is the main data container exposed by plaid. |
10+
| `DefaultManager` | `managers/default_manager.py` | Manages default values and missing data for features within a sample. |
11+
12+
Helper functions live in `utils.py` (e.g. `get_number_of_samples`, `get_sample_ids`)
13+
and are re-exported at the top level of the `plaid` package.
14+
15+
> Note: the v1.0.0 reorganization removed the `Dataset`, `Features` and
16+
> `FeatureIdentifier` classes. A collection of samples is now read/written through the
17+
> `storage` layer rather than a dedicated `Dataset` class. See `docs/source/upgrade_guide.md`.
1418
1519
## Design constraints
1620

17-
- `Dataset` is a **large class** (~1800 lines). Avoid adding new responsibilities to it. Prefer extracting logic into helper functions or dedicated modules.
18-
- `Sample` and `Features` are **value objects** -- they should remain simple, with minimal business logic.
19-
- `FeatureIdentifier` is **immutable and hashable** -- it is used as dictionary keys throughout the codebase. Do not add mutable state.
20-
- All containers must support **serialization** through the storage backends (zarr, hf_datasets, cgns).
21+
- `Sample` is a **value object** built on pydantic -- keep it focused on holding mesh
22+
and field data, with minimal business logic. Prefer extracting heavy logic into
23+
helper functions or dedicated modules.
24+
- `DefaultManager` centralizes default/missing-data handling -- do not duplicate this
25+
logic inside `Sample`.
26+
- All containers must support **serialization** through the storage backends
27+
(zarr, hf_datasets, cgns).
2128

2229
## Downstream impact
2330

24-
These classes are the public API surface consumed by downstream libraries and end users. Any signature change is a **breaking change** that requires a major version bump.
31+
`Sample` is part of the public API surface consumed by downstream libraries and end
32+
users. Any signature change is a **breaking change** that requires a major version bump.
2533

2634
## Testing
2735

28-
Tests are in `tests/`. When modifying a container class, verify that storage round-trips (write then read) still produce identical data.
36+
Tests are in `tests/`. When modifying a container class, verify that storage round-trips
37+
(write then read) still produce identical data.

src/plaid/storage/AGENTS.md

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Storage follows a **Registry pattern**:
99
```
1010
storage/
1111
├── registry.py <- Dispatches to the correct backend based on format
12+
├── backend_api.py <- Backend contract (BackendModule Protocol)
1213
├── reader.py <- Public read API (delegates to backend readers)
1314
├── writer.py <- Public write API (delegates to backend writers)
1415
├── common/ <- Abstract interfaces and shared utilities
@@ -23,15 +24,20 @@ storage/
2324

2425
## How it works
2526

26-
1. The **registry** (`registry.py`) maps format identifiers to backend modules.
27+
1. The **registry** (`registry.py`) holds a `BACKENDS` dict mapping each format name
28+
(`"cgns"`, `"hf_datasets"`, `"zarr"`) to its backend class, exposed through
29+
`get_backend(name)` and `available_backends()`.
2730
2. The public `reader.py` and `writer.py` at the top level accept a format parameter and delegate to the appropriate backend.
28-
3. Each backend implements the interfaces defined in `common/reader.py` and `common/writer.py`.
31+
3. Each backend exposes a backend class (e.g. `ZarrBackend`, `HFBackend`, `CgnsBackend`)
32+
that conforms to the `BackendModule` Protocol in `backend_api.py`, and implements the
33+
read/write logic in its `reader.py` and `writer.py`.
2934

3035
## Adding a new backend
3136

3237
1. Create a new subdirectory under `storage/` (e.g., `storage/my_format/`).
33-
2. Implement `reader.py` and `writer.py` following the interfaces in `common/`.
34-
3. Register the new backend in `registry.py`.
38+
2. Implement a backend class conforming to the `BackendModule` Protocol
39+
(`backend_api.py`), with its `reader.py` and `writer.py` following the interfaces in `common/`.
40+
3. Register the new backend by adding it to the `BACKENDS` dict in `registry.py`.
3541
4. Add round-trip tests (write then read) to verify data integrity.
3642

3743
## Design constraints

0 commit comments

Comments
 (0)