You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This module defines the core data containers of the PLAID data model.
3
+
This module defines the core data container of the PLAID data model.
4
4
5
5
## Key classes
6
6
7
7
| Class | File | Description |
8
8
|-------|------|-------------|
9
-
|`Dataset`|`dataset.py`| Ordered collection of `Sample` objects sharing a common `ProblemDefinition`. Main entry point for loading and manipulating simulation data. |
10
-
|`Sample`|`sample.py`| Single simulation snapshot containing mesh coordinates and field values as `Features`. |
11
-
|`Features`|`features.py`| Named tensor-like container with shape and dtype metadata. Wraps numpy arrays. |
12
-
|`FeatureIdentifier`|`feature_identifier.py`| Immutable key (name + location) used to uniquely identify a feature across samples. |
13
-
|`DefaultManager`|`managers/default_manager.py`| Manages default values and missing data for features within a dataset. |
9
+
|`Sample`|`sample.py`| Single simulation snapshot containing mesh coordinates and field values. Implemented as a pydantic `BaseModel`. This is the main data container exposed by plaid. |
10
+
|`DefaultManager`|`managers/default_manager.py`| Manages default values and missing data for features within a sample. |
11
+
12
+
Helper functions live in `utils.py` (e.g. `get_number_of_samples`, `get_sample_ids`)
13
+
and are re-exported at the top level of the `plaid` package.
14
+
15
+
> Note: the v1.0.0 reorganization removed the `Dataset`, `Features` and
16
+
> `FeatureIdentifier` classes. A collection of samples is now read/written through the
17
+
> `storage` layer rather than a dedicated `Dataset` class. See `docs/source/upgrade_guide.md`.
14
18
15
19
## Design constraints
16
20
17
-
-`Dataset` is a **large class** (~1800 lines). Avoid adding new responsibilities to it. Prefer extracting logic into helper functions or dedicated modules.
18
-
-`Sample` and `Features` are **value objects** -- they should remain simple, with minimal business logic.
19
-
-`FeatureIdentifier` is **immutable and hashable** -- it is used as dictionary keys throughout the codebase. Do not add mutable state.
20
-
- All containers must support **serialization** through the storage backends (zarr, hf_datasets, cgns).
21
+
-`Sample` is a **value object** built on pydantic -- keep it focused on holding mesh
22
+
and field data, with minimal business logic. Prefer extracting heavy logic into
23
+
helper functions or dedicated modules.
24
+
-`DefaultManager` centralizes default/missing-data handling -- do not duplicate this
25
+
logic inside `Sample`.
26
+
- All containers must support **serialization** through the storage backends
27
+
(zarr, hf_datasets, cgns).
21
28
22
29
## Downstream impact
23
30
24
-
These classes are the public API surface consumed by downstream libraries and end users. Any signature change is a **breaking change** that requires a major version bump.
31
+
`Sample` is part of the public API surface consumed by downstream libraries and end
32
+
users. Any signature change is a **breaking change** that requires a major version bump.
25
33
26
34
## Testing
27
35
28
-
Tests are in `tests/`. When modifying a container class, verify that storage round-trips (write then read) still produce identical data.
36
+
Tests are in `tests/`. When modifying a container class, verify that storage round-trips
0 commit comments