Skip to content

Commit 3381060

Browse files
committed
📄 (docs) update with respect to recent code changes
1 parent 848763e commit 3381060

5 files changed

Lines changed: 41 additions & 30 deletions

File tree

docs/source/concepts/dataset.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -106,18 +106,19 @@ sample = converter.to_plaid(
106106

107107
## Metadata and problem definitions
108108

109-
`save_to_disk(...)` writes shared metadata (`infos.yaml`, schemas, CGNS types,
110-
constants) and can also persist one or more `ProblemDefinition` objects.
109+
`save_to_disk(...)` writes `infos.yaml` and can also persist one or more
110+
`ProblemDefinition` objects. For non-CGNS backends, it also writes shared
111+
schemas, CGNS types, and constants. The CGNS backend stores self-contained
112+
samples and therefore does not write these derived metadata files.
111113

112114
The dataset-level `infos.yaml` payload is represented by
113115
[`Infos`](infos.md). It stores metadata such as legal ownership, licensing,
114116
data production context, data description, split sample counts, and the storage
115-
backend. The `infos` argument accepts either an `Infos` instance or a plain
116-
dictionary with the same schema:
117+
backend. The `infos` argument accepts an `Infos` instance:
117118

118119
```python
119120
from plaid import ProblemDefinition
120-
from plaid.infos import DataDescription, Infos
121+
from plaid.infos import Infos
121122
from plaid.storage import save_to_disk
122123

123124
pb_def = ProblemDefinition(
@@ -129,8 +130,7 @@ pb_def = ProblemDefinition(
129130
infos = Infos(
130131
owner="CompanyX",
131132
license="proprietary",
132-
data_description=DataDescription(number_of_samples=3),
133-
num_samples={"train": 3},
133+
data_description="Example dataset with three training samples.",
134134
)
135135

136136
save_to_disk(
@@ -148,7 +148,7 @@ The metadata and problem definitions can be loaded later with:
148148
from plaid.infos import Infos
149149
from plaid.storage import load_problem_definitions_from_disk
150150

151-
infos = Infos.from_path("my_plaid_dataset")
151+
infos = Infos.from_path("my_plaid_dataset/infos.yaml")
152152
pb_defs = load_problem_definitions_from_disk("my_plaid_dataset")
153153
pb_def = pb_defs["regression_1"]
154154
```

docs/source/concepts/disk_format.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -69,10 +69,8 @@ plaid-check /path/to/plaid_dataset --split train --json
6969
plaid-check /path/to/plaid_dataset --strict
7070
```
7171

72-
The minimal required layout checked by the CLI is:
72+
The minimal required layout checked by the CLI depends on the declared backend:
7373

74-
- `infos.yaml`
75-
- `variable_schema.yaml`
76-
- `cgns_types.yaml`
77-
- `constants/`
78-
- `data/`
74+
- for `cgns`: `infos.yaml` and `data/`;
75+
- for non-CGNS backends: `infos.yaml`, `variable_schema.yaml`,
76+
`cgns_types.yaml`, `constants/`, and `data/`.

docs/source/concepts/infos.md

Lines changed: 17 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,8 @@ In the current API, infos stores:
1212
- `owner` and `license`, required string entries describing the dataset
1313
ownership and licensing
1414
- `data_production`, for optional production context such as simulator,
15-
hardware, contact, or location
16-
- `data_description`, for optional dataset description entries such as the
17-
number of samples, DOE, inputs, and outputs
15+
hardware, computation duration, script, or contact
16+
- `data_description`, for an optional free-form dataset description string
1817
- `num_samples`, as a dictionary keyed by split name, populated by storage writers
1918
- `storage_backend`, as a storage backend identifier, populated by storage writers
2019

@@ -48,14 +47,19 @@ writing `infos.yaml`.
4847

4948
## Loading from disk
5049

51-
Load infos from a complete dataset path or directly from an `infos.yaml` file:
50+
Load infos directly from an `infos.yaml` file:
5251

5352
```python
54-
infos = Infos.from_path("/path/to/plaid_dataset")
53+
infos = Infos.from_path("/path/to/plaid_dataset/infos.yaml")
5554
```
5655

57-
When a directory is provided, `Infos.from_path(...)` looks for `infos.yaml`
58-
inside that directory.
56+
Use `Infos.from_path(...)` when you have the YAML file path. Use
57+
`plaid.storage.load_infos_from_disk("/path/to/plaid_dataset")` when you have the
58+
dataset root directory.
59+
60+
When the path has no suffix, `Infos.from_path(...)` appends `.yaml`. For
61+
example, `Infos.from_path("/path/to/plaid_dataset/infos")` reads
62+
`/path/to/plaid_dataset/infos.yaml`. Existing directories are rejected.
5963

6064
## Saving
6165

@@ -65,11 +69,12 @@ Save to YAML:
6569
infos.save_to_file("/path/to/plaid_dataset/infos.yaml")
6670
```
6771

68-
If a directory path is provided, the file is saved as `infos.yaml` inside that
69-
directory. Direct YAML writing requires complete persisted metadata: `owner`,
70-
`license`, `num_samples`, and `storage_backend`. When using
71-
`save_to_disk(..., infos=...)`, PLAID fills `num_samples` and `storage_backend`
72-
automatically before writing `infos.yaml`.
72+
When the path has no suffix, `save_to_file(...)` appends `.yaml`; if a non-YAML
73+
suffix is provided, it is replaced with `.yaml`. Existing directories are
74+
rejected. Direct YAML writing requires complete persisted metadata: `owner`,
75+
`license`, `num_samples`, and `storage_backend`. When using `save_to_disk(...,
76+
infos=...)`, PLAID fills `num_samples` and `storage_backend` automatically before
77+
writing `infos.yaml`.
7378

7479
## Typed access and serialization
7580

docs/source/concepts/viewer.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,20 @@ the mesh and fields interactively.
99

1010
From a terminal, run:
1111

12+
```bash
13+
plaid-viewer
14+
```
15+
16+
From a `uv`-managed development checkout, use `uv run plaid-viewer` instead.
17+
1218
```bash
1319
uv run plaid-viewer
1420
```
1521

1622
You can also start directly from a local datasets folder:
1723

1824
```bash
19-
uv run plaid-viewer --datasets-root /path/to/datasets
25+
plaid-viewer --datasets-root /path/to/datasets
2026
```
2127

2228
Then open the address shown by the command, usually:

docs/source/tutorials/storage.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -96,9 +96,11 @@ infos = Infos(
9696
owner="NeuralOperator (https://zenodo.org/records/13993629)",
9797
license="cc-by-4.0",
9898
data_description="No changes to data content from original dataset",
99-
type="simulation",
100-
physics="phase-field fracture models for brittle fracture",
101-
script="Subset 'res-SENS' of the initial dataset, 1/5th time steps, converted to PLAID format for standardized access; no changes to data content."
99+
data_production={
100+
"type": "simulation",
101+
"physics": "phase-field fracture models for brittle fracture",
102+
"script": "Subset 'res-SENS' of the initial dataset, 1/5th time steps, converted to PLAID format for standardized access; no changes to data content.",
103+
},
102104
)
103105

104106

0 commit comments

Comments
 (0)