📄 (docs) update with respect to recent code changes

casenave · casenave · commit 33810607f45e · 2026-06-06T21:49:54.000+02:00
diff --git a/docs/source/concepts/dataset.md b/docs/source/concepts/dataset.md
@@ -106,18 +106,19 @@ sample = converter.to_plaid(
 
 ## Metadata and problem definitions
 
-`save_to_disk(...)` writes shared metadata (`infos.yaml`, schemas, CGNS types,
-constants) and can also persist one or more `ProblemDefinition` objects.
+`save_to_disk(...)` writes `infos.yaml` and can also persist one or more
+`ProblemDefinition` objects. For non-CGNS backends, it also writes shared
+schemas, CGNS types, and constants. The CGNS backend stores self-contained
+samples and therefore does not write these derived metadata files.
 
 The dataset-level `infos.yaml` payload is represented by
 [`Infos`](infos.md). It stores metadata such as legal ownership, licensing,
 data production context, data description, split sample counts, and the storage
-backend. The `infos` argument accepts either an `Infos` instance or a plain
-dictionary with the same schema:
+backend. The `infos` argument accepts an `Infos` instance:
 
 ```python
 from plaid import ProblemDefinition
-from plaid.infos import DataDescription, Infos
+from plaid.infos import Infos
 from plaid.storage import save_to_disk
 
 pb_def = ProblemDefinition(
@@ -129,8 +130,7 @@ pb_def = ProblemDefinition(
 infos = Infos(
     owner="CompanyX",
     license="proprietary",
-    data_description=DataDescription(number_of_samples=3),
-    num_samples={"train": 3},
+    data_description="Example dataset with three training samples.",
 )
 
 save_to_disk(
@@ -148,7 +148,7 @@ The metadata and problem definitions can be loaded later with:
 from plaid.infos import Infos
 from plaid.storage import load_problem_definitions_from_disk
 
-infos = Infos.from_path("my_plaid_dataset")
+infos = Infos.from_path("my_plaid_dataset/infos.yaml")
 pb_defs = load_problem_definitions_from_disk("my_plaid_dataset")
 pb_def = pb_defs["regression_1"]
 ```
diff --git a/docs/source/concepts/disk_format.md b/docs/source/concepts/disk_format.md
@@ -69,10 +69,8 @@ plaid-check /path/to/plaid_dataset --split train --json
 plaid-check /path/to/plaid_dataset --strict
 ```
 
-The minimal required layout checked by the CLI is:
+The minimal required layout checked by the CLI depends on the declared backend:
 
-- `infos.yaml`
-- `variable_schema.yaml`
-- `cgns_types.yaml`
-- `constants/`
-- `data/`
+- for `cgns`: `infos.yaml` and `data/`;
+- for non-CGNS backends: `infos.yaml`, `variable_schema.yaml`,
+  `cgns_types.yaml`, `constants/`, and `data/`.
diff --git a/docs/source/concepts/infos.md b/docs/source/concepts/infos.md
@@ -12,9 +12,8 @@ In the current API, infos stores:
 - `owner` and `license`, required string entries describing the dataset
   ownership and licensing
 - `data_production`, for optional production context such as simulator,
-  hardware, contact, or location
-- `data_description`, for optional dataset description entries such as the
-  number of samples, DOE, inputs, and outputs
+  hardware, computation duration, script, or contact
+- `data_description`, for an optional free-form dataset description string
 - `num_samples`, as a dictionary keyed by split name, populated by storage writers
 - `storage_backend`, as a storage backend identifier, populated by storage writers
 
@@ -48,14 +47,19 @@ writing `infos.yaml`.
 
 ## Loading from disk
 
-Load infos from a complete dataset path or directly from an `infos.yaml` file:
+Load infos directly from an `infos.yaml` file:
 
 ```python
-infos = Infos.from_path("/path/to/plaid_dataset")
+infos = Infos.from_path("/path/to/plaid_dataset/infos.yaml")
 ```
 
-When a directory is provided, `Infos.from_path(...)` looks for `infos.yaml`
-inside that directory.
+Use `Infos.from_path(...)` when you have the YAML file path. Use
+`plaid.storage.load_infos_from_disk("/path/to/plaid_dataset")` when you have the
+dataset root directory.
+
+When the path has no suffix, `Infos.from_path(...)` appends `.yaml`. For
+example, `Infos.from_path("/path/to/plaid_dataset/infos")` reads
+`/path/to/plaid_dataset/infos.yaml`. Existing directories are rejected.
 
 ## Saving
 
@@ -65,11 +69,12 @@ Save to YAML:
 infos.save_to_file("/path/to/plaid_dataset/infos.yaml")
 ```
 
-If a directory path is provided, the file is saved as `infos.yaml` inside that
-directory. Direct YAML writing requires complete persisted metadata: `owner`,
-`license`, `num_samples`, and `storage_backend`. When using
-`save_to_disk(..., infos=...)`, PLAID fills `num_samples` and `storage_backend`
-automatically before writing `infos.yaml`.
+When the path has no suffix, `save_to_file(...)` appends `.yaml`; if a non-YAML
+suffix is provided, it is replaced with `.yaml`. Existing directories are
+rejected. Direct YAML writing requires complete persisted metadata: `owner`,
+`license`, `num_samples`, and `storage_backend`. When using `save_to_disk(...,
+infos=...)`, PLAID fills `num_samples` and `storage_backend` automatically before
+writing `infos.yaml`.
 
 ## Typed access and serialization
 
diff --git a/docs/source/concepts/viewer.md b/docs/source/concepts/viewer.md
@@ -9,14 +9,20 @@ the mesh and fields interactively.
 
 From a terminal, run:
 
+```bash
+plaid-viewer
+```
+
+From a `uv`-managed development checkout, use `uv run plaid-viewer` instead.
+
 ```bash
 uv run plaid-viewer
 ```
 
 You can also start directly from a local datasets folder:
 
 ```bash
-uv run plaid-viewer --datasets-root /path/to/datasets
+plaid-viewer --datasets-root /path/to/datasets
 ```
 
 Then open the address shown by the command, usually:
diff --git a/docs/source/tutorials/storage.md b/docs/source/tutorials/storage.md
@@ -96,9 +96,11 @@ infos = Infos(
     owner="NeuralOperator (https://zenodo.org/records/13993629)",
     license="cc-by-4.0",
     data_description="No changes to data content from original dataset",
-    type="simulation",
-    physics="phase-field fracture models for brittle fracture",
-    script="Subset 'res-SENS' of the initial dataset, 1/5th time steps, converted to PLAID format for standardized access; no changes to data content."
+    data_production={
+        "type": "simulation",
+        "physics": "phase-field fracture models for brittle fracture",
+        "script": "Subset 'res-SENS' of the initial dataset, 1/5th time steps, converted to PLAID format for standardized access; no changes to data content.",
+    },
 )