Skip to content

Commit c675b6d

Browse files
feat: Pydantic-zarr V3 model for Sentinel-1 GRD γ0T RTC stores (#138)
* phase 1: S1 RTC Pydantic models aligned with S2 pattern - Add src/eopf_geozarr/data_api/s1_rtc.py — Zarr V3 Pydantic models for S1 GRD γ0T RTC GeoZarr stores, using pyz.v3 GroupSpec/ArraySpec with TypedDict members (same pattern as s2.py uses pyz.v2) - Models: S1RtcRoot, S1RtcOrbitGroup, S1RtcNativeResolutionDataset, S1RtcOverviewResolutionDataset, S1RtcConditionsGroup - Validation: convention UUIDs, spatial:dimensions, multiscales layout, required data arrays (vv/vh/border_mask), gamma_area presence - Add tests/_test_data/s1_rtc_examples/s1-grd-rtc-31TCH.json — realistic fixture with 3 timesteps, 6 overview levels, 3 gamma_area conditions - Add tests/test_data_api/test_s1_rtc.py — 11 tests: round-trip, structure validation, negative cases (missing orbit, r10m, UUIDs, etc.) - Add conftest fixture s1_rtc_json_example parametrized over all fixtures * refactor: improve Pydantic model definitions and streamline imports in S1 RTC module * fix: standardize spatial dimensions to lowercase in S1 RTC models and test cases * refactor: use zcm.Multiscales typed model per reviewer feedback - Replace dict[str, Any] multiscales field with zcm.Multiscales import - Remove inline MultiscalesTransform/ScaleLevel/Multiscales classes - Update test assertions for Pydantic model attribute access * fix: configure ruff TC001 for Pydantic runtime-evaluated base classes - Add [tool.ruff.lint.flake8-type-checking] runtime-evaluated-base-classes for pydantic.BaseModel so Pydantic field type imports aren't flagged - Remove 4 stale noqa comments auto-fixed by ruff * refactor: replace validators with precise type annotations per review - spatial_dimensions: tuple[Literal['y'], Literal['x']] (removes validator) - spatial_bbox: tuple[float, float, float, float] (removes validator) - spatial_shape: tuple[int, int] (removes validator) - spatial_transform: tuple[float, ...] x6 (removes validator) - S1RtcOrbitGroupMembers: r10m required, others NotRequired (removes validator) - Add resolution_levels() method to S1RtcOrbitGroup - Apply same tuple types to S1RtcConditionsAttrs * ci: disable temporarly pre-commit job in ci.yml Comment out pre-commit job in CI workflow * ci: enable pre-commit checks in CI workflow * ci: transitive actions/cache@v4 dependency --------- Co-authored-by: Loïc Houpert <10154151+lhoupert@users.noreply.github.com>
1 parent b75f6e7 commit c675b6d

10 files changed

Lines changed: 2367 additions & 5 deletions

File tree

.github/workflows/ci.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,10 @@ jobs:
1414
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6
1515
with:
1616
python-version: '3.12'
17-
- uses: pre-commit/action@2c7b3805fd2a0fd8c1884dcaebf91fc102a13ecd # v3.0.1
17+
- name: Install pre-commit
18+
run: pip install pre-commit
19+
- name: Run pre-commit
20+
run: pre-commit run --all-files
1821

1922
test:
2023
runs-on: ${{ matrix.os }}

pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,9 @@ ignore = [
150150
"TRY003", # Long exception messages outside class - common pattern
151151
]
152152

153+
[tool.ruff.lint.flake8-type-checking]
154+
runtime-evaluated-base-classes = ["pydantic.BaseModel"]
155+
153156
[tool.mypy]
154157
python_version = "3.12"
155158
warn_return_any = true

src/eopf_geozarr/data_api/geozarr/common.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
from pydantic.experimental.missing_sentinel import MISSING
2525
from typing_extensions import runtime_checkable
2626

27-
from eopf_geozarr.data_api.geozarr.projjson import ProjJSON # noqa: TC001
27+
from eopf_geozarr.data_api.geozarr.projjson import ProjJSON
2828

2929
if TYPE_CHECKING:
3030
from collections.abc import Mapping

src/eopf_geozarr/data_api/geozarr/geoproj.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
from zarr_cm import geo_proj
99

1010
from eopf_geozarr.data_api.geozarr.common import is_none
11-
from eopf_geozarr.data_api.geozarr.projjson import ProjJSON # noqa: TC001
11+
from eopf_geozarr.data_api.geozarr.projjson import ProjJSON
1212

1313
PROJ_UUID = geo_proj.UUID
1414

src/eopf_geozarr/data_api/geozarr/multiscales/geozarr.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
from pydantic import BaseModel, model_validator
66
from pydantic.experimental.missing_sentinel import MISSING
77
from typing_extensions import TypedDict
8-
from zarr_cm import ConventionMetadataObject # noqa: TC002
8+
from zarr_cm import ConventionMetadataObject
99

1010
from . import tms, zcm
1111

src/eopf_geozarr/data_api/geozarr/multiscales/tms.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
from pydantic import BaseModel
44

5-
from eopf_geozarr.data_api.geozarr.types import ResamplingMethod # noqa: TC001
5+
from eopf_geozarr.data_api.geozarr.types import ResamplingMethod
66

77

88
class TileMatrix(BaseModel):
Lines changed: 285 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,285 @@
1+
"""
2+
Pydantic-zarr integrated models for Sentinel-1 GRD gamma0T RTC GeoZarr stores.
3+
4+
Uses the pyz.v3 GroupSpec/ArraySpec with TypedDict members to enforce strict
5+
structure validation — same pattern as s2.py (which uses pyz.v2 for Zarr V2).
6+
7+
These models validate time-series Zarr V3 stores built from S1Tiling GeoTIFFs
8+
on the Sentinel-2 MGRS grid. This is a *different data product* from the EOPF
9+
L1 GRD models in s1.py — those describe radar-geometry Zarr V2 products.
10+
11+
Store hierarchy::
12+
13+
s1-grd-rtc-{tile}.zarr/
14+
├── zarr.json
15+
├── ascending/
16+
│ ├── zarr.json # zarr_conventions, multiscales, proj:, spatial:
17+
│ ├── r10m/ # native resolution dataset
18+
│ │ ├── vv/ # (time, Y, X) float32
19+
│ │ ├── vh/ # (time, Y, X) float32
20+
│ │ ├── border_mask/ # (time, Y, X) uint8
21+
│ │ ├── time/ # (time,) int64 datetime
22+
│ │ ├── absolute_orbit/
23+
│ │ ├── relative_orbit/
24+
│ │ └── platform/
25+
│ ├── r20m/ … r720m/ # overview levels (vv, vh, border_mask only)
26+
│ └── conditions/
27+
│ └── gamma_area_{orbit}/ # (Y, X) float32
28+
└── descending/
29+
└── (same structure)
30+
"""
31+
32+
from __future__ import annotations
33+
34+
from typing import Any, Literal, NotRequired, Self
35+
36+
from pydantic import BaseModel, Field, model_validator
37+
from typing_extensions import TypedDict
38+
from zarr_cm import geo_proj
39+
from zarr_cm import multiscales as multiscales_cm
40+
from zarr_cm import spatial as spatial_cm
41+
42+
from eopf_geozarr.data_api.geozarr.common import DatasetAttrs
43+
from eopf_geozarr.data_api.geozarr.multiscales.zcm import Multiscales
44+
from eopf_geozarr.pyz.v3 import ArraySpec, GroupSpec
45+
46+
# ============================================================================
47+
# Constants
48+
# ============================================================================
49+
50+
MULTISCALES_UUID = multiscales_cm.UUID
51+
GEO_PROJ_UUID = geo_proj.UUID
52+
SPATIAL_UUID = spatial_cm.UUID
53+
54+
REQUIRED_CONVENTION_UUIDS = frozenset({MULTISCALES_UUID, GEO_PROJ_UUID, SPATIAL_UUID})
55+
56+
ResolutionLevel = Literal["r10m", "r20m", "r60m", "r120m", "r360m", "r720m"]
57+
OrbitDirection = Literal["ascending", "descending"]
58+
Polarisation = Literal["vv", "vh"]
59+
60+
# ============================================================================
61+
# Attributes models
62+
# ============================================================================
63+
64+
65+
class S1RtcOrbitGroupAttrs(BaseModel):
66+
"""Attributes for an orbit-direction group (ascending or descending).
67+
68+
Carries the three GeoZarr conventions plus proj:/spatial:/multiscales metadata.
69+
"""
70+
71+
zarr_conventions: list[dict[str, Any]]
72+
multiscales: Multiscales
73+
proj_code: str = Field(alias="proj:code")
74+
spatial_dimensions: tuple[Literal["y"], Literal["x"]] = Field(alias="spatial:dimensions")
75+
spatial_bbox: tuple[float, float, float, float] = Field(alias="spatial:bbox")
76+
77+
model_config = {"extra": "allow", "populate_by_name": True, "serialize_by_alias": True}
78+
79+
@model_validator(mode="after")
80+
def validate_zarr_conventions(self) -> Self:
81+
"""Ensure all three required convention UUIDs are present."""
82+
present = {c["uuid"] for c in self.zarr_conventions if "uuid" in c}
83+
missing = REQUIRED_CONVENTION_UUIDS - present
84+
if missing:
85+
raise ValueError(f"Missing required zarr_conventions UUIDs: {missing}")
86+
return self
87+
88+
89+
class S1RtcResolutionAttrs(BaseModel):
90+
"""Attributes for a resolution-level group (r10m, r20m, ...)."""
91+
92+
spatial_shape: tuple[int, int] = Field(alias="spatial:shape")
93+
spatial_transform: tuple[float, float, float, float, float, float] = Field(
94+
alias="spatial:transform"
95+
)
96+
97+
model_config = {"extra": "allow", "populate_by_name": True, "serialize_by_alias": True}
98+
99+
100+
class S1RtcConditionsAttrs(BaseModel):
101+
"""Attributes for the conditions group."""
102+
103+
proj_code: str = Field(alias="proj:code")
104+
spatial_dimensions: tuple[Literal["y"], Literal["x"]] = Field(alias="spatial:dimensions")
105+
spatial_transform: tuple[float, float, float, float, float, float] = Field(
106+
alias="spatial:transform"
107+
)
108+
109+
model_config = {"extra": "allow", "populate_by_name": True, "serialize_by_alias": True}
110+
111+
112+
# ============================================================================
113+
# TypedDict members (same pattern as S2 Sentinel2ResolutionMembers)
114+
# ============================================================================
115+
116+
117+
class S1RtcNativeResolutionMembers(TypedDict, closed=True, total=False): # type: ignore[call-arg]
118+
"""Members for the native resolution dataset (r10m).
119+
120+
Data variables (time, Y, X) plus 1-D coordinate variables (time,).
121+
All fields optional since not all arrays are present during incremental construction.
122+
"""
123+
124+
vv: ArraySpec[Any]
125+
vh: ArraySpec[Any]
126+
border_mask: ArraySpec[Any]
127+
time: ArraySpec[Any]
128+
absolute_orbit: ArraySpec[Any]
129+
relative_orbit: ArraySpec[Any]
130+
platform: ArraySpec[Any]
131+
132+
133+
class S1RtcOverviewResolutionMembers(TypedDict, closed=True, total=False): # type: ignore[call-arg]
134+
"""Members for overview resolution datasets (r20m … r720m).
135+
136+
Only data variables, no coordinate arrays.
137+
"""
138+
139+
vv: ArraySpec[Any]
140+
vh: ArraySpec[Any]
141+
border_mask: ArraySpec[Any]
142+
143+
144+
# ============================================================================
145+
# Group models (same pattern as S2 Sentinel2ResolutionDataset etc.)
146+
# ============================================================================
147+
148+
149+
class S1RtcNativeResolutionDataset(
150+
GroupSpec[S1RtcResolutionAttrs, S1RtcNativeResolutionMembers] # type: ignore[type-var]
151+
):
152+
"""The r10m dataset: data variables + coordinate arrays."""
153+
154+
@model_validator(mode="after")
155+
def validate_data_variables(self) -> Self:
156+
"""Ensure vv, vh, and border_mask are present."""
157+
for name in ("vv", "vh", "border_mask"):
158+
if name not in self.members:
159+
raise ValueError(f"Native resolution dataset must contain '{name}' array")
160+
return self
161+
162+
@property
163+
def vv(self) -> ArraySpec[Any]:
164+
return self.members["vv"]
165+
166+
@property
167+
def vh(self) -> ArraySpec[Any]:
168+
return self.members["vh"]
169+
170+
@property
171+
def border_mask(self) -> ArraySpec[Any]:
172+
return self.members["border_mask"]
173+
174+
175+
class S1RtcOverviewResolutionDataset(
176+
GroupSpec[S1RtcResolutionAttrs, S1RtcOverviewResolutionMembers] # type: ignore[type-var]
177+
):
178+
"""An overview resolution dataset (r20m-r720m): data variables only."""
179+
180+
181+
class S1RtcConditionsGroup(GroupSpec[S1RtcConditionsAttrs, dict[str, ArraySpec[Any]]]):
182+
"""Time-invariant condition arrays, keyed by name (e.g. gamma_area_008)."""
183+
184+
@model_validator(mode="after")
185+
def validate_has_gamma_area(self) -> Self:
186+
"""At least one gamma_area_* array should be present."""
187+
if not any(k.startswith("gamma_area_") for k in self.members):
188+
raise ValueError("Conditions group must contain at least one 'gamma_area_*' array")
189+
return self
190+
191+
192+
class S1RtcOrbitGroupMembers(TypedDict, closed=True): # type: ignore[call-arg]
193+
"""Members for an orbit-direction group.
194+
195+
r10m is always required; overview levels and conditions are optional.
196+
"""
197+
198+
r10m: S1RtcNativeResolutionDataset
199+
r20m: NotRequired[S1RtcOverviewResolutionDataset]
200+
r60m: NotRequired[S1RtcOverviewResolutionDataset]
201+
r120m: NotRequired[S1RtcOverviewResolutionDataset]
202+
r360m: NotRequired[S1RtcOverviewResolutionDataset]
203+
r720m: NotRequired[S1RtcOverviewResolutionDataset]
204+
conditions: NotRequired[S1RtcConditionsGroup]
205+
206+
207+
class S1RtcOrbitGroup(
208+
GroupSpec[S1RtcOrbitGroupAttrs, S1RtcOrbitGroupMembers] # type: ignore[type-var]
209+
):
210+
"""One orbit direction (ascending or descending) with multiscale layout."""
211+
212+
@property
213+
def r10m(self) -> S1RtcNativeResolutionDataset:
214+
return self.members["r10m"]
215+
216+
@property
217+
def conditions(self) -> S1RtcConditionsGroup | None:
218+
return self.members.get("conditions")
219+
220+
def get_resolution(self, level: ResolutionLevel) -> GroupSpec[Any, Any] | None:
221+
"""Retrieve a resolution dataset by level name."""
222+
return self.members.get(level)
223+
224+
def resolution_levels(self) -> list[ResolutionLevel]:
225+
"""List available resolution levels in this orbit group."""
226+
all_levels: tuple[ResolutionLevel, ...] = (
227+
"r10m",
228+
"r20m",
229+
"r60m",
230+
"r120m",
231+
"r360m",
232+
"r720m",
233+
)
234+
return [lvl for lvl in all_levels if lvl in self.members]
235+
236+
237+
# ============================================================================
238+
# Root model (same pattern as S2 Sentinel2Root)
239+
# ============================================================================
240+
241+
242+
class S1RtcRootMembers(TypedDict, closed=True, total=False): # type: ignore[call-arg]
243+
"""Members for the root group. At least one orbit direction must be present."""
244+
245+
ascending: S1RtcOrbitGroup
246+
descending: S1RtcOrbitGroup
247+
248+
249+
class S1RtcRoot(GroupSpec[DatasetAttrs, S1RtcRootMembers]): # type: ignore[type-var]
250+
"""Complete S1 GRD RTC GeoZarr V3 hierarchy.
251+
252+
The hierarchy follows the implementation plan::
253+
254+
s1-grd-rtc-{tile}.zarr/
255+
├── zarr.json
256+
├── ascending/
257+
│ ├── zarr.json # zarr_conventions, multiscales, proj:, spatial:
258+
│ ├── r10m/
259+
│ │ ├── vv/ # (time, Y, X) float32
260+
│ │ ├── vh/ # (time, Y, X) float32
261+
│ │ ├── border_mask/ # (time, Y, X) uint8
262+
│ │ ├── time/ # (time,) int64
263+
│ │ ├── absolute_orbit/
264+
│ │ ├── relative_orbit/
265+
│ │ └── platform/
266+
│ ├── r20m/ … r720m/
267+
│ └── conditions/
268+
│ └── gamma_area_{orbit}/
269+
└── descending/
270+
└── (same)
271+
"""
272+
273+
@model_validator(mode="after")
274+
def validate_at_least_one_orbit(self) -> Self:
275+
if "ascending" not in self.members and "descending" not in self.members:
276+
raise ValueError("Store must contain at least one orbit group (ascending/descending)")
277+
return self
278+
279+
@property
280+
def ascending(self) -> S1RtcOrbitGroup | None:
281+
return self.members.get("ascending")
282+
283+
@property
284+
def descending(self) -> S1RtcOrbitGroup | None:
285+
return self.members.get("descending")

0 commit comments

Comments
 (0)