feat/geozarr model#10
Conversation
…to feat/geozarr-model
…to feat/geozarr-model
…to feat/geozarr-model
…to feat/geozarr-model
|
@emmanuelmathot this is ready for review. The models right now are pretty narrow -- I define a A The If this looks good, maybe we merge this and then I can start working on integrating these classes with the rest of the codebase? |
emmanuelmathot
left a comment
There was a problem hiding this comment.
Thank you for the initial pydantic model @d-v-b ! I probably have more questions as it is fully clear how this work to me. Let's have a review meeting.
| ) | ||
|
|
||
|
|
||
| CFStandardName = Annotated[str, AfterValidator(check_standard_name)] |
There was a problem hiding this comment.
I think we miss the grid_mapping attribute verification defaulted to spatial_ref scalar with the EPSG code. https://zarr.dev/geozarr-spec/documents/standard/template/geozarr-spec.html#_e15d59bd-f2ec-28e8-8016-4e541c95e10f
There was a problem hiding this comment.
I can add these, and if they are required we should make that more clear in the spec. right now the spec says
CF Conventions – Including attributes such as standard_name, units, axis, and grid_mapping to express spatiotemporal semantics and coordinate system properties.
but it isn't clear which CF attributes are required, optional, etc
| CF_STANDARD_NAMES = get_cf_standard_names(url=CF_STANDARD_NAME_URL) | ||
|
|
||
|
|
||
| def check_standard_name(name: str) -> str: |
There was a problem hiding this comment.
Please bear with my limited knowledge of pydantic but how is made the link with the actual standard_name field name?
There was a problem hiding this comment.
Pydantic does most of its validation routines based on type annotations. When we annotate an attribute on a pydantic model with this type: https://github.com/d-v-b/data-model/blob/3d11af412e460993f8e603dcff0555c5342c4e8f/src/eopf_geozarr/data_api/geozarr/common.py#L70, then pydantic will run the check_standard_name function after checking that the input is a string.
| return model | ||
|
|
||
|
|
||
| class Dataset(GroupSpec[DatasetAttrs, GroupSpec[Any, Any] | DataArray]): |
There was a problem hiding this comment.
Shouldn't we call this GeoZarrDataset?
Then, how does pydantic-zarr discriminate the "normal" groups from a geoZarr Dataset group?
There was a problem hiding this comment.
- we could definitely call it a
GeoZarrDataset
Then, how does pydantic-zarr discriminate the "normal" groups from a geoZarr Dataset group?
see https://docs.pydantic.dev/latest/concepts/unions/. Basically there are 3 options for resolving a union: order-based, best-match-based, and discriminant-based.
Ideally a geozarr dataset group would have some specific requirements or traits that distinguish it from a regular zarr group, or a zarr group that happens to have cf metadata, but I don't think the spec as written expresses these requirements, so this might require some work
|
@d-v-b What is the status of this PR? |
|
still in progress, I was previously blocked on the semantics of the |
|
@emmanuelmathot I am getting an interesting test failure that would be helpful for you to check. Within a dataset, for each data variable that declares coordinates ( |
|
probably bugs from the EOPF CPM as we simply copy this dataset. I will give a look and report if any. I'll trace that here. |
|
@d-v-b As a more general question. Is this |
|
2 related issues created following the |
…to feat/geozarr-model
|
I added a new |
initial working pydantic models for geozarr