fix: handle non-iterable hashables as dim names#10638
Conversation
- Core dimension parsing fixes in namedarray/core.py - Type definition updates in _typing.py - Index handling fixes in core/indexes.py - DataArray dimension parsing in core/dataarray.py - Variable docstring updates in core/variable.py
|
There's been a lot of struggles with this in the past, see #8210 #6142 #8199 for more reading. The idea with the previous typing is that xarray promises to normalize NamedArray handles this fine I think on Main, so I think this PR should try to solve this at the DataArray/Variable level: import numpy as np
import xarray as xr
some_id = uuid.uuid4()
some_id_2 = uuid.uuid4()
dims = (some_id, some_id_2)
xr.NamedArray(dims, np.asarray([[0.0, 1.0]]))
# <xarray.NamedArray (14a722b6-c795-4557-9255-e3a35b76f383: 1,
# d9ebaff1-47b9-4266-8f3f-cb4f2e66719e: 2)> Size: 16B
# array([[0., 1.]])It probably would've been much easier if |
|
Tweaking types usually opens a can of worms.
DataArray is just missing |
|
|
||
| Note: Tuples are treated as sequences, so ('a', 'b') means two | ||
| dimensions named 'a' and 'b'. To use a tuple as a single dimension | ||
| name, wrap it in a list: [('a', 'b')]. |
There was a problem hiding this comment.
| name, wrap it in a list: [('a', 'b')]. | |
| name, wrap it in a another tuple: (('a', 'b'),) or list: [('a', 'b')]. |
Should recommend what xarray tries to use internally first, tuple[Hashable, ...].
|
|
||
| Note: Tuples are treated as sequences, so ('a', 'b') means two | ||
| dimensions named 'a' and 'b'. To use a tuple as a single dimension | ||
| name, wrap it in a list: [('a', 'b')]. |
There was a problem hiding this comment.
| name, wrap it in a list: [('a', 'b')]. | |
| name, wrap it in a tuple: (('a', 'b'),) or list: [('a', 'b')]. |
| if isinstance(dims, str): | ||
| dims = (dims,) | ||
| elif isinstance(dims, Iterable): | ||
| dims = tuple(dims) | ||
| else: | ||
| # Single non-string, non-iterable hashable (int, UUID, etc.) | ||
| dims = (dims,) |
There was a problem hiding this comment.
_parse_dimensions is triggered every time NamedArray is initialized. Adding more if-cases will slow down for the people (and NamedArray internally) using the correct type dims: tuple[Hashable, ...].
Would be nice to invert the check to promote correct typing.
Fixes: #10634
This exposed that
('a', 'b')is ambiguous as it is both hashable and a sequence of hashable, so I added an error message in a case where it is ambiguous.There's some error duplication logic, but that is matching the existing duplication of logic.
whats-new.rstapi.rst