You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -511,40 +511,167 @@ The name controls serialization format; each metadata DTO class provides its own
511
511
512
512
## Migration
513
513
514
-
### Backwards compatibility
514
+
### Public API compatibility
515
515
516
-
A module-level `__getattr__` shim in `chunk_grids.py` preserves the common downstream import pattern. Importing the old names emits a `DeprecationWarning` and returns the renamed metadata class:
516
+
The user-facing API is fully backward-compatible. Existing code that creates, opens, reads, and writes zarr arrays continues to work without changes:
517
+
518
+
-`zarr.create_array`, `zarr.open`, `zarr.open_array`, `zarr.open_group` -- unchanged signatures. The `chunks` parameter type is *widened* (now also accepts nested sequences for rectilinear grids), but all existing call patterns still work.
519
+
-`arr.chunks` -- returns `tuple[int, ...]` for regular arrays, same as before.
- Rectilinear chunks are gated behind `zarr.config.set({'array.rectilinear_chunks': True})`, so they cannot be created accidentally.
523
+
524
+
New additions (purely additive): `arr.read_chunk_sizes`, `arr.write_chunk_sizes`, `zarr.experimental.ChunkGrid`, `zarr.experimental.ChunkSpec`.
525
+
526
+
The breaking changes discussed below are confined to **internal modules** (`zarr.core.chunk_grids`, `zarr.core.metadata.v3`, `zarr.core.indexing`) that downstream libraries like cubed and VirtualiZarr access directly.
527
+
528
+
### Internal API compatibility trade-off analysis
529
+
530
+
This section analyzes the internal breaking changes from the metadata/array separation and evaluates two strategies: (A) add backward-compatibility shims in zarr-python, vs. (B) require downstream packages to update. The baseline is **no shims at all**.
531
+
532
+
#### What breaks without any shims
533
+
534
+
Three API changes affect downstream code:
535
+
536
+
1.**`RegularChunkGrid` class removed from `zarr.core.chunk_grids`.** On `main`, `RegularChunkGrid` is defined in `chunk_grids.py` as a `Metadata` subclass. This branch replaces it with `RegularChunkGridMetadata` in `metadata/v3.py`. Without a shim, `from zarr.core.chunk_grids import RegularChunkGrid` raises `ImportError`.
537
+
538
+
2.**`RegularChunkGrid` no longer available from `zarr.core.metadata.v3`.** On `main`, `v3.py` imports `RegularChunkGrid` from `chunk_grids.py` for internal use. VirtualiZarr imports it from this location (`from zarr.core.metadata.v3 import RegularChunkGrid`). Without the internal import, this raises `ImportError`.
539
+
540
+
3.**`OrthogonalIndexer` constructor expects `ChunkGrid`, not `RegularChunkGrid`/`RegularChunkGridMetadata`.** Even if the import shims above resolve to `RegularChunkGridMetadata`, the indexer constructors access `chunk_grid._dimensions`, which only exists on the runtime `ChunkGrid` class. Cubed constructs `OrthogonalIndexer(selection, shape, RegularChunkGrid(chunk_shape=chunks))` directly.
541
+
542
+
#### Downstream impact without shims
543
+
544
+
**VirtualiZarr** (5 line changes across 2 files):
517
545
518
546
```python
519
-
from zarr.core.chunk_grids import RegularChunkGrid # DeprecationWarning, returns RegularChunkGridMetadata
The same shim exists for `RectilinearChunkGrid` → `RectilinearChunkGridMetadata`.
568
+
The `manifests/array.py` import is from `zarr.core.metadata.v3` (never a documented export; VirtualiZarr relied on a transitive import). The `parsers/zarr.py` import is from `zarr.core.chunk_grids` (the canonical location on `main`). Both are straightforward renames. The `.chunk_shape` attribute is unchanged on the new class.
569
+
570
+
If VirtualiZarr needs to support both old and new zarr-python, a version-conditional import adds ~5 more lines.
Note that `ChunkGrid` is *not* a renamed class. `RegularChunkGrid(chunk_shape=chunks)` took only chunk sizes; `ChunkGrid.from_sizes(shape, chunks)` also requires the array shape. The `shape` parameter is already available at this call site.
586
+
587
+
If cubed needs to support both old and new zarr-python:
|**Maintenance burden**| None | Low (deprecation shims are well-understood) | Medium (indexer coercion blurs metadata/runtime boundary) |
652
+
|**API clarity**| Clean (metadata DTOs and runtime types are distinct) | Good (old names redirect to new names) | Weaker (indexers implicitly accept two type families) |
653
+
654
+
With Shims 1+2 only, VirtualiZarr's `manifests/array.py` import from `zarr.core.metadata.v3` is covered by Shim 2, and the `parsers/zarr.py` import from `zarr.core.chunk_grids` is covered by Shim 1. The `isinstance` checks work because both shims resolve to `RegularChunkGridMetadata`. The `cast` works because `.chunk_shape` is unchanged. So VirtualiZarr needs 0 changes with Shims 1+2. The 3 lines for cubed remain because Shim 1 resolves the import but `OrthogonalIndexer` still needs a runtime `ChunkGrid`.
526
655
527
656
### Downstream migration
528
657
529
-
| Old pattern | New pattern |
658
+
Migration from `main` (where only `RegularChunkGrid` and the abstract `ChunkGrid` ABC exist):
|`isinstance(cg, RegularChunkGrid)`|`isinstance(cg, RegularChunkGridMetadata)` or `grid.is_regular` on the runtime `ChunkGrid`|
665
+
|`cg.chunk_shape` on `RegularChunkGrid`|`cg.chunk_shape` on `RegularChunkGridMetadata` (unchanged) |
666
+
|`ChunkGrid.from_dict(data)`|`parse_chunk_grid(data)` from `zarr.core.metadata.v3`|
667
+
|`chunk_grid.all_chunk_coords(array_shape)`|`chunk_grid.all_chunk_coords()` (shape now stored in grid) |
668
+
|`chunk_grid.get_nchunks(array_shape)`|`chunk_grid.get_nchunks()` (shape now stored in grid) |
542
669
543
-
**[Icechunk#1338](https://github.com/earth-mover/icechunk/issues/1338):** Minimal impact — format changes driven by spec, not class hierarchy.
670
+
During the earlier [#3534](https://github.com/zarr-developers/zarr-python/pull/3534) effort (which used separate `RegularChunkGrid`/`RectilinearChunkGrid` classes), downstream PRs and issues were opened to explore compatibility:
544
671
545
-
**[cubed#876](https://github.com/cubed-dev/cubed/issues/876):** Switch store creation to `ChunkGrid` API. @tomwhite confirmed in #3534 that rechunking with variable-sized intermediate chunks works.
**HEALPix use case:**@tinaok demonstrated in #3534 that variable-chunked arrays arise naturally when grouping HEALPix cells by parent pixel — the chunk sizes come from `np.unique(parents, return_counts=True)`.
674
+
These target #3534's API, not this branch's unified `ChunkGrid` design. New downstream POC branches for this design are linked in [Proofs of concepts](#proofs-of-concepts).
0 commit comments