@@ -611,6 +611,171 @@ In this example a shard shape of (1000, 1000) and a chunk shape of (100, 100) is
611611This means that ` 10*10 ` chunks are stored in each shard, and there are ` 10*10 ` shards in total.
612612Without the ` shards ` argument, there would be 10,000 chunks stored as individual files.
613613
614+ ## Rectilinear (variable) chunk grids
615+
616+ !!! warning "Experimental"
617+ Rectilinear chunk grids are an experimental feature and may change in
618+ future releases. This feature is expected to stabilize in Zarr version 3.3.
619+
620+ Because the feature is still stabilizing, it is disabled by default and
621+ must be explicitly enabled:
622+
623+ ```python
624+ import zarr
625+ zarr.config.set({"array.rectilinear_chunks": True})
626+ ```
627+
628+ Or via the environment variable `ZARR_ARRAY__RECTILINEAR_CHUNKS=True`.
629+
630+ The examples below assume this config has been set.
631+
632+ By default, Zarr arrays use a regular chunk grid where every chunk along a
633+ given dimension has the same size (except possibly the final boundary chunk).
634+ Rectilinear chunk grids allow each chunk along a dimension to have a different
635+ size. This is useful when the natural partitioning of the data is not uniform —
636+ for example, satellite swaths of varying width, time series with irregular
637+ intervals, or spatial tiles of different extents.
638+
639+ ### Creating arrays with rectilinear chunks
640+
641+ To create an array with rectilinear chunks, pass a nested list to the ` chunks `
642+ parameter where each inner list gives the chunk sizes along one dimension:
643+
644+ ``` python exec="true" session="arrays" source="above" result="ansi"
645+ zarr.config.set({" array.rectilinear_chunks" : True })
646+ z = zarr.create_array(
647+ store = zarr.storage.MemoryStore(),
648+ shape = (60 , 100 ),
649+ chunks = [[10 , 20 , 30 ], [50 , 50 ]],
650+ dtype = ' int32' ,
651+ )
652+ print (z.info)
653+ ```
654+
655+ In this example the first dimension is split into three chunks of sizes 10, 20,
656+ and 30, while the second dimension is split into two equal chunks of size 50.
657+
658+ ### Reading and writing data
659+
660+ Rectilinear arrays support the same indexing interface as regular arrays.
661+ Reads and writes that cross chunk boundaries of different sizes are handled
662+ automatically:
663+
664+ ``` python exec="true" session="arrays" source="above" result="ansi"
665+ import numpy as np
666+ data = np.arange(60 * 100 , dtype = ' int32' ).reshape(60 , 100 )
667+ z[:] = data
668+ # Read a slice that spans the first two chunks (sizes 10 and 20) along axis 0
669+ print (z[5 :25 , 0 :5 ])
670+ ```
671+
672+ ### Inspecting chunk sizes
673+
674+ The ` .write_chunk_sizes ` property returns the actual data size of each storage
675+ chunk along every dimension. It works for both regular and rectilinear arrays
676+ and returns a tuple of tuples (matching the dask ` Array.chunks ` convention).
677+ When sharding is used, ` .read_chunk_sizes ` returns the inner chunk sizes instead:
678+
679+ ``` python exec="true" session="arrays" source="above" result="ansi"
680+ print (z.write_chunk_sizes)
681+ ```
682+
683+ For regular arrays, this includes the boundary chunk:
684+
685+ ``` python exec="true" session="arrays" source="above" result="ansi"
686+ z_regular = zarr.create_array(
687+ store = zarr.storage.MemoryStore(),
688+ shape = (100 , 80 ),
689+ chunks = (30 , 40 ),
690+ dtype = ' int32' ,
691+ )
692+ print (z_regular.write_chunk_sizes)
693+ ```
694+
695+ Note that the ` .chunks ` property is only available for regular chunk grids. For
696+ rectilinear arrays, use ` .write_chunk_sizes ` (or ` .read_chunk_sizes ` ) instead.
697+
698+ ### Resizing and appending
699+
700+ Rectilinear arrays can be resized. When growing past the current edge sum, a
701+ new chunk is appended covering the additional extent. When shrinking, the chunk
702+ edges are preserved and the extent is re-bound (chunks beyond the new extent
703+ simply become inactive):
704+
705+ ``` python exec="true" session="arrays" source="above" result="ansi"
706+ z = zarr.create_array(
707+ store = zarr.storage.MemoryStore(),
708+ shape = (30 ,),
709+ chunks = [[10 , 20 ]],
710+ dtype = ' float64' ,
711+ )
712+ z[:] = np.arange(30 , dtype = ' float64' )
713+ print (f " Before resize: chunk_sizes= { z.write_chunk_sizes} " )
714+ z.resize((50 ,))
715+ print (f " After resize: chunk_sizes= { z.write_chunk_sizes} " )
716+ ```
717+
718+ The ` append ` method also works with rectilinear arrays:
719+
720+ ``` python exec="true" session="arrays" source="above" result="ansi"
721+ z.append(np.arange(10 , dtype = ' float64' ))
722+ print (f " After append: shape= { z.shape} , chunk_sizes= { z.write_chunk_sizes} " )
723+ ```
724+
725+ ### Compressors and filters
726+
727+ Rectilinear arrays work with all codecs — compressors, filters, and checksums.
728+ Since each chunk may have a different size, the codec pipeline processes each
729+ chunk independently:
730+
731+ ``` python exec="true" session="arrays" source="above" result="ansi"
732+ z = zarr.create_array(
733+ store = zarr.storage.MemoryStore(),
734+ shape = (60 , 100 ),
735+ chunks = [[10 , 20 , 30 ], [50 , 50 ]],
736+ dtype = ' float64' ,
737+ filters = [zarr.codecs.TransposeCodec(order = (1 , 0 ))],
738+ compressors = [zarr.codecs.BloscCodec(cname = ' zstd' , clevel = 3 )],
739+ )
740+ z[:] = np.arange(60 * 100 , dtype = ' float64' ).reshape(60 , 100 )
741+ np.testing.assert_array_equal(z[:], np.arange(60 * 100 , dtype = ' float64' ).reshape(60 , 100 ))
742+ print (" Roundtrip OK" )
743+ ```
744+
745+ ### Rectilinear shard boundaries
746+
747+ Rectilinear chunk grids can also be used for shard boundaries when combined
748+ with sharding. In this case, the outer grid (shards) is rectilinear while the
749+ inner chunks remain regular. Each shard dimension must be divisible by the
750+ corresponding inner chunk size:
751+
752+ ``` python exec="true" session="arrays" source="above" result="ansi"
753+ z = zarr.create_array(
754+ store = zarr.storage.MemoryStore(),
755+ shape = (120 , 100 ),
756+ chunks = (10 , 10 ),
757+ shards = [[60 , 40 , 20 ], [50 , 50 ]],
758+ dtype = ' int32' ,
759+ )
760+ z[:] = np.arange(120 * 100 , dtype = ' int32' ).reshape(120 , 100 )
761+ print (z[50 :70 , 40 :60 ])
762+ ```
763+
764+ Note that rectilinear inner chunks with sharding are not supported — only the
765+ shard boundaries can be rectilinear.
766+
767+ ### Metadata format
768+
769+ Rectilinear chunk grid metadata uses run-length encoding (RLE) for compact
770+ serialization. When reading metadata, both bare integers and ` [value, count] `
771+ pairs are accepted:
772+
773+ - ` [10, 20, 30] ` — three chunks with explicit sizes
774+ - ` [[10, 3]] ` — three chunks of size 10 (RLE shorthand)
775+ - ` [[10, 3], 5] ` — three chunks of size 10, then one chunk of size 5
776+
777+ When writing, Zarr automatically compresses repeated values into RLE format.
778+
614779## Missing features in 3.0
615780
616781The following features have not been ported to 3.0 yet.
0 commit comments