Skip to content

Commit 5fa3fdf

Browse files
d-v-bclaude
andcommitted
refactor: simplify PhasedCodecPipeline — remove layout abstraction, use codec chain directly
Remove ~840 lines of ChunkLayout hierarchy (ShardIndex, SimpleChunkLayout, ShardedChunkLayout, fetch_chunks_sync/async, decode_chunks_from_index, merge_and_encode_from_index). The pipeline now uses ChunkTransform directly for sync decode/encode and falls back to the async codec API otherwise. Also fix ShardingCodec._encode_sync to respect write_empty_chunks config by skipping inner chunks that are all fill_value. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent ed2e230 commit 5fa3fdf

3 files changed

Lines changed: 269 additions & 1097 deletions

File tree

src/zarr/codecs/sharding.py

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -536,10 +536,19 @@ def _encode_sync(
536536
morton_order_iter(chunks_per_shard)
537537
)
538538

539+
chunk_spec = self._get_chunk_spec(shard_spec)
540+
skip_empty = not shard_spec.config.write_empty_chunks
541+
fill_value = shard_spec.fill_value
542+
if fill_value is None:
543+
fill_value = shard_spec.dtype.default_scalar()
544+
539545
for chunk_coords, _chunk_selection, out_selection, _ in indexer:
540546
chunk_array = shard_array[out_selection]
541-
encoded = inner_transform.encode_chunk(chunk_array)
542-
shard_builder[chunk_coords] = encoded
547+
if skip_empty and chunk_array.all_equal(fill_value):
548+
shard_builder[chunk_coords] = None
549+
else:
550+
encoded = inner_transform.encode_chunk(chunk_array)
551+
shard_builder[chunk_coords] = encoded
543552

544553
return self._encode_shard_dict_sync(
545554
shard_builder,

0 commit comments

Comments
 (0)