Commit 79d5b8f
Optimize partial shard reads (zarr-developers#3004)
* Add performance test of partial shard reads
* WIP Consolidate reads of multiple chunks in the same shard
Add test and make max gap and max coalesce size config options
Code clarity and comments
Test that chunk request coalescing reduces calls to store
Profile a few values for coalesce_max_gap
Update [doc]tests to include new sharding.read.* values
document sharded read config options in user-guide/config.rst
tweak logic: start new coalesced group if coalescing would exceed `coalesce_max_bytes`
previous logic only started a new group if existing group
was size already exceeded coalesce_max_bytes.
set `mypy_path = "src"` to help pre-commit mypy find imported classes
Reorder methods in sharding.py, add docstring + commenting
wording docs fix
docstring clarification
trigger precommit on all python files changed in this pull request
trying to get the ruff format that's happening locally during pre-commit
to match the pre-commit run that is failing on CI.
revert trigger for pre-commit ruff format
* Add changes/3004.feature.rst
* Consistently return None on failure and test partial shard read failure modes
Use range of integers as out_selection not slice in CoordinateIndexer
To fix issue when using vindex with repeated indexes in indexer
test: improve formatting and add debugging breakpoint in array property tests
test: disable hypothesis deadline for test_array_roundtrip to prevent timeout
fix: initialize decode buffers with shard_spec.fill_value instead of 0 to fix partial shard holes
style: reformat code for improved readability and consistency in sharding.py
fix: revert incorrect RangeByteRequest length fix in sharding byte retrieval
* Fix and test for case where some chunks in shard are all fill
* Self review
* Removing profiling code masquerading as a skipped test
* revert change to indexing.py, not needed
* Add test for duplicate integer indexing into a coalesced group
* Undo change to fill value when initializing shard arrays
* Undo change to set mypy_path = "src"
* Commenting and revert uncessary changes to files for smaller diff
* remove now redundant cast
* Document runtime config keys
* Improve changelog entry and .rst -> .md
* .coords -> .chunk_coords in _ChunkCoordsByteSlice dataclass
* Update test env in docs/contributing.md
* Move `config.get` calls up into `_decode_partial_single`
* Ensure no change in behavior when ByteGetter.get returns None + comment
* Add test_sharing_unit.py, focusing on coallesce behavior, but with basic tests for other components
* Fix typing errors in test_sharing_unit.py
* Only use concurrent_map over chunks within a shard if > 1 groups after coalescing
* no-op test change to retry CI after unavailable runner failure
* Another no-op test change to retry CI after unavailable runner failure
* use get_partial_values(), remove explicit coallescing and concurrent_map
* cleanup
* self review
* Unit tests for new get_partial_values implementations
* tests: work around mypy not seeing equivalent protocols as equivalent
* Re-work to use Store.get_ranges. Simplify significantly.
* Add missing assert in test_sharding.py
* Remove unit tests that tested now-removed _ShardIndex.is_dense
---------
Co-authored-by: Davis Bennett <davis.v.bennett@gmail.com>1 parent 7531de5 commit 79d5b8f
4 files changed
Lines changed: 828 additions & 17 deletions
File tree
- changes
- src/zarr/codecs
- tests/test_codecs
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
| 62 | + | |
62 | 63 | | |
63 | 64 | | |
64 | 65 | | |
| |||
467 | 468 | | |
468 | 469 | | |
469 | 470 | | |
470 | | - | |
| 471 | + | |
471 | 472 | | |
472 | 473 | | |
473 | 474 | | |
474 | 475 | | |
475 | 476 | | |
476 | 477 | | |
477 | 478 | | |
478 | | - | |
479 | | - | |
480 | | - | |
481 | 479 | | |
482 | 480 | | |
483 | | - | |
484 | | - | |
485 | | - | |
486 | | - | |
487 | | - | |
488 | | - | |
489 | | - | |
490 | | - | |
491 | | - | |
492 | | - | |
493 | | - | |
494 | | - | |
495 | | - | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
496 | 491 | | |
497 | 492 | | |
498 | 493 | | |
| |||
779 | 774 | | |
780 | 775 | | |
781 | 776 | | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
782 | 837 | | |
783 | 838 | | |
784 | 839 | | |
0 commit comments