Commit 6316ef4
authored
Cap dask graph size in read_geotiff_dask and batch adler32 transfers (#1211)
read_geotiff_dask built one delayed task per chunk with no upper bound.
For very large files at small chunk sizes the Python graph itself OOMs
the driver before any pixel read runs (30TB at chunks=256 would produce
~125M chunks, ~500M tasks, ~500GB graph on the host). Cap total chunks
at 1,000,000 and auto-scale the requested chunks size upward, emitting
a UserWarning so callers know their request was adjusted.
_nvcomp_batch_compress on the deflate path copied every uncompressed
tile GPU->CPU one at a time with .get().tobytes() purely to compute the
zlib adler32 trailer. Each per-tile .get() is a sync point on the default
stream. Batch all tiles into a single contiguous device buffer, transfer
once, then compute adler32 from a host memoryview slice per tile.1 parent d05d9b7 commit 6316ef4
2 files changed
Lines changed: 35 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
937 | 937 | | |
938 | 938 | | |
939 | 939 | | |
| 940 | + | |
| 941 | + | |
| 942 | + | |
| 943 | + | |
| 944 | + | |
| 945 | + | |
| 946 | + | |
| 947 | + | |
| 948 | + | |
| 949 | + | |
| 950 | + | |
| 951 | + | |
| 952 | + | |
| 953 | + | |
| 954 | + | |
| 955 | + | |
| 956 | + | |
| 957 | + | |
| 958 | + | |
| 959 | + | |
| 960 | + | |
940 | 961 | | |
941 | 962 | | |
942 | 963 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1813 | 1813 | | |
1814 | 1814 | | |
1815 | 1815 | | |
1816 | | - | |
| 1816 | + | |
| 1817 | + | |
| 1818 | + | |
| 1819 | + | |
1817 | 1820 | | |
1818 | 1821 | | |
1819 | 1822 | | |
1820 | 1823 | | |
1821 | | - | |
1822 | | - | |
1823 | | - | |
1824 | | - | |
| 1824 | + | |
| 1825 | + | |
| 1826 | + | |
| 1827 | + | |
| 1828 | + | |
| 1829 | + | |
| 1830 | + | |
| 1831 | + | |
| 1832 | + | |
| 1833 | + | |
1825 | 1834 | | |
1826 | 1835 | | |
1827 | 1836 | | |
| |||
0 commit comments