|
| 1 | +# GeoTIFF Performance and Memory Controls |
| 2 | + |
| 3 | +Adds three parameters to `open_geotiff` and `to_geotiff` that let callers |
| 4 | +control memory usage, compression speed, and large-raster write strategy. |
| 5 | +All three are opt-in; default behaviour is unchanged. |
| 6 | + |
| 7 | +## 1. `dtype` parameter on `open_geotiff` |
| 8 | + |
| 9 | +### API |
| 10 | + |
| 11 | +```python |
| 12 | +open_geotiff(source, *, dtype=None, ...) |
| 13 | +``` |
| 14 | + |
| 15 | +`dtype` accepts any numpy dtype string or object (`np.float32`, `'float32'`, |
| 16 | +etc.). `None` preserves the file's native dtype (current behaviour). |
| 17 | + |
| 18 | +### Read paths |
| 19 | + |
| 20 | +| Path | Behaviour | |
| 21 | +|------|-----------| |
| 22 | +| Eager (numpy) | Output array allocated at target dtype. Each decoded tile/strip cast before copy-in. Peak overhead: one tile at native dtype. | |
| 23 | +| Dask | Each delayed chunk function casts after decode. Output chunks are target dtype. Same per-tile overhead. | |
| 24 | +| GPU (CuPy) | Cast on device after decode. | |
| 25 | +| Dask + CuPy | Combination of dask and GPU paths. | |
| 26 | + |
| 27 | +### Numba LZW fast path |
| 28 | + |
| 29 | +The LZW decoder is a numba JIT function that emits values one at a time into a |
| 30 | +byte buffer. A variant will decode each value and cast inline to the target |
| 31 | +dtype so the per-tile buffer is never allocated at native dtype. Other codecs |
| 32 | +(deflate, zstd) return byte buffers from C libraries where per-value |
| 33 | +interception isn't possible, so those fall back to the tile-level cast. |
| 34 | + |
| 35 | +### Validation |
| 36 | + |
| 37 | +- Narrowing float casts (float64 to float32): allowed. |
| 38 | +- Narrowing int casts (int64 to int16): allowed (user asked for it explicitly). |
| 39 | +- Widening casts (float32 to float64, uint8 to int32): allowed. |
| 40 | +- Float to int: `ValueError` (lossy in a way users often don't intend). |
| 41 | +- Unsupported casts (e.g. complex128 to uint8): `ValueError`. |
| 42 | + |
| 43 | +## 2. `compression_level` parameter on `to_geotiff` |
| 44 | + |
| 45 | +### API |
| 46 | + |
| 47 | +```python |
| 48 | +to_geotiff(data, path, *, compression='zstd', compression_level=None, ...) |
| 49 | +``` |
| 50 | + |
| 51 | +`compression_level` is `int | None`. `None` uses the codec's existing default. |
| 52 | + |
| 53 | +### Ranges |
| 54 | + |
| 55 | +| Codec | Range | Default | Direction | |
| 56 | +|-------|-------|---------|-----------| |
| 57 | +| deflate | 1 -- 9 | 6 | 1 = fastest, 9 = smallest | |
| 58 | +| zstd | 1 -- 22 | 3 | 1 = fastest, 22 = smallest | |
| 59 | +| lz4 | 0 -- 16 | 0 | 0 = fastest | |
| 60 | +| lzw | n/a | n/a | No level support; ignored silently | |
| 61 | +| jpeg | n/a | n/a | Quality is a separate axis; ignored | |
| 62 | +| packbits | n/a | n/a | Ignored | |
| 63 | +| none | n/a | n/a | Ignored | |
| 64 | + |
| 65 | +### Plumbing |
| 66 | + |
| 67 | +`to_geotiff` passes `compression_level` to `write()`, which passes it to |
| 68 | +`compress()`. The internal `compress()` already accepts a `level` argument; we |
| 69 | +just thread it through the two intermediate call sites that currently hardcode |
| 70 | +it. |
| 71 | + |
| 72 | +### Validation |
| 73 | + |
| 74 | +- Out-of-range level for a codec that supports levels: `ValueError`. |
| 75 | +- Level set for a codec without level support: silently ignored. |
| 76 | + |
| 77 | +### GPU path |
| 78 | + |
| 79 | +`write_geotiff_gpu` also accepts and forwards the level to nvCOMP batch |
| 80 | +compression, which supports levels for zstd and deflate. |
| 81 | + |
| 82 | +## 3. VRT output from `to_geotiff` via `.vrt` extension |
| 83 | + |
| 84 | +### Trigger |
| 85 | + |
| 86 | +When `path` ends in `.vrt`, `to_geotiff` writes a tiled VRT instead of a |
| 87 | +monolithic TIFF. No new parameter needed. |
| 88 | + |
| 89 | +### Output layout |
| 90 | + |
| 91 | +``` |
| 92 | +output.vrt |
| 93 | +output_tiles/ |
| 94 | + tile_0000_0000.tif # row_col, zero-padded |
| 95 | + tile_0000_0001.tif |
| 96 | + ... |
| 97 | +``` |
| 98 | + |
| 99 | +Directory name derived from the VRT stem (`foo.vrt` -> `foo_tiles/`). |
| 100 | +Zero-padding width scales to the grid dimensions. |
| 101 | + |
| 102 | +### Behaviour per input type |
| 103 | + |
| 104 | +| Input | Tiling strategy | Memory profile | |
| 105 | +|-------|----------------|----------------| |
| 106 | +| Dask DataArray | One tile per dask chunk. Each task computes its chunk and writes one `.tif`. | One chunk in RAM at a time (scheduler controlled). | |
| 107 | +| Dask + CuPy | Same, GPU compress per tile. | One chunk in GPU memory at a time. | |
| 108 | +| Numpy / ndarray | Slice into `tile_size`-sized pieces, write each. | Source array already in RAM; tile slices are views (no duplication). | |
| 109 | +| CuPy | Same as numpy but GPU compress. | Source on GPU; tiles are views. | |
| 110 | + |
| 111 | +### Per-tile properties |
| 112 | + |
| 113 | +- Same `compression`, `compression_level`, `predictor`, `nodata`, `crs` as the |
| 114 | + parent call. |
| 115 | +- `tiled=True` with the caller's `tile_size` (internal TIFF tiling within each |
| 116 | + chunk-file). |
| 117 | +- GeoTransform adjusted to each tile's spatial position (row/col offset from |
| 118 | + the full raster origin). |
| 119 | +- No COG overviews on individual tiles. |
| 120 | + |
| 121 | +### VRT generation |
| 122 | + |
| 123 | +After all tiles are written, call `write_vrt()` with relative paths. The VRT |
| 124 | +XML references each tile by its spatial extent and band mapping. |
| 125 | + |
| 126 | +### Edge cases and validation |
| 127 | + |
| 128 | +- `cog=True` with a `.vrt` path: `ValueError` (mutually exclusive). |
| 129 | +- Tiles directory exists and is non-empty: `FileExistsError` to prevent silent |
| 130 | + overwrites. |
| 131 | +- Tiles directory doesn't exist: created automatically. |
| 132 | +- `overview_levels` with `.vrt` path: `ValueError` (overviews don't apply). |
| 133 | + |
| 134 | +### Dask scheduling |
| 135 | + |
| 136 | +For dask inputs, all delayed tile-write tasks are submitted to |
| 137 | +`dask.compute()` at once. The scheduler manages parallelism and memory. Each |
| 138 | +task is: compute chunk, compress, write tile file. No coordination between |
| 139 | +tasks. |
| 140 | + |
| 141 | +## Out of scope |
| 142 | + |
| 143 | +- Streaming write of a monolithic `.tif` from dask input (tracked as a separate |
| 144 | + issue). Users who need a single file from a large dask array can write to VRT |
| 145 | + and convert externally, or ensure sufficient RAM. |
| 146 | +- JPEG quality parameter (separate concern from compression level). |
| 147 | +- Automatic chunk-size recommendation based on available memory. |
0 commit comments