Skip to content

Commit 268618d

Browse files
committed
Add design spec for geotiff performance and memory controls
Covers dtype on read, compression_level on write, and VRT tiled output from to_geotiff for streaming dask writes.
1 parent 7d25252 commit 268618d

File tree

1 file changed

+147
-0
lines changed

1 file changed

+147
-0
lines changed
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
# GeoTIFF Performance and Memory Controls
2+
3+
Adds three parameters to `open_geotiff` and `to_geotiff` that let callers
4+
control memory usage, compression speed, and large-raster write strategy.
5+
All three are opt-in; default behaviour is unchanged.
6+
7+
## 1. `dtype` parameter on `open_geotiff`
8+
9+
### API
10+
11+
```python
12+
open_geotiff(source, *, dtype=None, ...)
13+
```
14+
15+
`dtype` accepts any numpy dtype string or object (`np.float32`, `'float32'`,
16+
etc.). `None` preserves the file's native dtype (current behaviour).
17+
18+
### Read paths
19+
20+
| Path | Behaviour |
21+
|------|-----------|
22+
| Eager (numpy) | Output array allocated at target dtype. Each decoded tile/strip cast before copy-in. Peak overhead: one tile at native dtype. |
23+
| Dask | Each delayed chunk function casts after decode. Output chunks are target dtype. Same per-tile overhead. |
24+
| GPU (CuPy) | Cast on device after decode. |
25+
| Dask + CuPy | Combination of dask and GPU paths. |
26+
27+
### Numba LZW fast path
28+
29+
The LZW decoder is a numba JIT function that emits values one at a time into a
30+
byte buffer. A variant will decode each value and cast inline to the target
31+
dtype so the per-tile buffer is never allocated at native dtype. Other codecs
32+
(deflate, zstd) return byte buffers from C libraries where per-value
33+
interception isn't possible, so those fall back to the tile-level cast.
34+
35+
### Validation
36+
37+
- Narrowing float casts (float64 to float32): allowed.
38+
- Narrowing int casts (int64 to int16): allowed (user asked for it explicitly).
39+
- Widening casts (float32 to float64, uint8 to int32): allowed.
40+
- Float to int: `ValueError` (lossy in a way users often don't intend).
41+
- Unsupported casts (e.g. complex128 to uint8): `ValueError`.
42+
43+
## 2. `compression_level` parameter on `to_geotiff`
44+
45+
### API
46+
47+
```python
48+
to_geotiff(data, path, *, compression='zstd', compression_level=None, ...)
49+
```
50+
51+
`compression_level` is `int | None`. `None` uses the codec's existing default.
52+
53+
### Ranges
54+
55+
| Codec | Range | Default | Direction |
56+
|-------|-------|---------|-----------|
57+
| deflate | 1 -- 9 | 6 | 1 = fastest, 9 = smallest |
58+
| zstd | 1 -- 22 | 3 | 1 = fastest, 22 = smallest |
59+
| lz4 | 0 -- 16 | 0 | 0 = fastest |
60+
| lzw | n/a | n/a | No level support; ignored silently |
61+
| jpeg | n/a | n/a | Quality is a separate axis; ignored |
62+
| packbits | n/a | n/a | Ignored |
63+
| none | n/a | n/a | Ignored |
64+
65+
### Plumbing
66+
67+
`to_geotiff` passes `compression_level` to `write()`, which passes it to
68+
`compress()`. The internal `compress()` already accepts a `level` argument; we
69+
just thread it through the two intermediate call sites that currently hardcode
70+
it.
71+
72+
### Validation
73+
74+
- Out-of-range level for a codec that supports levels: `ValueError`.
75+
- Level set for a codec without level support: silently ignored.
76+
77+
### GPU path
78+
79+
`write_geotiff_gpu` also accepts and forwards the level to nvCOMP batch
80+
compression, which supports levels for zstd and deflate.
81+
82+
## 3. VRT output from `to_geotiff` via `.vrt` extension
83+
84+
### Trigger
85+
86+
When `path` ends in `.vrt`, `to_geotiff` writes a tiled VRT instead of a
87+
monolithic TIFF. No new parameter needed.
88+
89+
### Output layout
90+
91+
```
92+
output.vrt
93+
output_tiles/
94+
tile_0000_0000.tif # row_col, zero-padded
95+
tile_0000_0001.tif
96+
...
97+
```
98+
99+
Directory name derived from the VRT stem (`foo.vrt` -> `foo_tiles/`).
100+
Zero-padding width scales to the grid dimensions.
101+
102+
### Behaviour per input type
103+
104+
| Input | Tiling strategy | Memory profile |
105+
|-------|----------------|----------------|
106+
| Dask DataArray | One tile per dask chunk. Each task computes its chunk and writes one `.tif`. | One chunk in RAM at a time (scheduler controlled). |
107+
| Dask + CuPy | Same, GPU compress per tile. | One chunk in GPU memory at a time. |
108+
| Numpy / ndarray | Slice into `tile_size`-sized pieces, write each. | Source array already in RAM; tile slices are views (no duplication). |
109+
| CuPy | Same as numpy but GPU compress. | Source on GPU; tiles are views. |
110+
111+
### Per-tile properties
112+
113+
- Same `compression`, `compression_level`, `predictor`, `nodata`, `crs` as the
114+
parent call.
115+
- `tiled=True` with the caller's `tile_size` (internal TIFF tiling within each
116+
chunk-file).
117+
- GeoTransform adjusted to each tile's spatial position (row/col offset from
118+
the full raster origin).
119+
- No COG overviews on individual tiles.
120+
121+
### VRT generation
122+
123+
After all tiles are written, call `write_vrt()` with relative paths. The VRT
124+
XML references each tile by its spatial extent and band mapping.
125+
126+
### Edge cases and validation
127+
128+
- `cog=True` with a `.vrt` path: `ValueError` (mutually exclusive).
129+
- Tiles directory exists and is non-empty: `FileExistsError` to prevent silent
130+
overwrites.
131+
- Tiles directory doesn't exist: created automatically.
132+
- `overview_levels` with `.vrt` path: `ValueError` (overviews don't apply).
133+
134+
### Dask scheduling
135+
136+
For dask inputs, all delayed tile-write tasks are submitted to
137+
`dask.compute()` at once. The scheduler manages parallelism and memory. Each
138+
task is: compute chunk, compress, write tile file. No coordination between
139+
tasks.
140+
141+
## Out of scope
142+
143+
- Streaming write of a monolithic `.tif` from dask input (tracked as a separate
144+
issue). Users who need a single file from a large dask array can write to VRT
145+
and convert externally, or ensure sufficient RAM.
146+
- JPEG quality parameter (separate concern from compression level).
147+
- Automatic chunk-size recommendation based on available memory.

0 commit comments

Comments
 (0)