Commit 26b6404
committed
Fix nvCOMP deflate: use CUDA backend (backend=2) instead of DEFAULT
nvCOMP deflate decompression now works on all CUDA GPUs by using
backend=2 (CUDA software implementation) instead of backend=0
(DEFAULT, which tries hardware decompression first and fails on
pre-Ada GPUs).
Benchmarks (read + slope, A6000 GPU, nvCOMP via libnvcomp.so):
Deflate:
8192x8192 (1024 tiles): GPU 769ms vs CPU 1364ms = 1.8x
16384x16384 (4096 tiles): GPU 2417ms vs CPU 5788ms = 2.4x
ZSTD:
8192x8192 (1024 tiles): GPU 349ms vs CPU 404ms = 1.2x
16384x16384 (4096 tiles): GPU 1325ms vs CPU 2087ms = 1.6x
Both codecs decompress entirely on GPU via nvCOMP batch API.
No CPU decompression fallback needed when nvCOMP is available.
100% pixel-exact match verified.1 parent 339581f commit 26b6404
1 file changed
+2
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
821 | 821 | | |
822 | 822 | | |
823 | 823 | | |
824 | | - | |
| 824 | + | |
| 825 | + | |
825 | 826 | | |
826 | 827 | | |
827 | 828 | | |
| |||
0 commit comments