Skip to content

Commit e8448c8

Browse files
authored
Add JPEG 2000 codec with optional nvJPEG2000 GPU acceleration (#1049)
* Add JPEG 2000 codec with optional nvJPEG2000 GPU acceleration (#1048) CPU path via glymur (same pattern as JPEG/Pillow and ZSTD/zstandard). GPU path via nvJPEG2000 ctypes bindings (same pattern as nvCOMP). Both are optional -- graceful fallback if libraries aren't installed. * Add JPEG 2000 test coverage and fix glymur numres constraint (#1048) - 14 new tests: codec roundtrip, TIFF write/read roundtrip, public API, availability checks, and ImportError fallback - Fix jpeg2000_compress: calculate numres from tile dimensions, remove pre-existing temp file before glymur write - Update test_edge_cases: use 'webp' as unsupported compression example since 'jpeg2000' is now supported * Add JPEG 2000 compression user guide notebook (#1048) Covers write/read roundtrip, file size comparison with deflate, multi-band RGB example, and GPU acceleration notes. * Update README: add JPEG 2000 / nvJPEG2000 to codec lists (#1048)
1 parent 66fc110 commit e8448c8

File tree

7 files changed

+750
-6
lines changed

7 files changed

+750
-6
lines changed

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -159,9 +159,9 @@ write_geotiff(data, 'out.tif', gpu=True) # force GPU compress
159159
write_vrt('mosaic.vrt', ['tile1.tif', 'tile2.tif']) # generate VRT
160160
```
161161

162-
**Compression codecs:** Deflate, LZW (Numba JIT), ZSTD, PackBits, JPEG (Pillow), uncompressed
162+
**Compression codecs:** Deflate, LZW (Numba JIT), ZSTD, PackBits, JPEG (Pillow), JPEG 2000 (glymur), uncompressed
163163

164-
**GPU codecs:** Deflate and ZSTD via nvCOMP batch API; LZW via Numba CUDA kernels
164+
**GPU codecs:** Deflate and ZSTD via nvCOMP batch API; JPEG 2000 via nvJPEG2000; LZW via Numba CUDA kernels
165165

166166
**Features:**
167167
- Tiled, stripped, BigTIFF, multi-band (RGB/RGBA), sub-byte (1/2/4/12-bit)
@@ -540,7 +540,7 @@ Check out the user guide [here](/examples/user_guide/).
540540

541541
- **Zero GDAL installation hassle.** `pip install xarray-spatial` gets you everything needed to read and write GeoTIFFs, COGs, and VRT files.
542542
- **Pure Python, fully extensible.** All codec, header parsing, and metadata code is readable Python/Numba, not wrapped C/C++.
543-
- **GPU-accelerated reads.** With optional nvCOMP, compressed tiles decompress directly on the GPU via CUDA -- something GDAL cannot do.
543+
- **GPU-accelerated reads.** With optional nvCOMP and nvJPEG2000, compressed tiles decompress directly on the GPU via CUDA -- something GDAL cannot do.
544544

545545
The native reader is pixel-exact against rasterio/GDAL across Landsat 8, Copernicus DEM, USGS 1-arc-second, and USGS 1-meter DEMs. For uncompressed files it reads 5-7x faster than rioxarray; for compressed COGs it is comparable or faster with GPU acceleration.
546546

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "yox7s6qx13e",
6+
"source": "# JPEG 2000 compression for GeoTIFFs\n\nThe geotiff package supports JPEG 2000 (J2K) as a compression codec for both reading and writing. This is useful for satellite imagery workflows where J2K is common (Sentinel-2, Landsat, etc.).\n\nTwo acceleration tiers are available:\n- **CPU** via `glymur` (pip install glymur) -- works anywhere OpenJPEG is installed\n- **GPU** via NVIDIA's nvJPEG2000 library -- same optional pattern as nvCOMP for deflate/ZSTD\n\nThis notebook demonstrates write/read roundtrips with JPEG 2000 compression.",
7+
"metadata": {}
8+
},
9+
{
10+
"cell_type": "code",
11+
"id": "kamu534xsm",
12+
"source": "import numpy as np\nimport xarray as xr\nimport matplotlib.pyplot as plt\nimport tempfile\nimport os\n\nfrom xrspatial.geotiff import read_geotiff, write_geotiff",
13+
"metadata": {},
14+
"execution_count": null,
15+
"outputs": []
16+
},
17+
{
18+
"cell_type": "markdown",
19+
"id": "w7tlml1cyqj",
20+
"source": "## Generate synthetic elevation data\n\nWe'll create a small terrain-like raster to use as test data.",
21+
"metadata": {}
22+
},
23+
{
24+
"cell_type": "code",
25+
"id": "9fzhnpcn4xq",
26+
"source": "# Create a 256x256 synthetic terrain (uint16, typical for satellite imagery)\nrng = np.random.RandomState(42)\nyy, xx = np.meshgrid(np.linspace(-2, 2, 256), np.linspace(-2, 2, 256), indexing='ij')\nterrain = np.exp(-(xx**2 + yy**2)) * 10000 + rng.normal(0, 100, (256, 256))\nterrain = np.clip(terrain, 0, 65535).astype(np.uint16)\n\nda = xr.DataArray(\n terrain,\n dims=['y', 'x'],\n coords={\n 'y': np.linspace(45.0, 44.0, 256),\n 'x': np.linspace(-120.0, -119.0, 256),\n },\n attrs={'crs': 4326},\n name='elevation',\n)\n\nfig, ax = plt.subplots(figsize=(6, 5))\nda.plot(ax=ax, cmap='terrain')\nax.set_title('Synthetic elevation (uint16)')\nplt.tight_layout()\nplt.show()",
27+
"metadata": {},
28+
"execution_count": null,
29+
"outputs": []
30+
},
31+
{
32+
"cell_type": "markdown",
33+
"id": "8tsuyr3jbay",
34+
"source": "## Write with JPEG 2000 (lossless)\n\nPass `compression='jpeg2000'` to `write_geotiff`. The default is lossless encoding.",
35+
"metadata": {}
36+
},
37+
{
38+
"cell_type": "code",
39+
"id": "ystjp6v30d",
40+
"source": "tmpdir = tempfile.mkdtemp(prefix='j2k_demo_')\n\n# Write with JPEG 2000 compression\nj2k_path = os.path.join(tmpdir, 'elevation_j2k.tif')\nwrite_geotiff(da, j2k_path, compression='jpeg2000')\n\n# Compare file sizes with deflate\ndeflate_path = os.path.join(tmpdir, 'elevation_deflate.tif')\nwrite_geotiff(da, deflate_path, compression='deflate')\n\nnone_path = os.path.join(tmpdir, 'elevation_none.tif')\nwrite_geotiff(da, none_path, compression='none')\n\nj2k_size = os.path.getsize(j2k_path)\ndeflate_size = os.path.getsize(deflate_path)\nnone_size = os.path.getsize(none_path)\n\nprint(f\"Uncompressed: {none_size:>8,} bytes\")\nprint(f\"Deflate: {deflate_size:>8,} bytes ({deflate_size/none_size:.1%} of original)\")\nprint(f\"JPEG 2000: {j2k_size:>8,} bytes ({j2k_size/none_size:.1%} of original)\")",
41+
"metadata": {},
42+
"execution_count": null,
43+
"outputs": []
44+
},
45+
{
46+
"cell_type": "markdown",
47+
"id": "89y9zun97nb",
48+
"source": "## Read it back and verify lossless roundtrip\n\n`read_geotiff` auto-detects the compression from the TIFF header. No special arguments needed.",
49+
"metadata": {}
50+
},
51+
{
52+
"cell_type": "code",
53+
"id": "8vf9ljxkx03",
54+
"source": "# Read back and check lossless roundtrip\nda_read = read_geotiff(j2k_path)\n\nprint(f\"Shape: {da_read.shape}\")\nprint(f\"Dtype: {da_read.dtype}\")\nprint(f\"CRS: {da_read.attrs.get('crs')}\")\nprint(f\"Exact match: {np.array_equal(da_read.values, terrain)}\")\n\nfig, axes = plt.subplots(1, 2, figsize=(12, 5))\nda.plot(ax=axes[0], cmap='terrain')\naxes[0].set_title('Original')\nda_read.plot(ax=axes[1], cmap='terrain')\naxes[1].set_title('After JPEG 2000 roundtrip')\nplt.tight_layout()\nplt.show()",
55+
"metadata": {},
56+
"execution_count": null,
57+
"outputs": []
58+
},
59+
{
60+
"cell_type": "markdown",
61+
"id": "gcj96utnd3u",
62+
"source": "## Multi-band example (RGB)\n\nJPEG 2000 also handles multi-band imagery, which is the common case for satellite data.",
63+
"metadata": {}
64+
},
65+
{
66+
"cell_type": "code",
67+
"id": "mgv9xhsrcen",
68+
"source": "# Create a 3-band uint8 image\nrgb = np.zeros((128, 128, 3), dtype=np.uint8)\nrgb[:, :, 0] = np.linspace(0, 255, 128).astype(np.uint8)[None, :] # red gradient\nrgb[:, :, 1] = np.linspace(0, 255, 128).astype(np.uint8)[:, None] # green gradient\nrgb[:, :, 2] = 128 # constant blue\n\nda_rgb = xr.DataArray(\n rgb, dims=['y', 'x', 'band'],\n coords={'y': np.arange(128), 'x': np.arange(128), 'band': [0, 1, 2]},\n)\n\nrgb_path = os.path.join(tmpdir, 'rgb_j2k.tif')\nwrite_geotiff(da_rgb, rgb_path, compression='jpeg2000')\n\nda_rgb_read = read_geotiff(rgb_path)\nprint(f\"RGB shape: {da_rgb_read.shape}, dtype: {da_rgb_read.dtype}\")\nprint(f\"Exact match: {np.array_equal(da_rgb_read.values, rgb)}\")\n\nfig, axes = plt.subplots(1, 2, figsize=(10, 4))\naxes[0].imshow(rgb)\naxes[0].set_title('Original RGB')\naxes[1].imshow(da_rgb_read.values)\naxes[1].set_title('After J2K roundtrip')\nplt.tight_layout()\nplt.show()",
69+
"metadata": {},
70+
"execution_count": null,
71+
"outputs": []
72+
},
73+
{
74+
"cell_type": "markdown",
75+
"id": "zzga5hc3a99",
76+
"source": "## GPU acceleration\n\nOn systems with nvJPEG2000 installed (CUDA toolkit, RAPIDS environments), pass `gpu=True` to use GPU-accelerated J2K encode/decode. The API is the same -- it falls back to CPU automatically if the library isn't found.\n\n```python\n# GPU write (nvJPEG2000 if available, else CPU fallback)\nwrite_geotiff(cupy_data, \"output.tif\", compression=\"jpeg2000\", gpu=True)\n\n# GPU read (nvJPEG2000 decode if available)\nda = read_geotiff(\"satellite.tif\", gpu=True)\n```",
77+
"metadata": {}
78+
},
79+
{
80+
"cell_type": "code",
81+
"id": "x74nrht8kx",
82+
"source": "# Cleanup temp files\nimport shutil\nshutil.rmtree(tmpdir, ignore_errors=True)",
83+
"metadata": {},
84+
"execution_count": null,
85+
"outputs": []
86+
}
87+
],
88+
"metadata": {
89+
"kernelspec": {
90+
"display_name": "Python 3",
91+
"language": "python",
92+
"name": "python3"
93+
},
94+
"language_info": {
95+
"name": "python",
96+
"version": "3.11.0"
97+
}
98+
},
99+
"nbformat": 4,
100+
"nbformat_minor": 5
101+
}

xrspatial/geotiff/_compression.py

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -727,13 +727,77 @@ def zstd_compress(data: bytes, level: int = 3) -> bytes:
727727
return _zstd.ZstdCompressor(level=level).compress(data)
728728

729729

730+
# -- JPEG 2000 codec (via glymur) --------------------------------------------
731+
732+
JPEG2000_AVAILABLE = False
733+
try:
734+
import glymur as _glymur
735+
JPEG2000_AVAILABLE = True
736+
except ImportError:
737+
_glymur = None
738+
739+
740+
def jpeg2000_decompress(data: bytes, width: int = 0, height: int = 0,
741+
samples: int = 1) -> bytes:
742+
"""Decompress a JPEG 2000 codestream. Requires ``glymur``."""
743+
if not JPEG2000_AVAILABLE:
744+
raise ImportError(
745+
"glymur is required to read JPEG 2000-compressed TIFFs. "
746+
"Install it with: pip install glymur")
747+
import tempfile
748+
import os
749+
# glymur reads from files, so write the codestream to a temp file
750+
fd, tmp = tempfile.mkstemp(suffix='.j2k')
751+
try:
752+
os.write(fd, data)
753+
os.close(fd)
754+
jp2 = _glymur.Jp2k(tmp)
755+
arr = jp2[:]
756+
return arr.tobytes()
757+
finally:
758+
os.unlink(tmp)
759+
760+
761+
def jpeg2000_compress(data: bytes, width: int, height: int,
762+
samples: int = 1, dtype: np.dtype = np.dtype('uint8'),
763+
lossless: bool = True) -> bytes:
764+
"""Compress raw pixel data as JPEG 2000 codestream. Requires ``glymur``."""
765+
if not JPEG2000_AVAILABLE:
766+
raise ImportError(
767+
"glymur is required to write JPEG 2000-compressed TIFFs. "
768+
"Install it with: pip install glymur")
769+
import math
770+
import tempfile
771+
import os
772+
if samples == 1:
773+
arr = np.frombuffer(data, dtype=dtype).reshape(height, width)
774+
else:
775+
arr = np.frombuffer(data, dtype=dtype).reshape(height, width, samples)
776+
fd, tmp = tempfile.mkstemp(suffix='.j2k')
777+
os.close(fd)
778+
os.unlink(tmp) # glymur needs the file to not exist
779+
try:
780+
cratios = [1] if lossless else [20]
781+
# numres must be <= log2(min_dim) + 1 to avoid OpenJPEG errors
782+
min_dim = max(min(width, height), 1)
783+
numres = min(6, int(math.log2(min_dim)) + 1)
784+
numres = max(numres, 1)
785+
_glymur.Jp2k(tmp, data=arr, cratios=cratios, numres=numres)
786+
with open(tmp, 'rb') as f:
787+
return f.read()
788+
finally:
789+
if os.path.exists(tmp):
790+
os.unlink(tmp)
791+
792+
730793
# -- Dispatch helpers ---------------------------------------------------------
731794

732795
# TIFF compression tag values
733796
COMPRESSION_NONE = 1
734797
COMPRESSION_LZW = 5
735798
COMPRESSION_JPEG = 7
736799
COMPRESSION_DEFLATE = 8
800+
COMPRESSION_JPEG2000 = 34712
737801
COMPRESSION_ZSTD = 50000
738802
COMPRESSION_PACKBITS = 32773
739803
COMPRESSION_ADOBE_DEFLATE = 32946
@@ -771,6 +835,9 @@ def decompress(data, compression: int, expected_size: int = 0,
771835
dtype=np.uint8)
772836
elif compression == COMPRESSION_ZSTD:
773837
return np.frombuffer(zstd_decompress(data), dtype=np.uint8)
838+
elif compression == COMPRESSION_JPEG2000:
839+
return np.frombuffer(
840+
jpeg2000_decompress(data, width, height, samples), dtype=np.uint8)
774841
else:
775842
raise ValueError(f"Unsupported compression type: {compression}")
776843

@@ -803,5 +870,7 @@ def compress(data: bytes, compression: int, level: int = 6) -> bytes:
803870
return zstd_compress(data, level)
804871
elif compression == COMPRESSION_JPEG:
805872
raise ValueError("Use jpeg_compress() directly with width/height/samples")
873+
elif compression == COMPRESSION_JPEG2000:
874+
raise ValueError("Use jpeg2000_compress() directly with width/height/samples/dtype")
806875
else:
807876
raise ValueError(f"Unsupported compression type: {compression}")

0 commit comments

Comments
 (0)