You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"source": "# Preview: memory-safe thumbnails of large rasters\n\nWhen a raster is backed by dask (e.g. loaded lazily from Zarr or a stack of GeoTIFFs),\ncalling `.compute()` to visualize it can blow up your memory. `xrspatial.preview()`\ndownsamples the data to a target pixel size using block averaging, and the whole\noperation stays lazy until you ask for the result. Peak memory is bounded by\nthe largest chunk plus the small output array.\n\nThis notebook generates a 1 TB dask-backed terrain raster and previews it at\n1000x1000 pixels. A `dask.distributed` LocalCluster is started so you can\nwatch the task graph and worker memory in the dashboard.",
7
+
"metadata": {}
8
+
},
9
+
{
10
+
"cell_type": "code",
11
+
"id": "ivhk3f6ui7",
12
+
"source": "import numpy as np\nimport xarray as xr\nimport dask.array as da\nimport matplotlib.pyplot as plt\n\nimport xrspatial\nfrom xrspatial import generate_terrain, preview",
"source": "## Generate a terrain tile\n\nFirst, create a 1024x1024 terrain tile using `generate_terrain`. This is the\nbuilding block we'll replicate into a massive dask array.",
"source": "## Tile it into a 1 TB dask array\n\nWe replicate the tile 512x512 times using `dask.array.tile` to get a\n524,288 x 524,288 raster. At float32 that's 1.1 TB of data. Nothing is\nactually computed here -- dask just records the tiling as a lazy graph.",
43
+
"metadata": {}
44
+
},
45
+
{
46
+
"cell_type": "code",
47
+
"id": "ire1hxtder",
48
+
"source": "# Tile the small terrain into a ~1 TB dask array\nreps = 512\nbig_dask = da.tile(\n da.from_array(tile.values, chunks=(1024, 1024)),\n (reps, reps),\n)\nrows, cols = big_dask.shape\nbig = xr.DataArray(\n big_dask,\n dims=[\"y\", \"x\"],\n coords={\"y\": np.arange(rows, dtype=np.float64), \"x\": np.arange(cols, dtype=np.float64)},\n)\n\nprint(f\"Shape: {big.shape[0]:,} x {big.shape[1]:,}\")\nprint(f\"Chunk size: {big_dask.chunksize}\")\nprint(f\"Num chunks: {big_dask.numblocks}\")\nprint(f\"Total size: {big_dask.nbytes / 1e12:.2f} TB\")\nprint(f\"Dtype: {big_dask.dtype}\")",
49
+
"metadata": {},
50
+
"execution_count": null,
51
+
"outputs": []
52
+
},
53
+
{
54
+
"cell_type": "markdown",
55
+
"id": "3n94gc0t1tg",
56
+
"source": "## Preview at 1000x1000\n\n`preview()` builds a lazy coarsen-then-mean graph. Calling `.compute()` on the\nresult materializes only the 1000x1000 output -- about 4 MB.",
"source": "fig, ax = plt.subplots(figsize=(8, 8))\nsmall.plot(ax=ax, cmap=\"terrain\")\nax.set_title(f\"1000x1000 preview of a {big_dask.nbytes / 1e12:.1f} TB raster\")\nax.set_aspect(\"equal\")\nplt.tight_layout()",
71
+
"metadata": {},
72
+
"execution_count": null,
73
+
"outputs": []
74
+
},
75
+
{
76
+
"cell_type": "markdown",
77
+
"id": "nrbcb74q9oa",
78
+
"source": "## Different preview sizes\n\nYou can control both width and height. Omitting height preserves the aspect ratio.",
79
+
"metadata": {}
80
+
},
81
+
{
82
+
"cell_type": "code",
83
+
"id": "mqzjqxdvj4",
84
+
"source": "fig, axes = plt.subplots(1, 3, figsize=(14, 4))\nfor ax, w in zip(axes, [100, 500, 2000]):\n p = preview(big, width=w).compute()\n p.plot(ax=ax, cmap=\"terrain\", add_colorbar=False)\n ax.set_title(f\"{p.shape[0]}x{p.shape[1]} ({p.nbytes / 1e6:.1f} MB)\")\n ax.set_aspect(\"equal\")\nplt.tight_layout()",
85
+
"metadata": {},
86
+
"execution_count": null,
87
+
"outputs": []
88
+
},
89
+
{
90
+
"cell_type": "markdown",
91
+
"id": "82h89j8n7em",
92
+
"source": "## Accessor syntax\n\nYou can also call `preview` directly on a DataArray or Dataset via the `.xrs` accessor.",
93
+
"metadata": {}
94
+
},
95
+
{
96
+
"cell_type": "code",
97
+
"id": "jastfcpb3i",
98
+
"source": "# Accessor on a DataArray\nsmall = big.xrs.preview(width=500).compute()\nprint(f\"DataArray accessor: {small.shape}\")\n\n# Accessor on a Dataset\nds = xr.Dataset({\"elevation\": big, \"slope_proxy\": big * 0.1})\nsmall_ds = ds.xrs.preview(width=500)\nfor name, var in small_ds.data_vars.items():\n print(f\"Dataset var '{name}': {var.shape}\")",
0 commit comments