Skip to content

Commit 9b83984

Browse files
committed
Merge origin/master, resolve conflicts with all_touched boundary drawing from #996
2 parents fae1f4a + 4125b34 commit 9b83984

File tree

3 files changed

+1132
-15
lines changed

3 files changed

+1132
-15
lines changed

examples/user_guide/25_GLCM_Texture.ipynb

Lines changed: 296 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,14 @@
2525
"cell_type": "code",
2626
"execution_count": null,
2727
"id": "3a06tk9sxuh",
28-
"metadata": {},
28+
"metadata": {
29+
"execution": {
30+
"iopub.execute_input": "2026-03-09T15:30:49.123715Z",
31+
"iopub.status.busy": "2026-03-09T15:30:49.123629Z",
32+
"iopub.status.idle": "2026-03-09T15:30:50.272830Z",
33+
"shell.execute_reply": "2026-03-09T15:30:50.272259Z"
34+
}
35+
},
2936
"outputs": [],
3037
"source": [
3138
"import numpy as np\n",
@@ -48,7 +55,14 @@
4855
"cell_type": "code",
4956
"execution_count": null,
5057
"id": "mqtgqmsnvlp",
51-
"metadata": {},
58+
"metadata": {
59+
"execution": {
60+
"iopub.execute_input": "2026-03-09T15:30:50.274712Z",
61+
"iopub.status.busy": "2026-03-09T15:30:50.274450Z",
62+
"iopub.status.idle": "2026-03-09T15:30:50.338586Z",
63+
"shell.execute_reply": "2026-03-09T15:30:50.338001Z"
64+
}
65+
},
5266
"outputs": [],
5367
"source": [
5468
"# Build a 100x100 raster with four texture quadrants\n",
@@ -93,7 +107,14 @@
93107
"cell_type": "code",
94108
"execution_count": null,
95109
"id": "2fgm4xz3uwj",
96-
"metadata": {},
110+
"metadata": {
111+
"execution": {
112+
"iopub.execute_input": "2026-03-09T15:30:50.339809Z",
113+
"iopub.status.busy": "2026-03-09T15:30:50.339694Z",
114+
"iopub.status.idle": "2026-03-09T15:30:51.046768Z",
115+
"shell.execute_reply": "2026-03-09T15:30:51.046259Z"
116+
}
117+
},
97118
"outputs": [],
98119
"source": [
99120
"contrast = glcm_texture(agg, metric='contrast', window_size=7, levels=64)\n",
@@ -122,7 +143,14 @@
122143
"cell_type": "code",
123144
"execution_count": null,
124145
"id": "7ah4bm6v5ut",
125-
"metadata": {},
146+
"metadata": {
147+
"execution": {
148+
"iopub.execute_input": "2026-03-09T15:30:51.047924Z",
149+
"iopub.status.busy": "2026-03-09T15:30:51.047834Z",
150+
"iopub.status.idle": "2026-03-09T15:30:51.568614Z",
151+
"shell.execute_reply": "2026-03-09T15:30:51.567915Z"
152+
}
153+
},
126154
"outputs": [],
127155
"source": [
128156
"metrics = ['contrast', 'dissimilarity', 'homogeneity', 'energy', 'correlation', 'entropy']\n",
@@ -160,7 +188,14 @@
160188
"cell_type": "code",
161189
"execution_count": null,
162190
"id": "nkzw9wmyxio",
163-
"metadata": {},
191+
"metadata": {
192+
"execution": {
193+
"iopub.execute_input": "2026-03-09T15:30:51.570036Z",
194+
"iopub.status.busy": "2026-03-09T15:30:51.569932Z",
195+
"iopub.status.idle": "2026-03-09T15:30:51.920501Z",
196+
"shell.execute_reply": "2026-03-09T15:30:51.919825Z"
197+
}
198+
},
164199
"outputs": [],
165200
"source": [
166201
"fig, axes = plt.subplots(1, 4, figsize=(16, 4))\n",
@@ -188,7 +223,14 @@
188223
"cell_type": "code",
189224
"execution_count": null,
190225
"id": "keurc3dmugs",
191-
"metadata": {},
226+
"metadata": {
227+
"execution": {
228+
"iopub.execute_input": "2026-03-09T15:30:51.921620Z",
229+
"iopub.status.busy": "2026-03-09T15:30:51.921522Z",
230+
"iopub.status.idle": "2026-03-09T15:30:52.045363Z",
231+
"shell.execute_reply": "2026-03-09T15:30:52.044567Z"
232+
}
233+
},
192234
"outputs": [],
193235
"source": [
194236
"import dask.array as da\n",
@@ -228,6 +270,254 @@
228270
"\n",
229271
"Lower `levels` values run faster but lose gray-level resolution. For most remote sensing work, 32-64 levels is a good balance."
230272
]
273+
},
274+
{
275+
"cell_type": "markdown",
276+
"id": "r8ezsmezbt9",
277+
"source": [
278+
"## Ok, how do we apply this to GIS?\n",
279+
"\n",
280+
"The synthetic quadrants above make the math obvious, but the real payoff is on actual imagery. Water and land have very different spatial textures in satellite data: water is smooth and uniform, land is rough and varied. GLCM features capture that difference even when raw pixel brightness alone is ambiguous (shadows, mudflats, dark pavement).\n",
281+
"\n",
282+
"Below we'll grab a Sentinel-2 NIR band crop of the San Francisco Bay coastline, compute GLCM texture features, and use KMeans clustering to classify water vs. land \u2014 no training labels required."
283+
],
284+
"metadata": {}
285+
},
286+
{
287+
"cell_type": "markdown",
288+
"id": "ec79xdunce9",
289+
"source": [
290+
"### Step 1 \u2014 Download a Sentinel-2 NIR band\n",
291+
"\n",
292+
"We read a 500 x 500 pixel window (5 km x 5 km at 10 m resolution) straight from a\n",
293+
"Cloud-Optimized GeoTIFF hosted on AWS. The scene is\n",
294+
"[S2B_10SEG_20230921](https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/10/S/EG/2023/9/S2B_10SEG_20230921_0_L2A/),\n",
295+
"a near-cloudless capture of the Bay Area from September 2023.\n",
296+
"\n",
297+
"If the remote file is unreachable, a synthetic coastal raster is used instead\n",
298+
"so the rest of the notebook still runs."
299+
],
300+
"metadata": {}
301+
},
302+
{
303+
"cell_type": "code",
304+
"id": "2hogpjtllw7",
305+
"source": [
306+
"import os\n",
307+
"import rioxarray\n",
308+
"\n",
309+
"os.environ['AWS_NO_SIGN_REQUEST'] = 'YES'\n",
310+
"os.environ['GDAL_DISABLE_READDIR_ON_OPEN'] = 'EMPTY_DIR'\n",
311+
"\n",
312+
"COG_URL = (\n",
313+
" 'https://sentinel-cogs.s3.us-west-2.amazonaws.com/'\n",
314+
" 'sentinel-s2-l2a-cogs/10/S/EG/2023/9/'\n",
315+
" 'S2B_10SEG_20230921_0_L2A/B08.tif'\n",
316+
")\n",
317+
"\n",
318+
"try:\n",
319+
" nir_da = rioxarray.open_rasterio(COG_URL).isel(band=0, y=slice(2100, 2600), x=slice(5300, 5800))\n",
320+
" nir = nir_da.load().values.astype(np.float64)\n",
321+
" print(f'Downloaded NIR band: {nir.shape}, range {nir.min():.0f}\u2013{nir.max():.0f}')\n",
322+
"except Exception as exc:\n",
323+
" print(f'Remote read failed ({exc}), using synthetic fallback')\n",
324+
" rng_sat = np.random.default_rng(99)\n",
325+
" nir = np.zeros((500, 500), dtype=np.float64)\n",
326+
" # Water: low reflectance, smooth\n",
327+
" nir[:, 250:] = rng_sat.normal(80, 10, (500, 250)).clip(20, 200)\n",
328+
" # Land: high reflectance, rough\n",
329+
" nir[:, :250] = rng_sat.normal(1800, 400, (500, 250)).clip(300, 4000)\n",
330+
"\n",
331+
"satellite = xr.DataArray(nir, dims=['y', 'x'])\n",
332+
"\n",
333+
"fig, ax = plt.subplots(figsize=(6, 6))\n",
334+
"ax.imshow(nir, cmap='gray', vmin=0, vmax=np.percentile(nir, 98))\n",
335+
"ax.set_title('Sentinel-2 NIR \u2014 SF Bay coastline')\n",
336+
"plt.tight_layout()\n",
337+
"plt.show()"
338+
],
339+
"metadata": {
340+
"execution": {
341+
"iopub.execute_input": "2026-03-09T15:30:52.046807Z",
342+
"iopub.status.busy": "2026-03-09T15:30:52.046700Z",
343+
"iopub.status.idle": "2026-03-09T15:30:53.261434Z",
344+
"shell.execute_reply": "2026-03-09T15:30:53.260895Z"
345+
}
346+
},
347+
"execution_count": null,
348+
"outputs": []
349+
},
350+
{
351+
"cell_type": "markdown",
352+
"id": "joxz7n8olpc",
353+
"source": [
354+
"### Step 2 \u2014 Compute GLCM texture features\n",
355+
"\n",
356+
"We pick four metrics that tend to separate water (uniform, high energy, high homogeneity) from land (rough, high contrast):\n",
357+
"\n",
358+
"| Metric | Water (smooth) | Land (rough) |\n",
359+
"|---|---|---|\n",
360+
"| contrast | low | high |\n",
361+
"| homogeneity | high | low |\n",
362+
"| energy | high | low |\n",
363+
"| correlation | high (uniform) | lower (varied) |\n",
364+
"\n",
365+
"An 11 x 11 window gives enough spatial context at 10 m resolution (110 m footprint)."
366+
],
367+
"metadata": {}
368+
},
369+
{
370+
"cell_type": "code",
371+
"id": "ytdru4ssilp",
372+
"source": [
373+
"sat_metrics = ['contrast', 'homogeneity', 'energy', 'correlation']\n",
374+
"sat_textures = glcm_texture(satellite, metric=sat_metrics, window_size=11, levels=64)\n",
375+
"\n",
376+
"fig, axes = plt.subplots(1, 4, figsize=(18, 4))\n",
377+
"for ax, name in zip(axes, sat_metrics):\n",
378+
" vals = sat_textures.sel(metric=name).values\n",
379+
" im = ax.imshow(vals, cmap='viridis')\n",
380+
" ax.set_title(name.capitalize())\n",
381+
" plt.colorbar(im, ax=ax, shrink=0.7)\n",
382+
"plt.suptitle('GLCM features on satellite NIR', fontsize=13, y=1.02)\n",
383+
"plt.tight_layout()\n",
384+
"plt.show()"
385+
],
386+
"metadata": {
387+
"execution": {
388+
"iopub.execute_input": "2026-03-09T15:30:53.262766Z",
389+
"iopub.status.busy": "2026-03-09T15:30:53.262453Z",
390+
"iopub.status.idle": "2026-03-09T15:30:58.515207Z",
391+
"shell.execute_reply": "2026-03-09T15:30:58.514628Z"
392+
}
393+
},
394+
"execution_count": null,
395+
"outputs": []
396+
},
397+
{
398+
"cell_type": "markdown",
399+
"id": "xya2v71uhb",
400+
"source": [
401+
"### Step 3 \u2014 Classify water vs. land with KMeans\n",
402+
"\n",
403+
"We stack the four texture layers into a feature vector per pixel, normalize them,\n",
404+
"and let KMeans find two clusters. No training labels needed \u2014 the texture\n",
405+
"difference between water and land is large enough that unsupervised clustering\n",
406+
"picks it up cleanly.\n",
407+
"\n",
408+
"After clustering, we check which cluster has lower mean NIR reflectance and\n",
409+
"label that one \"water.\""
410+
],
411+
"metadata": {}
412+
},
413+
{
414+
"cell_type": "code",
415+
"id": "keramnxr509",
416+
"source": [
417+
"from sklearn.cluster import KMeans\n",
418+
"from sklearn.preprocessing import StandardScaler\n",
419+
"\n",
420+
"# Stack texture features: (H, W, 4)\n",
421+
"feature_stack = np.stack(\n",
422+
" [sat_textures.sel(metric=m).values for m in sat_metrics], axis=-1\n",
423+
")\n",
424+
"h, w, n_feat = feature_stack.shape\n",
425+
"\n",
426+
"# Mask out NaN edges left by the GLCM window\n",
427+
"valid = ~np.any(np.isnan(feature_stack), axis=-1)\n",
428+
"X = feature_stack[valid]\n",
429+
"\n",
430+
"# Normalize and cluster\n",
431+
"X_scaled = StandardScaler().fit_transform(X)\n",
432+
"labels = KMeans(n_clusters=2, random_state=42, n_init=10).fit_predict(X_scaled)\n",
433+
"\n",
434+
"# Build the classification raster\n",
435+
"class_map = np.full((h, w), np.nan)\n",
436+
"class_map[valid] = labels\n",
437+
"\n",
438+
"# Assign \"water\" to the cluster with lower NIR reflectance\n",
439+
"cluster_nir = [np.nanmean(nir[class_map == c]) for c in range(2)]\n",
440+
"water_id = int(np.argmin(cluster_nir))\n",
441+
"\n",
442+
"water_mask = np.where(valid, class_map == water_id, np.nan)\n",
443+
"print(f'Water cluster: {water_id} (mean NIR {cluster_nir[water_id]:.0f})')\n",
444+
"print(f'Land cluster: {1 - water_id} (mean NIR {cluster_nir[1 - water_id]:.0f})')"
445+
],
446+
"metadata": {
447+
"execution": {
448+
"iopub.execute_input": "2026-03-09T15:30:58.517250Z",
449+
"iopub.status.busy": "2026-03-09T15:30:58.517125Z",
450+
"iopub.status.idle": "2026-03-09T15:30:59.860739Z",
451+
"shell.execute_reply": "2026-03-09T15:30:59.860156Z"
452+
}
453+
},
454+
"execution_count": null,
455+
"outputs": []
456+
},
457+
{
458+
"cell_type": "markdown",
459+
"id": "pbro14h2x4",
460+
"source": [
461+
"### Result\n",
462+
"\n",
463+
"Left: the NIR image (bright = land, dark = water). Right: the GLCM-based classification. The shoreline comes through cleanly without any hand-drawn training polygons."
464+
],
465+
"metadata": {}
466+
},
467+
{
468+
"cell_type": "code",
469+
"id": "q1zl6bioi7e",
470+
"source": [
471+
"from matplotlib.colors import ListedColormap\n",
472+
"\n",
473+
"fig, axes = plt.subplots(1, 2, figsize=(13, 6))\n",
474+
"\n",
475+
"# NIR image\n",
476+
"axes[0].imshow(nir, cmap='gray', vmin=0, vmax=np.percentile(nir, 98))\n",
477+
"axes[0].set_title('Sentinel-2 NIR band')\n",
478+
"\n",
479+
"# Classification\n",
480+
"water_cmap = ListedColormap(['#a6611a', '#2166ac']) # 0=land(brown), 1=water(blue)\n",
481+
"im = axes[1].imshow(\n",
482+
" np.where(valid, class_map == water_id, np.nan),\n",
483+
" cmap=water_cmap, vmin=0, vmax=1, interpolation='nearest',\n",
484+
")\n",
485+
"axes[1].set_title('GLCM water / land classification')\n",
486+
"\n",
487+
"# Legend\n",
488+
"from matplotlib.patches import Patch\n",
489+
"axes[1].legend(\n",
490+
" handles=[Patch(color='#2166ac', label='Water'),\n",
491+
" Patch(color='#a6611a', label='Land')],\n",
492+
" loc='lower right', framealpha=0.9,\n",
493+
")\n",
494+
"\n",
495+
"plt.tight_layout()\n",
496+
"plt.show()"
497+
],
498+
"metadata": {
499+
"execution": {
500+
"iopub.execute_input": "2026-03-09T15:30:59.862223Z",
501+
"iopub.status.busy": "2026-03-09T15:30:59.861896Z",
502+
"iopub.status.idle": "2026-03-09T15:31:00.062349Z",
503+
"shell.execute_reply": "2026-03-09T15:31:00.061769Z"
504+
}
505+
},
506+
"execution_count": null,
507+
"outputs": []
508+
},
509+
{
510+
"cell_type": "markdown",
511+
"id": "yqyikmy8is",
512+
"source": [
513+
"### Where to go from here\n",
514+
"\n",
515+
"- **More clusters.** Bump `n_clusters` to 3 or 4 to split out mud flats, urban areas, or vegetation.\n",
516+
"- **Supervised classification.** Replace KMeans with a Random Forest trained on labeled polygons for higher accuracy.\n",
517+
"- **Larger areas.** Wrap the satellite DataArray in `dask.array.from_array` and `glcm_texture` will process it chunk-by-chunk, the same way we showed in the Dask section above.\n",
518+
"- **Multi-band features.** Compute GLCM on several spectral bands and stack the results for a richer feature space."
519+
],
520+
"metadata": {}
231521
}
232522
],
233523
"metadata": {

0 commit comments

Comments
 (0)