Skip to content

Commit 173e719

Browse files
committed
updated classification notebook with more reliable data source and rioxarray loading
1 parent a37d5a8 commit 173e719

File tree

1 file changed

+31
-156
lines changed

1 file changed

+31
-156
lines changed

examples/xarray-spatial_classification-methods.ipynb

Lines changed: 31 additions & 156 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
"id": "89bdd2bf",
66
"metadata": {},
77
"source": [
8-
"## Grand Canyon elevation classification using the NASADEM dataset"
8+
"## Elevation binning using the NASADEM dataset"
99
]
1010
},
1111
{
@@ -22,7 +22,7 @@
2222
"2. Classifying the data using xarray-spatial's [natural breaks](https://xarray-spatial.readthedocs.io/reference/_autosummary/xrspatial.classify.natural_breaks.html), [equal interval](https://xarray-spatial.readthedocs.io/reference/_autosummary/xrspatial.classify.equal_interval.html), [quantile](https://xarray-spatial.readthedocs.io/reference/_autosummary/xrspatial.classify.quantile.html), and [reclassify](https://xarray-spatial.readthedocs.io/reference/_autosummary/xrspatial.classify.reclassify.html) functions.\n",
2323
"\n",
2424
"\n",
25-
"This tutorial uses the [NASADEM](https://github.com/microsoft/AIforEarthDatasets#nasadem) dataset from the [Microsoft Planetary Computer Data Catalog](https://planetarycomputer.microsoft.com/catalog). The area of interest roughly covers the Grand Canyon National Park. The [NASADEM](https://github.com/microsoft/AIforEarthDatasets#nasadem) dataset provides global topographic data at 1 arc-second (~30m) horizontal resolution. The data is derived primarily from data captured via the [Shuttle Radar Topography Mission](https://www2.jpl.nasa.gov/srtm/) (SRTM) and is stored on Azure Storage in [cloud-optimized GeoTIFF](https://www.cogeo.org/) format.\n"
25+
"The data is derived primarily from data captured via the [Shuttle Radar Topography Mission](https://www2.jpl.nasa.gov/srtm/) (SRTM) and is stored on Azure Storage in [cloud-optimized GeoTIFF](https://www.cogeo.org/) format.\n"
2626
]
2727
},
2828
{
@@ -33,111 +33,6 @@
3333
"### 1. Load the area of interest data"
3434
]
3535
},
36-
{
37-
"cell_type": "markdown",
38-
"id": "90ef8812",
39-
"metadata": {},
40-
"source": [
41-
"To load NASADEM data for the Grand Canyon, use the following approach described in [Accessing NASADEM data on Azure (NetCDF)](https://github.com/microsoft/AIforEarthDataSets/blob/main/data/nasadem-nc.ipynb)."
42-
]
43-
},
44-
{
45-
"cell_type": "markdown",
46-
"id": "d69cc680-4ee5-4ed6-a785-31aa44173618",
47-
"metadata": {},
48-
"source": [
49-
"First, set up the necessary constants and generate a list of all available GeoTIFF files:"
50-
]
51-
},
52-
{
53-
"cell_type": "code",
54-
"execution_count": null,
55-
"id": "4c4c7fae",
56-
"metadata": {},
57-
"outputs": [],
58-
"source": [
59-
"import requests\n",
60-
"\n",
61-
"nasadem_blob_root = \"https://nasademeuwest.blob.core.windows.net/nasadem-cog/v001/\"\n",
62-
"nasadem_file_index_url = nasadem_blob_root + \"index/nasadem_cog_list.txt\"\n",
63-
"nasadem_content_extension = \".tif\"\n",
64-
"nasadem_file_prefix = \"NASADEM_HGT_\"\n",
65-
"nasadem_file_list = None\n",
66-
"\n",
67-
"nasadem_file_list = requests.get(nasadem_file_index_url).text.split(\"\\n\")"
68-
]
69-
},
70-
{
71-
"cell_type": "markdown",
72-
"id": "b257fbfb-889c-4686-a527-2eb9b81c63e3",
73-
"metadata": {},
74-
"source": [
75-
"Next, define a function that selects a filename from the list generated in the previous step. This function accepts a list of latitude and longitude coordinates and returns the name of the file matching these coordinates:"
76-
]
77-
},
78-
{
79-
"cell_type": "code",
80-
"execution_count": null,
81-
"id": "0fc048ca",
82-
"metadata": {},
83-
"outputs": [],
84-
"source": [
85-
"import math\n",
86-
"\n",
87-
"\n",
88-
"def get_nasadem_filename(coord):\n",
89-
" \"\"\"\n",
90-
" Get the NASADEM filename for a specified latitude and longitude.\n",
91-
" \"\"\"\n",
92-
" lat = coord[0]\n",
93-
" lon = coord[1]\n",
94-
"\n",
95-
" ns_token = \"n\" if lat >= 0 else \"s\"\n",
96-
" ew_token = \"e\" if lon >= 0 else \"w\"\n",
97-
"\n",
98-
" lat_index = abs(math.floor(lat))\n",
99-
" lon_index = abs(math.floor(lon))\n",
100-
"\n",
101-
" lat_string = ns_token + \"{:02d}\".format(lat_index)\n",
102-
" lon_string = ew_token + \"{:03d}\".format(lon_index)\n",
103-
"\n",
104-
" filename = nasadem_file_prefix + lat_string + lon_string + nasadem_content_extension\n",
105-
"\n",
106-
" if filename not in nasadem_file_list:\n",
107-
" print(\"Lat/lon {},{} not available\".format(lat, lon))\n",
108-
" filename = None\n",
109-
"\n",
110-
" return filename"
111-
]
112-
},
113-
{
114-
"cell_type": "markdown",
115-
"id": "c581d16d-c1a8-4cce-ad77-94b9f98ca9da",
116-
"metadata": {},
117-
"source": [
118-
"Finally, use the function defined above to generate a URL pointing to the geodata for the Grand Canyon area:"
119-
]
120-
},
121-
{
122-
"cell_type": "code",
123-
"execution_count": null,
124-
"id": "074aa0f3",
125-
"metadata": {},
126-
"outputs": [],
127-
"source": [
128-
"grand_canyon_coord = [36.101690, -112.107676]\n",
129-
"\n",
130-
"url = nasadem_blob_root + get_nasadem_filename(grand_canyon_coord)"
131-
]
132-
},
133-
{
134-
"cell_type": "markdown",
135-
"id": "9149231c",
136-
"metadata": {},
137-
"source": [
138-
"After retrieving the raw data for the Grand Canyon, use xarray's [open_rasterio](http://xarray.pydata.org/en/stable/generated/xarray.open_rasterio.html) function to load the data into an array:"
139-
]
140-
},
14136
{
14237
"cell_type": "code",
14338
"execution_count": null,
@@ -146,8 +41,16 @@
14641
"outputs": [],
14742
"source": [
14843
"import xarray as xr\n",
44+
"import rioxarray\n",
45+
"\n",
46+
"file_name = 'data/colorado_merge_3arc_resamp.tif'\n",
47+
"raster = rioxarray.open_rasterio(file_name).sel(band=1).drop_vars('band')\n",
48+
"raster.name = 'Colorado Elevation Raster'\n",
14949
"\n",
150-
"img_arr = xr.open_rasterio(url).squeeze().drop(\"band\")"
50+
"xmin, xmax = raster.x.data.min(), raster.x.data.max()\n",
51+
"ymin, ymax = raster.y.data.min(), raster.y.data.max()\n",
52+
"\n",
53+
"xmin, xmax, ymin, ymax"
15154
]
15255
},
15356
{
@@ -157,7 +60,7 @@
15760
"metadata": {},
15861
"outputs": [],
15962
"source": [
160-
"img_arr.plot.imshow(figsize=(15, 10));"
63+
"raster.plot.imshow(figsize=(15, 10));"
16164
]
16265
},
16366
{
@@ -195,7 +98,7 @@
19598
"import matplotlib.pyplot as plt\n",
19699
"from xrspatial.classify import natural_breaks\n",
197100
"\n",
198-
"natural_breaks_agg = natural_breaks(img_arr, num_sample=20000, k=15)\n",
101+
"natural_breaks_agg = natural_breaks(raster, num_sample=20000, k=15)\n",
199102
"\n",
200103
"shade(natural_breaks_agg, cmap=plt.get_cmap(\"terrain\"), how=\"linear\")"
201104
]
@@ -233,7 +136,7 @@
233136
"source": [
234137
"from xrspatial.classify import equal_interval\n",
235138
"\n",
236-
"equal_interval_agg = equal_interval(img_arr, k=15)\n",
139+
"equal_interval_agg = equal_interval(raster, k=15)\n",
237140
"\n",
238141
"shade(equal_interval_agg, cmap=plt.get_cmap(\"terrain\"), how=\"linear\")"
239142
]
@@ -263,7 +166,7 @@
263166
"source": [
264167
"from xrspatial.classify import quantile\n",
265168
"\n",
266-
"quantile_agg = quantile(img_arr, k=15)\n",
169+
"quantile_agg = quantile(raster, k=15)\n",
267170
"\n",
268171
"shade(quantile_agg, cmap=plt.get_cmap(\"terrain\"), how=\"linear\")"
269172
]
@@ -297,60 +200,32 @@
297200
"new_vals = [val if val > 2500 else 0 for val in bins]\n",
298201
"\n",
299202
"reclass_agg = reclassify(\n",
300-
" agg=img_arr,\n",
203+
" agg=raster,\n",
301204
" bins=bins,\n",
302205
" new_values=new_vals,\n",
303206
")\n",
304207
"\n",
305208
"shade(reclass_agg, cmap=plt.get_cmap(\"terrain\"), how=\"linear\")"
306209
]
307-
},
308-
{
309-
"cell_type": "markdown",
310-
"id": "bd7822fe",
311-
"metadata": {},
312-
"source": [
313-
"## Next steps: classify different datasets"
314-
]
315-
},
316-
{
317-
"cell_type": "markdown",
318-
"id": "cea4d59c",
319-
"metadata": {},
320-
"source": [
321-
"The [Microsoft Planetary Computer Data Catalog](https://planetarycomputer.microsoft.com/catalog) includes petabytes of environmental monitoring data. All data sets are available in consistent, analysis-ready formats. You can access them through APIs as well as directly via [Azure Storage](https://docs.microsoft.com/en-us/azure/storage/). \n",
322-
"\n",
323-
"Try using [xarray-spatial's](https://xarray-spatial.readthedocs.io/index.html) classification methods with these datasets:"
324-
]
325-
},
326-
{
327-
"cell_type": "markdown",
328-
"id": "5bb5b5a0",
329-
"metadata": {},
330-
"source": [
331-
"<div style=\"width: 100%; overflow: hidden;\">\n",
332-
" <div style=\"width: 50%; float: left;\"> \n",
333-
" \n",
334-
" <center><img src=\"https://ai4edatasetspublicassets.blob.core.windows.net/assets/pc_thumbnails/additional_datasets/RE3CbUs.jpg\" /></center>\n",
335-
"<br>\n",
336-
"<center><font size=\"5\">Daymet</font>\n",
337-
"<center><font size=\"2\">Gridded temperature data across North America</font>\n",
338-
"<center><a href=\"http://aka.ms/ai4edata-daymet\" target=\"_blank\">Get Daymet temperature data</a>\n",
339-
" </div>\n",
340-
" <div style=\"margin-left: 50%;\"> \n",
341-
" <center><img src=\"https://ai4edatasetspublicassets.blob.core.windows.net/assets/pc_thumbnails/additional_datasets/gbif.jpg\" /><center>\n",
342-
"<br>\n",
343-
"<center><font size=\"5\">GBIF</font>\n",
344-
"<center><font size=\"2\">Species occurrences shared through the Global Biodiversity Information Facility</font>\n",
345-
"<center><a href=\"http://aka.ms/ai4edata-gbif/\" target=\"_blank\">Get GBIF occurrence data</a>\n",
346-
" </div>\n",
347-
"</div>"
348-
]
349210
}
350211
],
351212
"metadata": {
213+
"kernelspec": {
214+
"display_name": "Python 3 (ipykernel)",
215+
"language": "python",
216+
"name": "python3"
217+
},
352218
"language_info": {
353-
"name": "python"
219+
"codemirror_mode": {
220+
"name": "ipython",
221+
"version": 3
222+
},
223+
"file_extension": ".py",
224+
"mimetype": "text/x-python",
225+
"name": "python",
226+
"nbconvert_exporter": "python",
227+
"pygments_lexer": "ipython3",
228+
"version": "3.14.2"
354229
}
355230
},
356231
"nbformat": 4,

0 commit comments

Comments
 (0)