Skip to content

Commit 2e5cf73

Browse files
authored
DOC: Contributing instructions, BytesIO, switch from rasterio, GDAL config, & external courses (#910)
- Update Contributing page with building documentation section - Add documentation about writing into BytesIO #668, solves also #584 about writing raster in the cloud - Add a page dedicated to the switch from rasterio to rioxarray, and notably equivalences between rasterio functions and rioxarray methods - Add a page about GDAL configuration options #693 - Add and initialize external courses toctree - Carpentries courses #279 - Geospatial Python courses by Earth Lab at University of Colorado, Boulder
1 parent 74b9878 commit 2e5cf73

7 files changed

Lines changed: 222 additions & 2 deletions

File tree

CONTRIBUTING.rst

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,32 @@ This assumes you have cloned the rioxarray repository and are in the base folder
126126
'source /venv/bin/activate && python -m pytest'
127127
128128
129+
Build documentation
130+
-------------------
131+
132+
This assumes you have cloned the rioxarray repository, installed rioxarray conda environment and are in the base folder.
133+
134+
1. Install documentation libraries
135+
136+
.. code-block:: bash
137+
138+
conda activate rioxarray
139+
conda install -c conda-forge pandoc
140+
pip install -e . --group doc
141+
142+
2. Build the documentation
143+
144+
.. code-block:: bash
145+
146+
make docs
147+
148+
If you are on Windows or don't have ``make`` installed:
149+
150+
.. code-block:: bash
151+
152+
sphinx-build -b html docs/ docs/_build/
153+
154+
129155
Pull Request Guidelines
130156
-----------------------
131157

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@
6060

6161
# General information about the project.
6262
project = "rioxarray"
63-
copyright = "2019-2025, Corteva Agriscience™"
63+
copyright = "2019-2026, Corteva Agriscience™"
6464
author = "rioxarray Contributors"
6565

6666
# The version info for the project you're documenting, acts as replacement

docs/examples/convert_to_raster.ipynb

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -632,7 +632,7 @@
632632
" \"planet_scope_tiled.tif\",\n",
633633
" tiled=True, # GDAL: By default striped TIFF files are created. This option can be used to force creation of tiled TIFF files.\n",
634634
" windowed=True, # rioxarray: read & write one window at a time\n",
635-
") "
635+
")"
636636
]
637637
},
638638
{
@@ -651,6 +651,28 @@
651651
"source": [
652652
"!rio info planet_scope_tiled.tif"
653653
]
654+
},
655+
{
656+
"metadata": {},
657+
"cell_type": "markdown",
658+
"source": [
659+
"## Write your raster in `BytesIO` object\n",
660+
"\n",
661+
"As recommended in the [planetary computer documentation](https://planetarycomputer.microsoft.com/docs/quickstarts/storage/#Write-to-Azure-Blob-Storage), your raster can be directly written into a `BytesIO` object. Then, it can be uploaded into your favorite cloud storage."
662+
]
663+
},
664+
{
665+
"metadata": {},
666+
"cell_type": "code",
667+
"source": [
668+
"import io\n",
669+
"\n",
670+
"with io.BytesIO() as buffer:\n",
671+
" rds.rio.to_raster(buffer, driver=\"COG\")\n",
672+
" buffer.seek(0)"
673+
],
674+
"outputs": [],
675+
"execution_count": null
654676
}
655677
],
656678
"metadata": {

docs/examples/examples.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,3 +36,11 @@ This page contains links to a collection of examples of how to use rioxarray.
3636
How to mask NetCDF time series data from a shapefile in Python? <https://gis.stackexchange.com/a/354798/144357>
3737
Extract data from raster at a point <https://gis.stackexchange.com/a/358058/144357>
3838
Convert raster to CSV with lat, lon, and value columns <https://gis.stackexchange.com/a/358057/144357>
39+
40+
41+
.. toctree::
42+
:maxdepth: 1
43+
:caption: External courses:
44+
45+
Geospatial Python courses by The Carpentries <https://carpentries-incubator.github.io/geospatial-python/06-raster-intro.html>
46+
Geospatial Python courses by Earth Lab at University of Colorado, Boulder <https://earthdatascience.org/courses/use-data-open-source-python/intro-raster-data-python/>

docs/getting_started/gdal_env.rst

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
.. _gdal_env:
2+
3+
Configure GDAL environment
4+
==========================
5+
6+
``rioxarray`` relies on ``rasterio`` so setting up GDAL environment stays the same. See ``rasterio``'s `documentation <https://rasterio.readthedocs.io/en/latest/topics/configuration.html#rasterio>`__ for more insights.
7+
8+
Setting up GDAL environment is very useful when working with cloud-stored data.
9+
You can find Development Seed's TiTiler environment proposition `here <https://developmentseed.org/titiler/advanced/performance_tuning/#recommended-configuration-for-dynamic-tiling>`__.
10+
11+
With Dask clusters
12+
~~~~~~~~~~~~~~~~~~
13+
14+
When setting up a Dask cluster, be sure to pass the GDAL environment (and AWS session) to every workers.
15+
First create a function setting up the env and then submit it to every worker.
16+
17+
.. code-block:: python
18+
19+
import os
20+
from dask.distributed import Client
21+
22+
23+
def set_env():
24+
# Set that to dask workers to make process=True work on cloud
25+
os.environ["AWS_S3_ENDPOINT"] = os.getenv("AWS_S3_ENDPOINT")
26+
os.environ["AWS_S3_AWS_ACCESS_KEY_ID"] = os.getenv("AWS_S3_AWS_ACCESS_KEY_ID")
27+
os.environ["AWS_S3_AWS_SECRET_ACCESS_KEY"] = os.getenv(
28+
"AWS_S3_AWS_SECRET_ACCESS_KEY"
29+
)
30+
os.environ["CPL_VSIL_CURL_ALLOWED_EXTENSIONS"] = ".vrt"
31+
32+
33+
# Run the client
34+
with Client(processes=True) as client:
35+
36+
# Propagate the env variables
37+
client.run(set_env)
38+
...
39+
40+
41+
.. note::
42+
43+
There are gotchas with the environment and dask workers, see this `discussion <https://github.com/corteva/rioxarray/discussions/630>`__ for more insights.

docs/getting_started/getting_started.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,8 @@ Introductory Information
7979
.. toctree::
8080
:maxdepth: 1
8181

82+
switching_from_rasterio
8283
crs_management.ipynb
8384
nodata_management.ipynb
8485
manage_information_loss.ipynb
86+
gdal_env
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
.. _switching_from_rasterio:
2+
3+
Switching from ``rasterio``
4+
===========================
5+
6+
Reasons to switch from ``rasterio`` to ``rioxarray``
7+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
8+
9+
Usually, switching from ``rasterio`` to ``rioxarray`` means you are working with rasters and you have to adapt your code to ``xarray``.
10+
11+
``xarray`` is a powerful abstraction of both the raster dataset and the raster array. There is a lot of advantages to unite these two notions under the same object, as it simplifies the use of the functions, using attributes stored in the object rather than passing arguments to the functions.
12+
13+
``xarray`` comes also with a lot of very interesting built-in functions and can leverage several backends to replace ``numpy`` in cases where it is limiting (out-of-memory computation, running the code on clusters, on GPU...). Dask is one of the most well-knwown. ``rioxarray`` handles some basic ``dask`` features in I/O (see `Dask I/O example <https://corteva.github.io/rioxarray/html/examples/dask_read_write.html>`__) but is not designed to support ``dask`` in more evolved functions such as reproject.
14+
15+
Beware, ``xarray`` comes also with gotchas! You can see some of them in `the dedicated section <https://corteva.github.io/rioxarray/html/getting_started/manage_information_loss.html>`__.
16+
17+
18+
.. note::
19+
20+
``rasterio`` Dataset and xarray Dataset are two completely different things! Please be careful with these overlapping names.
21+
22+
Equivalences between ``rasterio`` and ``rioxarray``
23+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
24+
25+
To ease the switch from ``rasterio`` and ``rioxarray``, here is a table of the usual parameters or functions used.
26+
27+
``ds`` stands for ``rasterio`` Dataset
28+
29+
Profile
30+
-------
31+
32+
Here is the parameters that you can derive from ``rasterio``'s Dataset profile:
33+
34+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
35+
| ``rasterio`` from ``ds.profile`` | ``rioxarray`` from DataArray |
36+
+==================================+=======================================================================================================================+
37+
| :attr:`blockxsize` | :attr:`.encoding["preferred_chunks"]["x"]` |
38+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
39+
| :attr:`blockysize` | :attr:`.encoding["preferred_chunks"]["y"]` |
40+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
41+
| :attr:`compress` | *Unused in rioxarray* |
42+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
43+
| :attr:`~rasterio.io.DatasetReader.count` | :attr:`rio.count <rioxarray.rioxarray.XRasterBase.count>` |
44+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
45+
| :attr:`~rasterio.io.DatasetReader.crs` | :attr:`rio.crs <rioxarray.rioxarray.XRasterBase.crs>` |
46+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
47+
| :attr:`~rasterio.io.DatasetReader.driver` | Unused in rioxarray |
48+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
49+
| :attr:`~rasterio.io.DatasetReader.dtype` | :attr:`.encoding["rasterio_dtype"]` |
50+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
51+
| :attr:`~rasterio.io.DatasetReader.height` | :attr:`rio.height <rioxarray.rioxarray.XRasterBase.height>` |
52+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
53+
| :attr:`interleave` | Unused in rioxarray |
54+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
55+
| :attr:`~rasterio.io.DatasetReader.nodata` | :attr:`rio.nodata <rioxarray.raster_array.RasterArray.nodata>` (or `encoded_nodata <nodata_management.html>`_ ) |
56+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
57+
| :attr:`tiled` | Unused in rioxarray |
58+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
59+
| :attr:~rasterio.io.DatasetReader.transform` | :func:`rio.transform() <rioxarray.rioxarray.XRasterBase.transform>` |
60+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
61+
| :attr:`~rasterio.io.DatasetReader.width` | :attr:`rio.width <rioxarray.rioxarray.XRasterBase.width>` |
62+
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------+
63+
64+
The values not used in ``rioxarray`` comes from the abstraction of the dataset in ``xarray``: a dataset no longer belongs to a file on disk even if read from it. The driver and other file-related notions are meaningless in this context.
65+
66+
Other dataset parameters
67+
------------------------
68+
69+
+----------------------------------+-------------------------------------------------------------------+
70+
| ``rasterio`` from ``ds`` | ``rioxarray`` from DataArray |
71+
+==================================+===================================================================+
72+
| :attr:`~rasterio.io.DatasetReader.gcps` or :func:`~rasterio.io.DatasetReader.get_gcps` | :func:`rio.get_gcps() <rioxarray.rioxarray.XRasterBase.get_gcps>` |
73+
+----------------------------------+-------------------------------------------------------------------+
74+
| :attr:`~rasterio.io.DatasetReader.rpcs` | :func:`rio.get_rpcs() <rioxarray.rioxarray.XRasterBase.get_rpcs>` |
75+
+----------------------------------+-------------------------------------------------------------------+
76+
| :attr:`~rasterio.io.DatasetReader.bounds` | :func:`rio.bounds() <rioxarray.rioxarray.XRasterBase.bounds>` |
77+
+----------------------------------+-------------------------------------------------------------------+
78+
79+
Functions
80+
---------
81+
82+
+------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
83+
| ``rasterio`` | ``rioxarray`` |
84+
+====================================+===================================================================================================================================+
85+
| :func:`rasterio.open` | :func:`rioxarray.open_rasterio` or :attr:`xarray.open_dataset(..., engine="rasterio", decode_coords="all") <xarray.open_dataset>` |
86+
+------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
87+
| :func:`~rasterio.io.DatasetReader.read` | :func:`compute() <xarray.DataArray.compute>` (load data into memory) |
88+
+------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
89+
| :func:`ds.read(... window=) <rasterio.io.DatasetReader.read>` | :func:`rio.isel_window() <rioxarray.rioxarray.XRasterBase.isel_window>` |
90+
+------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
91+
| :func:`~rasterio.io.DatasetWriter.write` | :func:`rio.to_raster() <rioxarray.raster_array.RasterArray.to_raster>` |
92+
+------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
93+
| :func:`mask(..., crop=False) <rasterio.mask.mask>` | :func:`rio.clip(..., drop=False) <rioxarray.raster_array.RasterArray.clip>` |
94+
+------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
95+
| :func:`mask(..., crop=True) <rasterio.mask.mask>` | :func:`rio clip(..., drop=True) <rioxarray.raster_array.RasterArray.clip>` |
96+
+------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
97+
| :func:`~rasterio.warp.reproject` | :func:`rio.reproject() <rioxarray.raster_array.RasterArray.reproject>` |
98+
+------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
99+
| :func:`~rasterio.merge.merge` | :func:`rioxarray.merge.merge_arrays` |
100+
+------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
101+
| :func:`~rasterio.fill.fillnodata` | :func:`rio.interpolate_na() <rioxarray.raster_array.RasterArray.interpolate_na>` |
102+
+------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
103+
104+
105+
106+
By default, ``xarray`` is lazy and therefore not loaded into memory, hence the ``compute`` equivalent to ``read``.
107+
108+
109+
Going back to ``rasterio``
110+
~~~~~~~~~~~~~~~~~~~~~~~~~~
111+
112+
``rioxarray`` 0.21+ enables recreating a ``rasterio`` Dataset from ``rioxarray``.
113+
This is useful when translating your code from ``rasterio`` to ``rioxarray``, even if it is sub-optimal, because the array will be loaded and written in memory behind the hood.
114+
It is always better to look for ``rioxarray``'s native functions.
115+
116+
.. code-block:: python
117+
118+
with dataarray.rio.to_rasterio_dataset() as rio_ds:
119+
...

0 commit comments

Comments
 (0)