Skip to content
This repository was archived by the owner on Feb 23, 2026. It is now read-only.

Commit 6b307dd

Browse files
authored
Merge pull request #241 from raphaelrpl/bdc-catalog-v1
🔥 Add minimal support to generate data cube from local dir and minor changes
2 parents d7ac24a + 87dec46 commit 6b307dd

14 files changed

Lines changed: 720 additions & 71 deletions

File tree

CHANGES.rst

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,19 +21,20 @@ Changes
2121
=======
2222

2323

24-
Version 0.8.3 (2022-10-03)
25-
--------------------------
26-
27-
- Add support to customize data cube path and data cube item (`#236 <https://github.com/brazil-data-cube/cube-builder/issues/236>`_)
28-
- Review docs related with new path format cubes
29-
30-
31-
Version 1.0.0a1 (2022-09-27)
24+
Version 1.0.0a1 (2022-10-20)
3225
----------------------------
3326

3427
- Add integration with BDC-Catalog 1.0 `#233 <https://github.com/brazil-data-cube/cube-builder/issues/233>`_.
3528
- Add support to generate datacube from Sentinel-2 Zip (experimental) `#222 <https://github.com/brazil-data-cube/cube-builder/issues/222>`_.
3629
- Improve docs setup
30+
- Add support to generate data cube from local directories (`#25 <https://github.com/brazil-data-cube/cube-builder/issues/25>`_)
31+
32+
33+
Version 0.8.3 (2022-10-03)
34+
--------------------------
35+
36+
- Add support to customize data cube path and data cube item (`#236 <https://github.com/brazil-data-cube/cube-builder/issues/236>`_)
37+
- Review docs related with new path format cubes
3738

3839

3940
Version 0.8.2 (2022-09-21)

MANIFEST.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,4 @@ recursive-include examples *.py
2828
recursive-include tests *.py
2929
recursive-include tests *.json
3030
recursive-include cube_builder *.yaml
31+
recursive-include examples *.json

USING.rst

Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -267,6 +267,8 @@ Once the data cube definition is created, you can trigger a data cube using the
267267
if you would like to generate data cube using a different STAC provider. Remember that the ``--collection`` must exists.
268268

269269

270+
.. _create_sentinel:
271+
270272
Creating data cube Sentinel 2
271273
-----------------------------
272274

@@ -545,3 +547,181 @@ You can change any parameter with the command ``cube-builder configure`` with ``
545547
Be aware of what you are changing to do not affect the integrity of data cube.
546548
For example, changing the masking ``clear_data`` when there is a already area generated.
547549
Make sure to re-generate all the periods and tiles again.
550+
551+
552+
553+
Advanced User Guide
554+
-------------------
555+
556+
Generate data cubes from local dir
557+
++++++++++++++++++++++++++++++++++
558+
559+
.. versionadded:: 1.0.0
560+
561+
.. note::
562+
563+
To proceed this step, you will need to have a set of files in disk.
564+
We will not provide this files since its just a briefing of this feature. You may consider
565+
to have own files individually.
566+
567+
568+
With latest change of ``Cube-Builder`` (1.0), the user can generate data cubes using local directories containing
569+
images. This feature is useful to generate data cubes when the user has a bunch of image files locally and would like
570+
to apply temporal composition function over these files. In this case, a ``STAC Server`` is not required.
571+
This feature can be achieved using parameters ``--local DIRECTORY`` and ``--format PATH_TO_FORMAT.json``.
572+
It follows the signature of `GDALCubes Formats <https://github.com/appelmar/gdalcubes/tree/master/formats>`_ to read
573+
directories.
574+
Essentially, a format contains the following properties:
575+
576+
- ``images`` (REQUIRED): Object context representing how to seek for any image in disk.
577+
578+
- ``pattern`` (REQUIRED)
579+
- ``datetime`` (REQUIRED): Object context describing how to identify data times from any directory path or file path.
580+
581+
- ``pattern`` (REQUIRED): A regex expression describing how to match datetime.
582+
- ``format`` (REQUIRED): ISO Format to get data time from `str`.
583+
584+
- ``bands`` (REQUIRED): The data set bands that will be captured while recurring disk. You can also add extra fields to increment metadata of band. The following internal props are required:
585+
586+
- ``pattern``: Regex pattern to identify band in disk.
587+
- ``nodata``: No data value for band.
588+
- ``tags`` (OPTIONAL): List of keywords describing the given format.
589+
- ``description`` (OPTIONAL): A detailed multi-line description to fully explain the format.
590+
591+
You can check a minimal example in ``examples/formats/bdc-sentinel-2-l2a-cogs.json``, which offers support to
592+
locate ``Sentinel-2`` Cloud Optimized GeoTIFF files. You may also take a look in `GDALCubes Formats <https://github.com/appelmar/gdalcubes/tree/master/formats>`_
593+
for others formats.
594+
595+
For this example, lets create a simple sentinel-2 data cube called ``S2-LOCAL-16D``. The signature is similar from
596+
:ref:`create_sentinel`. We just need to change the cube parameters to something like::
597+
598+
...
599+
"parameters": {
600+
"mask": {
601+
"clear_data": [4, 5, 6],
602+
"not_clear_data": [2, 3, 7, 8, 9, 10, 11],
603+
"nodata": 0,
604+
"saturated_data": [1]
605+
},
606+
"local": "/path/to/local/files",
607+
"recursive": true,
608+
"format": "examples/formats/bdc-sentinel-2-l2a-cogs.json",
609+
"pattern": ".tif"
610+
}
611+
612+
So you can create a data cube with command::
613+
614+
curl --location \
615+
--request POST '127.0.0.1:5000/cubes' \
616+
--header 'Content-Type: application/json' \
617+
--data-raw '
618+
{
619+
"datacube": "S2-LOCAL-16D",
620+
"datacube_identity": "S2-LOCAL",
621+
"grs": "BRAZIL_SM",
622+
"title": "Sentinel-2 SR - Cube LCF 16 days -v001",
623+
"resolution": 10,
624+
"version": 1,
625+
"metadata": {
626+
"license": "MIT",
627+
"platform": {
628+
"code": "Sentinel-2",
629+
"instruments": "MSI"
630+
}
631+
},
632+
"temporal_composition": {
633+
"schema": "Cyclic",
634+
"step": 16,
635+
"unit": "day",
636+
"cycle": {
637+
"unit": "year",
638+
"step": 1
639+
}
640+
},
641+
"composite_function": "LCF",
642+
"bands_quicklook": [
643+
"B04",
644+
"B03",
645+
"B02"
646+
],
647+
"bands": [
648+
{"name": "B01", "common_name": "coastal", "data_type": "int16", "nodata": 0},
649+
{"name": "B02", "common_name": "blue", "data_type": "int16", "nodata": 0},
650+
{"name": "B03", "common_name": "green", "data_type": "int16", "nodata": 0},
651+
{"name": "B04", "common_name": "red", "data_type": "int16", "nodata": 0},
652+
{"name": "B05", "common_name": "rededge", "data_type": "int16", "nodata": 0},
653+
{"name": "B06", "common_name": "rededge", "data_type": "int16", "nodata": 0},
654+
{"name": "B07", "common_name": "rededge", "data_type": "int16", "nodata": 0},
655+
{"name": "B08", "common_name": "nir", "data_type": "int16", "nodata": 0},
656+
{"name": "B8A", "common_name": "nir08", "data_type": "int16", "nodata": 0},
657+
{"name": "B11", "common_name": "swir16", "data_type": "int16", "nodata": 0},
658+
{"name": "B12", "common_name": "swir22", "data_type": "int16", "nodata": 0},
659+
{"name": "SCL", "common_name": "quality","data_type": "uint8", "nodata": 0}
660+
],
661+
"indexes": [
662+
{
663+
"name": "EVI",
664+
"common_name": "evi",
665+
"data_type": "int16",
666+
"nodata": -9999,
667+
"metadata": {
668+
"expression": {
669+
"bands": [
670+
"B8A",
671+
"B04",
672+
"B02"
673+
],
674+
"value": "(10000. * 2.5 * (B8A - B04) / (B8A + 6. * B04 - 7.5 * B02 + 10000.))"
675+
}
676+
}
677+
},
678+
{
679+
"name": "NDVI",
680+
"common_name": "ndvi",
681+
"data_type": "int16",
682+
"nodata": -9999,
683+
"metadata": {
684+
"expression": {
685+
"bands": [
686+
"B8A",
687+
"B04"
688+
],
689+
"value": "10000. * ((B8A - B04)/(B8A + B04))"
690+
}
691+
}
692+
}
693+
],
694+
"quality_band": "SCL",
695+
"description": "This data cube contains all available images from Sentinel-2, resampled to 10 meters of spatial resolution, reprojected, cropped and mosaicked to BDC_SM grid and time composed each 16 days using LCF temporal composition function.",
696+
"parameters": {
697+
"mask": {
698+
"clear_data": [4, 5, 6],
699+
"not_clear_data": [2, 3, 7, 8, 9, 10, 11],
700+
"nodata": 0,
701+
"saturated_data": [1]
702+
},
703+
"local": "/path/to/local/files",
704+
"recursive": true,
705+
"format": "examples/formats/bdc-sentinel-2-l2a-cogs.json",
706+
"pattern": ".tif"
707+
}
708+
}'
709+
710+
After cube definition created, you can just use the command line ``cube-builder build-local``::
711+
712+
SQLALCHEMY_DATABASE_URI="postgresql://postgres:postgres@localhost/bdc" \
713+
cube-builder build-local S2-LOCAL-16D \
714+
--tiles 003011 \
715+
--start-date 2021-08-29 \
716+
--end-date 2021-09-13 \
717+
--directory /path/to/local/files \
718+
--format examples/formats/bdc-sentinel-2-l2a-cogs.json
719+
720+
721+
.. note::
722+
723+
This example just illustrate how to trigger the data cube using local directory. You may need to
724+
change these values like ``directory``, ``format``, ``start-date``, ``end-date`` and ``tiles``.
725+
726+
Right now, it only supports using ``--tiles`` as parameter. It will be replaced in the next release
727+
to support any ``Region of Interest (ROI)`` or shapefile.

cube_builder/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
from json import JSONEncoder
2222

2323
from bdc_catalog.ext import BDCCatalog
24+
from bdc_catalog.models import db
2425
from flask import Flask, abort, request
2526
from flask_redoc import Redoc
2627
from werkzeug.exceptions import HTTPException, InternalServerError

cube_builder/celery/tasks.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030

3131
# Cube Builder
3232
from ..config import Config
33-
from ..constants import CLEAR_OBSERVATION_NAME, DATASOURCE_NAME, PROVENANCE_NAME, TOTAL_OBSERVATION_NAME, IDENTITY
33+
from ..constants import CLEAR_OBSERVATION_NAME, DATASOURCE_NAME, IDENTITY, PROVENANCE_NAME, TOTAL_OBSERVATION_NAME
3434
from ..models import Activity
3535
from ..utils import get_srid_column
3636
from ..utils.image import check_file_integrity, create_empty_raster, match_histogram_with_merges

cube_builder/cli.py

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -169,6 +169,50 @@ def build(datacube: str, collections: str, tiles: str, start: str, end: str, ban
169169
assert res['ok']
170170

171171

172+
@cli.command()
173+
@click.argument('datacube')
174+
@click.option('--tiles', type=click.STRING, required=True, help='Comma delimited tiles')
175+
@click.option('-d', '--directory', type=click.Path(exists=True, readable=True), required=True,
176+
help='Directory containing files to read')
177+
@click.option('-f', '--format', type=click.STRING, required=True, help='Pattern to seek for files')
178+
@click.option('-r', '--recursive', is_flag=True, default=False, help='Recursive listing files')
179+
@click.option('-p', '--pattern', type=click.STRING, default='.tif', help='Pattern to seek for files')
180+
@click.option('--start-date', type=click.STRING, required=False, help='Start date')
181+
@click.option('--end-date', type=click.STRING, required=False, help='End date')
182+
@click.option('--force', '-f', is_flag=True, help='Build data cube without cache')
183+
@click.option('--token', type=click.STRING, help='Token to access data from STAC.')
184+
@click.option('--export-files', type=click.Path(writable=True), help='Export Identity Merges in file')
185+
@with_appcontext
186+
def build_local(datacube: str, tiles: str, directory: str, format: str,
187+
force=False, with_rgb=False, export_files=None, **kwargs):
188+
"""Build data cube from local files."""
189+
from .forms import DataCubeProcessForm
190+
191+
mask = kwargs.get('mask')
192+
193+
if mask:
194+
kwargs['mask'] = eval(mask)
195+
196+
data = dict(
197+
datacube=datacube,
198+
tiles=tiles.split(','),
199+
local=directory,
200+
format=format,
201+
force=force,
202+
with_rgb=with_rgb,
203+
**kwargs
204+
)
205+
206+
parser = DataCubeProcessForm()
207+
parsed_data = parser.load(data)
208+
parsed_data['export_files'] = export_files
209+
parsed_data['collections'] = None
210+
211+
res = CubeController.maestro(**parsed_data)
212+
213+
assert res['ok']
214+
215+
172216
@cli.command('configure')
173217
@click.argument('datacube')
174218
@click.option('--stac-url', type=click.STRING, help='STAC to search')

cube_builder/controller.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,8 +31,8 @@
3131
from werkzeug.exceptions import NotFound, abort
3232

3333
from .constants import (CLEAR_OBSERVATION_ATTRIBUTES, CLEAR_OBSERVATION_NAME, COG_MIME_TYPE, DATASOURCE_ATTRIBUTES,
34-
PROVENANCE_ATTRIBUTES, PROVENANCE_NAME, TOTAL_OBSERVATION_ATTRIBUTES, TOTAL_OBSERVATION_NAME,
35-
IDENTITY)
34+
IDENTITY, PROVENANCE_ATTRIBUTES, PROVENANCE_NAME, TOTAL_OBSERVATION_ATTRIBUTES,
35+
TOTAL_OBSERVATION_NAME)
3636
from .forms import CollectionForm
3737
from .grids import create_grids
3838
from .models import Activity, CubeParameters
@@ -328,7 +328,7 @@ def get_cube(cls, cube_id: int):
328328
list(filter(lambda b: b.id == cube.quicklook[0].green, cube.bands))[0].name,
329329
list(filter(lambda b: b.id == cube.quicklook[0].blue, cube.bands))[0].name
330330
]
331-
dump_cube['extent'] = None
331+
dump_cube['spatial_extent'] = None
332332
dump_cube['grid'] = cube.grs.name
333333
dump_cube['composite_function'] = dict(
334334
name=cube.composite_function.name,

0 commit comments

Comments
 (0)