Skip to content

[WIP] Enhancements CLI#948

Open
marinebcht wants to merge 28 commits into
GlacioHack:mainfrom
marinebcht:854_enhancements_CLI_None
Open

[WIP] Enhancements CLI#948
marinebcht wants to merge 28 commits into
GlacioHack:mainfrom
marinebcht:854_enhancements_CLI_None

Conversation

@marinebcht
Copy link
Copy Markdown
Contributor

@marinebcht marinebcht commented Apr 9, 2026

1. Inputs config in topo workflow

Resolves #854 (point 1)
Resolves #921

Now, we can run several Topo Workflows for several DEMs :

inputs:
  - path_to_elev: "path_dem_1"
    [other_param: xxx]
  - path_to_elev: "path_dem_2"
    [other_param: xxx]

Corresponding tree of outputs for level 1:

output_path (from outputs/path in the yml)
  ├─ dem_[X]  (if several inputs)
  │   ├─ tables ...
  │   ├─ plots ...
  │   ├─ report.html
  │   ├─ report.pdf
  └─ used_config.yaml

If only one dem needed to be study, the input section can look like this :

inputs:
    path_to_elev: "path_dem_1"
    [other_param: xxx]

And the outputs tables/plots(/raster) are now at the root directory (from outputs/path in the yml) as before.

2.A VCRS parameters simplification

Resolves #854 (point 2)

Currently, input elev have two parameters :
image

As VCRS becomes part of the CRS, the DEM can have already a VCRS. So, we can simplify to only one optional "force_vcrs" :

  • if the actual CRS elev has no vcrs: set it
  • else: replace it
inputs:
    path_to_elev: "path_dem_1"
    force_vcrs: XXXX

The possibility to offer this possibility for the CRS: CRS are mostly present and correct, unlike VCRS, which can be absent from the metadata or erroneous => NO add

2.B Attributes computaion in case of geographic CRS

For the topo workflow, as not projected elev can lead to wrong attributes outputs, we propose the following keys :

a. no reproject key, or {reproject: None}, or {reproject: {crs: None}} => reproject the elev to the corresponting CRS UTM (.get_metric_crs())

[reproject:]
   [crs:]

b. {reproject: {crs: None}} => no reprojection => warning As the CRS is a geographic CRS, the following surface fit attributes might be wrong. Please use a projected CRS or let it empty to reproject in default projected CRS.

reproject:
   crs: False

c. If we wan't to customize the projection :
- if CRS is projected => projection before atributes computation (and only!)
- CRS projected covering the area => OK
- Need to raises an error (GDAL EXCEPTION = > issues #955)

  • if CRS is not projected => warning As the CRS is a geographic CRS and the "reproject/crs" either, the following surface fit attributes might be wrong.
reproject:
    crs: xxx

In two words :

  • the reprojection is use ONLY if the initial proj is geographic and if the attributes need as input projection
  • the default behaviour = reproject when needed to the corresponding UTM
  • no reproject (leading to possible errors) if {reproject: {crs: None}}
  • personal choice projection {reproject: {crs: [CRS]}}

If level > 1 and a reprojection is done, "elev_reprojected.tif" is saved in rasters
/!\ NEED TO USE THE NEW GEOUTIL WITH XARRAY AND VCRS IN CRS to complete validation #956

3. Add direct CLI tests (arguments + workflow) in test_cli.py

Resolves #864 (point 1)

  • test_invalid_parameters
  • test_raises_help (xdem -h, xdem --help, xdem)
  • test_missing_param_after_workflow (xdem topo)
  • test topo and accuracy workflow
  • test default config (print and file)
  • test config file errors (does not exist, not .yml or .yaml, error in it)

4. Fixes other CLI problems/little change in it

In topo workflows:

  • if attributes : None, no error anymore
  • attributes are presented in columns of 3 if greater than 6
  • in the config, if extra parameters needed (dict version), the extra_information key is not more needed
  • the attribtues computation is now call once (before: once in png generation function and one in raster function)
  • fractal_roughness attibutes was misscall fractal_dimension Resolve Fractal roughness error in topo workflow #949
  • As fractal roughness need a window_size > 5 and the other attributes contrary to window-needed-attributes, I added another window_size_fractal parameters in every function used to its computaion (especially get_terrain_attribute() *)

In accuracy :

image

For both workflows :

  • create_html and generate_pdf functions moved to [workflow] as we call it for each input in topo, same for creation_output_dir function and _load_data takes an input dict.
  • the _validate_crs call for previous set_vcrs and to_vcrs, renamed _validate_vcrs for force_vcrs and _validate_crs validated CRS.from_user_input(value) (@ topo key reproject/crs)
  • Add outputs/generate_pdf (boolean) to deactivate this part (error in the cluster)

(*) Added test_attributes_default_call in test_terrain.py to check if default parameters in get_terrain_attribute() and in [attribute]() are the same (and lead to same results for each terrain attributes).

5. Documentation

  • Change cli_topo for (1)
  • Change cli_topo + cli_accuracy for (2.A) and (generate_pdf in 4)
  • Change cli_topo for (2.B)
  • Remove double sub sections for reference/tba_elev in cli_accuracy Resolves Enhancements for workflows and CLI interface #854 (point 3)

6. Propositions for the now/future

@marinebcht
Copy link
Copy Markdown
Contributor Author

marinebcht commented Apr 14, 2026

This PR is almost ready (question point 2 + its doc) :)
@belletva @rhugonnet @adehecq

Maybe another proposition for the point 2 (with all the dev simplification to do):
image

Furthermore, the set/from_vcrs does not appear to persist after "data preparation", but this will be automatically fixed when xDem will be merged to use the brand new Geoutils and its VCRS-part-of-the-CRS functionality 🥳

@marinebcht
Copy link
Copy Markdown
Contributor Author

Output example for two inputs @ topo workflow :

image

Comment thread doc/source/cli_topo.md Outdated
Comment thread doc/source/cli_topo.md Outdated
Comment thread doc/source/cli_topo.md Outdated
Comment thread doc/source/cli_topo.md Outdated
│ │ └─ elev_with_mask_stats.csv (if mask_elev is given in input)
│ ├─ plots
├─ elev_map.png
│ ├─ masked_elev_map.png (if mask_elev is given in input)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add [ ] for optional files

Comment thread doc/source/cli_topo.md Outdated
@rhugonnet
Copy link
Copy Markdown
Member

Thanks a lot @marinebcht! This is amazing (especially the direct CLI tests 😜)

I can't review fully today, but my quick reaction on vcrs management:

This is not specific to vertical CRS, but CRS in general, and there's two questions mixed in here:

1/ Do we provide support for edge cases where a user has a badly defined input, with potentially no CRS/VCRS written in the metadata, or not? Currently we don't, except for the VCRS. This is not an issue in Python, where the user can set it manually beforehand, but I can see how this could be limiting in the CLI.

2/ Do we allow users to reproject through xDEM for workflows on single-source data like Topo? (Accuracy is relative change, so not an issue there). This is actually quite relevant, as some terrain attributes are nonsensical in some common coordinate systems (geographic), and need to be projected to local UTM/other beforehand. (We could even have a default that reprojects/derives/reprojects back for the geographic case)

I think the way forward could be simply:

1/ We always accept optional inputs force_crs and force_vcrs that override the DEM crs/vcrs, in inputs (or we can keep from_vcrs/crs naming too),
2/ We add an optional processing step reprojection where the user can provide a crs/vcrs, or even a string "utm" (which would wrap .get_metric_crs()), so that it's clear in the workflow that this step is done. I don't think it works in outputs because this happens ahead in the processing chain.

In this context, we won't need to_vcrs anymore (only from/force_crs/vcrs), as this is covered in the reprojection step for the whole 3D CRS.

@marinebcht
Copy link
Copy Markdown
Contributor Author

Hello @rhugonnet :) Thank you for all of your comments !

Several questions/proposition :

  • Is that pretty common to can CRS error / missing CRS ? I thought that was just the case on purpose (like sensor grid) 😥
    In this context of wrong metadata : if we have a dem with (V)RS A, and the user give an (V)CRS B, it means that we have to trust hit and force elev to (V)CRS B with a warning ?
  • I don't quite understand this sentence "We could even have a default that reprojects/derives/reprojects back for the geographic case" ? 🤔
  • In the topo workflow, I feel like the output reprojection will be more accurate in new section (same for all inputs = one purpose/process by .yml) than in each inputs

@rhugonnet
Copy link
Copy Markdown
Member

@marinebcht Thanks! 🙂

Some quick answers:

  • CRS missing: It is quite uncommon, especially nowadays, but I've seen it (usually messed up by the user through some software, often the nodata too). Vertical CRS is the most frequent one.
  • Reprojecting back: If the CRS is geographic, most attributes fail (unsupported), or give erroneous units. We could have a default behaviour that detects (if EPSG=4326 or same family), projects the DEM to the local utm (get_metric_crs), derives the attributes (right units and value), then projects back to the original EPSG (simple resampling, values/units stay valid).
  • Last point: Not sure I understand. You mean the new "reprojection" section I propose would be clearer? (I liked the idea of a new section, but I think "outputs" that was proposed doesn't work in terms of processing order and logic of other CLI steps)

@marinebcht
Copy link
Copy Markdown
Contributor Author

marinebcht commented Apr 27, 2026

@belletva @rhugonnet @adehecq
ready to review (modulo doc point 2.B) :)

Added:

@marinebcht marinebcht marked this pull request as ready for review April 27, 2026 11:09
Comment thread doc/source/cli_topo.md Outdated
Comment thread doc/source/cli_topo.md Outdated

1. Level 1 → aligned elevation only
2. Level 2 → more detailed output
1. Level 1 → save reports and stats and plots used in it
Copy link
Copy Markdown
Contributor

@belletva belletva Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save reports (HTML and PDF formats) in parent directory, stats (CSV formats) in tables directory and plots (PNG formats) in plots directory

Comment thread doc/source/cli_accuracy.md Outdated
Comment thread doc/source/cli_accuracy.md Outdated
Comment thread doc/source/cli_accuracy.md Outdated
Comment thread doc/source/cli_accuracy.md
Comment thread doc/source/cli_accuracy.md Outdated
Comment thread doc/source/cli_accuracy.md Outdated
Comment thread doc/source/cli_accuracy.md Outdated
Comment thread doc/source/cli_accuracy.md Outdated
Comment thread doc/source/cli_accuracy.md
Comment thread doc/source/cli_accuracy.md Outdated
Comment thread doc/source/cli_accuracy.md Outdated

:::{note}
For transforming between vertical CRS with ``from_vcrs``/``to_vcrs`` please refer to {ref}`vertical-ref`.
To set the vertical CRS with ``force_vcrs``, please refer to {ref}`vertical-ref`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe an additional note: "Overrides a vertical CRS that might exist in the metadata".

@marinebcht
Copy link
Copy Markdown
Contributor Author

marinebcht commented May 4, 2026

Hello, I took some time to understand the use of the param output_grid in outputs.
Here's what I saw, depending on its value, the value of sampling_grid and if the coreg is activated or not.

https://docs.google.com/spreadsheets/d/109xGCD0ILPQTy0bbZuMFch2qr7C-ZdZHpkOkSOmBe-8/edit?usp=sharing

When output_grid when equals to to_be_aligned_elev gives diff_before and diff_elev null. The only purpose can be to compute the difference between to_be_aligned_elev before and after coreg (if coreg!) => to see the error of an elev relative to the reference it should coregister.

To me, the main accuracy metric is really the difference between to_be_aligned_elev (before/after/like that) and the reference_elev. So, maybe we can "fix" output_grid to reference_elev (so remove the key from the config file) and add a new key dedicated to compute the diff before/after to_be_aligned_elev (if coreg) (if useful ?).

Corresponding PR: #767

belletva
belletva previously approved these changes May 5, 2026
Copy link
Copy Markdown
Contributor

@belletva belletva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this PR is OK. We need to open a new issue concerning the output_grid and sampling_grid.

@rhugonnet
Copy link
Copy Markdown
Member

Good for me too.

On output_grid: I don't understand the explanation on the purpose, sorry 😅.
If it's useless, we scrap it!

@marinebcht
Copy link
Copy Markdown
Contributor Author

Hello @belletva
New titles for the plots in the accuracy workflow with coreg and level = 2 :
report_acc_11_05.pdf

Add the (visual) doc for the reprojection parameters in the topo workflow :
image

@marinebcht
Copy link
Copy Markdown
Contributor Author

Little note after f9c6d7c
When we use multi process to compute several attributes, ex:
mp_config = MultiprocConfig(chunk_size=50, outfile="tmp_mp_output.tif")
The outputs files are "tmp_mp_output_[attributes].tif"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants