Skip to content

Separate "sensor" attribute concerns#3381

Closed
sfinkens wants to merge 1 commit intopytroll:mainfrom
sfinkens:oscar-test
Closed

Separate "sensor" attribute concerns#3381
sfinkens wants to merge 1 commit intopytroll:mainfrom
sfinkens:oscar-test

Conversation

@sfinkens
Copy link
Copy Markdown
Member

This is an alternative approach to #3291 . I realized that the "sensor" attribute currently serves two purposes

  • Internal: Finding available composites and enhancements for the dataset at hand
  • External: Annotating the dataset for the user

So I tried separating those two concerns by

  • Renaming the sensor attribute to _satpy_sensor. This is still lowercase but for internal usage in filenames and yaml definitions I actually find that more appropriate.
  • Adding a new instruments attribute with the WMO names. Currently ABI L1B only for testing purpose.
  • Adding a config switch legacy_sensor_attribute to get the sensor attribute back. At the moment this is True, with v1.0 we could set it to False.

Notes:

  • The instruments attribute is of type set, because a dataset can be derived from multiple instruments.
  • The _satpy_sensor attribute is of type set|str. Some parts of Satpy take that into account, others just assume it's a string. The advantage of this approach is, that we can leave the internal handling as it is, while the new attribute still draws the users' attention that it might be more than one instrument.

Questions:

  • Is it really enough to add the legacy attribute in Scene.__getitem__ or are there other paths that I missed here?
  • Closes #xxxx
  • Tests added
  • Fully documented
  • Add your name to AUTHORS.md if not there already

@djhoese
Copy link
Copy Markdown
Member

djhoese commented Apr 29, 2026

Without looking at the code here is my gut reaction:

  • "instruments" 👍
  • "_satpy_sensor": Not a big fan of this. I'd really like us to avoid satpy-specific attributes if at all possible. And yes, I dislike the need for _satpy_id already and have always wished we could get rid of area and represent geolocation in a more xarray-specific way, but that is easier said than done.

I like your point about there being two uses for the metadata, but I'd like to propose an alternative: Could we have a instrument(s) "serialization" function that converts sensor names/sets to a single "canonical" string version. This could, maybe, even be a configurable thing like satpy.config.set(instrument_serialization_func=my_func). So the basic version would be:

"-".join(inst_name.replace("-", "").replace("/", "").lower() for inst_name in instruments)

Or something like that. If we break down where satpy-style sensor names are used we have:

  1. composite YAML names
  2. enhancement YAML names
  3. output filename

The first two seem pretty simple as a "convert instruments to something useful" but it does bring up the issue that there isn't an obvious name for the YAML filenames when users create them. It isn't an exact mapping from something in the .attrs to the YAML filename.

For the output filename, it may be as simple as documenting additional keys that can be used that are dynamically generated.

I think the benefit of this style of approach is that each use of the instrument information has control over how it is used. For example, whether to join multiple sensors into a single string ("-".join) or only used the first sensor or don't support multiple sensors. It also reduces the need for two metadata items (instuments and _satpy_sensor) to be linked together and needing to keep them in sync.

@djhoese
Copy link
Copy Markdown
Member

djhoese commented Apr 29, 2026

Another use of sensor I think is in some of the "finding readers" code where a user can specify what sensor they want to read data for and Satpy's utility functions will find all the readers that say they support that sensor. I'm not sure if this functionality is still supported or if it is even useful to keep supporting it.

@sfinkens
Copy link
Copy Markdown
Member Author

(instuments and _satpy_sensor) to be linked together and needing to keep them in sync.

Good point, it's the same information in two different spellings

Could we have a instrument(s) "serialization" function that converts sensor names/sets to a single "canonical" string version

Yes! I like that very much. Just to double check that I understand correctly:

  • Dataset attributes provide instruments as a set of WMO names
  • Satpy internally calls serialize_func(data_array.attrs["instruments"]) each time that information is needed
  • YAML filenames and contents don't have to be changed

Is that what you had in mind?

@djhoese
Copy link
Copy Markdown
Member

djhoese commented Apr 29, 2026

Yes, I think that's what I had in mind. We may want the serialization function to be internal and not configurable to start OR define a separate function for each use type: yaml filename serialization versus output filename or something similar. Or it could be defined as a "serialization function" but then the output filename doesn't use that publicly, but more like the output filenaming has various options available (joined versus first with or without lowercasing, etc). Much of that is probably customizable by the user just with plain python string formatting syntax, especially if we always do output_filename.format(instruments=sorted(instrument_set)) as that would, I think, allow a user to do "{instruments[0]!l}" (see https://trollsift.readthedocs.io/en/latest/api.html#trollsift.parser.StringFormatter) which would give you the first instrument lowercased.

@sfinkens
Copy link
Copy Markdown
Member Author

Next try -> #3382

@sfinkens sfinkens closed this Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants