Skip to content

Fix dmrpp error handling#880

Merged
TomNicholas merged 6 commits intozarr-developers:mainfrom
betolink:main
Feb 27, 2026
Merged

Fix dmrpp error handling#880
TomNicholas merged 6 commits intozarr-developers:mainfrom
betolink:main

Conversation

@betolink
Copy link
Copy Markdown
Contributor

@betolink betolink commented Feb 20, 2026

I think a while ago we talked about eventually moving the parser to pydap as that would be the more native place to improve it and fix bugs, I think we should move in that direction and offer a thin wrapper for virtualization. However that integration is still TDB, this PR addresses some of the quirkiness of some DMRpp files mostly by adding a safe attribute getter function. Unit tests were added and now the parsers shows up as one of the valid parsers for VirtualiZarr.

  • Closes #xxxx
  • Tests added
  • Tests passing
  • Full type hint coverage
  • Changes are documented in docs/releases.rst
  • New functions/methods are listed in api.rst
  • New functionality has documentation

p.s. once #794 gets merged, we should be able to create and append Zarr stores from NASA earthdata using DMRpp + kerchunk really fast!

cc @Mikejmnez @owenlittlejohns @TomNicholas @danielfromearth

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 20, 2026

Codecov Report

❌ Patch coverage is 88.67925% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.21%. Comparing base (785de91) to head (8d56c09).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
virtualizarr/parsers/dmrpp.py 88.67% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #880      +/-   ##
==========================================
+ Coverage   89.11%   89.21%   +0.10%     
==========================================
  Files          34       34              
  Lines        1956     1985      +29     
==========================================
+ Hits         1743     1771      +28     
- Misses        213      214       +1     
Files with missing lines Coverage Δ
virtualizarr/xarray.py 87.93% <ø> (+0.86%) ⬆️
virtualizarr/parsers/dmrpp.py 85.85% <88.67%> (+2.42%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@betolink betolink marked this pull request as ready for review February 23, 2026 22:40
@Mikejmnez
Copy link
Copy Markdown
Contributor

Mikejmnez commented Feb 23, 2026

@betolink the dmrpp parser has been integrated into pydap (see here), and see for example: pydap/PR#649 for parsing inline missing values (that cause some issues with attributes when parsing variables).

I just have not yet made a release since then, and I still need to make sure that the improvements I have made work with a virtualizarr local dev branch that I have not pushed yet to fully migrate the parser away from vz (it was working during AGU but I need to update a few things... ).

@betolink
Copy link
Copy Markdown
Contributor Author

@Mikejmnez nice!! should we try to merge this and then migrate once the pydap parser is ready and gets released? we could also integrate this PR (the idea of the safe attr getter) into your branch and close this PR. The next PR here would be to use PyDAP. I'm open to both options.

@Mikejmnez
Copy link
Copy Markdown
Contributor

Mikejmnez commented Feb 23, 2026

Yeah good question - I have also been looking at parsing various attributes currently unsupported. I find the idea of "required=True|False" attributes very interesting. What constitutes required? That logic may need to live outside of pydap (and so perhaps in VZ) because all attributes should be parsed fine by pydap, the question is whether these are supported by Virtualizarrt... For example, dmrpps may have:

chunks:missingvalues & chunks:compact as attributes with inline values.

These should be parsed by the dmrpp parser, but are these supported by other APIs?

@TomNicholas
Copy link
Copy Markdown
Member

TomNicholas commented Feb 24, 2026

The next PR here would be to use PyDAP.

Do you guys want to just move this parser out of VirtualiZarr entirely? We have a 3rd-party extension system now, which this DMRPP parser predates. It's something of a smell that VZ devs will have to approve PRs changing the DMRPP parser, when we don't really know anything about it.

EDIT: We could still advertise it in VZ, or even import then re-export it to retain backwards compatibility. But if you are re-implementing it in PyDAP then we surely should just move development of the parser over there?

@betolink
Copy link
Copy Markdown
Contributor Author

Do you guys want to just move this parser out of VirtualiZarr entirely?

I think that's the idea, I'd would like to merge this PR as the last work we do here and then start working towards removing the code, then we could import it when is ready in pydap and make it backwards compatible like you said.
This PR along with the PR from Max #794 will allow us to build and append L4 virtual collections for several missions using dmrpp

I wonder when you mentioned 3rd party extensions, what is the scope? will the parsers become extensions? or what is the criteria of core vs 3rd party.

@TomNicholas
Copy link
Copy Markdown
Member

I wonder when you mentioned 3rd party extensions, what is the scope? will the parsers become extensions?

I was referring to the parsers. They are a mechanism for extension by 3rd parties.

I think that's the idea, I'd would like to merge this PR as the last work we do here and then start working towards removing the code

👍 Just let me know when you want me to review/merge this.

chunks:missingvalues & chunks:compact as attributes with inline values.

@Mikejmnez can you explain more about what these mean? Or maybe open another issue if that's a separate topic?

@Mikejmnez
Copy link
Copy Markdown
Contributor

Mikejmnez commented Feb 24, 2026

Do you guys want to just move this parser out of VirtualiZarr entirely?

I think that's the idea, I'd would like to merge this PR as the last work we do here and then start working towards removing the code

@betolink Sounds good with me.

@Mikejmnez can you explain more about what these mean? Or maybe open another issue if that's a separate topic?

@TomNicholas These are two kinds of inline values that can occur within a DMRpp. These are not parsed currently by the parser, and results in the parser dropping the variable when attempting to parse it with a warning. This PR that @betolink has helps log these variables to track them I think. But the parser should parse them (which is what I am currently working on the pydap side). My understanding was that it was unsupported by VZ, but I am glad to see Max's work on enabling this !

@betolink
Copy link
Copy Markdown
Contributor Author

@Mikejmnez one note, Max PR is for when VZ already parsed these into Kerchunk, so then we can open that store and append without having to recreate it all from scratch.

This PR is only logging "{var} could not be parsed because of {x}" (instead of just the parser blowing) and also enumerates phony dimensions, right now all they become "phony_dim" instead of "phony_dim_{n}". I'm excited to see your work and Max's in VZ, we'll be in a position to roundtrip chunkmanifest with inline values!

@TomNicholas
Copy link
Copy Markdown
Member

Max PR is for when VZ already parsed these into Kerchunk, so then we can open that store and append without having to recreate it all from scratch.

Max's (draft) PR is actually more general than that. It was motivated by Kerchunk, but has other uses too. The idea is it will allow storing some chunks inlined into VZ's in-memory manifests, regardless of the parser they came from.

@betolink
Copy link
Copy Markdown
Contributor Author

Alright, I think this PR is ready for review then @TomNicholas @Mikejmnez

Copy link
Copy Markdown
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lemme know when you want me to merge

@betolink
Copy link
Copy Markdown
Contributor Author

Thanks Tom! anytime is good.

@TomNicholas TomNicholas merged commit 085bcb6 into zarr-developers:main Feb 27, 2026
15 checks passed
@Mikejmnez Mikejmnez mentioned this pull request Mar 6, 2026
7 tasks
TomNicholas added a commit to TomNicholas/VirtualiZarr that referenced this pull request Mar 23, 2026
- zarr-developers#880 Fix dmrpp error handling
- zarr-developers#868 Fix error with Zarr-Python 3.1.0
- zarr-developers#924 Fix coordinate name issue
- zarr-developers#916 Fix ZarrParser to use public attribute

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TomNicholas added a commit that referenced this pull request Mar 23, 2026
* add empty release notes

* Fix wrong PR link references in release notes

- Line 57: PR text said #927 but URL pointed to #932 (correct PR)
- Line 115: PR text said #565 but URL pointed to #822 (correct PR)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Move PR #818 from v2.2.0 to unreleased section

This PR was merged after v2.4.0 but was incorrectly listed
under the v2.2.0 release notes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add missing bug fixes to unreleased section

- #880 Fix dmrpp error handling
- #868 Fix error with Zarr-Python 3.1.0
- #924 Fix coordinate name issue
- #916 Fix ZarrParser to use public attribute

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add missing documentation entries to unreleased section

- #918 FAQ answer on "why still write native zarr?"
- #893 Update FAQ regarding virtualizing existing Zarr V2 data
- #937 R2 docs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add missing internal change #909 to unreleased section

Compiled regular expressions for improved performance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add missing documentation entries to v2.4.0 release notes

- #855 Add example of virtualizing GOES
- #856 Update kerchunk comparison in FAQ

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add missing entries to v2.2.0 release notes

- #829 Raise informative error on Zarr V2 parsing with Zarr-Python<3.1.3
- #805 Revert unnecessary dtype conversion in icechunk writer

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants