You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#124 describes @ayushnag's idea to allow loading data variables directly from the in-memory ManifestArrays, rather than having to write to kerchunk/icechunk then reading from that. PR #458 will add a zarr-compliant in-memory virtual ManifestStore that wraps a virtual dataset and would allow loading data from it via
This issue is to track the idea that once #458 is implemented we should refactor the implementation of loadable_variables to use this ManifestStore + xr.open_zarr approach internally, for all backends. Currently this is instead done by each virtual backend calling out to a different xarray backend, depending on the filetype.
There are multiple reasons to re-implement this:
We would no longer need every virtual backend to have a corresponding xarray backend,
We would be able to guarantee (and create property-based tests - see Refactor tests around expected properties #394) that loading data via loadable_variables will give the same result as creating a virtual dataset, writing to icechunk, then loading,
Make it easier entralize file handle management, so we can close file handles in the way xarray can (see File handle resource leak #468).
#124 describes @ayushnag's idea to allow loading data variables directly from the in-memory
ManifestArrays, rather than having to write to kerchunk/icechunk then reading from that. PR #458 will add a zarr-compliant in-memory virtualManifestStorethat wraps a virtual dataset and would allow loading data from it viaThis issue is to track the idea that once #458 is implemented we should refactor the implementation of
loadable_variablesto use thisManifestStore+xr.open_zarrapproach internally, for all backends. Currently this is instead done by each virtual backend calling out to a different xarray backend, depending on the filetype.There are multiple reasons to re-implement this:
loadable_variableswill give the same result as creating a virtual dataset, writing to icechunk, then loading,FYI @chuckwondo