Make consolidated metadata optional#449
Conversation
| ) | ||
|
|
||
| consolidate_metadata = click.option( | ||
| "--consolidate-metadata", |
There was a problem hiding this comment.
To turn off consolidated metadate you would do:
vcf2zarr convert --consolidate-metadata=false ...
But perhaps this is better (more consistent with other options):
vcf2zarr convert --no-consolidated-metadata ...
Thoughts?
c66d8c8 to
98d637f
Compare
|
I think it's worth stepping back here an asking why we consolidate metadata at all - I think it's just for xarray support? I'd be happier dropping it entirely tbh. |
|
It can be useful for reducing latency for metadata operations with cloud stores when there's a large number of groups - something that xarray does benefit from, which is discussed a bit here: zarr-developers/zarr-python#3119. That said, it's easy enough to add it (or remove it) from a VCZ store if it's needed (or not). So I don't have a strong opinion either way. |
|
I think it's probably redundant as far as bio2zarr is concerned and VCZ should be agnostic to whether it's there or not. My recollection is that I added it in the early days just to get sgkit/xarray support working, and never really examined the assumption. If a particular client needs the consolidated metadata then I think they should be responsible for managing it (which as you say is easy). It would be a breaking change here that older versions of bio2zarr wouldn't support newer VCZ repos, but I think that's a pretty minor problem in practise. Does vcztools require consolidated metadata? |
No it doesn't (I just tried it without). I'll rework this PR (or create a new one) to remove. |
|
Great, thanks! Making things simpler is good. |
|
Closing in favour of #450 |
Some Zarr stores don't support consolidated metadata (e.g. Icechunk), or some users may not want to use it since it can get out of date with the data in the store.
Fixes #276