I put together a recent example of using VirtualiZarr + Icechunk over HTTPS, which could be helpful for our docs. However, people were still stuck because auth does not work with Icechunk + HTTPS. We should also caveat that HTTPS auth is not yet implemented, with a link to earth-mover/icechunk#997 as the upstream tracking issue. Our team at https://github.com/NASA-IMPACT/veda-odd is evaluating whether we could contribute a fix for that in our next project increment, which starts this month. cc @edshred2000 @owenlittlejohns @abarciauskas-bgse
# /// script
# requires-python = ">=3.12"
# dependencies = [
# "virtualizarr",
# "virtual-tiff",
# "obstore",
# "obspec-utils",
# "icechunk",
# "xarray",
# "zarr",
# ]
# ///
from obstore.store import HTTPStore
from obspec_utils.registry import ObjectStoreRegistry
from virtual_tiff import VirtualTIFF
from virtualizarr import open_virtual_dataset
import icechunk
import xarray as xr
base_url = "https://storage.googleapis.com/"
file = "solus100pub/cec7_0_cm_p.tif"
url = f"{base_url}/{file}"
print(f"URL: {url}")
store = HTTPStore.from_url(base_url)
registry = ObjectStoreRegistry({base_url: store})
parser = VirtualTIFF(ifd=4)
# Open as virtual dataset
with open_virtual_dataset(url=url, parser=parser, registry=registry) as vds:
print("\nVirtual dataset:")
print(vds)
# Write to in-memory Icechunk store
storage = icechunk.Storage.new_in_memory()
config = icechunk.RepositoryConfig.default()
container = icechunk.VirtualChunkContainer(
url_prefix=base_url,
store=icechunk.http_store(),
)
config.set_virtual_chunk_container(container)
repo = icechunk.Repository.create(
storage=storage,
config=config,
authorize_virtual_chunk_access={base_url: None},
)
session = repo.writable_session("main")
vds.vz.to_icechunk(session.store)
session.commit("Write virtual TIFF references")
print("\nCommitted to Icechunk")
# Read back from Icechunk
read_session = repo.readonly_session("main")
ds = xr.open_zarr(read_session.store, consolidated=False, zarr_format=3)
print("\nRoundtripped dataset from Icechunk:")
print(ds)
# Load actual data
loaded = ds.load()
print("\nLoaded data:")
print(loaded)
I put together a recent example of using VirtualiZarr + Icechunk over HTTPS, which could be helpful for our docs. However, people were still stuck because auth does not work with Icechunk + HTTPS. We should also caveat that HTTPS auth is not yet implemented, with a link to earth-mover/icechunk#997 as the upstream tracking issue. Our team at https://github.com/NASA-IMPACT/veda-odd is evaluating whether we could contribute a fix for that in our next project increment, which starts this month. cc @edshred2000 @owenlittlejohns @abarciauskas-bgse