Skip to content

⬆️ Zarr V3 compatibility#904

Draft
measty wants to merge 59 commits intodevelopfrom
fix-zarr-check
Draft

⬆️ Zarr V3 compatibility#904
measty wants to merge 59 commits intodevelopfrom
fix-zarr-check

Conversation

@measty
Copy link
Copy Markdown
Collaborator

@measty measty commented Jan 16, 2025

Fixes the error related to zarr.errors.FSPathExistNotDir

It doesnt exist anymore since zarr v3.0.0, which instead just raises a standard FileNotFoundError

rather than checking for different error types in different versions of zarr, i've just removed the check for specific error type in favour of any errors

There are some other changes which we need to deal with from 3.0 also, as it has removed or changed zarr.LRUStoreCache, zarr.DirectoryStore, and zarr.SQLiteStore all of which tiatoolbox uses

We would also need to wait for tifffile to update to be compatible with zarr 3.0

See: https://zarr.readthedocs.io/en/latest/user-guide/v3_migration.html for more details

  • 🆙 Upgrade Zarr >=3.08, tifffile>=2025.5.21
  • Remove Python 3.10 as Zarr v3 has no support
  • Remove object_codec in dask.to_zarr
  • Refactor zarr.core.Array to zarr.Array
  • Refactor canvas_zarr.store.path to canvas_zarr.store.root
  • Refactor zarr.DirectoryStore to zarr.storage.LocalStore
  • tuple input for zarr.resize
  • Replace output.items() with output.members()
  • Remove ngff.sqlitestore as zarr.SQLiteStore is no longer supported.
  • Zarr path explicitly uses str key / i is replaced by key / str(i)

@measty measty changed the title fix zarr checking Zarr 3.0 compatibility Jan 16, 2025
@measty measty marked this pull request as draft January 16, 2025 14:11
@codecov
Copy link
Copy Markdown

codecov bot commented Jan 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.88%. Comparing base (d82df5c) to head (2212eef).

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #904      +/-   ##
===========================================
- Coverage    99.90%   99.88%   -0.03%     
===========================================
  Files           70       70              
  Lines         8735     8735              
  Branches      1149     1149              
===========================================
- Hits          8727     8725       -2     
- Misses           3        5       +2     
  Partials         5        5              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@measty measty mentioned this pull request Jan 16, 2025
@shaneahmed shaneahmed added bug Something isn't working dependencies Pull requests that update a dependency file labels Jan 24, 2025
@shaneahmed
Copy link
Copy Markdown
Member

shaneahmed commented Mar 19, 2025

Zarr v3 is not compatible with tifffile cgohlke/tifffile#282 czbiohub-sf/iohub#292

@shaneahmed shaneahmed added the stale Old PRs/Issues which are inactive label Mar 21, 2025
@shaneahmed shaneahmed added this to the Release v1.7.0 milestone Mar 21, 2025
@shaneahmed
Copy link
Copy Markdown
Member

Zarr 3 is supported by tifffile. However, zarr 3 only supports Python 3.11+.

@shaneahmed
Copy link
Copy Markdown
Member

@aacic Current implementation of FsspecJsonWSIReader is not compatible with Zarr V3. I have tested multiple options but they fail. Do you have any suggestions?

class FsspecJsonWSIReader(WSIReader):
"""Reader for fsspec zarr json generated by: tiatoolbox/utils/tiff_to_fsspec.py.
The fsspec zarr json file represents a SVS or TIFF file
that be accessed using byte range HTTP API.
All the information on the chunk locations in the SVS or TIFF file
is outlined as byte-ranges in the JSON,
so the reader requests only chunks that are needed to display requested tiles,
rather than the entire SVS or TIFF file.
"""
def __init__(
self: FsspecJsonWSIReader,
input_img: str | Path | np.ndarray,
mpp: tuple[Number, Number] | None = None,
power: Number | None = None,
cache_size: int = 2**28,
) -> None:
"""Initialize :class:`FsspecJsonWSIReader`."""
super().__init__(input_img=input_img, mpp=mpp, power=power)
jpeg_codec = Jpeg()
register_codec(jpeg_codec, "imagecodecs_jpeg")
jpeg2k_codec = Jpeg2k()
register_codec(jpeg2k_codec, "imagecodecs_jpeg2k")
lzw_codec = Lzw()
register_codec(lzw_codec, "imagecodecs_lzw")
delta_codec = Delta()
register_codec(delta_codec, "imagecodecs_delta")
mapper = fsspec.get_mapper(
"reference://", fo=str(input_img), target_protocol="file"
)
self._zarr_array = zarr.open(mapper, mode="r")
self.__set_axes()
self._zarr_store = self._zarr_array.store
self._zarr_lru_cache = zarr.LRUStoreCache(self._zarr_store, max_size=cache_size)
self._zarr_group = zarr.open(self._zarr_lru_cache)
if not isinstance(self._zarr_group, zarr.hierarchy.Group): # pragma: no cover
group = zarr.hierarchy.group()
group[0] = self._zarr_group
self._zarr_group = group
self.level_arrays = {
int(key): ArrayView(array, axes=self._axes)
for key, array in self._zarr_group.items()
}
# ensure level arrays are sorted by descending area
self.level_arrays = dict(
sorted(
self.level_arrays.items(),
key=lambda x: (
-np.prod(
TIFFWSIReaderDelegate.canonical_shape(
self._axes, x[1].array.shape[:2]
)
)
),
)
)
self.tiff_reader_delegate = TIFFWSIReaderDelegate(self, self.level_arrays)

@shaneahmed shaneahmed self-requested a review March 19, 2026 09:28
@aacic
Copy link
Copy Markdown
Collaborator

aacic commented Mar 19, 2026

@aacic Current implementation of FsspecJsonWSIReader is not compatible with Zarr V3. I have tested multiple options, but they fail. Do you have any suggestions?

class FsspecJsonWSIReader(WSIReader):
"""Reader for fsspec zarr json generated by: tiatoolbox/utils/tiff_to_fsspec.py.
The fsspec zarr json file represents a SVS or TIFF file
that be accessed using byte range HTTP API.
All the information on the chunk locations in the SVS or TIFF file
is outlined as byte-ranges in the JSON,
so the reader requests only chunks that are needed to display requested tiles,
rather than the entire SVS or TIFF file.
"""
def __init__(
self: FsspecJsonWSIReader,
input_img: str | Path | np.ndarray,
mpp: tuple[Number, Number] | None = None,
power: Number | None = None,
cache_size: int = 2**28,
) -> None:
"""Initialize :class:`FsspecJsonWSIReader`."""
super().__init__(input_img=input_img, mpp=mpp, power=power)
jpeg_codec = Jpeg()
register_codec(jpeg_codec, "imagecodecs_jpeg")
jpeg2k_codec = Jpeg2k()
register_codec(jpeg2k_codec, "imagecodecs_jpeg2k")
lzw_codec = Lzw()
register_codec(lzw_codec, "imagecodecs_lzw")
delta_codec = Delta()
register_codec(delta_codec, "imagecodecs_delta")
mapper = fsspec.get_mapper(
"reference://", fo=str(input_img), target_protocol="file"
)
self._zarr_array = zarr.open(mapper, mode="r")
self.__set_axes()
self._zarr_store = self._zarr_array.store
self._zarr_lru_cache = zarr.LRUStoreCache(self._zarr_store, max_size=cache_size)
self._zarr_group = zarr.open(self._zarr_lru_cache)
if not isinstance(self._zarr_group, zarr.hierarchy.Group): # pragma: no cover
group = zarr.hierarchy.group()
group[0] = self._zarr_group
self._zarr_group = group
self.level_arrays = {
int(key): ArrayView(array, axes=self._axes)
for key, array in self._zarr_group.items()
}
# ensure level arrays are sorted by descending area
self.level_arrays = dict(
sorted(
self.level_arrays.items(),
key=lambda x: (
-np.prod(
TIFFWSIReaderDelegate.canonical_shape(
self._axes, x[1].array.shape[:2]
)
)
),
)
)
self.tiff_reader_delegate = TIFFWSIReaderDelegate(self, self.level_arrays)

Please give me some time to look into this. I'll try to look at this next week. I don't know it from the top of my head.

@shaneahmed shaneahmed changed the title Zarr 3.0 compatibility :Uparrow: Zarr V3 compatibility Mar 25, 2026
@shaneahmed shaneahmed changed the title :Uparrow: Zarr V3 compatibility ⬆️ Zarr V3 compatibility Mar 25, 2026
@shaneahmed
Copy link
Copy Markdown
Member

All the errors have been fixed except FsspecJsonWSIReader.

@shaneahmed
Copy link
Copy Markdown
Member

All the errors have been fixed except FsspecJsonWSIReader.

The updates to engines output breaks, test_hovernet_on_box which also needs to be fixed.

@aacic
Copy link
Copy Markdown
Collaborator

aacic commented Mar 26, 2026

All the errors have been fixed except FsspecJsonWSIReader.

@shaneahmed I was able to make it work locally. I need some more time to verify the solution, and I'll try to submit a PR sometime next week. I'll keep you posted.

@shaneahmed
Copy link
Copy Markdown
Member

All the errors have been fixed except FsspecJsonWSIReader.

@shaneahmed I was able to make it work locally. I need some more time to verify the solution, and I'll try to submit a PR sometime next week. I'll keep you posted.

Thank you @aacic That would be wonderful.

# Conflicts:
#	requirements/requirements.txt
@shaneahmed
Copy link
Copy Markdown
Member

shaneahmed commented Mar 31, 2026

All the errors have been fixed except FsspecJsonWSIReader.

The updates to engines output breaks, test_hovernet_on_box which also needs to be fixed.

@measty Please would you be able to check the error with test_hovernet_on_box? This is probably due to updates in the way new zarr output is saved.

shaneahmed and others added 2 commits March 31, 2026 12:36
* Zarr 3 fix.

* Fix multilayer image `FsspecJsonWSIReader` support.
Copy link
Copy Markdown

@JiwaniZakir JiwaniZakir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The padding-removal logic added in assert_annotation_store_patch_output and assert_qupath_json_patch_output uses np.iinfo(contours.dtype).min as the sentinel value, but np.iinfo only accepts integer dtypes — if contours.dtype is ever a float (e.g., float32 or float64), this will raise a ValueError at runtime. A safer approach would be to guard with np.iinfo only when np.issubdtype(contours.dtype, np.integer) and fall back to np.finfo or a different sentinel otherwise.

The pervasive switch from integer to string keys (e.g., output_zarr["x"][0]output_zarr["x"]["0"] in test_nucleus_detection_engine.py and the str(i) conversion in assert_output_lengths) reflects a real zarr v3 API change, but it's worth verifying this pattern is consistently applied in the production code paths as well — the tests only cover the assertion helpers, and any production code that still uses integer indexing into zarr groups would silently fail or return wrong results.

The switch from output_.items() to output_.members() in test_semantic_segmentor.py is correct for zarr v3, but the truncated diff makes it unclear whether similar items() calls in production code (not just tests) have been updated — a quick grep for .items() on zarr group objects in the main source would be worth confirming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working dependencies Pull requests that update a dependency file help wanted Extra attention is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants