Skip to content
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
138 commits
Select commit Hold shift + click to select a range
eaafdd1
changed pyproject to zarr>=3
LucaMarconato Aug 18, 2025
3047662
adjust dependencies
LucaMarconato Aug 18, 2025
9f26ce0
wip
LucaMarconato Aug 18, 2025
241bbfe
more fixes
LucaMarconato Aug 18, 2025
bdb6da0
update ome-zarr dep
melonora Aug 26, 2025
e3a53df
Add zarr v3 formats
melonora Aug 26, 2025
41b980f
refactoring formats
LucaMarconato Aug 26, 2025
cf6e90b
Merge branch 'zarrv3' of https://github.com/scverse/spatialdata into …
LucaMarconato Aug 26, 2025
67d2dd9
wip replace parse_url() with _open_zarr_store()
LucaMarconato Aug 26, 2025
5d0b0cf
update way of writing transforms, update typehints
melonora Aug 27, 2025
2c486b5
attempt fix consolidated metadata, fix channel names write
melonora Aug 29, 2025
fa4622f
fix partial read tests
melonora Aug 29, 2025
4137c5e
fix path
melonora Aug 29, 2025
90c8f92
access name in group directly
melonora Aug 29, 2025
c9a610b
fix read in consolidated metadata
melonora Aug 31, 2025
e6236e6
initial ugly fix groups and consolidated metadata when deleting
melonora Aug 31, 2025
3e63bfe
fix reading back table
melonora Sep 1, 2025
3c298a4
open group without using consolidated metadata when writing
melonora Sep 1, 2025
b6c2885
revert adding labels
melonora Sep 1, 2025
1127edd
fix read / write issues with consolidated metadata
melonora Sep 2, 2025
2502629
use_consolidated False when writing transforms
melonora Sep 2, 2025
d398e96
add ome_format arg
melonora Sep 2, 2025
0fd4cce
check valid element formats in container format
melonora Sep 3, 2025
484489e
uncomment code arrayNotFoundError
melonora Sep 3, 2025
ff4cfee
remove future annotations import
melonora Sep 3, 2025
4d9003c
Merge branch 'main' into zarrv3
melonora Sep 3, 2025
89b53f0
update workflow mac version
melonora Sep 3, 2025
607580a
drop python 3.10
melonora Sep 3, 2025
75e656b
change target python version and readthedocs python version
melonora Sep 3, 2025
b977b36
minor updates
melonora Sep 3, 2025
0cfd29f
update Self import
melonora Sep 3, 2025
3d0c3eb
fix ome errors
melonora Sep 3, 2025
fc7efda
add windows workflow
melonora Sep 3, 2025
46c7b34
prevent consolidation labels group when deleting element
melonora Sep 3, 2025
38f29fe
add shapes test
melonora Sep 3, 2025
a88c7e7
fix shape conversion
melonora Sep 3, 2025
16aa529
update dask dependency because of zarr v3
melonora Sep 3, 2025
bdea210
fix multipoly
melonora Sep 3, 2025
ebc5080
fix invalid read name test
melonora Sep 3, 2025
6ba53e4
use UPath
melonora Sep 3, 2025
9608232
fix paths
melonora Sep 3, 2025
dd5bc23
resolve paths
melonora Sep 3, 2025
8d788e0
refactor overwrite transformations
melonora Sep 4, 2025
a48ba23
refactor
melonora Sep 4, 2025
6c5ee98
further refactor, add docstrings
melonora Sep 4, 2025
25aca86
refactor io_raster
melonora Sep 4, 2025
934c2bf
several refactors io
melonora Sep 4, 2025
ee78d30
war on warnings
melonora Sep 5, 2025
42923c4
checks backward compatibility
melonora Sep 6, 2025
f11d683
correct test
melonora Sep 6, 2025
2b71939
further reduce warnings
melonora Sep 6, 2025
94971cb
remove log with no useful info
melonora Sep 6, 2025
5672e19
remove log as it is stated in doc string
melonora Sep 6, 2025
6a67034
move log to info in docstring
melonora Sep 6, 2025
68ce29f
remove deprecated warning
melonora Sep 7, 2025
82f12b0
get rid of categorical and str casting warnings
melonora Sep 7, 2025
bf607b3
below 1000 warnings
melonora Sep 7, 2025
20469f8
further reduction of warnings
melonora Sep 7, 2025
429db46
remove deprecated code
melonora Sep 7, 2025
2a5ec5b
correct location for storing transforms
melonora Sep 7, 2025
016da2e
consistent naming
melonora Sep 8, 2025
3e5490e
update according to ome-zarr-py
melonora Sep 8, 2025
be1ae2f
correct docstring
melonora Sep 8, 2025
8873fa0
update docstring
melonora Sep 8, 2025
b1ade7e
remove todo
melonora Sep 8, 2025
421b56f
remove unassigned function call
melonora Sep 8, 2025
0bb8bc9
refactor new _open_zarr_store to _resolve_zarr_store
melonora Sep 8, 2025
fd08907
silence zarr parquet warnings
melonora Sep 8, 2025
74a81cd
change overwriting warning, silence in tests
melonora Sep 8, 2025
2bc809c
silence overwriting warnings
melonora Sep 8, 2025
6e43b42
silence chunk warning
melonora Sep 8, 2025
bd0468e
remove argument from docstring, update typehint
melonora Sep 9, 2025
54e0d87
update typehint
melonora Sep 9, 2025
fc052d7
small fixes
LucaMarconato Sep 9, 2025
421d45c
Merge branch 'zarrv3' of https://github.com/scverse/spatialdata into …
LucaMarconato Sep 9, 2025
fc41701
merge
melonora Sep 9, 2025
7c1bd7e
fail if not root does not exist
melonora Sep 9, 2025
b9e5d92
write out function name
melonora Sep 9, 2025
8544684
small fixes
LucaMarconato Sep 9, 2025
6840304
Merge branch 'zarrv3' of https://github.com/scverse/spatialdata into …
LucaMarconato Sep 9, 2025
fa62038
initial replacement parse_url
melonora Sep 9, 2025
c21ac8c
initial replacement parse_url
melonora Sep 9, 2025
6a0c2d0
alter docstring
melonora Sep 10, 2025
0c44ddb
adjust argument docstring
melonora Sep 10, 2025
a2a2a45
remove overwrite warnings
melonora Sep 10, 2025
ac56372
fix test
melonora Sep 10, 2025
22fce41
replace parse_url
melonora Sep 10, 2025
92390d2
change version
melonora Sep 10, 2025
ffc7ab0
remove type hints from docstrings
melonora Sep 10, 2025
1decd22
refactor to one function
melonora Sep 10, 2025
daa804b
change typehint
melonora Sep 10, 2025
564abae
remove typehints from docstring
melonora Sep 10, 2025
aa8e686
remove type hint return in docstring
melonora Sep 10, 2025
36c5987
remove comment
melonora Sep 10, 2025
48f0b81
ensure comment added back
melonora Sep 10, 2025
b6e23f7
fix channel metadata
melonora Sep 10, 2025
7959e02
get rid of TableValidateMixin
melonora Sep 10, 2025
8e16c8e
code fixes
LucaMarconato Sep 10, 2025
62101b6
Merge branch 'zarrv3' of https://github.com/scverse/spatialdata into …
LucaMarconato Sep 10, 2025
906e3a9
fix
LucaMarconato Sep 10, 2025
878dce1
remove format without effect
melonora Sep 10, 2025
a9f4ca0
remove unnecessary catch warnings
melonora Sep 10, 2025
09ac152
add todo
melonora Sep 10, 2025
f6bae29
fixes
LucaMarconato Sep 10, 2025
76157b6
Merge branch 'zarrv3' of https://github.com/scverse/spatialdata into …
LucaMarconato Sep 10, 2025
7997f3b
remove unnecessary .get('ome')
melonora Sep 10, 2025
28e8c3e
merge
melonora Sep 10, 2025
b19256b
add clarifying comment
melonora Sep 10, 2025
2bed0ce
wip tests readwrite across formats
LucaMarconato Sep 10, 2025
a616e77
Merge branch 'zarrv3' of https://github.com/scverse/spatialdata into …
LucaMarconato Sep 10, 2025
21e5794
return None instead of AnnData
melonora Sep 10, 2025
46c0753
remove TODO
melonora Sep 10, 2025
9c3914e
remove invalid characters from test
melonora Sep 10, 2025
0f230b7
remove unused fixture and commented code
melonora Sep 10, 2025
5f5b8d7
almost completed extending readwrite tests to all container versions
LucaMarconato Sep 10, 2025
a190626
Merge branch 'zarrv3' of https://github.com/scverse/spatialdata into …
LucaMarconato Sep 10, 2025
02a5df7
add OSError
melonora Sep 10, 2025
f10c9bf
partial fix writing empty spatialdata
melonora Sep 10, 2025
e731261
fix type
melonora Sep 11, 2025
b431781
fix overwrite when no zarr store
melonora Sep 11, 2025
4e33d98
fix write element to empty directory location
melonora Sep 11, 2025
04aae9f
correct no zarr store write test
melonora Sep 11, 2025
594003b
remove unnecessary code
melonora Sep 11, 2025
17eaf8d
fix write_element
LucaMarconato Sep 11, 2025
e38e5ff
Merge branch 'zarrv3' of https://github.com/scverse/spatialdata into …
LucaMarconato Sep 11, 2025
ffdc9da
delete group isntead of .zattrs
melonora Sep 11, 2025
8f0f438
remove parse_url
LucaMarconato Sep 11, 2025
dacac00
Merge branch 'zarrv3' of https://github.com/scverse/spatialdata into …
LucaMarconato Sep 11, 2025
d3191a3
removed logger.ingo() and most of remaining warnings from tests
LucaMarconato Sep 11, 2025
48ad6fd
addressing review comments
LucaMarconato Sep 11, 2025
a9e7242
addressed consolidate metadata comment
LucaMarconato Sep 11, 2025
71adfe1
make full coverage of _validate_can_safely_write_to_path() easier to …
LucaMarconato Sep 11, 2025
7c56b25
restore partial read/write tests
melonora Sep 12, 2025
c88b488
minor changes in test_partial_read(); code review finished
LucaMarconato Sep 12, 2025
51292d5
support ome-zarr-py master
LucaMarconato Sep 16, 2025
d2b0463
fix docs
LucaMarconato Sep 16, 2025
709bdd8
ensure multiscales written correctly
melonora Sep 17, 2025
59df1ca
refactor read_zarr (#982)
melonora Sep 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python: ["3.10", "3.12"]
python: ["3.11", "3.13"]
os: [ubuntu-latest]
include:
- os: macos-latest
Expand Down
5 changes: 3 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ dependencies = [
"networkx",
"numba>=0.55.0",
"numpy",
"ome_zarr>=0.8.4",
"ome_zarr>=0.12rc1",
# "ome_zarr>=0.8.4",
"pandas",
"pooch",
"pyarrow",
Expand All @@ -47,7 +48,7 @@ dependencies = [
"xarray>=2024.10.0",
"xarray-schema",
"xarray-spatial>=0.3.5",
"zarr<3",
"zarr>=3.0.0",
]

[project.optional-dependencies]
Expand Down
63 changes: 42 additions & 21 deletions src/spatialdata/_core/spatialdata.py
Original file line number Diff line number Diff line change
Expand Up @@ -621,10 +621,13 @@ def _get_groups_for_element(
-------
either the existing Zarr subgroup or a new one.
"""
from spatialdata._io.format import SpatialDataFormat

if not isinstance(zarr_path, Path):
raise ValueError("zarr_path should be a Path object")
store = parse_url(zarr_path, mode="r+").store
root = zarr.group(store=store)
store = SpatialDataFormat().init_store(str(zarr_path), mode="r+")
Comment thread
LucaMarconato marked this conversation as resolved.
Outdated
# store = parse_url(zarr_path, mode="r+", fmt=SpatialDataFormat()).store
root = zarr.open_group(store=store, mode="r+")
if element_type not in ["images", "labels", "points", "polygons", "shapes", "tables"]:
raise ValueError(f"Unknown element type {element_type}")
element_type_group = root.require_group(element_type)
Expand All @@ -646,8 +649,10 @@ def _group_for_element_exists(self, zarr_path: Path, element_type: str, element_
-------
True if the group exists, False otherwise.
"""
store = parse_url(zarr_path, mode="r").store
root = zarr.group(store=store)
from spatialdata._io.format import SpatialDataFormat

store = parse_url(zarr_path, mode="r", fmt=SpatialDataFormat()).store
root = zarr.open_group(store=store, mode="r")
assert element_type in ["images", "labels", "points", "polygons", "shapes", "tables"]
exists = element_type in root and element_name in root[element_type]
store.close()
Expand Down Expand Up @@ -1068,19 +1073,27 @@ def elements_paths_on_disk(self) -> list[str]:
-------
A list of paths of the elements saved in the Zarr store.
"""
from spatialdata._io.format import SpatialDataFormat

if self.path is None:
raise ValueError("The SpatialData object is not backed by a Zarr store.")
store = parse_url(self.path, mode="r").store
root = zarr.group(store=store)

store = parse_url(self.path, mode="r", fmt=SpatialDataFormat()).store
root = zarr.open_group(store=store, mode="r")
elements_in_zarr = []

def find_groups(obj: zarr.Group, path: str) -> None:
# with the current implementation, a path of a zarr group if the path for an element if and only if its
# with the current implementation, a path of a zarr group is the path for an element if and only if its
# string representation contains exactly one "/"
if isinstance(obj, zarr.Group) and path.count("/") == 1:
elements_in_zarr.append(path)

root.visit(lambda path: find_groups(root[path], path))
for element_type in root:
if element_type in ["images", "labels", "points", "shapes", "tables"]:
for element_name in root[element_type]:
path = f"{element_type}/{element_name}"
elements_in_zarr.append(path)
# root.visit(lambda path: find_groups(root[path], path))
store.close()
return elements_in_zarr

Expand Down Expand Up @@ -1115,6 +1128,7 @@ def _validate_can_safely_write_to_path(
saving_an_element: bool = False,
) -> None:
from spatialdata._io._utils import _backed_elements_contained_in_path, _is_subfolder
from spatialdata._io.format import SpatialDataFormat

if isinstance(file_path, str):
file_path = Path(file_path)
Expand All @@ -1123,7 +1137,7 @@ def _validate_can_safely_write_to_path(
raise ValueError(f"file_path must be a string or a Path object, type(file_path) = {type(file_path)}.")

if os.path.exists(file_path):
if parse_url(file_path, mode="r") is None:
if parse_url(file_path, mode="r", fmt=SpatialDataFormat()) is None:
raise ValueError(
"The target file path specified already exists, and it has been detected to not be a Zarr store. "
Comment thread
LucaMarconato marked this conversation as resolved.
"Overwriting non-Zarr stores is not supported to prevent accidental data loss."
Expand Down Expand Up @@ -1205,13 +1219,15 @@ def write(
:class:`~spatialdata._io.format.CurrentRasterFormat`, :class:`~spatialdata._io.format.CurrentShapesFormat`,
:class:`~spatialdata._io.format.CurrentPointsFormat`, :class:`~spatialdata._io.format.CurrentTablesFormat`.
"""
from spatialdata._io.format import SpatialDataFormat

if isinstance(file_path, str):
file_path = Path(file_path)
self._validate_can_safely_write_to_path(file_path, overwrite=overwrite)
self._validate_all_elements()

store = parse_url(file_path, mode="w").store
zarr_group = zarr.group(store=store, overwrite=overwrite)
store = parse_url(file_path, mode="w", fmt=SpatialDataFormat()).store
zarr_group = zarr.open_group(store=store, mode="w" if overwrite else "a")
self.write_attrs(zarr_group=zarr_group)
store.close()

Expand Down Expand Up @@ -1370,14 +1386,15 @@ def delete_element_from_disk(self, element_name: str | list[str]) -> None:
environment (e.g. operating system, local vs network storage, file permissions, ...) and call this function
appropriately (or implement a tailored solution), to prevent data loss.
"""
from spatialdata._io._utils import _backed_elements_contained_in_path
from spatialdata._io.format import SpatialDataFormat

if isinstance(element_name, list):
for name in element_name:
assert isinstance(name, str)
self.delete_element_from_disk(name)
return

from spatialdata._io._utils import _backed_elements_contained_in_path

if self.path is None:
raise ValueError("The SpatialData object is not backed by a Zarr store.")

Expand Down Expand Up @@ -1417,8 +1434,8 @@ def delete_element_from_disk(self, element_name: str | list[str]) -> None:
)

# delete the element
store = parse_url(self.path, mode="r+").store
root = zarr.group(store=store)
store = parse_url(self.path, mode="r+", fmt=SpatialDataFormat()).store
root = zarr.open_group(store=store, mode="r+")
root[element_type].pop(element_name)
store.close()

Expand All @@ -1438,15 +1455,19 @@ def _check_element_not_on_disk_with_different_type(self, element_type: str, elem
)

def write_consolidated_metadata(self) -> None:
Comment thread
melonora marked this conversation as resolved.
store = parse_url(self.path, mode="r+").store
from spatialdata._io.format import SpatialDataFormat

store = parse_url(self.path, mode="r+", fmt=SpatialDataFormat()).store
# consolidate metadata to more easily support remote reading bug in zarr. In reality, 'zmetadata' is written
# instead of '.zmetadata' see discussion https://github.com/zarr-developers/zarr-python/issues/1121
zarr.consolidate_metadata(store, metadata_key=".zmetadata")
zarr.consolidate_metadata(store)
store.close()

def has_consolidated_metadata(self) -> bool:
from spatialdata._io.format import SpatialDataFormat

return_value = False
store = parse_url(self.path, mode="r").store
store = parse_url(self.path, mode="r", fmt=SpatialDataFormat()).store
if "zmetadata" in store:
return_value = True
store.close()
Expand Down Expand Up @@ -1614,16 +1635,16 @@ def _element_type_and_name_from_element_path(self, element_path: str) -> tuple[s
return element_type, element_name

def write_attrs(self, format: SpatialDataFormat | None = None, zarr_group: zarr.Group | None = None) -> None:
from spatialdata._io.format import _parse_formats
from spatialdata._io.format import SpatialDataFormat, _parse_formats

parsed = _parse_formats(formats=format)

store = None

if zarr_group is None:
assert self.is_backed(), "The SpatialData object must be backed by a Zarr store to write attrs."
store = parse_url(self.path, mode="r+").store
zarr_group = zarr.group(store=store, overwrite=False)
store = parse_url(self.path, mode="r+", fmt=SpatialDataFormat()).store
zarr_group = zarr.open_group(store=store, overwrite=False, mode="r+")

version = parsed["SpatialData"].spatialdata_format_version
version_specific_attrs = parsed["SpatialData"].attrs_to_dict()
Expand Down
4 changes: 2 additions & 2 deletions src/spatialdata/_io/format.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import ome_zarr.format
import zarr
from anndata import AnnData
from ome_zarr.format import CurrentFormat, Format, FormatV01, FormatV02, FormatV03, FormatV04
from ome_zarr.format import Format, FormatV01, FormatV02, FormatV03, FormatV04
from pandas.api.types import CategoricalDtype
from shapely import GeometryType

Expand Down Expand Up @@ -46,7 +46,7 @@ def _parse_version(group: zarr.Group, expect_attrs_key: bool) -> str | None:
return version


class SpatialDataFormat(CurrentFormat):
class SpatialDataFormat(FormatV04):
pass


Expand Down
7 changes: 5 additions & 2 deletions src/spatialdata/_io/io_points.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ def _read_points(
assert version is not None
format = PointsFormats[version]

path = os.path.join(f._store.path, f.path, "points.parquet")
store_root = f.store_path.store.root
path = os.path.join(store_root, f.path, "points.parquet")
# cache on remote file needed for parquet reader to work
# TODO: allow reading in the metadata without caching all the data
points = read_parquet("simplecache::" + path if path.startswith("http") else path)
Expand All @@ -57,7 +58,9 @@ def write_points(
t = _get_transformations(points)

points_groups = group.require_group(name)
path = Path(points_groups._store.path) / points_groups.path / "points.parquet"
store_root = points_groups.store_path.store.root
group_path = points_groups.path
path = Path(store_root) / group_path / "points.parquet"

# The following code iterates through all columns in the 'points' DataFrame. If the column's datatype is
# 'category', it checks whether the categories of this column are known. If not, it explicitly converts the
Expand Down
6 changes: 4 additions & 2 deletions src/spatialdata/_io/io_shapes.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,8 @@ def _read_shapes(
geometry = from_ragged_array(typ, coords, offsets)
geo_df = GeoDataFrame({"geometry": geometry}, index=index)
elif isinstance(format, ShapesFormatV02):
path = Path(f._store.path) / f.path / "shapes.parquet"
store_root = f.store_path.store.root
path = Path(store_root) / f.path / "shapes.parquet"
geo_df = read_parquet(path)
else:
raise ValueError(
Expand Down Expand Up @@ -93,7 +94,8 @@ def write_shapes(
attrs = format.attrs_to_dict(geometry)
attrs["version"] = format.spatialdata_format_version
elif isinstance(format, ShapesFormatV02):
path = Path(shapes_group._store.path) / shapes_group.path / "shapes.parquet"
store_root = shapes_group.store_path.store.root
path = Path(store_root) / shapes_group.path / "shapes.parquet"
shapes.to_parquet(path)

attrs = format.attrs_to_dict(shapes.attrs)
Expand Down
9 changes: 7 additions & 2 deletions src/spatialdata/_io/io_table.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@
from anndata import read_zarr as read_anndata_zarr
from anndata._io.specs import write_elem as write_adata
from ome_zarr.format import Format
from zarr.errors import ArrayNotFoundError

# from zarr.errors import ArrayNotFoundError # removed in zarr 3.0
from spatialdata._io._utils import BadFileHandleMethod, handle_read_errors
from spatialdata._io.format import CurrentTablesFormat, TablesFormats, _parse_version
from spatialdata._logging import logger
Expand Down Expand Up @@ -53,7 +53,12 @@ def _read_table(
with handle_read_errors(
on_bad_files=on_bad_files,
location=f"{subgroup.path}/{table_name}",
exc_types=(JSONDecodeError, KeyError, ValueError, ArrayNotFoundError),
exc_types=(
JSONDecodeError,
KeyError,
ValueError,
# ArrayNotFoundError, # removed in zarr 3.0
),
):
tables[table_name] = read_anndata_zarr(f_elem_store)

Expand Down
Loading
Loading