Zarr-Python 3 represents a major refactor of the Zarr-Python codebase. Some of the goals motivating this refactor included:
- adding support for the Zarr format 3 specification (along with the Zarr format 2 specification)
- cleaning up internal and user facing APIs
- improving performance (particularly in high latency storage environments like cloud object stores)
To accommodate this, Zarr-Python 3 introduces a number of changes to the API, including a number of significant breaking changes and deprecations.
This page provides a guide explaining breaking changes and deprecations to help you migrate your code from version 2 to version 3. If we have missed anything, please open a GitHub issue so we can improve this guide.
The goals described above necessitated some breaking changes to the API (hence the
major version update), but where possible we have maintained backwards compatibility
in the most widely used parts of the API. This includes the [zarr.Array][] and
[zarr.Group][] classes and the "top-level API" (e.g. [zarr.open_array][] and
[zarr.open_group][]).
Before migrating to Zarr-Python 3, we suggest projects that depend on Zarr-Python take the following actions in order:
-
Pin the supported Zarr-Python version to
zarr>=2,<3. This is a best practice and will protect your users from any incompatibilities that may arise during the release of Zarr-Python 3. This pin can be removed after migrating to Zarr-Python 3. -
Limit your imports from the Zarr-Python package. Most of the primary API
zarr.*will be compatible in Zarr-Python 3. However, the following breaking API changes are planned:-
numcodecs.*will no longer be available inzarr.*. To migrate, import codecs directly fromnumcodecs:from numcodecs import Blosc # instead of: # from zarr import Blosc
-
The
zarr.v3_api_availablefeature flag is being removed. In Zarr-Python 3 the v3 API is always available, so you shouldn't need to use this flag. -
The following internal modules are being removed or significantly changed. If your application relies on imports from any of the below modules, you will need to either a) modify your application to no longer rely on these imports or b) vendor the parts of the specific modules that you need.
zarr.attrshas gone, with no replacementzarr.codecshas changed, see "Codecs" section below for more informationzarr.contexthas gone, with no replacementzarr.coreremains but should be considered private APIzarr.hierarchyhas gone, with no replacement (usezarr.Groupinplace ofzarr.hierarchy.Group)zarr.indexinghas gone, with no replacementzarr.metahas gone, with no replacementzarr.meta_v1has gone, with no replacementzarr.synchas gone, with no replacementzarr.typeshas gone, with no replacementzarr.utilhas gone, with no replacementzarr.n5has gone, see below for an alternative N5 options
-
-
Test that your package works with version 3.
-
Update the pin to include
zarr>=3,<4.
Zarr-Python 2.x is still available, though we recommend migrating to Zarr-Python 3 for its performance improvements and new features. Security and bug fixes will be made to the 2.x series for at least six months following the first Zarr-Python 3 release. If you need to use the latest Zarr-Python 2 release, you can install it with:
$ pip install "zarr==2.*"!!! note Development and maintenance of the 2.x release series has moved to the support/v2 branch. Issues and pull requests related to this branch are tagged with the V2 label.
The following sections provide details on breaking changes in Zarr-Python 3.
-
Disallow direct construction - the signature for initializing the
Arrayclass has changed significantly. Please use [zarr.create_array][] or [zarr.open_array][] instead of directly constructing the [zarr.Array][] class. -
Defaulting to
zarr_format=3- newly created arrays will use the version 3 of the Zarr specification. To continue using version 2, setzarr_format=2when creating arrays or setdefault_zarr_format=2in Zarr's runtime configuration. -
Function signature change to [
zarr.Array.resize][] - theresizefunction now takes azarr.core.common.ShapeLikeinput rather than separate arguments for each dimension. Useresize((10,10))in place ofresize(10,10).
-
Disallow direct construction - use [
zarr.open_group][] or [zarr.create_group][] instead of directly constructing thezarr.Groupclass. -
The h5py compatibility methods
create_datasetandrequire_datasethave been removed. Use the following replacements:- [
zarr.Group.create_array][] in place ofGroup.create_dataset - [
zarr.Group.require_array][] in place ofGroup.require_dataset
- [
-
Disallow "." syntax for getting group members. To get a member of a group named
foo, usegroup["foo"]in place ofgroup.foo. -
The
zarr.storage.init_grouplow-level helper function has been removed. Use [zarr.open_group][] or [zarr.create_group][] instead:- from zarr.storage import init_group - init_group(store, overwrite=True, path="my/path") + import zarr + zarr.open_group(store, mode="w", path="my/path")
The Store API has changed significant in Zarr-Python 3.
The MutableMapping base class has been replaced in favor of a custom abstract base class ([zarr.abc.store.Store][]).
An asynchronous interface is used for all store methods that use I/O.
This change ensures that these store methods are non-blocking and are as performant as possible.
Store implementations have moved from the top-level module to zarr.storage:
# Before (v2)
- from zarr import MemoryStore
+ from zarr.storage import MemoryStoreThe following stores have been renamed or changed:
| v2 | v3 |
|---|---|
DirectoryStore |
[zarr.storage.LocalStore][] |
FSStore |
[zarr.storage.FsspecStore][] |
TempStore |
Use [tempfile.TemporaryDirectory][] with [LocalStore][zarr.storage.LocalStore] |
| `zarr. |
A number of deprecated stores were also removed. See issue #1274 for more details on the removal of these stores.
N5Store- see https://github.com/zarr-developers/n5py for an alternative interface to N5 formatted data.ABSStore- use the [zarr.storage.FsspecStore][] instead along with fsspec's adlfs backend.DBMStoreLMDBStoreSQLiteStoreMongoDBStoreRedisStore
The latter five stores in this list do not have an equivalent in Zarr-Python 3. If you are interested in developing a custom store that targets these backends, see developing custom stores or open an issue to discuss your use case.
Codecs defined in numcodecs (and also imported into the zarr.codecs namespace in Zarr-Python 2)
should still be used when creating Zarr format 2 arrays.
Codecs for creating Zarr format 3 arrays are available in two locations:
zarr.codecscontains Zarr format 3 codecs that are defined in the codecs section of the Zarr format 3 specification.numcodecs.zarr3contains codecs fromnumcodecsthat can be used to create Zarr format 3 arrays, but are not necessarily part of the Zarr format 3 specification.
When installing using pip:
- The new
remotedependency group can be used to install a supported version offsspec, required for remote data access. - The new
gpudependency group can be used to install a supported version ofcuda, required for GPU functionality. - The
jupyteroptional dependency group has been removed, since v3 contains no jupyter specific functionality.
- The keyword argument
zarr_versionin most creation functions inzarr(e.g. [zarr.create][], [zarr.open][], [zarr.group][], [zarr.array][]) has been removed. Usezarr_formatinstead.
Zarr-Python 3 is still under active development, and is not yet fully complete. The following list summarizes areas of the codebase that we expect to build out after the 3.0.0 release. If features listed below are important to your use case of Zarr-Python, please open (or comment on) a GitHub issue.
The following functions / methods have not been ported to Zarr-Python 3 yet:
zarr.copy(issue #2407)zarr.copy_all(issue #2407)zarr.copy_store(issue #2407)zarr.Group.move(issue #2108)
The following features (corresponding to function arguments to functions in
zarr) have not been ported to Zarr-Python 3 yet. Using these features
will raise a warning or a NotImplementedError:
cache_attrscache_metadatachunk_store(issue #2495)meta_arrayobject_codec(issue #2617)synchronizer(issue #1596)dimension_separator
The following features that were supported by Zarr-Python 2 have not been ported to Zarr-Python 3 yet:
- Object dtypes (issue #2616)
- Ragged arrays (issue #2618)
- Groups and Arrays do not implement
__enter__and__exit__protocols (issue #2619) - Default filters for object dtypes for Zarr format 2 arrays (issue #2627)