Skip to content

Commit 932788f

Browse files
authored
Merge branch 'main' into ig/spec0_py314
2 parents 2e3a48d + 536ce9d commit 932788f

29 files changed

Lines changed: 761 additions & 84 deletions

.github/workflows/codspeed.yml

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,10 @@
11
name: CodSpeed Benchmarks
22

33
on:
4-
push:
5-
branches:
6-
- "main"
4+
schedule:
5+
- cron: '0 9 * * 1' # Every Monday at 9am UTC
76
pull_request:
8-
types: [labeled, synchronize]
9-
# `workflow_dispatch` allows CodSpeed to trigger backtest
10-
# performance analysis in order to generate initial data.
7+
types: [labeled]
118
workflow_dispatch:
129

1310
permissions:
@@ -17,15 +14,14 @@ jobs:
1714
benchmarks:
1815
name: Run benchmarks
1916
runs-on: codspeed-macro
20-
# Only run benchmarks for: pushes to main, manual triggers, or PRs with 'benchmark' label
2117
if: |
22-
github.event_name == 'push' ||
18+
github.event_name == 'schedule' ||
2319
github.event_name == 'workflow_dispatch' ||
2420
(github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'benchmark'))
2521
steps:
2622
- uses: actions/checkout@v6
2723
with:
28-
fetch-depth: 0 # grab all branches and tags
24+
fetch-depth: 0
2925
- name: Set up Python
3026
uses: actions/setup-python@v6
3127
with:

changes/3778.misc.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
`Group.tree()` no longer requires the `rich` dependency. Tree rendering now uses built-in ANSI bold for terminals and HTML bold for Jupyter. New parameters: `plain=True` for unstyled output, and `max_nodes` (default 500) to truncate large hierarchies with early bailout.

docs/overrides/stylesheets/extra.css

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -56,11 +56,6 @@
5656
.md-header .md-search__input {
5757
background-color: rgba(255, 255, 255, 0.15);
5858
border: 1px solid rgba(255, 255, 255, 0.2);
59-
color: white;
60-
}
61-
62-
.md-header .md-search__input::placeholder {
63-
color: rgba(255, 255, 255, 0.7);
6459
}
6560

6661
/* Navigation tabs */

docs/user-guide/glossary.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Glossary
2+
3+
This page defines key terms used throughout the zarr-python documentation and API.
4+
5+
## Array Structure
6+
7+
### Array
8+
9+
An N-dimensional typed array stored in a Zarr [store](#store). An array's
10+
[metadata](#metadata) defines its shape, data type, chunk layout, and codecs.
11+
12+
### Chunk
13+
14+
The fundamental unit of data in a Zarr array. An array is divided into chunks
15+
along each dimension according to the [chunk grid](#chunk-grid), which is currently
16+
part of Zarr's private API. Each chunk is independently compressed and encoded
17+
through the array's [codec](#codec) pipeline.
18+
19+
When [sharding](#shard) is used, "chunk" refers to the inner chunks within each
20+
shard, because those are the compressible units. The chunks are the smallest units
21+
that can be read independently.
22+
23+
!!! warning "Convention specific to zarr-python"
24+
The use of "chunk" to mean the inner sub-chunk within a shard is a convention
25+
adopted by zarr-python's `Array` API. In the Zarr V3 specification and in other
26+
Zarr implementations, "chunk" may refer to the top-level grid cells (which
27+
zarr-python calls "shards" when the sharding codec is used). Be aware of this
28+
distinction when working across libraries.
29+
30+
**API**: [`Array.chunks`][zarr.Array.chunks] returns the chunk shape. When
31+
sharding is used, this is the inner chunk shape.
32+
33+
### Chunk Grid
34+
35+
The partitioning of an array's elements into [chunks](#chunk). In Zarr V3, the
36+
chunk grid is defined in the array [metadata](#metadata) and determines the
37+
boundaries of each storage object.
38+
39+
When sharding is used, the chunk grid defines the [shard](#shard) boundaries,
40+
not the inner chunk boundaries. The inner chunk shape is defined within the
41+
[sharding codec](#shard).
42+
43+
**API**: The `chunk_grid` field in array metadata contains the storage-level
44+
grid.
45+
46+
### Shard
47+
48+
A storage object that contains one or more [chunks](#chunk). Sharding reduces the
49+
number of objects in a [store](#store) by grouping chunks together, which
50+
improves performance on file systems and object storage.
51+
52+
Within each shard, chunks are compressed independently and can be read
53+
individually. However, writing requires updating the full shard for consistency,
54+
making shards the unit of writing and chunks the unit of reading.
55+
56+
Sharding is implemented as a [codec](#codec) (the sharding indexed codec).
57+
When sharding is used:
58+
59+
- The [chunk grid](#chunk-grid) in metadata defines the shard boundaries
60+
- The sharding codec's `chunk_shape` defines the inner chunk size
61+
- Each shard contains `shard_shape / chunk_shape` chunks per dimension
62+
63+
**API**: [`Array.shards`][zarr.Array.shards] returns the shard shape, or `None`
64+
if sharding is not used. [`Array.chunks`][zarr.Array.chunks] returns the inner
65+
chunk shape.
66+
67+
## Storage
68+
69+
### Store
70+
71+
A key-value storage backend that holds Zarr data and metadata. Stores implement
72+
the [`zarr.abc.store.Store`][] interface. Examples include local file systems,
73+
cloud object storage (S3, GCS, Azure), zip files, and in-memory dictionaries.
74+
75+
Each [chunk](#chunk) or [shard](#shard) is stored as a single value (object or
76+
file) in the store, addressed by a key derived from its grid coordinates.
77+
78+
### Metadata
79+
80+
The JSON document (`zarr.json`) that describes an [array](#array) or group. For
81+
arrays, metadata includes the shape, data type, [chunk grid](#chunk-grid), fill
82+
value, and [codec](#codec) pipeline. Metadata is stored alongside the data in
83+
the [store](#store). Zarr-Python does not yet expose its internal metadata
84+
representation as part of its public API.
85+
86+
## Codecs
87+
88+
### Codec
89+
90+
A transformation applied to array data during reading and writing. Codecs are
91+
chained into a pipeline and come in three types:
92+
93+
- **Array-to-array**: Transforms like transpose that rearrange array elements
94+
- **Array-to-bytes**: Serialization that converts an array to a byte sequence
95+
(exactly one required)
96+
- **Bytes-to-bytes**: Compression or checksums applied to the serialized bytes
97+
98+
The [sharding indexed codec](#shard) is a special array-to-bytes codec that
99+
groups multiple [chunks](#chunk) into a single storage object.
100+
101+
## API Properties
102+
103+
The following properties are available on [`zarr.Array`][]:
104+
105+
| Property | Description |
106+
|----------|-------------|
107+
| `.chunks` | Chunk shape — the inner chunk shape when sharding is used |
108+
| `.shards` | Shard shape, or `None` if no sharding |
109+
| `.nchunks` | Total number of independently compressible units across the array |
110+
| `.cdata_shape` | Number of independently compressible units per dimension |

docs/user-guide/groups.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -133,5 +133,3 @@ Groups also have the [`zarr.Group.tree`][] method, e.g.:
133133
print(root.tree())
134134
```
135135

136-
!!! note
137-
[`zarr.Group.tree`][] requires the optional [rich](https://rich.readthedocs.io/en/stable/) dependency. It can be installed with the `[tree]` extra.

docs/user-guide/index.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,10 @@ Take your skills to the next level:
3535
- **[Extending](extending.md)** - Extend functionality with custom code
3636
- **[Consolidated Metadata](consolidated_metadata.md)** - Advanced metadata management
3737

38+
## Reference
39+
40+
- **[Glossary](glossary.md)** - Definitions of key terms (chunks, shards, codecs, etc.)
41+
3842
## Need Help?
3943

4044
- Browse the [API Reference](../api/zarr/index.md) for detailed function documentation

docs/user-guide/installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ These can be installed using `pip install "zarr[<extra>]"`, e.g. `pip install "z
2626
- `gpu`: support for GPUs
2727
- `remote`: support for reading/writing to remote data stores
2828

29-
Additional optional dependencies include `rich`, `universal_pathlib`. These must be installed separately.
29+
Additional optional dependencies include `universal_pathlib`. These must be installed separately.
3030

3131
## conda
3232

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ nav:
2727
- user-guide/gpu.md
2828
- user-guide/consolidated_metadata.md
2929
- user-guide/experimental.md
30+
- user-guide/glossary.md
3031
- Examples:
3132
- user-guide/examples/custom_dtype.md
3233
- API Reference:

pyproject.toml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ gpu = [
7070
"cupy-cuda12x",
7171
]
7272
cli = ["typer"]
73-
optional = ["rich", "universal-pathlib"]
73+
optional = ["universal-pathlib"]
7474

7575
[project.scripts]
7676
zarr = "zarr._cli.cli:app"
@@ -122,7 +122,6 @@ docs = [
122122
"towncrier",
123123
# Optional dependencies to run examples
124124
"numcodecs[msgpack]",
125-
"rich",
126125
"s3fs>=2023.10.0",
127126
"astroid<4",
128127
"pytest",
@@ -131,7 +130,6 @@ dev = [
131130
{include-group = "test"},
132131
{include-group = "remote-tests"},
133132
{include-group = "docs"},
134-
"rich",
135133
"universal-pathlib",
136134
"mypy",
137135
]

src/zarr/__init__.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,6 @@ def print_packages(packages: list[str]) -> None:
7878
"s3fs",
7979
"gcsfs",
8080
"universal-pathlib",
81-
"rich",
8281
"obstore",
8382
]
8483

0 commit comments

Comments
 (0)