Skip to content

Commit 178f0b2

Browse files
docs: Add <npy@> and plugin codecs to type system docs
- Add <npy@> to built-in codecs with full documentation - Add Plugin Codecs section with dj-zarr-codecs, dj-figpack-codecs, dj-photon-codecs - Explain entry point discovery mechanism - Update Choosing Types table with new recommendations Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent b0b3ed7 commit 178f0b2

File tree

1 file changed

+82
-2
lines changed

1 file changed

+82
-2
lines changed

src/explanation/type-system.md

Lines changed: 82 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ graph TB
1313
subgraph "Layer 3: Codecs"
1414
blob["‹blob›"]
1515
attach["‹attach›"]
16+
npy["‹npy@›"]
1617
object["‹object@›"]
1718
hash["‹hash@›"]
1819
custom["‹custom›"]
@@ -34,6 +35,7 @@ graph TB
3435
3536
blob --> bytes
3637
attach --> bytes
38+
npy --> json
3739
object --> json
3840
hash --> json
3941
bytes --> BLOB
@@ -108,10 +110,51 @@ Codec types use angle bracket notation:
108110
|-------|----------|--------------|---------|
109111
| `<blob>` ||`<blob@>` | Python object |
110112
| `<attach>` ||`<attach@>` | Local file path |
113+
| `<npy@>` ||| NpyRef (lazy) |
111114
| `<object@>` ||| ObjectRef |
112115
| `<hash@>` ||| bytes |
113116
| `<filepath@>` ||| ObjectRef |
114117

118+
### Plugin Codecs
119+
120+
Additional codecs are available as separately installed packages. This ecosystem is actively expanding—new codecs are added as community needs arise.
121+
122+
| Package | Codec | Description | Repository |
123+
|---------|-------|-------------|------------|
124+
| `dj-zarr-codecs` | `<zarr@>` | Zarr arrays with lazy chunked access | [datajoint/dj-zarr-codecs](https://github.com/datajoint/dj-zarr-codecs) |
125+
| `dj-figpack-codecs` | `<figpack@>` | Interactive browser visualizations | [datajoint/dj-figpack-codecs](https://github.com/datajoint/dj-figpack-codecs) |
126+
| `dj-photon-codecs` | `<photon@>` | Photon imaging data formats | [datajoint/dj-photon-codecs](https://github.com/datajoint/dj-photon-codecs) |
127+
128+
**Installation and discovery:**
129+
130+
Plugin codecs use Python's entry point mechanism for automatic registration. Install the package and DataJoint discovers the codec automatically:
131+
132+
```bash
133+
pip install dj-zarr-codecs
134+
```
135+
136+
```python
137+
import datajoint as dj
138+
139+
# Codec is available immediately after install
140+
@schema
141+
class Analysis(dj.Computed):
142+
definition = """
143+
-> Recording
144+
---
145+
data : <zarr@store>
146+
"""
147+
```
148+
149+
Packages declare their codecs in `pyproject.toml` under the `datajoint.codecs` entry point group:
150+
151+
```toml
152+
[project.entry-points."datajoint.codecs"]
153+
zarr = "dj_zarr_codecs:ZarrCodec"
154+
```
155+
156+
DataJoint loads these entry points on first use, making third-party codecs indistinguishable from built-ins.
157+
115158
### `<blob>` — Serialized Python Objects
116159

117160
Stores NumPy arrays, dicts, lists, and other Python objects using DataJoint's custom binary serialization format.
@@ -163,6 +206,41 @@ class Config(dj.Manual):
163206
"""
164207
```
165208

209+
### `<npy@>` — NumPy Arrays as .npy Files
210+
211+
Stores NumPy arrays as standard `.npy` files with lazy loading. Returns `NpyRef` which provides metadata access (shape, dtype) without downloading.
212+
213+
```python
214+
class Recording(dj.Computed):
215+
definition = """
216+
-> Session
217+
---
218+
waveform : <npy@> # Default store
219+
spectrogram : <npy@archive> # Named store
220+
"""
221+
```
222+
223+
**Lazy access:**
224+
225+
```python
226+
ref = (Recording & key).fetch1('waveform')
227+
ref.shape # (1000, 32) — no download
228+
ref.dtype # float64 — no download
229+
230+
# Explicit load
231+
arr = ref.load()
232+
233+
# Transparent numpy integration
234+
result = np.mean(ref) # Downloads automatically
235+
```
236+
237+
**Key features:**
238+
239+
- **Portable format**: Standard `.npy` readable by NumPy, MATLAB, etc.
240+
- **Lazy loading**: Shape/dtype available without I/O
241+
- **Safe bulk fetch**: Fetching many rows doesn't download until needed
242+
- **Memory mapping**: `ref.load(mmap_mode='r')` for random access to large arrays
243+
166244
### `<object@>` — Path-Addressed Storage
167245

168246
For large/complex file structures (Zarr, HDF5). Path derived from primary key.
@@ -241,9 +319,11 @@ class Network(dj.Computed):
241319
| Small scalars | Core types (`int32`, `float64`) |
242320
| Short strings | `varchar(n)` |
243321
| NumPy arrays (small) | `<blob>` |
244-
| NumPy arrays (large) | `<blob@>` |
322+
| NumPy arrays (large) | `<npy@>` or `<blob@>` |
245323
| Files to attach | `<attach>` or `<attach@>` |
246-
| Zarr/HDF5 | `<object@>` |
324+
| Zarr arrays | `<zarr@>` (plugin) |
325+
| Complex file structures | `<object@>` |
326+
| Interactive visualizations | `<figpack@>` (plugin) |
247327
| File references (in-store) | `<filepath@store>` |
248328
| Custom objects | Custom codec |
249329

0 commit comments

Comments
 (0)