Skip to content

Commit fa22cf7

Browse files
Merge pull request #122 from datajoint/docs/plugin-codecs-guide
docs: Add plugin codecs guide and fix codec notation
2 parents df94bb7 + b341385 commit fa22cf7

File tree

6 files changed

+683
-5
lines changed

6 files changed

+683
-5
lines changed

mkdocs.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ nav:
7474
- Object Storage:
7575
- Use Object Storage: how-to/use-object-storage.md
7676
- Use NPY Codec: how-to/use-npy-codec.md
77+
- Use Plugin Codecs: how-to/use-plugin-codecs.md
7778
- Create Custom Codecs: how-to/create-custom-codec.md
7879
- Manage Large Data: how-to/manage-large-data.md
7980
- Clean Up Storage: how-to/garbage-collection.md

src/explanation/custom-codecs.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -334,3 +334,12 @@ Custom codecs enable:
334334

335335
The codec system makes DataJoint extensible to any scientific domain without
336336
modifying the core framework.
337+
338+
## Before Creating Your Own
339+
340+
Check for existing plugin codecs that may already solve your needs:
341+
342+
- **[dj-zarr-codecs](https://github.com/datajoint/dj-zarr-codecs)** — General numpy arrays with Zarr storage
343+
- **[dj-photon-codecs](https://github.com/datajoint/dj-photon-codecs)** — Photon-limited movies with Anscombe transformation and compression
344+
345+
See the [Use Plugin Codecs](../how-to/use-plugin-codecs.md) guide for installation and usage of existing codec packages. Creating a custom codec is straightforward, but reusing existing ones saves time and ensures compatibility.

src/explanation/type-system.md

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,22 @@ Codecs provide `encode()`/`decode()` semantics for complex Python objects.
111111

112112
### `<blob>` — Serialized Python Objects
113113

114-
Stores NumPy arrays, dicts, lists, and other Python objects.
114+
Stores NumPy arrays, dicts, lists, and other Python objects using DataJoint's custom binary serialization format.
115+
116+
**Serialization format:**
117+
- **Protocol headers**:
118+
- `mYm` — MATLAB-compatible format (see [mYm on MATLAB FileExchange](https://www.mathworks.com/matlabcentral/fileexchange/81208-mym) and [mym on GitHub](https://github.com/datajoint/mym))
119+
- `dj0` — Python-extended format supporting additional types
120+
- **Optional compression**: zlib compression for data > 1KB
121+
- **Type-specific encoding**: Each Python type has a specific serialization code
122+
- **Version detection**: Protocol header embedded in blob enables format detection
123+
124+
**Supported types:**
125+
- NumPy arrays (numeric, structured, recarrays)
126+
- Collections (dict, list, tuple, set)
127+
- Scalars (int, float, bool, complex, str, bytes)
128+
- Date/time objects (datetime, date, time)
129+
- UUID, Decimal
115130

116131
```python
117132
class Results(dj.Computed):
@@ -124,6 +139,10 @@ class Results(dj.Computed):
124139
"""
125140
```
126141

142+
**Storage modes:**
143+
- `<blob>` — Stored in database as LONGBLOB (up to ~1GB depending on MySQL config)
144+
- `<blob@>` — Stored externally via `<hash@>` with MD5 deduplication
145+
127146
### `<attach>` — File Attachments
128147

129148
Stores files with filename preserved.

src/how-to/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ they assume you understand the basics and focus on getting things done.
4242
- [Object Storage Overview](object-storage-overview.md) — Navigation guide for all storage docs
4343
- [Choose a Storage Type](choose-storage-type.md) — Decision guide for codecs
4444
- [Use Object Storage](use-object-storage.md) — When and how
45+
- [Use Plugin Codecs](use-plugin-codecs.md) — Install codec packages via entry points
4546
- [Create Custom Codecs](create-custom-codec.md) — Domain-specific types
4647
- [Manage Large Data](manage-large-data.md) — Blobs, streaming, efficiency
4748
- [Clean Up External Storage](garbage-collection.md) — Garbage collection

0 commit comments

Comments
 (0)