Skip to content

Commit 83e401a

Browse files
committed
docs: add canonical name column to registered filter plugins
Introduces the canonical filter name as a normative field in the registry, in preparation for HDF5 v3 (H5Z_class3_t) plugins where the H5Z_class_t::name field becomes a load-bearing string identifier rather than a free-form debug comment. - Adds a Canonical Name column to the registry table, pre-filled with proposed names marked (proposed) pending maintainer confirmation. - Adds a "Canonical Names (HDF5 v3 / H5Z_class3_t)" section documenting where the name appears, the syntactic rules ([A-Za-z0-9_.-], <=255 bytes, case-sensitive), the first-registered-wins allocation policy, and the self-namespacing convention for unregistered plugins. - Extends "How to Register HDF5 Filter Plugin" to request a proposed canonical name as part of the submission. - Adds "Updating an existing registration for v3" with the three-step task list for current plugin maintainers. Existing plugin builds are unaffected; canonical names only become load-bearing when a plugin opts into H5Z_class3_t. Maintainer confirmations are tracked in #255. See RFC-HDFG-2026-001 for the v3 design.
1 parent 0e24d5c commit 83e401a

1 file changed

Lines changed: 79 additions & 38 deletions

File tree

docs/RegisteredFilterPlugins.md

Lines changed: 79 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ Any member of the HDF5 community can register a plugin for their or a third-part
1111
* Maintainer's contact information. Minimum an email address, preferably additional information like personal website, GitHub or social network handles. More ways to contact the responsible maintainer is better.
1212
* Filter plugin's respository.
1313
* Description of the new plugin including the specifics of the filter parameters (`cd_nelmts` and `cd_values[]`) supported by the plugin.
14+
* A proposed **canonical name** for the filter. This is the short, stable string identifier that an HDF5 v3 (`H5Z_class3_t`) plugin will advertise in its `name` field and that the library will write into the on-disk filter-pipeline message. The canonical name must be non-empty, no more than 255 bytes, drawn from the character class `[A-Za-z0-9_.-]`, and is matched case-sensitively. See [Canonical Names](#canonical-names-hdf5-v3--h5z_class3_t) below for details. (Existing v1/v2 plugins without a canonical name continue to work unchanged; this only becomes load-bearing when a plugin opts into v3.)
1415
* Links to any relevant documentation, including the licensing information.
1516

1617
Upon receiving a request with the above information, HDF Group will register the new plugin by assigning it a filter plugin _identifier_. The current policy for assigning an identifier is explained below:
@@ -24,48 +25,88 @@ Upon receiving a request with the above information, HDF Group will register the
2425

2526
## List of Filter Plugins Registered with The HDF Group
2627

27-
| Plugin Identifier | For Filter | Short Description|
28-
|--------|----------------|---------------------|
29-
|`257` |<a href="#hzip">hzip</a> |hzip compression used in Silo|
30-
|`258` |<a href="#fpzip">fpzip</a> |Duplicate of plugin `32014`|
31-
|`305` |<a href="#lzo">LZO</a> |LZO lossless compression used by PyTables|
32-
|`307` |<a href="#bzip2">BZIP2</a> |BZIP2 lossless compression used by PyTables|
33-
|`32000` |<a href="#lzf">LZF</a> |LZF lossless compression used by the h5py package|
34-
|`32001` |<a href="#blosc">BLOSC</a> |Blosc lossless compression used by PyTables|
35-
|`32002` |<a href="#mafisc">MAFISC</a> |Modified LZMA compression filter, MAFISC (Multidimensional Adaptive Filtering Improved Scientific data Compression)|
36-
|`32003` |<a href="#snappy">Snappy</a> |Snappy lossless compression|
37-
|`32004` |<a href="#lz4">LZ4</a> |LZ4 fast lossless compression algorithm|
38-
|`32005` |<a href="#apax">APAX</a> |Samplify’s APAX Numerical Encoding Technology|
39-
|`32006` |<a href="#cbf">CBF</a> |All imgCIF/CBF compressions and decompressions, including Canonical, Packed, Packed Version 2, Byte Offset and Nibble Offset|
40-
|`32007` |<a href="#jpeg-xr">JPEG-XR</a> |Enables images to be compressed/decompressed with JPEG-XR compression|
41-
|`32008` |<a href="#bitshuffle">bitshuffle</a> |Extreme version of shuffle filter that shuffles data at bit level instead of byte level|
42-
|`32009` |<a href="#spdp">SPDP</a> |SPDP fast lossless compression algorithm for single- and double-precision floating-point data|
43-
|`32010` |<a href="#lpc-rice">LPC-Rice</a> |LPC-Rice multi-threaded lossless compression|
44-
|`32011` |<a href="#ccsds-123">CCSDS-123</a> |ESA CCSDS-123 multi-threaded compression filter|
45-
|`32012` |<a href="#jpeg-ls">JPEG-LS</a> |CharLS JPEG-LS multi-threaded compression filter|
46-
|`32013` |<a href="#zfp">zfp</a> |Lossy & lossless compression of floating point and integer datasets to meet rate, accuracy, and/or precision targets.|
47-
|`32014` |<a href="#fpzip">fpzip</a> |Fast and Efficient Lossy or Lossless Compressor for Floating-Point Data|
48-
|`32015` |<a href="#zstandard">Zstandard</a> |Real-time compression algorithm with wide range of compression / speed trade-off and fast decoder|
49-
|`32016` |<a href="#b3d">B³D</a> |GPU based image compression method developed for light-microscopy applications|
50-
|`32017` |<a href="#sz">SZ</a> |An error-bounded lossy compressor for scientific floating-point data|
51-
|`32018` |<a href="#fcidecomp">FCIDECOMP</a> |EUMETSAT CharLS compression filter for use with netCDF|
52-
|`32019` |<a href="#jpeg">JPEG</a> |Jpeg compression filter|
53-
|`32020` |<a href="#vbz">VBZ</a> |Compression filter for raw dna signal data used by Oxford Nanopore|
54-
|`32021` |<a href="#fapec">FAPEC</a> | Versatile and efficient data compressor supporting many kinds of data and using an outlier-resilient entropy coder|
55-
|`32022` |<a href="#bitgroom">BitGroom</a> |The BitGroom quantization algorithm|
56-
|`32023` |<a href="#gbr">Granular BitRound (GBR)</a> |The GBR quantization algorithm is a significant improvement to the BitGroom filter|
57-
|`32024` |<a href="#sz3">SZ3</a> |A modular error-bounded lossy compression framework for scientific datasets|
58-
|`32025` |<a href="#delta-rice">Delta-Rice</a> |Lossless compression algorithm optimized for digitized analog signals based on delta encoding and rice coding|
59-
|`32026` |<a href="#blosc2">BLOSC2</a> |The recent new-generation version of the Blosc compression library|
60-
|`32027` |<a href="#flac">FLAC</a> |FLAC audio compression filter in HDF5|
61-
|`32028` |<a href="#sperr">SPERR</a> |SPERR is a lossy scientific (floating-point) data compressor that produces one of the best rate-distortion curves|
62-
|`32029` |<a href="#trpx">TERSE/PROLIX</a> |A lossless and fast compression of the diffraction data|
63-
|`32030` |<a href="#ffmpeg">FFMPEG</a> |A lossy compression filter based on ffmpeg video library|
64-
|`32031` |<a href="#jpeg2000">JPEG2000</a> | A compression filter for lossy and lossless coding|
28+
The **Canonical Name** column is the string identifier used by HDF5 v3 (`H5Z_class3_t`) plugins. Entries marked _(proposed)_ are pending confirmation by the plugin maintainer; see [Canonical Names](#canonical-names-hdf5-v3--h5z_class3_t) and the [tracking issue](https://github.com/HDFGroup/hdf5_plugins/issues/255).
29+
30+
| Plugin Identifier | Canonical Name | For Filter | Short Description|
31+
|--------|--------|----------------|---------------------|
32+
|`257` |`hzip` _(proposed)_ |<a href="#hzip">hzip</a> |hzip compression used in Silo|
33+
|`258` |`fpzip-legacy` _(proposed)_ |<a href="#fpzip">fpzip</a> |Duplicate of plugin `32014`|
34+
|`305` |`lzo` _(proposed)_ |<a href="#lzo">LZO</a> |LZO lossless compression used by PyTables|
35+
|`307` |`bzip2` _(proposed)_ |<a href="#bzip2">BZIP2</a> |BZIP2 lossless compression used by PyTables|
36+
|`32000` |`lzf` _(proposed)_ |<a href="#lzf">LZF</a> |LZF lossless compression used by the h5py package|
37+
|`32001` |`blosc` _(proposed)_ |<a href="#blosc">BLOSC</a> |Blosc lossless compression used by PyTables|
38+
|`32002` |`mafisc` _(proposed)_ |<a href="#mafisc">MAFISC</a> |Modified LZMA compression filter, MAFISC (Multidimensional Adaptive Filtering Improved Scientific data Compression)|
39+
|`32003` |`snappy` _(proposed)_ |<a href="#snappy">Snappy</a> |Snappy lossless compression|
40+
|`32004` |`lz4` _(proposed)_ |<a href="#lz4">LZ4</a> |LZ4 fast lossless compression algorithm|
41+
|`32005` |`apax` _(proposed)_ |<a href="#apax">APAX</a> |Samplify’s APAX Numerical Encoding Technology|
42+
|`32006` |`cbf` _(proposed)_ |<a href="#cbf">CBF</a> |All imgCIF/CBF compressions and decompressions, including Canonical, Packed, Packed Version 2, Byte Offset and Nibble Offset|
43+
|`32007` |`jpeg-xr` _(proposed)_ |<a href="#jpeg-xr">JPEG-XR</a> |Enables images to be compressed/decompressed with JPEG-XR compression|
44+
|`32008` |`bitshuffle` _(proposed)_ |<a href="#bitshuffle">bitshuffle</a> |Extreme version of shuffle filter that shuffles data at bit level instead of byte level|
45+
|`32009` |`spdp` _(proposed)_ |<a href="#spdp">SPDP</a> |SPDP fast lossless compression algorithm for single- and double-precision floating-point data|
46+
|`32010` |`lpc-rice` _(proposed)_ |<a href="#lpc-rice">LPC-Rice</a> |LPC-Rice multi-threaded lossless compression|
47+
|`32011` |`ccsds-123` _(proposed)_ |<a href="#ccsds-123">CCSDS-123</a> |ESA CCSDS-123 multi-threaded compression filter|
48+
|`32012` |`jpeg-ls` _(proposed)_ |<a href="#jpeg-ls">JPEG-LS</a> |CharLS JPEG-LS multi-threaded compression filter|
49+
|`32013` |`zfp` _(proposed)_ |<a href="#zfp">zfp</a> |Lossy & lossless compression of floating point and integer datasets to meet rate, accuracy, and/or precision targets.|
50+
|`32014` |`fpzip` _(proposed)_ |<a href="#fpzip">fpzip</a> |Fast and Efficient Lossy or Lossless Compressor for Floating-Point Data|
51+
|`32015` |`zstd` _(proposed)_ |<a href="#zstandard">Zstandard</a> |Real-time compression algorithm with wide range of compression / speed trade-off and fast decoder|
52+
|`32016` |`b3d` _(proposed)_ |<a href="#b3d">B³D</a> |GPU based image compression method developed for light-microscopy applications|
53+
|`32017` |`sz` _(proposed)_ |<a href="#sz">SZ</a> |An error-bounded lossy compressor for scientific floating-point data|
54+
|`32018` |`fcidecomp` _(proposed)_ |<a href="#fcidecomp">FCIDECOMP</a> |EUMETSAT CharLS compression filter for use with netCDF|
55+
|`32019` |`jpeg` _(proposed)_ |<a href="#jpeg">JPEG</a> |Jpeg compression filter|
56+
|`32020` |`vbz` _(proposed)_ |<a href="#vbz">VBZ</a> |Compression filter for raw dna signal data used by Oxford Nanopore|
57+
|`32021` |`fapec` _(proposed)_ |<a href="#fapec">FAPEC</a> | Versatile and efficient data compressor supporting many kinds of data and using an outlier-resilient entropy coder|
58+
|`32022` |`bitgroom` _(proposed)_ |<a href="#bitgroom">BitGroom</a> |The BitGroom quantization algorithm|
59+
|`32023` |`granular-bitround` _(proposed)_ |<a href="#gbr">Granular BitRound (GBR)</a> |The GBR quantization algorithm is a significant improvement to the BitGroom filter|
60+
|`32024` |`sz3` _(proposed)_ |<a href="#sz3">SZ3</a> |A modular error-bounded lossy compression framework for scientific datasets|
61+
|`32025` |`delta-rice` _(proposed)_ |<a href="#delta-rice">Delta-Rice</a> |Lossless compression algorithm optimized for digitized analog signals based on delta encoding and rice coding|
62+
|`32026` |`blosc2` _(proposed)_ |<a href="#blosc2">BLOSC2</a> |The recent new-generation version of the Blosc compression library|
63+
|`32027` |`flac` _(proposed)_ |<a href="#flac">FLAC</a> |FLAC audio compression filter in HDF5|
64+
|`32028` |`sperr` _(proposed)_ |<a href="#sperr">SPERR</a> |SPERR is a lossy scientific (floating-point) data compressor that produces one of the best rate-distortion curves|
65+
|`32029` |`terse-prolix` _(proposed)_ |<a href="#trpx">TERSE/PROLIX</a> |A lossless and fast compression of the diffraction data|
66+
|`32030` |`ffmpeg` _(proposed)_ |<a href="#ffmpeg">FFMPEG</a> |A lossy compression filter based on ffmpeg video library|
67+
|`32031` |`jpeg2000` _(proposed)_ |<a href="#jpeg2000">JPEG2000</a> | A compression filter for lossy and lossless coding|
6568

6669
> [!NOTE]
6770
> Please contact the maintainer of a filter plugin for help with the plugin or its filter in the HDF5 library.
6871
72+
## Canonical Names (HDF5 v3 / `H5Z_class3_t`)
73+
74+
Starting with the v3 plugin class (`H5Z_class3_t`) introduced in HDF5 2.x, every registered filter has a **canonical name**: a short, stable string identifier used by callers, tools, and the on-disk format.
75+
76+
### Where the canonical name appears
77+
78+
* **In the plugin source.** A v3 plugin sets `H5Z_class3_t::name` to its canonical name.
79+
* **In the file.** When a v3-aware library adds a filter to a dataset's pipeline, it writes the canonical name into the filter-pipeline message (`H5O_PLINE`), so tools like `h5dump` can identify the filter even when the plugin shared library is not installed.
80+
* **In the API.** The forthcoming `H5Pappend_filter("canonical_name", "params")` interface accepts the canonical name to identify the filter, and accepts a TOML inline table parameter string for typed configuration. (Numeric filter IDs continue to work via the existing `H5Pset_filter` interface.)
81+
* **In CLI tools.** `h5repack`, `h5dump`, and other tools display the canonical name when printing filter information.
82+
83+
### Syntactic rules
84+
85+
* Non-NULL, non-empty.
86+
* At most 255 bytes.
87+
* Drawn from the character class `[A-Za-z0-9_.-]`.
88+
* Matched case-sensitively.
89+
90+
Registrations that violate these rules are rejected by `H5Zregister()` with `H5E_BADVALUE`.
91+
92+
### Allocation
93+
94+
The HDF Group assigns a canonical name alongside the numeric `id` at filter-registration time. First-registered wins, exactly as for numeric IDs: if two plugins claim the same canonical name in one process, the second `H5Zregister()` call is rejected with `H5E_CANTREGISTER`. Coordination through this registry is the only guarantee that, for example, `"zfp"` means the same plugin on every system.
95+
96+
### Self-namespacing (non-normative)
97+
98+
Third-party plugins that have not gone through HDF Group filter registration should self-namespace their canonical names (for example, `org.example.fastlz`) to make accidental collisions vanishingly unlikely. Coordinated registration through The HDF Group remains the recommended path for plugins that want a short, bare name.
99+
100+
### Updating an existing registration for v3
101+
102+
Existing filter plugins continue to work unchanged with HDF5 v3-aware libraries; the canonical name only becomes load-bearing when a plugin opts into the new `H5Z_class3_t` class. Maintainers of currently registered plugins should:
103+
104+
1. Review the proposed canonical name in the table above for your filter ID.
105+
2. Confirm or propose a correction by replying on the [tracking issue](https://github.com/HDFGroup/hdf5_plugins/issues/255) or contacting the HDF Group [Helpdesk](mailto:help@hdfgroup.org). Confirmations replace _(proposed)_ in the table.
106+
3. When shipping a v3 build of the plugin, set `H5Z_class3_t::name` to the registered canonical name byte-identically.
107+
108+
See the HDF5 RFC on string-configured filters (RFC-HDFG-2026-001) for the full v3 design.
109+
69110
## Information about Registered Filter Plugins
70111

71112
### hzip <a name="hzip"></a>

0 commit comments

Comments
 (0)