Skip to content

Commit 4514e3e

Browse files
authored
Fix 1849 (#1855)
* add new dev branch with audit file * Switch WAMP ubjson serializer from py-ubjson to bjdata (#1849) Fixes #1849: py-ubjson is unmaintained and ships no wheels, so `pip install --only-binary :all:` for autobahn failed. The WAMP "ubjson" serializer is now backed by bjdata (Binary JData), a maintained, wheel-shipping successor. - serializer.py / message.py: import bjdata as the "ubjson" backend (serializer id unchanged for transport negotiation). - pyproject.toml: drop the unconditional py-ubjson dependency; bjdata is an OPTIONAL dependency in the `serialization` extra (it pulls in numpy, which we keep out of a minimal install). A minimal `pip install autobahn` now installs cleanly from wheels only. - PyPy: set PYBJDATA_NO_EXTENSION=1 in the test recipes and CI (serdes) to use bjdata's JIT-friendly pure-Python path (also avoids its numpy-ABI-fragile C extension). WIRE-LEVEL CHANGE: bjdata's octet encoding is NOT identical to the prior py-ubjson/UBJSON bytes (unsigned-int markers, little-endian). The wamp-proto UBJSON test vectors will be regenerated in a follow-up wamp-proto PR after the next autobahn-python release; until then the serdes byte-vector conformance suite excludes "ubjson" (round-trip + cross-serializer coverage is retained via test_wamp_serializer.py). See the changelog. Note: This work was completed with AI assistance (Claude Code). * Docs: correct ubjson/bjdata packaging notes (sdist-only); drop py-ubjson refs (#1849) Follow-up wording/doc fixes for the py-ubjson -> bjdata switch: - README.md / docs/installation.rst: bjdata (like py-ubjson) is published sdist-only with no PyPI wheels and builds an optional C extension; document the optional `autobahn[serialization]` extra, PYBJDATA_NO_EXTENSION=1, and steer wheels-only/cross-arch users to cbor/msgpack. UBJSON is no longer described as "included by default". - docs/changelog.rst: reframe the #1849 fix as "the binary-JSON dependency is now optional, so a base autobahn install is wheel-clean" (bjdata itself ships no wheels). - serializer.py: drop the inaccurate "wheel-shipping" wording and state the sdist/C-extension reality. Note: This work was completed with AI assistance (Claude Code). * Make bjdata (ubjson serializer) CPython-only; PyPy can't build it (#1849) bjdata's sdist build pulls numpy as an unconditional build dependency (`oldest-supported-numpy` in build-system.requires, plus a top-level `from numpy import get_include` in setup.py) even though it skips its C extension on PyPy. On PyPy that numpy pin (1.23.2) has no wheel and fails to compile, so `pip install autobahn[serialization]` cannot be installed on PyPy at all -- this broke the wstest CI jobs. Reported upstream as NeuroJSON/pybj#6. - pyproject.toml: restrict bjdata to CPython (`bjdata>=0.6.0; platform_python_implementation == 'CPython'`). On PyPy the UBJSON serializer is unavailable (graceful try/except ImportError); use cbor/msgpack instead. - Remove the now-moot PyPy PYBJDATA_NO_EXTENSION lines from justfile and CI: the flag can't help, because the numpy build dependency is installed under PEP 517 build isolation before bjdata's setup.py is ever run. - Update README, docs/installation.rst, docs/changelog.rst and the serializer.py comment to document bjdata as CPython-only, linking NeuroJSON/pybj#6. Note: This work was completed with AI assistance (Claude Code). * test: bump .proto submodule to wamp-proto master (#557 merged) wamp-proto#557 (the canonical WAMP conformance test-vector suite) was squash-merged to wamp-proto master as d8389b8. Bump the .proto submodule to pick it up. This brings in the BJData (bjdata) UBJSON byte vectors, so the serdes conformance suite can be re-enabled for the "ubjson" serializer (follow-up commit). The serdes suite still passes (446) with "ubjson" currently excluded. Refs: #1849. Note: This work was completed with AI assistance (Claude Code). * test: re-enable ubjson in serdes conformance suite against bjdata vectors With the wamp-proto vectors now carrying both the legacy py-ubjson and the new bjdata/BJData byte representations for "ubjson" (.proto bumped in the previous commit), re-enable the serializer in the byte-vector conformance suite: - conftest: _VECTOR_EXCLUDED_SERIALIZERS is now empty (no serializer excluded). - utils.require_decodable(): new helper implementing the "at least one must match" rule for the deserialize direction. A serializer's variant list may hold encodings produced by other implementations/versions that this backend cannot decode, or decodes to a valid-but-wrong message (py-ubjson bytes read as little-endian bjdata). It keeps variants whose decode re-serializes to a canonical variant (cosmetic variants + this backend's own encoding), falling back to decode-without-error for serializers whose re-serialization is not byte-canonical (flatbuffers). At least one variant must qualify. - The 25 per-message test files use require_decodable() at their decode sites (deserialize-from-bytes, roundtrip, cross-serializer source) instead of iterating/indexing the raw variant list. Serdes suite: 529 passed (up from 446 with ubjson excluded). On PyPy, bjdata is unavailable so the ubjson serializer is simply not registered and these tests do not run. Refs: #1849. Note: This work was completed with AI assistance (Claude Code).
1 parent f1ac1c3 commit 4514e3e

36 files changed

Lines changed: 211 additions & 74 deletions

.audit/oberstet_fix_1849.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
- [ ] I did **not** use any AI-assistance tools to help create this pull request.
2+
- [x] I **did** use AI-assistance tools to *help* create this pull request.
3+
- [x] I have read, understood and followed the projects' [AI Policy](https://github.com/crossbario/autobahn-python/blob/main/AI_POLICY.md) when creating code, documentation etc. for this pull request.
4+
5+
Submitted by: @oberstet
6+
Date: 2025-06-13
7+
Related issue(s): #1849
8+
Branch: oberstet:fix_1849

.proto

Submodule .proto updated 85 files

README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -343,14 +343,14 @@ masking) and UTF-8 validation.
343343

344344
### WAMP Serializers
345345

346-
**As of v25.11.1, all WAMP serializers are included by default** - batteries included!
346+
**The JSON, MessagePack, CBOR and FlatBuffers serializers are included by default** - batteries included! UBJSON is available via the optional `autobahn[serialization]` extra.
347347

348-
Autobahn|Python now ships with full support for all WAMP serializers out-of-the-box:
348+
Autobahn|Python ships with the following WAMP serializers:
349349

350350
- **JSON** (standard library) - always available
351351
- **MessagePack** - high-performance binary serialization
352352
- **CBOR** - IETF standard binary serialization (RFC 8949)
353-
- **UBJSON** - Universal Binary JSON
353+
- **UBJSON** - Universal Binary JSON *(optional: `pip install autobahn[serialization]`)*
354354
- **Flatbuffers** - Google's zero-copy serialization (vendored)
355355

356356
#### Architecture & Performance
@@ -363,12 +363,12 @@ The serializer dependencies are optimized for both **CPython** and **PyPy**:
363363
| **msgpack** | Binary wheel (C extension) | u-msgpack-python (pure Python) | Native + Universal | PyPy JIT makes pure Python faster than C |
364364
| **ujson** | Binary wheel | Binary wheel | Native | Available for both implementations |
365365
| **cbor2** | Binary wheel | Pure Python fallback | Native + Universal | Binary wheels + py3-none-any |
366-
| **ubjson** | Pure Python | Pure Python | Source | Set `PYUBJSON_NO_EXTENSION=1` to skip C build |
366+
| **ubjson** *(optional)* | C ext (from sdist) | ❌ n/a (CPython only) | Source only — no wheels | Optional `autobahn[serialization]` extra (bjdata, pulls numpy). CPython-only: bjdata can't install on PyPy ([NeuroJSON/pybj#6](https://github.com/NeuroJSON/pybj/issues/6)); on PyPy use cbor/msgpack. On CPython without a compiler set `PYBJDATA_NO_EXTENSION=1` |
367367
| **flatbuffers** | Vendored | Vendored | Included | Always available, no external dependency |
368368

369369
**Key Design Principles:**
370370

371-
1. **Batteries Included**: All serializers available without extra install steps
371+
1. **Batteries Included**: Core serializers (JSON, MessagePack, CBOR, FlatBuffers) available without extra install steps; UBJSON via the optional `autobahn[serialization]` extra
372372
2. **PyPy Optimization**: Pure Python implementations leverage PyPy's JIT for superior performance
373373
3. **Binary Wheels**: Native wheels for all major platforms (Linux x86_64/ARM64, macOS x86_64/ARM64, Windows x86_64)
374374
4. **Zero System Pollution**: All dependencies install cleanly via wheels or pure Python
@@ -429,15 +429,15 @@ All dependencies follow these design principles:
429429

430430
### WAMP Serializers (Batteries Included)
431431

432-
All serializers are now **included by default** in the base installation:
432+
All serializers **except UBJSON** are included by default in the base installation; UBJSON is an optional extra (`pip install autobahn[serialization]`):
433433

434434
| Serializer | Purpose | CPython | PyPy | Wheel Coverage | Notes |
435435
|------------|---------|---------|------|----------------|-------|
436436
| **json** | JSON serialization | stdlib | stdlib | ✅ Always available | Python standard library |
437437
| **msgpack** | MessagePack serialization | msgpack (binary wheel) | u-msgpack-python (pure Python) | ✅ Excellent | 50+ wheels for CPython; PyPy JIT optimized |
438438
| **ujson** | Fast JSON (optional) | Binary wheel | Binary wheel | ✅ Excellent | 30+ wheels; both implementations |
439439
| **cbor2** | CBOR serialization (RFC 8949) | Binary wheel | Pure Python fallback | ✅ Excellent | 30+ binary wheels + universal fallback |
440-
| **py-ubjson** | UBJSON serialization | Pure Python | Pure Python | ✅ Good | Optional C extension (can skip with `PYUBJSON_NO_EXTENSION=1`) |
440+
| **bjdata** | UBJSON serialization *(optional)* | C ext (from sdist) | ❌ n/a (CPython only) | ⚠️ sdist only — no wheels | `autobahn[serialization]` extra; pulls numpy. CPython-only: can't install on PyPy ([NeuroJSON/pybj#6](https://github.com/NeuroJSON/pybj/issues/6)). On CPython without a compiler set `PYBJDATA_NO_EXTENSION=1`. For wheels-only/cross-arch/PyPy installs prefer cbor/msgpack |
441441
| **flatbuffers** | Google Flatbuffers | **Vendored** | **Vendored** | ✅ Perfect | Included in our wheel, zero external dependency |
442442

443443
### Optional: Twisted Framework

docs/changelog.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,13 @@ Changelog
88
26.6.1
99
------
1010

11+
**WAMP Serialization**
12+
13+
* ``py-ubjson`` (unmaintained, sdist-only) is no longer an unconditional dependency. A base ``pip install autobahn`` — and the wheels-only / cross-arch case from #1849 (``pip download --only-binary :all: --platform ...``) — now resolves entirely from binary wheels (#1849)
14+
* The WAMP ``ubjson`` serializer is now backed by the maintained ``bjdata`` (Binary JData) package, provided as the OPTIONAL ``autobahn[serialization]`` extra (it also pulls in numpy), keeping both out of a minimal install (#1849)
15+
* ``bjdata`` is published sdist-only (no PyPI wheels) and is currently **CPython-only**: on PyPy its sdist build pulls an unbuildable numpy (upstream ``NeuroJSON/pybj#6``), so the ``ubjson`` serializer is unavailable on PyPy - use ``cbor``/``msgpack`` there. On CPython without a compiler, set ``PYBJDATA_NO_EXTENSION=1`` for a pure-Python build. For wheels-only or cross-arch deployments, also prefer ``cbor``/``msgpack`` (#1849)
16+
* ⚠️ **Wire-level change to watch out for:** bjdata's octet-level encoding is NOT identical to the previous py-ubjson/UBJSON bytes (different integer markers, little-endian). The WAMP serializer id remains ``ubjson`` for transport negotiation. The ``wamp-proto`` UBJSON test vectors will be regenerated in a follow-up PR after this release; until then the ``ubjson`` serializer is excluded from the byte-vector conformance suite (round-trip and cross-serializer coverage retained) (#1849)
17+
1118
**FlatBuffers**
1219

1320
* Bump vendored FlatBuffers from v25.9.23 to v25.12.19, restoring the version-sync with zlmdb 26.6.1 (#1853)

docs/installation.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,7 @@ Install Variants
152152
* - ``scram``
153153
- Install WAMP-SCRAM authentication packages.
154154
* - ``serialization``
155-
- Backwards-compatible no-op; WAMP serializers are included by default.
155+
- Install ``bjdata`` to enable the optional UBJSON WAMP serializer. CPython-only - it cannot install on PyPy (upstream ``NeuroJSON/pybj#6``); on CPython without a compiler set ``PYBJDATA_NO_EXTENSION=1`` for a pure-Python build. JSON, MessagePack, CBOR and FlatBuffers are always available without this extra (use them on PyPy).
156156
* - ``nvx``
157157
- Backwards-compatible no-op; NVX acceleration is included in binary wheels where supported.
158158
* - ``all``

examples/serdes/tests/conftest.py

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,20 @@
1212
from .utils import load_test_vector, get_serializer_ids
1313

1414

15+
# The WAMP "ubjson" serializer is backed by bjdata (autobahn #1849). The
16+
# wamp-proto canonical vectors (via the .proto submodule) now carry BOTH the
17+
# legacy py-ubjson bytes and the new bjdata/BJData bytes for "ubjson" (each
18+
# tagged with a "note"), matched with "at least one must match" semantics. The
19+
# deserialize-direction tests use utils.require_decodable() to skip byte
20+
# variants this backend cannot decode (the two encodings are not mutually
21+
# decodable), so no serializer needs to be excluded from the byte-vector suite.
22+
_VECTOR_EXCLUDED_SERIALIZERS = ()
23+
24+
25+
def _conformance_serializer_ids():
26+
return [s for s in get_serializer_ids() if s not in _VECTOR_EXCLUDED_SERIALIZERS]
27+
28+
1529
@pytest.fixture(scope="session")
1630
def wamp_test_vector_publish():
1731
"""Load PUBLISH test vector"""
@@ -48,12 +62,12 @@ def pytest_generate_tests(metafunc):
4862
This generates test parameters for serializer_id based on available serializers.
4963
"""
5064
if "serializer_id" in metafunc.fixturenames:
51-
serializer_ids = get_serializer_ids()
65+
serializer_ids = _conformance_serializer_ids()
5266
metafunc.parametrize("serializer_id", serializer_ids)
5367

5468
if "serializer_pair" in metafunc.fixturenames:
5569
# Generate all unique pairs of serializers for cross-serializer tests
56-
serializer_ids = get_serializer_ids()
70+
serializer_ids = _conformance_serializer_ids()
5771
pairs = []
5872
for i, ser1 in enumerate(serializer_ids):
5973
for ser2 in serializer_ids[i + 1 :]:

examples/serdes/tests/test_abort.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
from autobahn.wamp.serializer import create_transport_serializer
1515

1616
from .utils import (
17+
require_decodable,
1718
load_test_vector,
1819
bytes_from_hex,
1920
matches_any_byte_representation,
@@ -62,7 +63,7 @@ def test_abort_deserialize_from_bytes(serializer_id, abort_samples, create_seria
6263
byte_variants = sample["serializers"][serializer_id]
6364

6465
# Try deserializing each byte variant
65-
for variant in byte_variants:
66+
for variant in require_decodable(serializer, byte_variants):
6667
# Get bytes
6768
if "bytes_hex" in variant:
6869
test_bytes = bytes_from_hex(variant["bytes_hex"])
@@ -129,7 +130,7 @@ def test_abort_roundtrip(serializer_id, abort_samples, create_serializer):
129130
if not byte_variants:
130131
continue
131132

132-
variant = byte_variants[0]
133+
variant = require_decodable(serializer, byte_variants)[0]
133134
if "bytes_hex" in variant:
134135
original_bytes = bytes_from_hex(variant["bytes_hex"])
135136
elif "bytes" in variant:

examples/serdes/tests/test_authenticate.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
from autobahn.wamp.serializer import create_transport_serializer
1515

1616
from .utils import (
17+
require_decodable,
1718
load_test_vector,
1819
bytes_from_hex,
1920
matches_any_byte_representation,
@@ -64,7 +65,7 @@ def test_authenticate_deserialize_from_bytes(
6465
byte_variants = sample["serializers"][serializer_id]
6566

6667
# Try deserializing each byte variant
67-
for variant in byte_variants:
68+
for variant in require_decodable(serializer, byte_variants):
6869
# Get bytes
6970
if "bytes_hex" in variant:
7071
test_bytes = bytes_from_hex(variant["bytes_hex"])
@@ -133,7 +134,7 @@ def test_authenticate_roundtrip(serializer_id, authenticate_samples, create_seri
133134
if not byte_variants:
134135
continue
135136

136-
variant = byte_variants[0]
137+
variant = require_decodable(serializer, byte_variants)[0]
137138
if "bytes_hex" in variant:
138139
original_bytes = bytes_from_hex(variant["bytes_hex"])
139140
elif "bytes" in variant:

examples/serdes/tests/test_call.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
from autobahn.wamp.serializer import create_transport_serializer
1515

1616
from .utils import (
17+
require_decodable,
1718
load_test_vector,
1819
bytes_from_hex,
1920
matches_any_byte_representation,
@@ -62,7 +63,7 @@ def test_call_deserialize_from_bytes(serializer_id, call_samples, create_seriali
6263
byte_variants = sample["serializers"][serializer_id]
6364

6465
# Try deserializing each byte variant
65-
for variant in byte_variants:
66+
for variant in require_decodable(serializer, byte_variants):
6667
# Get bytes
6768
if "bytes_hex" in variant:
6869
test_bytes = bytes_from_hex(variant["bytes_hex"])
@@ -149,7 +150,7 @@ def test_call_roundtrip(serializer_id, call_samples, create_serializer):
149150
if not byte_variants:
150151
continue
151152

152-
variant = byte_variants[0]
153+
variant = require_decodable(serializer, byte_variants)[0]
153154
if "bytes_hex" in variant:
154155
original_bytes = bytes_from_hex(variant["bytes_hex"])
155156
elif "bytes" in variant:

examples/serdes/tests/test_cancel.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
from autobahn.wamp.serializer import create_transport_serializer
1515

1616
from .utils import (
17+
require_decodable,
1718
load_test_vector,
1819
bytes_from_hex,
1920
matches_any_byte_representation,
@@ -64,7 +65,7 @@ def test_cancel_deserialize_from_bytes(
6465
byte_variants = sample["serializers"][serializer_id]
6566

6667
# Try deserializing each byte variant
67-
for variant in byte_variants:
68+
for variant in require_decodable(serializer, byte_variants):
6869
# Get bytes
6970
if "bytes_hex" in variant:
7071
test_bytes = bytes_from_hex(variant["bytes_hex"])
@@ -131,7 +132,7 @@ def test_cancel_roundtrip(serializer_id, cancel_samples, create_serializer):
131132
if not byte_variants:
132133
continue
133134

134-
variant = byte_variants[0]
135+
variant = require_decodable(serializer, byte_variants)[0]
135136
if "bytes_hex" in variant:
136137
original_bytes = bytes_from_hex(variant["bytes_hex"])
137138
elif "bytes" in variant:

0 commit comments

Comments
 (0)