Skip to content

perf: use pre-allocated constant for null sentinel in collection serialization (10's of ns if null elements are used)#763

Draft
mykaul wants to merge 1 commit into
scylladb:masterfrom
mykaul:perf/collection-null-sentinel
Draft

perf: use pre-allocated constant for null sentinel in collection serialization (10's of ns if null elements are used)#763
mykaul wants to merge 1 commit into
scylladb:masterfrom
mykaul:perf/collection-null-sentinel

Conversation

@mykaul

@mykaul mykaul commented Mar 25, 2026

Copy link
Copy Markdown

Summary

  • Replace per-call int32_pack(-1) with a module-level _INT32_NULL constant in 5 collection serialize methods
  • Avoids a struct.pack() call on every null element during collection serialization

Details

The CQL protocol represents null collection elements as a 4-byte int32 with value -1. Previously, every null element triggered a fresh struct.pack('>i', -1) call. This PR pre-computes the result once at module load time and reuses the bytes object.

Affected sites

Type Method Null path
ListType / SetType _SimpleParameterizedType.serialize_safe null elements
MapType MapType.serialize_safe null keys, null values (2 sites)
TupleType TupleType.serialize_safe null fields
UserType UserType.serialize_safe null fields

Benchmark results

All benchmarks: CPython 3.14, median of 3-5 runs, results in nanoseconds.

Isolated operation (int32_pack(-1) call vs _INT32_NULL constant lookup, 10M iterations, median of 5 runs):

Operation Time (ns) Speedup
int32_pack(-1) (before) 35.9
_INT32_NULL lookup (after) 4.8 7.4x
Savings per null element 31.1

End-to-end serialize() — before vs after (200K iterations, median of 3 runs):

Each collection contains ~50% None elements to exercise the null path.

Type Before (ns) After (ns) Change
ListType (10 elements, 5 nulls) 1,560 1,523 -2.4%
SetType (10 elements, 5 nulls) 1,575 1,476 -6.3%
MapType (5 entries, 3 null values) 1,882 1,933 +2.7%
TupleType (5 fields, 2 nulls) 1,136 1,123 -1.1%
UserType (5 fields, 2 nulls) 1,220 1,223 +0.3%

End-to-end improvement is within measurement noise (~1-6%) because the null sentinel write is a small fraction of total serialization cost (which includes to_binary() calls, int32_pack(len(...)) for non-null elements, BytesIO writes, etc.). The benefit scales with null density — workloads with many sparse/null columns will see a larger improvement.

Analysis

The isolated benchmark shows a clear 7.4x speedup for the null-write operation itself (31 ns saved per null element). In end-to-end serialization, the improvement is diluted by the much larger cost of serializing non-null elements. The optimization is:

  • Zero-risk: pure constant substitution, identical bytes produced
  • Free at runtime: the constant is allocated once at module load
  • Scales with null density: more null elements = more savings
  • Eliminates unnecessary function call overhead on every null element

Testing

  • Added CollectionNullSentinelTests with 5 round-trip tests covering List, Set, Map, Tuple, and UserType with None elements
  • All 668 unit tests pass (38 skipped, 2 pre-existing failures on master unrelated to this PR)

@mykaul mykaul marked this pull request as draft March 25, 2026 20:31
mykaul added a commit to mykaul/python-driver that referenced this pull request Apr 3, 2026
Replace io.BytesIO() buffer pattern with list accumulation + b''.join()
in serialize_safe methods for ListType, SetType, MapType, TupleType,
and UserType. Also pre-compute _INT32_NULL = int32_pack(-1) as a
module-level constant to avoid repeated packing of the null sentinel.

Buffer assembly micro-benchmarks (isolating the BytesIO overhead from
per-element to_binary() cost):

  Scenario                  Before (us)   After (us)  Speedup
  List 100 elements              9.0          8.6      1.05x
  List 10 elements               1.1          0.9      1.18x
  List 10 all-null               0.8          0.4      2.23x
  Map 10 entries                 2.0          1.6      1.24x

The all-null case benefits most from the pre-computed _INT32_NULL
constant, which eliminates repeated int32_pack(-1) calls.

Note: PR scylladb#763 on this repo adds only the _INT32_NULL constant;
this commit is a superset that also replaces BytesIO with b''.join()
across all four collection/composite type serializers.
@mykaul mykaul force-pushed the perf/collection-null-sentinel branch from a4ece54 to 144b6fc Compare April 7, 2026 11:43
…alization

Replace per-call int32_pack(-1) with a module-level _INT32_NULL constant
in serialize methods of ListType, SetType, MapType, TupleType, and
UserType. This avoids a struct.pack call on every null element during
collection serialization.

Affected sites:
  - _SimpleParameterizedType.serialize_safe (ListType/SetType)
  - MapType.serialize_safe (null key + null value)
  - TupleType.serialize_safe
  - UserType.serialize_safe
@mykaul mykaul force-pushed the perf/collection-null-sentinel branch from 144b6fc to 25a6198 Compare April 7, 2026 14:47
@mykaul mykaul changed the title perf: use pre-allocated constant for null sentinel in collection serialization perf: use pre-allocated constant for null sentinel in collection serialization (10's of ns if null elements are used) Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant