Skip to content

Commit ba3ff8b

Browse files
committed
(improvement) serializers: add Cython-optimized serialization for VectorType
Add cassandra/serializers.pyx and cassandra/serializers.pxd implementing Cython-optimized serialization that mirrors the deserializers.pyx architecture. Implements type-specialized serializers for the three subtypes commonly used in vector columns: - SerFloatType: 4-byte big-endian IEEE 754 float - SerDoubleType: 8-byte big-endian double - SerInt32Type: 4-byte big-endian signed int32 SerVectorType pre-allocates a contiguous buffer and uses C-level byte swapping for float/double/int32 vectors, with a generic fallback for other subtypes. GenericSerializer delegates to the Python-level cqltype.serialize() classmethod. Factory functions find_serializer() and make_serializers() allow easy lookup and batch creation of serializers for column types. Benchmarks show ~30x speedup over the current io.BytesIO baseline and ~3x speedup over Python struct.pack for Vector<float, 1536> serialization. No setup.py changes needed - the existing cassandra/*.pyx glob already picks up new .pyx files.
1 parent caa98b6 commit ba3ff8b

4 files changed

Lines changed: 1072 additions & 1 deletion

File tree

cassandra/deserializers.pyx

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -481,7 +481,15 @@ cpdef Deserializer find_deserializer(cqltype):
481481

482482

483483
def obj_array(list objs):
484-
"""Create a (Cython) array of objects given a list of objects"""
484+
"""Create a (Cython) array of objects given a list of objects.
485+
486+
Returns the plain list for empty input since ``cython_array`` does
487+
not support zero-length shapes. Callers that use
488+
``cdef Deserializer[::1]`` typed memoryviews must guard against
489+
empty input before assignment.
490+
"""
491+
if not objs:
492+
return objs
485493
cdef object[:] arr
486494
cdef Py_ssize_t i
487495
arr = cython_array(shape=(len(objs),), itemsize=sizeof(void *), format="O")

cassandra/serializers.pxd

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Copyright ScyllaDB, Inc.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
16+
cdef class Serializer:
17+
# The cqltypes._CassandraType corresponding to this serializer
18+
cdef object cqltype
19+
20+
cpdef bytes serialize(self, object value, int protocol_version)

0 commit comments

Comments
 (0)