Fastcall by ifduyue · Pull Request #146 · ifduyue/python-xxhash

ifduyue · 2026-04-27T03:42:32Z

No description provided.

Switch 12 module-level functions from METH_VARARGS|METH_KEYWORDS to METH_FASTCALL|METH_KEYWORDS, avoiding tuple/dict allocation per call. Adds _get_buffer_or_str() helper to support str input (UTF-8 encoded) in addition to bytes-like objects, matching PyArg_ParseTupleAndKeywords "s*" behavior. Uses PyLong_AsUnsignedLongLongMask/PyLong_AsUnsignedLong for seed overflow handling, matching PyArg_ParseTupleAndKeywords "K"/"I" format silent-wrapping semantics.

- Accept 'input' as keyword argument when no positional args - Use PyLong_AsUnsignedLongMask for XXH32 seed (matching 'I' format) - Use PyLong_AsUnsignedLongLongMask for XXH64/XXH3 seeds (matching 'K') - Add comprehensive vectorcall argument passing tests

gemini-code-assist

Code Review

This pull request refactors the module-level hashing functions to use the METH_FASTCALL calling convention for improved performance. It introduces a helper function, _get_buffer_or_str, to handle input processing and adds a comprehensive test suite for the new argument parsing logic. Feedback focuses on several critical issues: a bug in exception handling within the buffer helper, missing error checks for integer conversions of the seed argument, and potential null pointer dereferences in the 128-bit integer digest calculation. Additionally, the manual argument parsing logic lacks standard Python validation for duplicate or unknown keyword arguments.

gemini-code-assist · 2026-04-27T03:44:22Z

+    if (PyObject_GetBuffer(obj, buf, PyBUF_SIMPLE) == 0) {
+        *owner = NULL;
+        return 0;
+    }
+    PyErr_Clear();
+    if (!PyUnicode_Check(obj))
+        return -1;
+    *owner = PyUnicode_AsUTF8String(obj);
+    if (*owner == NULL)
+        return -1;
+    if (PyObject_GetBuffer(*owner, buf, PyBUF_SIMPLE) < 0) {
+        Py_DECREF(*owner);
+        return -1;
+    }
+    return 0;


The current implementation of _get_buffer_or_str has a bug where it clears the Python exception too early. If PyObject_GetBuffer fails and the object is not a string, the function returns -1 with no exception set (because it was cleared at line 59). This will cause the caller to return NULL to the interpreter without an exception, triggering a SystemError. The exception should only be cleared if we are certain we are dealing with a string that needs encoding.

if (PyObject_GetBuffer(obj, buf, PyBUF_SIMPLE) == 0) { *owner = NULL; return 0; } if (PyUnicode_Check(obj)) { PyErr_Clear(); *owner = PyUnicode_AsUTF8String(obj); if (*owner == NULL) return -1; if (PyObject_GetBuffer(*owner, buf, PyBUF_SIMPLE) < 0) { Py_DECREF(*owner); return -1; } return 0; } return -1;

gemini-code-assist · 2026-04-27T03:44:22Z

+    if (nargs > 1) {
+        seed = (XXH32_hash_t)PyLong_AsUnsignedLongMask(args[1]);
+    }
+    if (kwnames) {
+        for (Py_ssize_t i = 0; i < PyTuple_GET_SIZE(kwnames); i++) {
+            PyObject *key = PyTuple_GET_ITEM(kwnames, i);
+            if (PyUnicode_CompareWithASCIIString(key, "seed") == 0) {
+                seed = (XXH32_hash_t)PyLong_AsUnsignedLongMask(args[nargs + i]);
+            }
+        }
    }


PyLong_AsUnsignedLongMask (and PyLong_AsUnsignedLongLongMask in other functions) can fail and set a Python exception if the provided argument cannot be converted to an integer. Since -1 is a valid return value for these functions, you must check PyErr_Occurred() after each call and return NULL if an error is detected. Otherwise, the function will proceed with an invalid seed and eventually return a result while an exception is still set, which is incorrect.

if (nargs > 1) { seed = (XXH32_hash_t)PyLong_AsUnsignedLongMask(args[1]); if (PyErr_Occurred()) return NULL; } if (kwnames) { for (Py_ssize_t i = 0; i < PyTuple_GET_SIZE(kwnames); i++) { PyObject *key = PyTuple_GET_ITEM(kwnames, i); if (PyUnicode_CompareWithASCIIString(key, "seed") == 0) { seed = (XXH32_hash_t)PyLong_AsUnsignedLongMask(args[nargs + i]); if (PyErr_Occurred()) return NULL; } } }

gemini-code-assist · 2026-04-27T03:44:22Z

+    PyObject *sixtyfour = PyLong_FromLong(64);
+    PyObject *low = PyLong_FromUnsignedLongLong(intdigest.low64);
+    PyObject *high = PyLong_FromUnsignedLongLong(intdigest.high64);
+    PyObject *result = PyNumber_Lshift(high, sixtyfour);
    Py_DECREF(high);
    high = result;
    result = PyNumber_Add(high, low);
    Py_DECREF(high);
    Py_DECREF(low);
    Py_DECREF(sixtyfour);


This block performs multiple Python object allocations and operations without checking for NULL returns. If any of these fail (e.g., due to memory exhaustion), subsequent calls like PyNumber_Lshift or Py_DECREF will operate on NULL pointers, leading to a crash. Use a more robust error handling pattern with checks and proper cleanup.

PyObject *sixtyfour = PyLong_FromLong(64); PyObject *low = PyLong_FromUnsignedLongLong(intdigest.low64); PyObject *high = PyLong_FromUnsignedLongLong(intdigest.high64); PyObject *result = NULL; if (sixtyfour && low && high) { PyObject *shifted = PyNumber_Lshift(high, sixtyfour); if (shifted) { result = PyNumber_Add(shifted, low); Py_DECREF(shifted); } } Py_XDECREF(high); Py_XDECREF(low); Py_XDECREF(sixtyfour);

gemini-code-assist · 2026-04-27T03:44:22Z

+    if (nargs < 1) {
+        if (!kwnames) {
+            PyErr_SetString(PyExc_TypeError, "xxh32_digest() missing required argument 'input'");
+            return NULL;
+        }
+        int found = 0;
+        for (Py_ssize_t i = 0; i < PyTuple_GET_SIZE(kwnames); i++) {
+            if (PyUnicode_CompareWithASCIIString(PyTuple_GET_ITEM(kwnames, i), "input") == 0) {
+                if (nargs + i >= 0 && _get_buffer_or_str(args[nargs + i], &buf, &buf_owner) < 0)
+                    return NULL;
+                found = 1;
+                break;
+            }
+        }
+        if (!found) {
+            PyErr_SetString(PyExc_TypeError, "xxh32_digest() missing required argument 'input'");
+            return NULL;
+        }
+    } else {
+        if (_get_buffer_or_str(args[0], &buf, &buf_owner) < 0)
+            return NULL;
+    }

-    if (!PyArg_ParseTupleAndKeywords(args, kwargs, "s*|I:xxh32_digest", keywords, &buf, &seed)) {
-        return NULL;
+    if (nargs > 1) {
+        seed = (XXH32_hash_t)PyLong_AsUnsignedLongMask(args[1]);
+    }
+    if (kwnames) {
+        for (Py_ssize_t i = 0; i < PyTuple_GET_SIZE(kwnames); i++) {
+            PyObject *key = PyTuple_GET_ITEM(kwnames, i);
+            if (PyUnicode_CompareWithASCIIString(key, "seed") == 0) {
+                seed = (XXH32_hash_t)PyLong_AsUnsignedLongMask(args[nargs + i]);
+            }
+        }
    }


The manual argument parsing for METH_FASTCALL is missing several validation checks that were previously handled by PyArg_ParseTupleAndKeywords. This is a regression in API robustness:

Duplicate arguments: It doesn't detect if input or seed are provided both positionally and as keyword arguments (e.g., xxh32_digest(data, input=data)).

Unknown keyword arguments: It silently ignores any unexpected keys in kwnames.

Extra positional arguments: It doesn't check if nargs > 2.

While FASTCALL is intended for performance, it should still maintain standard Python argument validation behavior.

codspeed-hq · 2026-04-27T03:44:39Z

Merging this PR will not alter performance

✅ 96 untouched benchmarks

_{Comparing fastcall (0a22795) with master (ce1cb9f)}

PyErr_Clear() is now inside the PyUnicode_Check() branch, so if PyObject_GetBuffer fails on a non-string object, the original TypeError propagates correctly instead of being lost.

Checking str before attempting PyObject_GetBuffer avoids a guaranteed failure on str objects, saving a TypeError set/clear cycle. For non-str objects that fail GetBuffer, the original TypeError propagates correctly since no PyErr_Clear is ever called.

ifduyue · 2026-04-27T03:58:06Z

/gemini review

gemini-code-assist

Code Review

This pull request migrates the module-level digest functions to the METH_FASTCALL calling convention to improve performance. However, the manual argument parsing logic introduced is incomplete and deviates from standard Python behavior; it fails to validate unknown keywords, duplicate arguments, or extra positional parameters. Additionally, the feedback identifies missing error handling for Python API calls and inconsistencies in integer overflow behavior compared to the previous implementation. It is recommended to centralize the parsing logic and expand the test suite to cover these edge cases.

gemini-code-assist · 2026-04-27T03:59:58Z

+    if (nargs < 1) {
+        if (!kwnames) {
+            PyErr_SetString(PyExc_TypeError, "xxh32_digest() missing required argument 'input'");
+            return NULL;
+        }
+        int found = 0;
+        for (Py_ssize_t i = 0; i < PyTuple_GET_SIZE(kwnames); i++) {
+            if (PyUnicode_CompareWithASCIIString(PyTuple_GET_ITEM(kwnames, i), "input") == 0) {
+                if (nargs + i >= 0 && _get_buffer_or_str(args[nargs + i], &buf, &buf_owner) < 0)
+                    return NULL;
+                found = 1;
+                break;
+            }
+        }
+        if (!found) {
+            PyErr_SetString(PyExc_TypeError, "xxh32_digest() missing required argument 'input'");
+            return NULL;
+        }
+    } else {
+        if (_get_buffer_or_str(args[0], &buf, &buf_owner) < 0)
+            return NULL;
+    }

-    if (!PyArg_ParseTupleAndKeywords(args, kwargs, "s*|I:xxh32_digest", keywords, &buf, &seed)) {
-        return NULL;
+    if (nargs > 1) {
+        seed = (XXH32_hash_t)PyLong_AsUnsignedLongMask(args[1]);
+    }
+    if (kwnames) {
+        for (Py_ssize_t i = 0; i < PyTuple_GET_SIZE(kwnames); i++) {
+            PyObject *key = PyTuple_GET_ITEM(kwnames, i);
+            if (PyUnicode_CompareWithASCIIString(key, "seed") == 0) {
+                seed = (XXH32_hash_t)PyLong_AsUnsignedLongMask(args[nargs + i]);
+            }
+        }
    }


The manual argument parsing for METH_FASTCALL is incomplete and deviates from standard Python behavior in several ways:

Unknown Keywords: Keyword arguments other than input or seed are silently ignored. They should raise a TypeError.

Duplicate Arguments: If an argument is provided both positionally and via keyword (e.g., xxh32_digest(data, input=data)), it should raise a TypeError.

Positional Argument Limit: Extra positional arguments (e.g., xxh32_digest(a, b, c)) are ignored instead of raising a TypeError.

Overflow Behavior: PyLong_AsUnsignedLongMask silently masks values, whereas the previous I format in PyArg_ParseTupleAndKeywords raised an OverflowError for values exceeding 32 bits. This creates an inconsistency between the module-level functions and the object-based API.

Error Handling: PyLong_AsUnsignedLongMask can fail (e.g., if passed a non-integer). You must check PyErr_Occurred() and return NULL if an exception is set.

Since this logic is duplicated across 12 functions, I recommend implementing a shared internal helper function to handle argument parsing correctly and consistently.

gemini-code-assist · 2026-04-27T03:59:58Z

+    PyObject *sixtyfour = PyLong_FromLong(64);
+    PyObject *low = PyLong_FromUnsignedLongLong(intdigest.low64);
+    PyObject *high = PyLong_FromUnsignedLongLong(intdigest.high64);
+    PyObject *result = PyNumber_Lshift(high, sixtyfour);
    Py_DECREF(high);
    high = result;
    result = PyNumber_Add(high, low);
    Py_DECREF(high);
    Py_DECREF(low);
    Py_DECREF(sixtyfour);


This block lacks error handling for Python API calls. PyLong_FromLong, PyLong_FromUnsignedLongLong, PyNumber_Lshift, and PyNumber_Add can all return NULL on failure (e.g., out of memory). Passing NULL to subsequent calls or to Py_DECREF will cause a crash.

I suggest adding checks for NULL and using Py_XDECREF for safe cleanup.

PyObject *sixtyfour = PyLong_FromLong(64); PyObject *low = PyLong_FromUnsignedLongLong(intdigest.low64); PyObject *high = PyLong_FromUnsignedLongLong(intdigest.high64); PyObject *result = NULL; if (sixtyfour && low && high) { PyObject *shifted = PyNumber_Lshift(high, sixtyfour); if (shifted) { result = PyNumber_Add(shifted, low); Py_DECREF(shifted); } } Py_XDECREF(high); Py_XDECREF(low); Py_XDECREF(sixtyfour);

gemini-code-assist · 2026-04-27T03:59:58Z

+    def test_missing_input(self):
+        for a in ('xxh32', 'xxh64', 'xxh3_64', 'xxh3_128'):
+            for fn in self._funcs(a):
+                with self.assertRaises(TypeError):
+                    fn()


The tests should be expanded to cover edge cases in argument parsing. Specifically, verify that:

Unknown keyword arguments raise TypeError.

Duplicate arguments (positional + keyword) raise TypeError.

Too many positional arguments raise TypeError.

Invalid types for the seed argument raise an appropriate exception.

PyLong_AsUnsignedLongMask and PyLong_AsUnsignedLongLongMask return -1 (which is a valid seed value) when the argument cannot be converted to an integer. Without checking PyErr_Occurred(), the function would proceed with seed=UINT_MAX/ULLONG_MAX and return a result while an exception is still set, causing a SystemError.

Replaces 12 duplicated argument-parsing blocks with a single shared helper that correctly handles: - Unknown keywords (raises TypeError instead of silently ignoring) - Duplicate arguments (positional + keyword for same param) - Too many positional arguments - Proper error cleanup on failure (PyBuffer_Release + Py_XDECREF) Uses goto for clean error-path resource release.

PyLong_FromLong, PyLong_FromUnsignedLongLong, PyNumber_Lshift, and PyNumber_Add can all return NULL on failure. Passing NULL to subsequent calls or to Py_DECREF would crash. Use Py_XDECREF for safe cleanup and guard all intermediate results. Fixes both module-level xxh3_128_intdigest and PYXXH3_128_intdigest.

Cover unknown keywords, duplicate arguments, too many positional args, and invalid seed types across all four hash algorithms.

- Track seed_found in _parse_fastcall_args to detect duplicate seed passed both positionally and via keyword (TypeError) - Rewrite test_fastcall.py: 32 tests covering normal cases, error cases (unknown kw, duplicate args, missing input, invalid types), and seed overflow wrapping across all four algorithms

ifduyue · 2026-04-27T04:20:34Z

/gemini review

gemini-code-assist

Code Review

This pull request migrates module-level hash functions to the METH_FASTCALL calling convention and introduces a unified argument parsing helper. Feedback notes that using PyLong_AsUnsignedLongLongMask for seeds introduces a breaking change by silently masking overflows instead of raising an OverflowError. Additionally, an optimization was suggested to use a static reference for the constant integer 64 in 128-bit intdigest calculations to improve performance.

gemini-code-assist · 2026-04-27T04:25:28Z

+        input_found = 1;
+    }
+    if (nargs >= 2) {
+        *seed = PyLong_AsUnsignedLongLongMask(args[1]);


The use of PyLong_AsUnsignedLongLongMask changes the behavior for seed values that exceed the range of unsigned long long. Previously, PyArg_ParseTupleAndKeywords with the I or K format would raise an OverflowError. Now, these values will be silently masked. While the new tests in tests/test_fastcall.py suggest this wrapping behavior is intended, it is a breaking change from the previous implementation. If strict range checking is desired, consider using PyLong_AsUnsignedLongLong and handling the potential OverflowError.

gemini-code-assist · 2026-04-27T04:25:28Z

-
+    Py_XDECREF(buf_owner);
+
+    PyObject *sixtyfour = PyLong_FromLong(64);


Creating a new Python long object for the constant 64 on every call to intdigest is slightly inefficient. While PyLong_FromLong is fast for small integers in modern Python versions, for a performance-oriented 'Fastcall' implementation, you might consider using a static reference or a more direct way to construct the 128-bit integer if performance becomes a bottleneck.

static PyObject *sixtyfour = NULL; if (sixtyfour == NULL) { sixtyfour = PyLong_FromLong(64); } PyObject *low = PyLong_FromUnsignedLongLong(intdigest.low64); PyObject *high = PyLong_FromUnsignedLongLong(intdigest.high64);

Py_BuildValue('I'/'K', val) parses a format string on every call. PyLong_FromUnsignedLong/LongLong constructs the integer directly, saving the format parse overhead. 10-21% faster for intdigest functions.

ifduyue · 2026-04-27T04:47:13Z

/gemini review

Ensures inlining across all optimization levels and compilers, matching CPython's own convention for performance-critical local helpers.

gemini-code-assist

Code Review

This pull request implements METH_FASTCALL for module-level hash functions in _xxhash.c to improve performance by bypassing the overhead of PyArg_ParseTupleAndKeywords. It introduces internal helper functions _get_buffer_or_str and _parse_fastcall_args to handle positional and keyword arguments efficiently. The changes also include refactoring integer digest and seed retrieval to use direct PyLong creation functions and adding a comprehensive test suite in tests/test_fastcall.py to verify argument parsing, error handling, and seed wrapping behavior. I have no feedback to provide.

ifduyue · 2026-04-27T04:54:21Z

Call	Before (ns)	After (ns)	Δ
`xxh32_intdigest(b'hello')`	170	82	-52%
`xxh64_intdigest(b'hello')`	170	86	-49%
`xxh3_64_intdigest(b'hello')`	165	82	-50%
`xxh3_128_intdigest(b'hello')`	267	195	-27%
`xxh32_intdigest(b'hello', seed=42)`	338	145	-57%
`xxh64_intdigest(b'hello', seed=42)`	341	151	-56%
`xxh32_intdigest('hello')`	170	121	-29%
`xxh3_64_intdigest(b'x'*4096)`	425	338	-20%
`xxh32_hexdigest(b'hello')`	174	112	-36%
`xxh64_hexdigest(b'hello')`	161	106	-34%

Key contributors:

METH_FASTCALL → eliminates tuple/dict allocation per call (largest win)
Py_BuildValue → PyLong_FromUnsigned* → skips format string parse
Py_ALWAYS_INLINE on arg-parsing helpers → ensures no call overhead

Py_ALWAYS_INLINE was added in Python 3.12. On older versions (3.9-3.11) it is not defined, causing a compile error.

ifduyue added 2 commits April 27, 2026 04:14

gemini-code-assist Bot reviewed Apr 27, 2026

View reviewed changes

ifduyue added 2 commits April 27, 2026 11:47

fix: only clear error in _get_buffer_or_str after confirming str

fe8150b

PyErr_Clear() is now inside the PyUnicode_Check() branch, so if PyObject_GetBuffer fails on a non-string object, the original TypeError propagates correctly instead of being lost.

gemini-code-assist Bot reviewed Apr 27, 2026

View reviewed changes

ifduyue added 5 commits April 27, 2026 12:00

test: add fastcall edge case tests for argument parsing

37f3f68

Cover unknown keywords, duplicate arguments, too many positional args, and invalid seed types across all four hash algorithms.

ifduyue temporarily deployed to test-pypi April 27, 2026 04:18 — with GitHub Actions Inactive

gemini-code-assist Bot reviewed Apr 27, 2026

View reviewed changes

perf: replace Py_BuildValue with PyLong_FromUnsignedLong/LongLong

14ff962

Py_BuildValue('I'/'K', val) parses a format string on every call. PyLong_FromUnsignedLong/LongLong constructs the integer directly, saving the format parse overhead. 10-21% faster for intdigest functions.

perf: mark _get_buffer_or_str and _parse_fastcall_args Py_ALWAYS_INLINE

13554d7

Ensures inlining across all optimization levels and compilers, matching CPython's own convention for performance-critical local helpers.

gemini-code-assist Bot reviewed Apr 27, 2026

View reviewed changes

fix: add Py_ALWAYS_INLINE fallback for Python < 3.12

0a22795

Py_ALWAYS_INLINE was added in Python 3.12. On older versions (3.9-3.11) it is not defined, causing a compile error.

ifduyue merged commit b3466b5 into master Apr 27, 2026
43 checks passed

ifduyue deleted the fastcall branch April 27, 2026 05:12

ifduyue temporarily deployed to test-pypi April 27, 2026 05:16 — with GitHub Actions Inactive


		Py_XDECREF(buf_owner);

		PyObject *sixtyfour = PyLong_FromLong(64);

Conversation

ifduyue commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

codspeed-hq Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

ifduyue commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

ifduyue commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

ifduyue commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

ifduyue commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codspeed-hq Bot commented Apr 27, 2026 •

edited

Loading