Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the _xxhash C extension to support sub-interpreters and PEP 384 heap types by implementing per-module state and converting static types, accompanied by new sub-interpreter tests. The review identifies a double-free vulnerability in the module initialization where PyModule_AddType steals references that are later cleared in _clear. Additionally, it is recommended to use Py_DECREF for idiomatic cleanup during allocation failures and to add preprocessor guards for Python 3.12 features to ensure backward compatibility.
| typedef struct { | ||
| PyTypeObject *xxh32_type; | ||
| PyTypeObject *xxh64_type; | ||
| PyTypeObject *xxh3_64_type; | ||
| PyTypeObject *xxh3_128_type; | ||
| } module_state; |
There was a problem hiding this comment.
The type objects stored in module_state are currently unused by any other C functions in this module. Storing them here is redundant and, as implemented in _exec, leads to a double-free bug because PyModule_AddType steals the reference while _clear also decrefs it.
If you don't need to access these types from other C functions (e.g., for PyObject_TypeCheck), it is cleaner to remove them from the state entirely. If you do wish to keep them, you must Py_INCREF them before calling PyModule_AddType to ensure the module state owns its own reference.
| state->xxh32_type = (PyTypeObject *)PyType_FromSpec(&xxh32_spec); | ||
| if (state->xxh32_type == NULL) { | ||
| return -1; | ||
| } | ||
| if (PyModule_AddType(module, state->xxh32_type) < 0) { | ||
| return -1; | ||
| } |
There was a problem hiding this comment.
As noted in the module_state definition, PyModule_AddType steals a reference to the type object on success. In your current implementation, state->xxh32_type becomes a borrowed reference. When _clear is called, it will attempt to decref this borrowed reference, leading to a double-free when the module's dictionary is also cleared.
If you decide to keep the types in the state, you should increment the reference count before adding it to the module.
state->xxh32_type = (PyTypeObject *)PyType_FromSpec(&xxh32_spec);
if (state->xxh32_type == NULL) {
return -1;
}
Py_INCREF(state->xxh32_type);
if (PyModule_AddType(module, state->xxh32_type) < 0) {
Py_DECREF(state->xxh32_type);
return -1;
}| Py_TYPE(self)->tp_free((PyObject *)self); | ||
| return PyErr_NoMemory(); |
There was a problem hiding this comment.
Instead of manually calling tp_free and returning PyErr_NoMemory(), it is more idiomatic to use Py_DECREF(self). This ensures that the object is destroyed through the standard dealloc path. Since PYXXH32_dealloc calls XXH32_freeState, and the xxhash library handles NULL pointers safely in that function, this is safe and cleaner. This pattern repeats in other _new and _copy methods in this file.
Py_DECREF(self);
return PyErr_NoMemory();|
|
||
| static PyModuleDef_Slot slots[] = { | ||
| {Py_mod_exec, _exec}, | ||
| {Py_mod_multiple_interpreters, Py_MOD_MULTIPLE_INTERPRETERS_SUPPORTED}, |
There was a problem hiding this comment.
Py_mod_multiple_interpreters and Py_MOD_MULTIPLE_INTERPRETERS_SUPPORTED were introduced in Python 3.12. To maintain compatibility with older Python versions during compilation, this slot should be guarded.
#ifdef Py_mod_multiple_interpreters
{Py_mod_multiple_interpreters, Py_MOD_MULTIPLE_INTERPRETERS_SUPPORTED},
#endif- Add Py_mod_multiple_interpreters slot to explicitly declare support - Fix resource leaks in _new/_copy when XXH*_createState() fails - Use tp_free instead of PyObject_Del for proper heap type dealloc - Add tests for sub-interpreter isolation and per-module state
12690e5 to
f50fa52
Compare
No description provided.