Skip to content

subinterpreter support#143

Closed
ifduyue wants to merge 2 commits intomasterfrom
duyue/multi-phase-init
Closed

subinterpreter support#143
ifduyue wants to merge 2 commits intomasterfrom
duyue/multi-phase-init

Conversation

@ifduyue
Copy link
Copy Markdown
Owner

@ifduyue ifduyue commented Apr 26, 2026

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the _xxhash C extension to support sub-interpreters and PEP 384 heap types by implementing per-module state and converting static types, accompanied by new sub-interpreter tests. The review identifies a double-free vulnerability in the module initialization where PyModule_AddType steals references that are later cleared in _clear. Additionally, it is recommended to use Py_DECREF for idiomatic cleanup during allocation failures and to add preprocessor guards for Python 3.12 features to ensure backward compatibility.

Comment thread src/_xxhash.c
Comment on lines +52 to +57
typedef struct {
PyTypeObject *xxh32_type;
PyTypeObject *xxh64_type;
PyTypeObject *xxh3_64_type;
PyTypeObject *xxh3_128_type;
} module_state;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The type objects stored in module_state are currently unused by any other C functions in this module. Storing them here is redundant and, as implemented in _exec, leads to a double-free bug because PyModule_AddType steals the reference while _clear also decrefs it.

If you don't need to access these types from other C functions (e.g., for PyObject_TypeCheck), it is cleaner to remove them from the state entirely. If you do wish to keep them, you must Py_INCREF them before calling PyModule_AddType to ensure the module state owns its own reference.

Comment thread src/_xxhash.c
Comment on lines +1531 to +1537
state->xxh32_type = (PyTypeObject *)PyType_FromSpec(&xxh32_spec);
if (state->xxh32_type == NULL) {
return -1;
}
if (PyModule_AddType(module, state->xxh32_type) < 0) {
return -1;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

As noted in the module_state definition, PyModule_AddType steals a reference to the type object on success. In your current implementation, state->xxh32_type becomes a borrowed reference. When _clear is called, it will attempt to decref this borrowed reference, leading to a double-free when the module's dictionary is also cleared.

If you decide to keep the types in the state, you should increment the reference count before adding it to the module.

    state->xxh32_type = (PyTypeObject *)PyType_FromSpec(&xxh32_spec);
    if (state->xxh32_type == NULL) {
        return -1;
    }
    Py_INCREF(state->xxh32_type);
    if (PyModule_AddType(module, state->xxh32_type) < 0) {
        Py_DECREF(state->xxh32_type);
        return -1;
    }

Comment thread src/_xxhash.c
Comment on lines +422 to +423
Py_TYPE(self)->tp_free((PyObject *)self);
return PyErr_NoMemory();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Instead of manually calling tp_free and returning PyErr_NoMemory(), it is more idiomatic to use Py_DECREF(self). This ensures that the object is destroyed through the standard dealloc path. Since PYXXH32_dealloc calls XXH32_freeState, and the xxhash library handles NULL pointers safely in that function, this is safe and cleaner. This pattern repeats in other _new and _copy methods in this file.

        Py_DECREF(self);
        return PyErr_NoMemory();

Comment thread src/_xxhash.c

static PyModuleDef_Slot slots[] = {
{Py_mod_exec, _exec},
{Py_mod_multiple_interpreters, Py_MOD_MULTIPLE_INTERPRETERS_SUPPORTED},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Py_mod_multiple_interpreters and Py_MOD_MULTIPLE_INTERPRETERS_SUPPORTED were introduced in Python 3.12. To maintain compatibility with older Python versions during compilation, this slot should be guarded.

#ifdef Py_mod_multiple_interpreters
    {Py_mod_multiple_interpreters, Py_MOD_MULTIPLE_INTERPRETERS_SUPPORTED},
#endif

ifduyue added 2 commits April 26, 2026 18:52
- Add Py_mod_multiple_interpreters slot to explicitly declare support
- Fix resource leaks in _new/_copy when XXH*_createState() fails
- Use tp_free instead of PyObject_Del for proper heap type dealloc
- Add tests for sub-interpreter isolation and per-module state
@ifduyue ifduyue force-pushed the duyue/multi-phase-init branch from 12690e5 to f50fa52 Compare April 26, 2026 10:52
@ifduyue ifduyue closed this Apr 26, 2026
@ifduyue ifduyue deleted the duyue/multi-phase-init branch April 26, 2026 10:55
@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented Apr 26, 2026

Merging this PR will not alter performance

✅ 96 untouched benchmarks


Comparing duyue/multi-phase-init (f50fa52) with master (2bcd375)

Open in CodSpeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant