Skip to content

gh-148241: Fix json serialization for str subclasses#148249

Open
vstinner wants to merge 3 commits intopython:mainfrom
vstinner:json_str_subclass
Open

gh-148241: Fix json serialization for str subclasses#148249
vstinner wants to merge 3 commits intopython:mainfrom
vstinner:json_str_subclass

Conversation

@vstinner
Copy link
Copy Markdown
Member

@vstinner vstinner commented Apr 8, 2026

Fix json serialization: no longer call str(obj) on str subclasses.

Replace PyUnicodeWriter_WriteStr() with PyUnicodeWriter_WriteASCII() and private _PyUnicodeWriter_WriteStr().

Fix json serialization: no longer call str(obj) on str subclasses.

Replace PyUnicodeWriter_WriteStr() with PyUnicodeWriter_WriteASCII()
and private _PyUnicodeWriter_WriteStr().
if (PyUnicodeWriter_WriteStr(writer, pystr) < 0) {
// gh-148241: Avoid PyUnicodeWriter_WriteStr() which calls str(obj)
// on str subclasses
if (_PyUnicodeWriter_WriteStr((_PyUnicodeWriter*)writer, pystr) < 0) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative to using the private API (which is bad!) is to call PyUnicode_FromObject(). But PyUnicode_FromObject() has to copy the string, it's less efficient.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also the issue gh-148250 that I just created for this issue.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using the private API

It would raise a deprecated warning.

_Py_DEPRECATED_EXTERNALLY(3.14) PyAPI_FUNC(int) _PyUnicodeWriter_WriteStr(
    _PyUnicodeWriter *writer,
    PyObject *str); 

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to cast PyUnicodeWriter* to _PyUnicodeWriter*?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's safe. That's how PyUnicodeWriter_WriteStr() is implemented: by casting the first argument to _PyUnicodeWriter* and calling _PyUnicodeWriter_WriteStr().

Do you prefer calling PyUnicode_FromObject() to avoid the private API?

It would raise a deprecated warning.

_Py_DEPRECATED_EXTERNALLY() only emits a deprecation warning if the Py_BUILD_CORE macro is not defined. This macro is defined in Modules/_json.c, so no warning is emitted.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to cast PyUnicodeWriter* to _PyUnicodeWriter*?

Yes. I checked, PyUnicodeWriter is an opaque type that is always cast to _PyUnicodeWriter before use.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor of public API.

Is PyUnicodeWriter_WriteSubstring applicable for this?

PyUnicodeWriter_WriteSubstring the only PyUnicodeWriter public API that treats str subclasses as str.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fine. We already use private API in this file.

@vstinner
Copy link
Copy Markdown
Member Author

vstinner commented Apr 8, 2026

cc @nineteendo @methane @serhiy-storchaka (who worked on PR gh-133239 which introduced the regession)

@vstinner
Copy link
Copy Markdown
Member Author

vstinner commented Apr 8, 2026

"Tests / CIFuzz / python3-libraries (address)" failure was already fixed by google/oss-fuzz#15317 but the fix was not deployed yet.

@StanFromIreland
Copy link
Copy Markdown
Member

Re-running, it should be OK now.

Copy link
Copy Markdown
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is test_enum specially for such cases. I think it is better to add a test here.

@vstinner vstinner requested a review from ethanfurman as a code owner April 8, 2026 15:32
@vstinner
Copy link
Copy Markdown
Member Author

vstinner commented Apr 8, 2026

There is test_enum specially for such cases. I think it is better to add a test here.

Ok, I also added a test to test_json.test_enum. I would prefer to keep also other tests since they test different code paths.

Copy link
Copy Markdown
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 👍

I am not sure we need so many tests, but this is up to you.

if (PyUnicodeWriter_WriteStr(writer, pystr) < 0) {
// gh-148241: Avoid PyUnicodeWriter_WriteStr() which calls str(obj)
// on str subclasses
if (_PyUnicodeWriter_WriteStr((_PyUnicodeWriter*)writer, pystr) < 0) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fine. We already use private API in this file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants