Skip to content

fix(descriptors): silence SyntaxWarning '\d' in TextMatch docstring#1883

Open
jbbqqf wants to merge 1 commit into
evidentlyai:mainfrom
jbbqqf:fix/text-match-docstring-syntax-warning
Open

fix(descriptors): silence SyntaxWarning '\d' in TextMatch docstring#1883
jbbqqf wants to merge 1 commit into
evidentlyai:mainfrom
jbbqqf:fix/text-match-docstring-syntax-warning

Conversation

@jbbqqf
Copy link
Copy Markdown

@jbbqqf jbbqqf commented May 21, 2026

Why

Importing evidently on Python 3.12+ emits:

src/evidently/descriptors/text_match.py:140: SyntaxWarning: invalid escape sequence '\d'
  TextMatch(text_column="description", match_items=r"\b\d{3}-\d{3}-\d{4}\b", match_type="regex")

The line in question lives inside the TextMatch class docstring. The
r"..." prefix only marks the example literal as raw — the docstring
itself is a plain string, so Python parses every \b and \d in it and
raises a SyntaxWarning. The warning is harmless at runtime but noisy in
user notebooks and shows up in CI logs of any downstream project that
imports evidently.

Fix

Mark the docstring itself raw (r""" ... """). The rendered docstring is
unchanged (verified via TextMatch.__doc__) and the regex example still
displays the same way.

A small regression test (test_text_match_module_compiles_without_syntax_warning)
compiles text_match.py under warnings.simplefilter("error", SyntaxWarning)
so this can't silently regress on a future docstring edit. The test
relies on py_compile.compile(..., doraise=True), which converts the
SyntaxWarning into a PyCompileError — that way it works even when the
module is already imported (and therefore cached) elsewhere in the
process, which is the failure mode when running with the test suite's
shared conftest import.

Reproduce BEFORE/AFTER yourself (copy-paste)

git clone https://github.com/evidentlyai/evidently.git && cd evidently
uv venv .venv && . .venv/bin/activate
uv pip install -e . pytest pytest-timeout pytest-asyncio

# === BEFORE (origin/main) — expected: 1 failure ===
git checkout origin/main
git fetch https://github.com/jbbqqf/evidently.git fix/text-match-docstring-syntax-warning
git checkout FETCH_HEAD -- tests/future/descriptors/test_text_match.py
pytest tests/future/descriptors/test_text_match.py::test_text_match_module_compiles_without_syntax_warning -v
# Expected: 1 failed — SyntaxError: invalid escape sequence '\d'

# === AFTER (this PR) — expected: green ===
git checkout FETCH_HEAD -- src/evidently/descriptors/text_match.py
pytest tests/future/descriptors/test_text_match.py::test_text_match_module_compiles_without_syntax_warning -v
# Expected: 1 passed

What I ran locally

$ python -W error::SyntaxWarning -c "import py_compile; py_compile.compile('src/evidently/descriptors/text_match.py', doraise=True); print('ok')"
ok

$ pytest tests/future/descriptors/test_text_match.py tests/future/descriptors/test_text_match_compat.py --timeout=30 -q
87 passed, 6 skipped, 2 flaky (unrelated nltk fetch races on parallel runs; both pass in isolation)

Failing output before the fix (verbatim):

File "src/evidently/descriptors/text_match.py", line 140
    TextMatch(text_column="description", match_items=r"\b\d{3}-\d{3}-\d{4}\b", match_type="regex")
                                                         ^^
SyntaxError: invalid escape sequence '\d'

Edge cases

Concern Verified
Rendered docstring unchanged TextMatch.__doc__ first 200 chars identical pre/post
Other backslash-sensitive docstrings in src/ py_compile walk of src/evidently/**/*.py flags only this one site
Test cached-import edge case uses py_compile.compile(..., doraise=True) instead of relying on import

Disclosure: I drafted this PR with help from Claude Code while triaging
older issues. The reproduction, fix, and test runs above were executed
locally; outputs are copied verbatim.

The TextMatch class docstring contains an inline regex example
`r"\b\d{3}-\d{3}-\d{4}\b"`. The `r"..."` prefix only marks the *example*
literal as raw — the docstring itself is a plain string, so Python 3.12+
parses `\b` and `\d` inside it and emits
`SyntaxWarning: invalid escape sequence '\d'` on every import.

Mark the docstring itself raw (`r""" ... """`). The rendered help text is
unchanged and the example regex still displays the same way.

Add a regression test that compiles the module under
`warnings.simplefilter("error", SyntaxWarning)` so this can't silently
regress on a future docstring edit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant