Skip to content

Commit 8040b20

Browse files
[3.14] Regex HOWTO: invalid string literals result in SyntaxWarning (GH-148092) (#148097)
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
1 parent 64207c9 commit 8040b20

File tree

1 file changed

+28
-28
lines changed

1 file changed

+28
-28
lines changed

Doc/howto/regex.rst

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
.. _regex-howto:
22

33
****************************
4-
Regular Expression HOWTO
4+
Regular expression HOWTO
55
****************************
66

77
:Author: A.M. Kuchling <amk@amk.ca>
@@ -47,7 +47,7 @@ Python code to do the processing; while Python code will be slower than an
4747
elaborate regular expression, it will also probably be more understandable.
4848

4949

50-
Simple Patterns
50+
Simple patterns
5151
===============
5252

5353
We'll start by learning about the simplest possible regular expressions. Since
@@ -59,7 +59,7 @@ expressions (deterministic and non-deterministic finite automata), you can refer
5959
to almost any textbook on writing compilers.
6060

6161

62-
Matching Characters
62+
Matching characters
6363
-------------------
6464

6565
Most letters and characters will simply match themselves. For example, the
@@ -159,7 +159,7 @@ match even a newline. ``.`` is often used where you want to match "any
159159
character".
160160

161161

162-
Repeating Things
162+
Repeating things
163163
----------------
164164

165165
Being able to match varying sets of characters is the first thing regular
@@ -210,7 +210,7 @@ this RE against the string ``'abcbd'``.
210210
| | | ``[bcd]*`` is only matching |
211211
| | | ``bc``. |
212212
+------+-----------+---------------------------------+
213-
| 6 | ``abcb`` | Try ``b`` again. This time |
213+
| 7 | ``abcb`` | Try ``b`` again. This time |
214214
| | | the character at the |
215215
| | | current position is ``'b'``, so |
216216
| | | it succeeds. |
@@ -255,7 +255,7 @@ is equivalent to ``+``, and ``{0,1}`` is the same as ``?``. It's better to use
255255
to read.
256256

257257

258-
Using Regular Expressions
258+
Using regular expressions
259259
=========================
260260

261261
Now that we've looked at some simple regular expressions, how do we actually use
@@ -264,7 +264,7 @@ expression engine, allowing you to compile REs into objects and then perform
264264
matches with them.
265265

266266

267-
Compiling Regular Expressions
267+
Compiling regular expressions
268268
-----------------------------
269269

270270
Regular expressions are compiled into pattern objects, which have
@@ -295,7 +295,7 @@ disadvantage which is the topic of the next section.
295295

296296
.. _the-backslash-plague:
297297

298-
The Backslash Plague
298+
The backslash plague
299299
--------------------
300300

301301
As stated earlier, regular expressions use the backslash character (``'\'``) to
@@ -335,7 +335,7 @@ expressions will often be written in Python code using this raw string notation.
335335

336336
In addition, special escape sequences that are valid in regular expressions,
337337
but not valid as Python string literals, now result in a
338-
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`,
338+
:exc:`SyntaxWarning` and will eventually become a :exc:`SyntaxError`,
339339
which means the sequences will be invalid if raw string notation or escaping
340340
the backslashes isn't used.
341341

@@ -351,7 +351,7 @@ the backslashes isn't used.
351351
+-------------------+------------------+
352352

353353

354-
Performing Matches
354+
Performing matches
355355
------------------
356356

357357
Once you have an object representing a compiled regular expression, what do you
@@ -369,10 +369,10 @@ for a complete listing.
369369
| | location where this RE matches. |
370370
+------------------+-----------------------------------------------+
371371
| ``findall()`` | Find all substrings where the RE matches, and |
372-
| | returns them as a list. |
372+
| | return them as a list. |
373373
+------------------+-----------------------------------------------+
374374
| ``finditer()`` | Find all substrings where the RE matches, and |
375-
| | returns them as an :term:`iterator`. |
375+
| | return them as an :term:`iterator`. |
376376
+------------------+-----------------------------------------------+
377377

378378
:meth:`~re.Pattern.match` and :meth:`~re.Pattern.search` return ``None`` if no match can be found. If
@@ -473,7 +473,7 @@ Two pattern methods return all of the matches for a pattern.
473473
The ``r`` prefix, making the literal a raw string literal, is needed in this
474474
example because escape sequences in a normal "cooked" string literal that are
475475
not recognized by Python, as opposed to regular expressions, now result in a
476-
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`. See
476+
:exc:`SyntaxWarning` and will eventually become a :exc:`SyntaxError`. See
477477
:ref:`the-backslash-plague`.
478478

479479
:meth:`~re.Pattern.findall` has to create the entire list before it can be returned as the
@@ -491,7 +491,7 @@ result. The :meth:`~re.Pattern.finditer` method returns a sequence of
491491
(29, 31)
492492

493493

494-
Module-Level Functions
494+
Module-level functions
495495
----------------------
496496

497497
You don't have to create a pattern object and call its methods; the
@@ -518,7 +518,7 @@ Outside of loops, there's not much difference thanks to the internal
518518
cache.
519519

520520

521-
Compilation Flags
521+
Compilation flags
522522
-----------------
523523

524524
.. currentmodule:: re
@@ -642,7 +642,7 @@ of each one.
642642
whitespace is in a character class or preceded by an unescaped backslash; this
643643
lets you organize and indent the RE more clearly. This flag also lets you put
644644
comments within a RE that will be ignored by the engine; comments are marked by
645-
a ``'#'`` that's neither in a character class or preceded by an unescaped
645+
a ``'#'`` that's neither in a character class nor preceded by an unescaped
646646
backslash.
647647

648648
For example, here's a RE that uses :const:`re.VERBOSE`; see how much easier it
@@ -669,7 +669,7 @@ of each one.
669669
to understand than the version using :const:`re.VERBOSE`.
670670

671671

672-
More Pattern Power
672+
More pattern power
673673
==================
674674

675675
So far we've only covered a part of the features of regular expressions. In
@@ -679,7 +679,7 @@ retrieve portions of the text that was matched.
679679

680680
.. _more-metacharacters:
681681

682-
More Metacharacters
682+
More metacharacters
683683
-------------------
684684

685685
There are some metacharacters that we haven't covered yet. Most of them will be
@@ -875,7 +875,7 @@ Backreferences like this aren't often useful for just searching through a string
875875
find out that they're *very* useful when performing string substitutions.
876876

877877

878-
Non-capturing and Named Groups
878+
Non-capturing and named groups
879879
------------------------------
880880

881881
Elaborate REs may use many groups, both to capture substrings of interest, and
@@ -979,7 +979,7 @@ current point. The regular expression for finding doubled words,
979979
'the the'
980980

981981

982-
Lookahead Assertions
982+
Lookahead assertions
983983
--------------------
984984

985985
Another zero-width assertion is the lookahead assertion. Lookahead assertions
@@ -1061,7 +1061,7 @@ end in either ``bat`` or ``exe``:
10611061
``.*[.](?!bat$|exe$)[^.]*$``
10621062

10631063

1064-
Modifying Strings
1064+
Modifying strings
10651065
=================
10661066

10671067
Up to this point, we've simply performed searches against a static string.
@@ -1083,7 +1083,7 @@ using the following pattern methods:
10831083
+------------------+-----------------------------------------------+
10841084

10851085

1086-
Splitting Strings
1086+
Splitting strings
10871087
-----------------
10881088

10891089
The :meth:`~re.Pattern.split` method of a pattern splits a string apart
@@ -1137,7 +1137,7 @@ argument, but is otherwise the same. ::
11371137
['Words', 'words, words.']
11381138

11391139

1140-
Search and Replace
1140+
Search and replace
11411141
------------------
11421142

11431143
Another common task is to find all the matches for a pattern, and replace them
@@ -1236,15 +1236,15 @@ pattern object as the first parameter, or use embedded modifiers in the
12361236
pattern string, e.g. ``sub("(?i)b+", "x", "bbbb BBBB")`` returns ``'x x'``.
12371237

12381238

1239-
Common Problems
1239+
Common problems
12401240
===============
12411241

12421242
Regular expressions are a powerful tool for some applications, but in some ways
12431243
their behaviour isn't intuitive and at times they don't behave the way you may
12441244
expect them to. This section will point out some of the most common pitfalls.
12451245

12461246

1247-
Use String Methods
1247+
Use string methods
12481248
------------------
12491249

12501250
Sometimes using the :mod:`re` module is a mistake. If you're matching a fixed
@@ -1310,7 +1310,7 @@ string and then backtracking to find a match for the rest of the RE. Use
13101310
:func:`re.search` instead.
13111311

13121312

1313-
Greedy versus Non-Greedy
1313+
Greedy versus non-greedy
13141314
------------------------
13151315

13161316
When repeating a regular expression, as in ``a*``, the resulting action is to
@@ -1388,9 +1388,9 @@ Feedback
13881388
========
13891389

13901390
Regular expressions are a complicated topic. Did this document help you
1391-
understand them? Were there parts that were unclear, or Problems you
1391+
understand them? Were there parts that were unclear, or problems you
13921392
encountered that weren't covered here? If so, please send suggestions for
1393-
improvements to the author.
1393+
improvements to the :ref:`issue tracker <using-the-tracker>`.
13941394

13951395
The most complete book on regular expressions is almost certainly Jeffrey
13961396
Friedl's Mastering Regular Expressions, published by O'Reilly. Unfortunately,

0 commit comments

Comments
 (0)