11.. _regex-howto :
22
33****************************
4- Regular Expression HOWTO
4+ Regular expression HOWTO
55****************************
66
77:Author: A.M. Kuchling <amk@amk.ca>
@@ -47,7 +47,7 @@ Python code to do the processing; while Python code will be slower than an
4747elaborate regular expression, it will also probably be more understandable.
4848
4949
50- Simple Patterns
50+ Simple patterns
5151===============
5252
5353We'll start by learning about the simplest possible regular expressions. Since
@@ -59,7 +59,7 @@ expressions (deterministic and non-deterministic finite automata), you can refer
5959to almost any textbook on writing compilers.
6060
6161
62- Matching Characters
62+ Matching characters
6363-------------------
6464
6565Most letters and characters will simply match themselves. For example, the
@@ -159,7 +159,7 @@ match even a newline. ``.`` is often used where you want to match "any
159159character".
160160
161161
162- Repeating Things
162+ Repeating things
163163----------------
164164
165165Being able to match varying sets of characters is the first thing regular
@@ -210,7 +210,7 @@ this RE against the string ``'abcbd'``.
210210| | | ``[bcd]* `` is only matching |
211211| | | ``bc ``. |
212212+------+-----------+---------------------------------+
213- | 6 | ``abcb `` | Try ``b `` again. This time |
213+ | 7 | ``abcb `` | Try ``b `` again. This time |
214214| | | the character at the |
215215| | | current position is ``'b' ``, so |
216216| | | it succeeds. |
@@ -255,7 +255,7 @@ is equivalent to ``+``, and ``{0,1}`` is the same as ``?``. It's better to use
255255to read.
256256
257257
258- Using Regular Expressions
258+ Using regular expressions
259259=========================
260260
261261Now that we've looked at some simple regular expressions, how do we actually use
@@ -264,7 +264,7 @@ expression engine, allowing you to compile REs into objects and then perform
264264matches with them.
265265
266266
267- Compiling Regular Expressions
267+ Compiling regular expressions
268268-----------------------------
269269
270270Regular expressions are compiled into pattern objects, which have
@@ -295,7 +295,7 @@ disadvantage which is the topic of the next section.
295295
296296.. _the-backslash-plague :
297297
298- The Backslash Plague
298+ The backslash plague
299299--------------------
300300
301301As stated earlier, regular expressions use the backslash character (``'\' ``) to
@@ -335,7 +335,7 @@ expressions will often be written in Python code using this raw string notation.
335335
336336In addition, special escape sequences that are valid in regular expressions,
337337but not valid as Python string literals, now result in a
338- :exc: `DeprecationWarning ` and will eventually become a :exc: `SyntaxError `,
338+ :exc: `SyntaxWarning ` and will eventually become a :exc: `SyntaxError `,
339339which means the sequences will be invalid if raw string notation or escaping
340340the backslashes isn't used.
341341
@@ -351,7 +351,7 @@ the backslashes isn't used.
351351+-------------------+------------------+
352352
353353
354- Performing Matches
354+ Performing matches
355355------------------
356356
357357Once you have an object representing a compiled regular expression, what do you
@@ -369,10 +369,10 @@ for a complete listing.
369369| | location where this RE matches. |
370370+------------------+-----------------------------------------------+
371371| ``findall() `` | Find all substrings where the RE matches, and |
372- | | returns them as a list. |
372+ | | return them as a list. |
373373+------------------+-----------------------------------------------+
374374| ``finditer() `` | Find all substrings where the RE matches, and |
375- | | returns them as an :term: `iterator `. |
375+ | | return them as an :term: `iterator `. |
376376+------------------+-----------------------------------------------+
377377
378378:meth: `~re.Pattern.match ` and :meth: `~re.Pattern.search ` return ``None `` if no match can be found. If
@@ -473,7 +473,7 @@ Two pattern methods return all of the matches for a pattern.
473473The ``r `` prefix, making the literal a raw string literal, is needed in this
474474example because escape sequences in a normal "cooked" string literal that are
475475not recognized by Python, as opposed to regular expressions, now result in a
476- :exc: `DeprecationWarning ` and will eventually become a :exc: `SyntaxError `. See
476+ :exc: `SyntaxWarning ` and will eventually become a :exc: `SyntaxError `. See
477477:ref: `the-backslash-plague `.
478478
479479:meth: `~re.Pattern.findall ` has to create the entire list before it can be returned as the
@@ -491,7 +491,7 @@ result. The :meth:`~re.Pattern.finditer` method returns a sequence of
491491 (29, 31)
492492
493493
494- Module-Level Functions
494+ Module-level functions
495495----------------------
496496
497497You don't have to create a pattern object and call its methods; the
@@ -518,7 +518,7 @@ Outside of loops, there's not much difference thanks to the internal
518518cache.
519519
520520
521- Compilation Flags
521+ Compilation flags
522522-----------------
523523
524524.. currentmodule :: re
@@ -642,7 +642,7 @@ of each one.
642642 whitespace is in a character class or preceded by an unescaped backslash; this
643643 lets you organize and indent the RE more clearly. This flag also lets you put
644644 comments within a RE that will be ignored by the engine; comments are marked by
645- a ``'#' `` that's neither in a character class or preceded by an unescaped
645+ a ``'#' `` that's neither in a character class nor preceded by an unescaped
646646 backslash.
647647
648648 For example, here's a RE that uses :const: `re.VERBOSE `; see how much easier it
@@ -669,7 +669,7 @@ of each one.
669669 to understand than the version using :const: `re.VERBOSE `.
670670
671671
672- More Pattern Power
672+ More pattern power
673673==================
674674
675675So far we've only covered a part of the features of regular expressions. In
@@ -679,7 +679,7 @@ retrieve portions of the text that was matched.
679679
680680.. _more-metacharacters :
681681
682- More Metacharacters
682+ More metacharacters
683683-------------------
684684
685685There are some metacharacters that we haven't covered yet. Most of them will be
@@ -875,7 +875,7 @@ Backreferences like this aren't often useful for just searching through a string
875875find out that they're *very * useful when performing string substitutions.
876876
877877
878- Non-capturing and Named Groups
878+ Non-capturing and named groups
879879------------------------------
880880
881881Elaborate REs may use many groups, both to capture substrings of interest, and
@@ -979,7 +979,7 @@ current point. The regular expression for finding doubled words,
979979 'the the'
980980
981981
982- Lookahead Assertions
982+ Lookahead assertions
983983--------------------
984984
985985Another zero-width assertion is the lookahead assertion. Lookahead assertions
@@ -1061,7 +1061,7 @@ end in either ``bat`` or ``exe``:
10611061``.*[.](?!bat$|exe$)[^.]*$ ``
10621062
10631063
1064- Modifying Strings
1064+ Modifying strings
10651065=================
10661066
10671067Up to this point, we've simply performed searches against a static string.
@@ -1083,7 +1083,7 @@ using the following pattern methods:
10831083+------------------+-----------------------------------------------+
10841084
10851085
1086- Splitting Strings
1086+ Splitting strings
10871087-----------------
10881088
10891089The :meth: `~re.Pattern.split ` method of a pattern splits a string apart
@@ -1137,7 +1137,7 @@ argument, but is otherwise the same. ::
11371137 ['Words', 'words, words.']
11381138
11391139
1140- Search and Replace
1140+ Search and replace
11411141------------------
11421142
11431143Another common task is to find all the matches for a pattern, and replace them
@@ -1236,15 +1236,15 @@ pattern object as the first parameter, or use embedded modifiers in the
12361236pattern string, e.g. ``sub("(?i)b+", "x", "bbbb BBBB") `` returns ``'x x' ``.
12371237
12381238
1239- Common Problems
1239+ Common problems
12401240===============
12411241
12421242Regular expressions are a powerful tool for some applications, but in some ways
12431243their behaviour isn't intuitive and at times they don't behave the way you may
12441244expect them to. This section will point out some of the most common pitfalls.
12451245
12461246
1247- Use String Methods
1247+ Use string methods
12481248------------------
12491249
12501250Sometimes using the :mod: `re ` module is a mistake. If you're matching a fixed
@@ -1310,7 +1310,7 @@ string and then backtracking to find a match for the rest of the RE. Use
13101310:func: `re.search ` instead.
13111311
13121312
1313- Greedy versus Non-Greedy
1313+ Greedy versus non-greedy
13141314------------------------
13151315
13161316When repeating a regular expression, as in ``a* ``, the resulting action is to
@@ -1388,9 +1388,9 @@ Feedback
13881388========
13891389
13901390Regular expressions are a complicated topic. Did this document help you
1391- understand them? Were there parts that were unclear, or Problems you
1391+ understand them? Were there parts that were unclear, or problems you
13921392encountered that weren't covered here? If so, please send suggestions for
1393- improvements to the author .
1393+ improvements to the :ref: ` issue tracker < using-the-tracker >` .
13941394
13951395The most complete book on regular expressions is almost certainly Jeffrey
13961396Friedl's Mastering Regular Expressions, published by O'Reilly. Unfortunately,
0 commit comments