@@ -362,20 +362,21 @@ for a complete listing.
362362+------------------+-----------------------------------------------+
363363| Method/Attribute | Purpose |
364364+==================+===============================================+
365- | ``match() `` | Determine if the RE matches at the beginning |
366- | | of the string. |
367- +------------------+-----------------------------------------------+
368365| ``search() `` | Scan through a string, looking for any |
369366| | location where this RE matches. |
370367+------------------+-----------------------------------------------+
368+ | ``prefixmatch()``| Determine if the RE matches at the beginning |
369+ | | of the string. Previously named :ref: `match() |
370+ | | <prefixmatch-vs-match>`. |
371+ +------------------+-----------------------------------------------+
371372| ``findall() `` | Find all substrings where the RE matches, and |
372373| | return them as a list. |
373374+------------------+-----------------------------------------------+
374375| ``finditer() `` | Find all substrings where the RE matches, and |
375376| | return them as an :term: `iterator `. |
376377+------------------+-----------------------------------------------+
377378
378- :meth: `~re.Pattern.match ` and :meth: `~re.Pattern.search ` return ``None `` if no match can be found. If
379+ :meth: `~re.Pattern.search ` and :meth: `~re.Pattern.prefixmatch ` return ``None `` if no match can be found. If
379380they're successful, a :ref: `match object <match-objects >` instance is returned,
380381containing information about the match: where it starts and ends, the substring
381382it matched, and more.
@@ -393,19 +394,19 @@ Python interpreter, import the :mod:`re` module, and compile a RE::
393394
394395Now, you can try matching various strings against the RE ``[a-z]+ ``. An empty
395396string shouldn't match at all, since ``+ `` means 'one or more repetitions'.
396- :meth: `~re.Pattern.match ` should return ``None `` in this case, which will cause the
397+ :meth: `~re.Pattern.search ` should return ``None `` in this case, which will cause the
397398interpreter to print no output. You can explicitly print the result of
398- :meth: `!match ` to make this clear. ::
399+ :meth: `!search ` to make this clear. ::
399400
400- >>> p.match ("")
401- >>> print(p.match (""))
401+ >>> p.search ("")
402+ >>> print(p.search (""))
402403 None
403404
404405Now, let's try it on a string that it should match, such as ``tempo ``. In this
405- case, :meth: `~re.Pattern.match ` will return a :ref: `match object <match-objects >`, so you
406+ case, :meth: `~re.Pattern.search ` will return a :ref: `match object <match-objects >`, so you
406407should store the result in a variable for later use. ::
407408
408- >>> m = p.match ('tempo')
409+ >>> m = p.search ('tempo')
409410 >>> m
410411 <re.Match object; span=(0, 5), match='tempo'>
411412
@@ -437,27 +438,28 @@ Trying these methods will soon clarify their meaning::
437438
438439:meth: `~re.Match.group ` returns the substring that was matched by the RE. :meth: `~re.Match.start `
439440and :meth: `~re.Match.end ` return the starting and ending index of the match. :meth: `~re.Match.span `
440- returns both start and end indexes in a single tuple. Since the :meth: `~re.Pattern.match `
441- method only checks if the RE matches at the start of a string, :meth: `!start `
442- will always be zero. However, the :meth: `~re.Pattern.search ` method of patterns
443- scans through the string, so the match may not start at zero in that
444- case. ::
441+ returns both start and end indexes in a single tuple.
442+ The :meth: `~re.Pattern.search ` method of patterns
443+ scans through the string, so the match may not start at zero.
444+ However, the :meth: `~re.Pattern.prefixmatch `
445+ method only checks if the RE matches at the start of a string, so :meth: `!start `
446+ will always be zero in that case. ::
445447
446- >>> print(p.match('::: message'))
447- None
448448 >>> m = p.search('::: message'); print(m)
449449 <re.Match object; span=(4, 11), match='message'>
450450 >>> m.group()
451451 'message'
452452 >>> m.span()
453453 (4, 11)
454+ >>> print(p.prefixmatch('::: message'))
455+ None
454456
455457In actual programs, the most common style is to store the
456458:ref: `match object <match-objects >` in a variable, and then check if it was
457459``None ``. This usually looks like::
458460
459461 p = re.compile( ... )
460- m = p.match ( 'string goes here' )
462+ m = p.search ( 'string goes here' )
461463 if m:
462464 print('Match found: ', m.group())
463465 else:
@@ -495,15 +497,15 @@ Module-level functions
495497----------------------
496498
497499You don't have to create a pattern object and call its methods; the
498- :mod: `re ` module also provides top-level functions called :func: `~re.match `,
499- :func: `~re.search `, :func: `~re.findall `, :func: `~re.sub `, and so forth. These functions
500+ :mod: `re ` module also provides top-level functions called :func: `~re.search `,
501+ :func: `~re.prefixmatch `, :func: `~re.findall `, :func: `~re.sub `, and so forth. These functions
500502take the same arguments as the corresponding pattern method with
501503the RE string added as the first argument, and still return either ``None `` or a
502504:ref: `match object <match-objects >` instance. ::
503505
504- >>> print(re.match (r'From\s+', 'Fromage amk'))
506+ >>> print(re.prefixmatch (r'From\s+', 'Fromage amk'))
505507 None
506- >>> re.match (r'From\s+', 'From amk Thu May 14 19:12:10 1998') #doctest: +ELLIPSIS
508+ >>> re.prefixmatch (r'From\s+', 'From amk Thu May 14 19:12:10 1998') #doctest: +ELLIPSIS
507509 <re.Match object; span=(0, 5), match='From '>
508510
509511Under the hood, these functions simply create a pattern object for you
@@ -812,7 +814,7 @@ of a group with a quantifier, such as ``*``, ``+``, ``?``, or
812814``ab ``. ::
813815
814816 >>> p = re.compile('(ab)*')
815- >>> print(p.match ('ababababab').span())
817+ >>> print(p.search ('ababababab').span())
816818 (0, 10)
817819
818820Groups indicated with ``'(' ``, ``')' `` also capture the starting and ending
@@ -825,7 +827,7 @@ argument. Later we'll see how to express groups that don't capture the span
825827of text that they match. ::
826828
827829 >>> p = re.compile('(a)b')
828- >>> m = p.match ('ab')
830+ >>> m = p.search ('ab')
829831 >>> m.group()
830832 'ab'
831833 >>> m.group(0)
@@ -836,7 +838,7 @@ to determine the number, just count the opening parenthesis characters, going
836838from left to right. ::
837839
838840 >>> p = re.compile('(a(b)c)d')
839- >>> m = p.match ('abcd')
841+ >>> m = p.search ('abcd')
840842 >>> m.group(0)
841843 'abcd'
842844 >>> m.group(1)
@@ -912,10 +914,10 @@ but aren't interested in retrieving the group's contents. You can make this fact
912914explicit by using a non-capturing group: ``(?:...) ``, where you can replace the
913915``... `` with any other regular expression. ::
914916
915- >>> m = re.match ("([abc])+", "abc")
917+ >>> m = re.search ("([abc])+", "abc")
916918 >>> m.groups()
917919 ('c',)
918- >>> m = re.match ("(?:[abc])+", "abc")
920+ >>> m = re.search ("(?:[abc])+", "abc")
919921 >>> m.groups()
920922 ()
921923
@@ -949,7 +951,7 @@ given numbers, so you can retrieve information about a group in two ways::
949951Additionally, you can retrieve named groups as a dictionary with
950952:meth: `~re.Match.groupdict `::
951953
952- >>> m = re.match (r'(?P<first>\w+) (?P<last>\w+)', 'Jane Doe')
954+ >>> m = re.search (r'(?P<first>\w+) (?P<last>\w+)', 'Jane Doe')
953955 >>> m.groupdict()
954956 {'first': 'Jane', 'last': 'Doe'}
955957
@@ -1274,40 +1276,35 @@ In short, before turning to the :mod:`re` module, consider whether your problem
12741276can be solved with a faster and simpler string method.
12751277
12761278
1277- match() versus search()
1278- -----------------------
1279+ .. _match-versus-search :
12791280
1280- The :func: `~re.match ` function only checks if the RE matches at the beginning of the
1281- string while :func: `~re.search ` will scan forward through the string for a match.
1282- It's important to keep this distinction in mind. Remember, :func: `!match ` will
1283- only report a successful match which will start at 0; if the match wouldn't
1284- start at zero, :func: `!match ` will *not * report it. ::
1281+ prefixmatch() (aka match) versus search()
1282+ -----------------------------------------
12851283
1286- >>> print(re.match('super', 'superstition').span())
1284+ :func: `~re.prefixmatch ` was added in Python 3.15 as the :ref: `preferred name
1285+ <prefixmatch-vs-match>` for :func: `~re.match `. Before this, it was only known
1286+ as :func: `!match ` and the distinction with :func: `~re.search ` was often
1287+ misunderstood.
1288+
1289+ :func: `!prefixmatch ` aka :func: `!match ` only checks if the RE matches at the
1290+ beginning of the string while :func: `!search ` scans forward through the
1291+ string for a match. ::
1292+
1293+ >>> print(re.prefixmatch('super', 'superstition').span())
12871294 (0, 5)
1288- >>> print(re.match ('super', 'insuperable'))
1295+ >>> print(re.prefixmatch ('super', 'insuperable'))
12891296 None
12901297
1291- On the other hand, :func: `~re.search ` will scan forward through the string,
1298+ On the other hand, :func: `~re.search ` scans forward through the string,
12921299reporting the first match it finds. ::
12931300
12941301 >>> print(re.search('super', 'superstition').span())
12951302 (0, 5)
12961303 >>> print(re.search('super', 'insuperable').span())
12971304 (2, 7)
12981305
1299- Sometimes you'll be tempted to keep using :func: `re.match `, and just add ``.* ``
1300- to the front of your RE. Resist this temptation and use :func: `re.search `
1301- instead. The regular expression compiler does some analysis of REs in order to
1302- speed up the process of looking for a match. One such analysis figures out what
1303- the first character of a match must be; for example, a pattern starting with
1304- ``Crow `` must match starting with a ``'C' ``. The analysis lets the engine
1305- quickly scan through the string looking for the starting character, only trying
1306- the full match if a ``'C' `` is found.
1307-
1308- Adding ``.* `` defeats this optimization, requiring scanning to the end of the
1309- string and then backtracking to find a match for the rest of the RE. Use
1310- :func: `re.search ` instead.
1306+ This distinction is important to remember when using the old :func: `~re.match `
1307+ name in code requiring compatibility with older Python versions.
13111308
13121309
13131310Greedy versus non-greedy
@@ -1322,9 +1319,9 @@ doesn't work because of the greedy nature of ``.*``. ::
13221319 >>> s = '<html><head><title>Title</title>'
13231320 >>> len(s)
13241321 32
1325- >>> print(re.match ('<.*>', s).span())
1322+ >>> print(re.prefixmatch ('<.*>', s).span())
13261323 (0, 32)
1327- >>> print(re.match ('<.*>', s).group())
1324+ >>> print(re.prefixmatch ('<.*>', s).group())
13281325 <html><head><title>Title</title>
13291326
13301327The RE matches the ``'<' `` in ``'<html>' ``, and the ``.* `` consumes the rest of
@@ -1340,7 +1337,7 @@ example, the ``'>'`` is tried immediately after the first ``'<'`` matches, and
13401337when it fails, the engine advances a character at a time, retrying the ``'>' ``
13411338at every step. This produces just the right result::
13421339
1343- >>> print(re.match ('<.*?>', s).group())
1340+ >>> print(re.prefixmatch ('<.*?>', s).group())
13441341 <html>
13451342
13461343(Note that parsing HTML or XML with regular expressions is painful.
0 commit comments