@@ -362,20 +362,20 @@ for a complete listing.
362362+------------------+-----------------------------------------------+
363363| Method/Attribute | Purpose |
364364+==================+===============================================+
365- | ``match() `` | Determine if the RE matches at the beginning |
366- | | of the string. |
367- +------------------+-----------------------------------------------+
368365| ``search() `` | Scan through a string, looking for any |
369366| | location where this RE matches. |
370367+------------------+-----------------------------------------------+
368+ | ``prefixmatch()``| Determine if the RE matches at the beginning |
369+ | | of the string. |
370+ +------------------+-----------------------------------------------+
371371| ``findall() `` | Find all substrings where the RE matches, and |
372372| | returns them as a list. |
373373+------------------+-----------------------------------------------+
374374| ``finditer() `` | Find all substrings where the RE matches, and |
375375| | returns them as an :term: `iterator `. |
376376+------------------+-----------------------------------------------+
377377
378- :meth: `~re.Pattern.match ` and :meth: `~re.Pattern.search ` return ``None `` if no match can be found. If
378+ :meth: `~re.Pattern.search ` and :meth: `~re.Pattern.prefixmatch ` return ``None `` if no match can be found. If
379379they're successful, a :ref: `match object <match-objects >` instance is returned,
380380containing information about the match: where it starts and ends, the substring
381381it matched, and more.
@@ -391,21 +391,21 @@ Python interpreter, import the :mod:`re` module, and compile a RE::
391391 >>> p
392392 re.compile('[a-z]+')
393393
394- Now, you can try matching various strings against the RE ``[a-z]+ ``. An empty
394+ Now, you can try searching various strings against the RE ``[a-z]+ ``. An empty
395395string shouldn't match at all, since ``+ `` means 'one or more repetitions'.
396- :meth: `~re.Pattern.match ` should return ``None `` in this case, which will cause the
396+ :meth: `~re.Pattern.search ` should return ``None `` in this case, which will cause the
397397interpreter to print no output. You can explicitly print the result of
398- :meth: `!match ` to make this clear. ::
398+ :meth: `!search ` to make this clear. ::
399399
400- >>> p.match ("")
401- >>> print(p.match (""))
400+ >>> p.search ("")
401+ >>> print(p.search (""))
402402 None
403403
404404Now, let's try it on a string that it should match, such as ``tempo ``. In this
405- case, :meth: `~re.Pattern.match ` will return a :ref: `match object <match-objects >`, so you
405+ case, :meth: `~re.Pattern.search ` will return a :ref: `match object <match-objects >`, so you
406406should store the result in a variable for later use. ::
407407
408- >>> m = p.match ('tempo')
408+ >>> m = p.search ('tempo')
409409 >>> m
410410 <re.Match object; span=(0, 5), match='tempo'>
411411
@@ -437,27 +437,28 @@ Trying these methods will soon clarify their meaning::
437437
438438:meth: `~re.Match.group ` returns the substring that was matched by the RE. :meth: `~re.Match.start `
439439and :meth: `~re.Match.end ` return the starting and ending index of the match. :meth: `~re.Match.span `
440- returns both start and end indexes in a single tuple. Since the :meth: `~re.Pattern.match `
441- method only checks if the RE matches at the start of a string, :meth: `!start `
442- will always be zero. However, the :meth: `~re.Pattern.search ` method of patterns
443- scans through the string, so the match may not start at zero in that
444- case. ::
440+ returns both start and end indexes in a single tuple.
441+ The :meth: `~re.Pattern.search ` method of patterns
442+ scans through the string, so the match may not start at zero.
443+ However, the :meth: `~re.Pattern.prefixmatch `
444+ method only checks if the RE matches at the start of a string, so :meth: `!start `
445+ will always be zero in that case. ::
445446
446- >>> print(p.match('::: message'))
447- None
448447 >>> m = p.search('::: message'); print(m)
449448 <re.Match object; span=(4, 11), match='message'>
450449 >>> m.group()
451450 'message'
452451 >>> m.span()
453452 (4, 11)
453+ >>> print(p.prefixmatch('::: message'))
454+ None
454455
455456In actual programs, the most common style is to store the
456457:ref: `match object <match-objects >` in a variable, and then check if it was
457458``None ``. This usually looks like::
458459
459460 p = re.compile( ... )
460- m = p.match ( 'string goes here' )
461+ m = p.search ( 'string goes here' )
461462 if m:
462463 print('Match found: ', m.group())
463464 else:
@@ -495,15 +496,15 @@ Module-Level Functions
495496----------------------
496497
497498You don't have to create a pattern object and call its methods; the
498- :mod: `re ` module also provides top-level functions called :func: `~re.match `,
499- :func: `~re.search `, :func: `~re.findall `, :func: `~re.sub `, and so forth. These functions
499+ :mod: `re ` module also provides top-level functions called :func: `~re.search `,
500+ :func: `~re.prefixmatch `, :func: `~re.findall `, :func: `~re.sub `, and so forth. These functions
500501take the same arguments as the corresponding pattern method with
501502the RE string added as the first argument, and still return either ``None `` or a
502503:ref: `match object <match-objects >` instance. ::
503504
504- >>> print(re.match (r'From\s+', 'Fromage amk'))
505+ >>> print(re.prefixmatch (r'From\s+', 'Fromage amk'))
505506 None
506- >>> re.match (r'From\s+', 'From amk Thu May 14 19:12:10 1998') #doctest: +ELLIPSIS
507+ >>> re.prefixmatch (r'From\s+', 'From amk Thu May 14 19:12:10 1998') #doctest: +ELLIPSIS
507508 <re.Match object; span=(0, 5), match='From '>
508509
509510Under the hood, these functions simply create a pattern object for you
@@ -812,7 +813,7 @@ of a group with a quantifier, such as ``*``, ``+``, ``?``, or
812813``ab ``. ::
813814
814815 >>> p = re.compile('(ab)*')
815- >>> print(p.match ('ababababab').span())
816+ >>> print(p.search ('ababababab').span())
816817 (0, 10)
817818
818819Groups indicated with ``'(' ``, ``')' `` also capture the starting and ending
@@ -825,7 +826,7 @@ argument. Later we'll see how to express groups that don't capture the span
825826of text that they match. ::
826827
827828 >>> p = re.compile('(a)b')
828- >>> m = p.match ('ab')
829+ >>> m = p.search ('ab')
829830 >>> m.group()
830831 'ab'
831832 >>> m.group(0)
@@ -836,7 +837,7 @@ to determine the number, just count the opening parenthesis characters, going
836837from left to right. ::
837838
838839 >>> p = re.compile('(a(b)c)d')
839- >>> m = p.match ('abcd')
840+ >>> m = p.search ('abcd')
840841 >>> m.group(0)
841842 'abcd'
842843 >>> m.group(1)
@@ -912,10 +913,10 @@ but aren't interested in retrieving the group's contents. You can make this fact
912913explicit by using a non-capturing group: ``(?:...) ``, where you can replace the
913914``... `` with any other regular expression. ::
914915
915- >>> m = re.match ("([abc])+", "abc")
916+ >>> m = re.search ("([abc])+", "abc")
916917 >>> m.groups()
917918 ('c',)
918- >>> m = re.match ("(?:[abc])+", "abc")
919+ >>> m = re.search ("(?:[abc])+", "abc")
919920 >>> m.groups()
920921 ()
921922
@@ -949,7 +950,7 @@ given numbers, so you can retrieve information about a group in two ways::
949950Additionally, you can retrieve named groups as a dictionary with
950951:meth: `~re.Match.groupdict `::
951952
952- >>> m = re.match (r'(?P<first>\w+) (?P<last>\w+)', 'Jane Doe')
953+ >>> m = re.search (r'(?P<first>\w+) (?P<last>\w+)', 'Jane Doe')
953954 >>> m.groupdict()
954955 {'first': 'Jane', 'last': 'Doe'}
955956
@@ -1274,18 +1275,18 @@ In short, before turning to the :mod:`re` module, consider whether your problem
12741275can be solved with a faster and simpler string method.
12751276
12761277
1277- match () versus search()
1278- -----------------------
1278+ prefixmatch () versus search()
1279+ -----------------------------
12791280
1280- The :func: `~re.match ` function only checks if the RE matches at the beginning of the
1281+ The :func: `~re.prefixmatch ` function only checks if the RE matches at the beginning of the
12811282string while :func: `~re.search ` will scan forward through the string for a match.
1282- It's important to keep this distinction in mind. Remember, :func: `!match ` will
1283+ It's important to keep this distinction in mind. Remember, :func: `!prefixmatch ` will
12831284only report a successful match which will start at 0; if the match wouldn't
1284- start at zero, :func: `!match ` will *not * report it. ::
1285+ start at zero, :func: `!prefixmatch ` will *not * report it. ::
12851286
1286- >>> print(re.match ('super', 'superstition').span())
1287+ >>> print(re.prefixmatch ('super', 'superstition').span())
12871288 (0, 5)
1288- >>> print(re.match ('super', 'insuperable'))
1289+ >>> print(re.prefixmatch ('super', 'insuperable'))
12891290 None
12901291
12911292On the other hand, :func: `~re.search ` will scan forward through the string,
@@ -1296,7 +1297,7 @@ reporting the first match it finds. ::
12961297 >>> print(re.search('super', 'insuperable').span())
12971298 (2, 7)
12981299
1299- Sometimes you'll be tempted to keep using :func: `re.match `, and just add ``.* ``
1300+ Sometimes you'll be tempted to keep using :func: `re.prefixmatch `, and just add ``.* ``
13001301to the front of your RE. Resist this temptation and use :func: `re.search `
13011302instead. The regular expression compiler does some analysis of REs in order to
13021303speed up the process of looking for a match. One such analysis figures out what
@@ -1322,9 +1323,9 @@ doesn't work because of the greedy nature of ``.*``. ::
13221323 >>> s = '<html><head><title>Title</title>'
13231324 >>> len(s)
13241325 32
1325- >>> print(re.match ('<.*>', s).span())
1326+ >>> print(re.prefixmatch ('<.*>', s).span())
13261327 (0, 32)
1327- >>> print(re.match ('<.*>', s).group())
1328+ >>> print(re.prefixmatch ('<.*>', s).group())
13281329 <html><head><title>Title</title>
13291330
13301331The RE matches the ``'<' `` in ``'<html>' ``, and the ``.* `` consumes the rest of
@@ -1340,7 +1341,7 @@ example, the ``'>'`` is tried immediately after the first ``'<'`` matches, and
13401341when it fails, the engine advances a character at a time, retrying the ``'>' ``
13411342at every step. This produces just the right result::
13421343
1343- >>> print(re.match ('<.*?>', s).group())
1344+ >>> print(re.prefixmatch ('<.*?>', s).group())
13441345 <html>
13451346
13461347(Note that parsing HTML or XML with regular expressions is painful.
0 commit comments