Skip to content
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Doc/howto/logging-cookbook.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3877,7 +3877,7 @@ subclassed handler which looks something like this::
def format(self, record):
version = 1
asctime = dt.datetime.fromtimestamp(record.created).isoformat()
m = self.tz_offset.match(time.strftime('%z'))
m = self.tz_offset.prefixmatch(time.strftime('%z'))
has_offset = False
if m and time.timezone:
hrs, mins = m.groups()
Expand Down
83 changes: 43 additions & 40 deletions Doc/howto/regex.rst
Original file line number Diff line number Diff line change
Expand Up @@ -362,20 +362,20 @@ for a complete listing.
+------------------+-----------------------------------------------+
| Method/Attribute | Purpose |
+==================+===============================================+
| ``match()`` | Determine if the RE matches at the beginning |
| | of the string. |
+------------------+-----------------------------------------------+
| ``search()`` | Scan through a string, looking for any |
| | location where this RE matches. |
+------------------+-----------------------------------------------+
| ``prefixmatch()``| Determine if the RE matches at the beginning |
| | of the string. |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd at least leave a little mention of "Previously named match()" with a link to the canonical doc section explaining the renaming and soft deprecation in here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

+------------------+-----------------------------------------------+
| ``findall()`` | Find all substrings where the RE matches, and |
| | returns them as a list. |
+------------------+-----------------------------------------------+
| ``finditer()`` | Find all substrings where the RE matches, and |
| | returns them as an :term:`iterator`. |
+------------------+-----------------------------------------------+

:meth:`~re.Pattern.match` and :meth:`~re.Pattern.search` return ``None`` if no match can be found. If
:meth:`~re.Pattern.search` and :meth:`~re.Pattern.prefixmatch` return ``None`` if no match can be found. If
they're successful, a :ref:`match object <match-objects>` instance is returned,
containing information about the match: where it starts and ends, the substring
it matched, and more.
Expand All @@ -391,21 +391,21 @@ Python interpreter, import the :mod:`re` module, and compile a RE::
>>> p
re.compile('[a-z]+')

Now, you can try matching various strings against the RE ``[a-z]+``. An empty
Now, you can try searching various strings against the RE ``[a-z]+``. An empty
string shouldn't match at all, since ``+`` means 'one or more repetitions'.
:meth:`~re.Pattern.match` should return ``None`` in this case, which will cause the
:meth:`~re.Pattern.search` should return ``None`` in this case, which will cause the
interpreter to print no output. You can explicitly print the result of
:meth:`!match` to make this clear. ::
:meth:`!search` to make this clear. ::

>>> p.match("")
>>> print(p.match(""))
>>> p.search("")
>>> print(p.search(""))
None

Now, let's try it on a string that it should match, such as ``tempo``. In this
case, :meth:`~re.Pattern.match` will return a :ref:`match object <match-objects>`, so you
case, :meth:`~re.Pattern.search` will return a :ref:`match object <match-objects>`, so you
should store the result in a variable for later use. ::

>>> m = p.match('tempo')
>>> m = p.search('tempo')
>>> m
<re.Match object; span=(0, 5), match='tempo'>

Expand Down Expand Up @@ -437,27 +437,28 @@ Trying these methods will soon clarify their meaning::

:meth:`~re.Match.group` returns the substring that was matched by the RE. :meth:`~re.Match.start`
and :meth:`~re.Match.end` return the starting and ending index of the match. :meth:`~re.Match.span`
returns both start and end indexes in a single tuple. Since the :meth:`~re.Pattern.match`
method only checks if the RE matches at the start of a string, :meth:`!start`
will always be zero. However, the :meth:`~re.Pattern.search` method of patterns
scans through the string, so the match may not start at zero in that
case. ::
returns both start and end indexes in a single tuple.
The :meth:`~re.Pattern.search` method of patterns
scans through the string, so the match may not start at zero.
However, the :meth:`~re.Pattern.prefixmatch`
method only checks if the RE matches at the start of a string, so :meth:`!start`
will always be zero in that case. ::

>>> print(p.match('::: message'))
None
>>> m = p.search('::: message'); print(m)
<re.Match object; span=(4, 11), match='message'>
>>> m.group()
'message'
>>> m.span()
(4, 11)
>>> print(p.prefixmatch('::: message'))
None

In actual programs, the most common style is to store the
:ref:`match object <match-objects>` in a variable, and then check if it was
``None``. This usually looks like::

p = re.compile( ... )
m = p.match( 'string goes here' )
m = p.search( 'string goes here' )
if m:
print('Match found: ', m.group())
else:
Expand Down Expand Up @@ -495,15 +496,15 @@ Module-Level Functions
----------------------

You don't have to create a pattern object and call its methods; the
:mod:`re` module also provides top-level functions called :func:`~re.match`,
:func:`~re.search`, :func:`~re.findall`, :func:`~re.sub`, and so forth. These functions
:mod:`re` module also provides top-level functions called :func:`~re.search`,
:func:`~re.prefixmatch`, :func:`~re.findall`, :func:`~re.sub`, and so forth. These functions
take the same arguments as the corresponding pattern method with
the RE string added as the first argument, and still return either ``None`` or a
:ref:`match object <match-objects>` instance. ::

>>> print(re.match(r'From\s+', 'Fromage amk'))
>>> print(re.prefixmatch(r'From\s+', 'Fromage amk'))
None
>>> re.match(r'From\s+', 'From amk Thu May 14 19:12:10 1998') #doctest: +ELLIPSIS
>>> re.prefixmatch(r'From\s+', 'From amk Thu May 14 19:12:10 1998') #doctest: +ELLIPSIS
<re.Match object; span=(0, 5), match='From '>

Under the hood, these functions simply create a pattern object for you
Expand Down Expand Up @@ -812,7 +813,7 @@ of a group with a quantifier, such as ``*``, ``+``, ``?``, or
``ab``. ::

>>> p = re.compile('(ab)*')
>>> print(p.match('ababababab').span())
>>> print(p.search('ababababab').span())
(0, 10)

Groups indicated with ``'('``, ``')'`` also capture the starting and ending
Expand All @@ -825,7 +826,7 @@ argument. Later we'll see how to express groups that don't capture the span
of text that they match. ::

>>> p = re.compile('(a)b')
>>> m = p.match('ab')
>>> m = p.search('ab')
>>> m.group()
'ab'
>>> m.group(0)
Expand All @@ -836,7 +837,7 @@ to determine the number, just count the opening parenthesis characters, going
from left to right. ::

>>> p = re.compile('(a(b)c)d')
>>> m = p.match('abcd')
>>> m = p.search('abcd')
>>> m.group(0)
'abcd'
>>> m.group(1)
Expand Down Expand Up @@ -912,10 +913,10 @@ but aren't interested in retrieving the group's contents. You can make this fact
explicit by using a non-capturing group: ``(?:...)``, where you can replace the
``...`` with any other regular expression. ::

>>> m = re.match("([abc])+", "abc")
>>> m = re.search("([abc])+", "abc")
>>> m.groups()
('c',)
>>> m = re.match("(?:[abc])+", "abc")
>>> m = re.search("(?:[abc])+", "abc")
>>> m.groups()
()

Expand Down Expand Up @@ -949,7 +950,7 @@ given numbers, so you can retrieve information about a group in two ways::
Additionally, you can retrieve named groups as a dictionary with
:meth:`~re.Match.groupdict`::

>>> m = re.match(r'(?P<first>\w+) (?P<last>\w+)', 'Jane Doe')
>>> m = re.search(r'(?P<first>\w+) (?P<last>\w+)', 'Jane Doe')
>>> m.groupdict()
{'first': 'Jane', 'last': 'Doe'}

Expand Down Expand Up @@ -1274,18 +1275,20 @@ In short, before turning to the :mod:`re` module, consider whether your problem
can be solved with a faster and simpler string method.


match() versus search()
-----------------------
.. _match-versus-search:

prefixmatch() versus search()
-----------------------------

The :func:`~re.match` function only checks if the RE matches at the beginning of the
The :func:`~re.prefixmatch` function only checks if the RE matches at the beginning of the
string while :func:`~re.search` will scan forward through the string for a match.
It's important to keep this distinction in mind. Remember, :func:`!match` will
It's important to keep this distinction in mind. Remember, :func:`!prefixmatch` will
only report a successful match which will start at 0; if the match wouldn't
start at zero, :func:`!match` will *not* report it. ::
start at zero, :func:`!prefixmatch` will *not* report it. ::

>>> print(re.match('super', 'superstition').span())
>>> print(re.prefixmatch('super', 'superstition').span())
(0, 5)
>>> print(re.match('super', 'insuperable'))
>>> print(re.prefixmatch('super', 'insuperable'))
None

On the other hand, :func:`~re.search` will scan forward through the string,
Expand All @@ -1296,7 +1299,7 @@ reporting the first match it finds. ::
>>> print(re.search('super', 'insuperable').span())
(2, 7)

Sometimes you'll be tempted to keep using :func:`re.match`, and just add ``.*``
Sometimes you'll be tempted to keep using :func:`re.prefixmatch`, and just add ``.*``
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this temptation true any longer?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully the new name makes things clearer. I've removed this bit.

to the front of your RE. Resist this temptation and use :func:`re.search`
instead. The regular expression compiler does some analysis of REs in order to
speed up the process of looking for a match. One such analysis figures out what
Expand All @@ -1322,9 +1325,9 @@ doesn't work because of the greedy nature of ``.*``. ::
>>> s = '<html><head><title>Title</title>'
>>> len(s)
32
>>> print(re.match('<.*>', s).span())
>>> print(re.prefixmatch('<.*>', s).span())
(0, 32)
>>> print(re.match('<.*>', s).group())
>>> print(re.prefixmatch('<.*>', s).group())
<html><head><title>Title</title>

The RE matches the ``'<'`` in ``'<html>'``, and the ``.*`` consumes the rest of
Expand All @@ -1340,7 +1343,7 @@ example, the ``'>'`` is tried immediately after the first ``'<'`` matches, and
when it fails, the engine advances a character at a time, retrying the ``'>'``
at every step. This produces just the right result::

>>> print(re.match('<.*?>', s).group())
>>> print(re.prefixmatch('<.*?>', s).group())
<html>

(Note that parsing HTML or XML with regular expressions is painful.
Expand Down
5 changes: 3 additions & 2 deletions Doc/library/fnmatch.rst
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,8 @@ functions: :func:`fnmatch`, :func:`fnmatchcase`, :func:`.filter`, :func:`.filter
.. function:: translate(pat)

Return the shell-style pattern *pat* converted to a regular expression for
using with :func:`re.match`. The pattern is expected to be a :class:`str`.
using with :func:`re.prefixmatch`. The pattern is expected to be a
:class:`str`.

Example:

Expand All @@ -113,7 +114,7 @@ functions: :func:`fnmatch`, :func:`fnmatchcase`, :func:`.filter`, :func:`.filter
>>> regex
'(?s:.*\\.txt)\\z'
>>> reobj = re.compile(regex)
>>> reobj.match('foobar.txt')
>>> reobj.prefixmatch('foobar.txt')
<re.Match object; span=(0, 10), match='foobar.txt'>


Expand Down
5 changes: 3 additions & 2 deletions Doc/library/glob.rst
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,8 @@ The :mod:`!glob` module defines the following functions:
.. function:: translate(pathname, *, recursive=False, include_hidden=False, seps=None)

Convert the given path specification to a regular expression for use with
:func:`re.match`. The path specification can contain shell-style wildcards.
:func:`re.prefixmatch`. The path specification can contain shell-style
wildcards.

For example:

Expand All @@ -140,7 +141,7 @@ The :mod:`!glob` module defines the following functions:
>>> regex
'(?s:(?:.+/)?[^/]*\\.txt)\\z'
>>> reobj = re.compile(regex)
>>> reobj.match('foo/bar/baz.txt')
>>> reobj.prefixmatch('foo/bar/baz.txt')
<re.Match object; span=(0, 15), match='foo/bar/baz.txt'>

Path separators and segments are meaningful to this function, unlike
Expand Down
2 changes: 1 addition & 1 deletion Doc/library/typing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3797,7 +3797,7 @@ Aliases to other concrete types
Match

Deprecated aliases corresponding to the return types from
:func:`re.compile` and :func:`re.match`.
:func:`re.compile` and :func:`re.search`.

These types (and the corresponding functions) are generic over
:data:`AnyStr`. ``Pattern`` can be specialised as ``Pattern[str]`` or
Expand Down
Loading