• Home
  • Raw
  • Download

Lines Matching +full:- +full:match

2 :mod:`re` --- Regular expression operations
13 Unicode strings as well as 8-bit strings.
18 character for the same purpose in string literals; for example, to match
26 prefixed with ``'r'``. So ``r"\n"`` is a two-character string containing
27 ``'\'`` and ``'n'``, while ``"\n"`` is a one-character string containing a
32 module-level functions and :class:`RegexObject` methods. The functions are
34 fine-tuning parameters.
38 The third-party `regex <https://pypi.org/project/regex/>`_ module,
43 .. _re-syntax:
46 -------------------------
56 string *pq* will match AB. This holds unless *A* or *B* contain low precedence
64 information and a gentler presentation, consult the :ref:`regex-howto`.
68 expressions; they simply match themselves. You can concatenate ordinary
80 directly nested. This avoids ambiguity with the non-greedy modifier suffix
107 Causes the resulting RE to match 0 or more repetitions of the preceding RE, as
108 many repetitions as are possible. ``ab*`` will match 'a', 'ab', or 'a' followed
112 Causes the resulting RE to match 1 or more repetitions of the preceding RE.
113 ``ab+`` will match 'a' followed by any non-zero number of 'b's; it will not
114 match just 'a'.
117 Causes the resulting RE to match 0 or 1 repetitions of the preceding RE.
118 ``ab?`` will match either 'a' or 'ab'.
121 The ``'*'``, ``'+'``, and ``'?'`` qualifiers are all :dfn:`greedy`; they match
123 ``<.*>`` is matched against ``<a> b <c>``, it will match the entire
125 perform the match in :dfn:`non-greedy` or :dfn:`minimal` fashion; as *few*
126 characters as possible will be matched. Using the RE ``<.*?>`` will match
131 matches cause the entire RE not to match. For example, ``a{6}`` will match
135 Causes the resulting RE to match from *m* to *n* repetitions of the preceding
136 RE, attempting to match as many repetitions as possible. For example,
137 ``a{3,5}`` will match from 3 to 5 ``'a'`` characters. Omitting *m* specifies a
139 example, ``a{4,}b`` will match ``aaaab`` or a thousand ``'a'`` characters
144 Causes the resulting RE to match from *m* to *n* repetitions of the preceding
145 RE, attempting to match as *few* repetitions as possible. This is the
146 non-greedy version of the previous qualifier. For example, on the
147 6-character string ``'aaaaaa'``, ``a{3,5}`` will match 5 ``'a'`` characters,
148 while ``a{3,5}?`` will only match 3 characters.
151 Either escapes special characters (permitting you to match characters like
166 * Characters can be listed individually, e.g. ``[amk]`` will match ``'a'``,
170 them by a ``'-'``, for example ``[a-z]`` will match any lowercase ASCII letter,
171 ``[0-5][0-9]`` will match all the two-digits numbers from ``00`` to ``59``, and
172 ``[0-9A-Fa-f]`` will match any hexadecimal digit. If ``-`` is escaped (e.g.
173 ``[a\-z]``) or if it's placed as the first or last character (e.g. ``[a-]``),
174 it will match a literal ``'-'``.
177 ``[(+*)]`` will match any of the literal characters ``'('``, ``'+'``,
181 inside a set, although the characters they match depends on whether
186 that are *not* in the set will be matched. For example, ``[^5]`` will match
187 any character except ``'5'``, and ``[^^]`` will match any character except
191 * To match a literal ``']'`` inside a set, precede it with a backslash, or
193 ``[]()[{}]`` will both match a parenthesis.
197 will match either A or B. An arbitrary number of REs can be separated by the
202 produce a longer overall match. In other words, the ``'|'`` operator is never
203 greedy. To match a literal ``'|'``, use ``\|``, or enclose it inside a
208 start and end of a group; the contents of a group can be retrieved after a match
210 special sequence, described below. To match the literals ``'('`` or ``')'``,
224 :const:`re.L` (locale dependent), :const:`re.M` (multi-line),
227 flags are described in :ref:`contents-of-module-re`.) This
234 If there are non-whitespace characters before the flag, the results are
238 A non-capturing version of regular parentheses. Matches whatever regular
240 *cannot* be retrieved after performing a match or referenced later in the
254 +---------------------------------------+----------------------------------+
259 +---------------------------------------+----------------------------------+
260 | when processing match object ``m`` | * ``m.group('quote')`` |
262 +---------------------------------------+----------------------------------+
266 +---------------------------------------+----------------------------------+
277 called a lookahead assertion. For example, ``Isaac (?=Asimov)`` will match
281 Matches if ``...`` doesn't match next. This is a negative lookahead assertion.
282 For example, ``Isaac (?!Asimov)`` will match ``'Isaac '`` only if it's *not*
286 Matches if the current position in the string is preceded by a match for ``...``
288 assertion`. ``(?<=abc)def`` will find a match in ``abcdef``, since the
290 The contained pattern must only match strings of some fixed length, meaning that
292 references are not supported even if they match strings of some fixed length.
294 patterns which start with positive lookbehind assertions will not match at the
296 :func:`search` function rather than the :func:`match` function:
305 >>> m = re.search('(?<=-)\w+', 'spam-egg')
310 Matches if the current position in the string is not preceded by a match for
312 positive lookbehind assertions, the contained pattern must only match strings of
315 match at the beginning of the string being searched.
317 ``(?(id/name)yes-pattern|no-pattern)``
318 Will try to match with ``yes-pattern`` if the group with given *id* or *name*
319 exists, and with ``no-pattern`` if it doesn't. ``no-pattern`` is optional and
321 matching pattern, which will match with ``'<user@host.com>'`` as well as
327 If the ordinary character is not on the list, then the resulting RE will match
334 can only be used to match one of the first 99 groups. If the first digit of
336 a group match, but as the character with octal value *number*. Inside the
346 word is indicated by whitespace or a non-alphanumeric, non-underscore character.
365 is equivalent to the set ``[0-9]``. With :const:`UNICODE`, it will match
370 When the :const:`UNICODE` flag is not specified, matches any non-digit
371 character; this is equivalent to the set ``[^0-9]``. With :const:`UNICODE`, it
372 will match anything other than character marked as digits in the Unicode
379 If :const:`UNICODE` is set, this will match the characters ``[ \t\n\r\f\v]``
384 When the :const:`UNICODE` flag is not specified, matches any non-whitespace
386 :const:`LOCALE` flag has no extra effect on non-whitespace match. If
394 ``[a-zA-Z0-9_]``. With :const:`LOCALE`, it will match the set ``[0-9_]`` plus
396 :const:`UNICODE` is set, this will match the characters ``[0-9_]`` plus whatever
401 any non-alphanumeric character; this is equivalent to the set ``[^a-zA-Z0-9_]``.
402 With :const:`LOCALE`, it will match any character not in the set ``[0-9_]``, and
404 this will match anything other than ``[0-9_]`` plus characters classified as
438 .. _contents-of-module-re:
441 ---------------
445 regular expressions. Most non-trivial applications always use the compiled
452 can be used for matching using its :func:`~RegexObject.match` and
462 result = prog.match(string)
466 result = re.match(pattern, string)
475 :func:`re.match`, :func:`re.search` or :func:`re.compile` are cached, so
488 Perform case-insensitive matching; expressions like ``[A-Z]`` will match
490 get this effect on non-ASCII Unicode characters such as ``ü`` and ``Ü``,
515 Make the ``'.'`` special character match any character at all, including a
516 newline; without this flag, ``'.'`` will match anything *except* a newline.
524 enables non-ASCII matching for :const:`IGNORECASE`.
541 This means that the two following regular expression objects that match a
553 *pattern* produces a match, and return a corresponding :class:`MatchObject`
555 that this is different from finding a zero-length match at some point in the
559 .. function:: match(pattern, string, flags=0)
561 If zero or more characters at the beginning of *string* match the regular
563 Return ``None`` if the string does not match the pattern; note that this is
564 different from a zero-length match.
566 Note that even in :const:`MULTILINE` mode, :func:`re.match` will only match
569 If you want to locate a match anywhere in *string*, use :func:`search`
570 instead (see also :ref:`search-vs-match`).
588 >>> re.split('[a-f]+', '0a3B9', flags=re.IGNORECASE)
602 Note that *split* will never split a string on an empty pattern match.
617 Return all non-overlapping matches of *pattern* in *string*, as a list of
618 strings. The *string* is scanned left-to-right, and matches are returned in
626 following an empty match is not included in a next match, so
639 non-overlapping matches for the RE *pattern* in *string*. The *string* is
640 scanned left-to-right, and matches are returned in the order found. Empty
651 Return the string obtained by replacing the leftmost non-overlapping occurrences
660 >>> re.sub(r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):',
665 If *repl* is a function, it is called for every non-overlapping occurrence of
666 *pattern*. The function takes a single match object argument, and returns the
670 ... if matchobj.group(0) == '-': return ' '
671 ... else: return '-'
672 >>> re.sub('-{1,2}', dashrepl, 'pro----gram-files')
673 'pro--gram files'
680 replaced; *count* must be a non-negative integer. If omitted or zero, all
682 when not adjacent to a previous match, so ``sub('x*', '-', 'abc')`` returns
683 ``'-a-b-c-'``.
685 In string-type *repl* arguments, in addition to the character escapes and
711 This is useful if you want to match an arbitrary literal string that may
717 >>> legal_chars = string.ascii_lowercase + string.digits + "!#$%&'*+-.^_`|~:"
719 [abcdefghijklmnopqrstuvwxyz0123456789\!\#\$\%\&\'\*\+\-\.\^\_\`\|\~\:]+
721 >>> operators = ['+', '-', '*', '/', '**']
723 \/|\-|\+|\*\*|\*
736 error if a string contains no match for a pattern.
739 .. _re-objects:
742 --------------------------
751 produces a match, and return a corresponding :class:`MatchObject` instance.
753 is different from finding a zero-length match at some point in the string.
763 from *pos* to ``endpos - 1`` will be searched for a match. If *endpos* is less
764 than *pos*, no match will be found, otherwise, if *rx* is a compiled regular
769 >>> pattern.search("dog") # Match at index 0
771 >>> pattern.search("dog", 1) # No match; search doesn't include the "d"
774 .. method:: RegexObject.match(string[, pos[, endpos]])
776 If zero or more characters at the *beginning* of *string* match this regular
778 ``None`` if the string does not match the pattern; note that this is different
779 from a zero-length match.
785 >>> pattern.match("dog") # No match as "o" is not at the start of "dog".
786 >>> pattern.match("dog", 1) # Match as "o" is the 2nd character of "dog".
789 If you want to locate a match anywhere in *string*, use
790 :meth:`~RegexObject.search` instead (see also :ref:`search-vs-match`).
802 region like for :meth:`match`.
809 region like for :meth:`match`.
845 .. _match-objects:
847 Match Objects
848 -------------
852 Match objects always have a boolean value of ``True``.
853 Since :meth:`~regex.match` and :meth:`~regex.search` return ``None``
854 when there is no match, you can test whether there was a match with a simple
857 match = re.search(pattern, string)
858 if match:
859 process(match)
861 Match objects support the following methods and attributes:
875 Returns one or more subgroups of the match. If there is a single argument, the
878 (the whole match is returned). If a *groupN* argument is zero, the corresponding
883 part of the pattern that did not match, the corresponding result is ``None``.
885 the last match is returned.
887 >>> m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
888 >>> m.group(0) # The entire match
904 >>> m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Malcolm Reynolds")
917 If a group matches multiple times, only the last match is accessible:
919 >>> m = re.match(r"(..)+", "a1b2c3") # Matches 3 times.
920 >>> m.group(1) # Returns only the last match.
926 Return a tuple containing all the subgroups of the match, from 1 up to however
928 did not participate in the match; it defaults to ``None``. (Incompatibility
935 >>> m = re.match(r"(\d+)\.(\d+)", "24.1632")
940 might participate in the match. These groups will default to ``None`` unless
943 >>> m = re.match(r"(\d+)\.?(\d+)?", "24")
952 Return a dictionary containing all the *named* subgroups of the match, keyed by
954 participate in the match; it defaults to ``None``. For example:
956 >>> m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Malcolm Reynolds")
965 *group* defaults to zero (meaning the whole matched substring). Return ``-1`` if
966 *group* exists but did not contribute to the match. For a match object *m*, and
967 a group *g* that did contribute to the match, the substring matched by group *g*
987 For :class:`MatchObject` *m*, return the 2-tuple ``(m.start(group),
988 m.end(group))``. Note that if *group* did not contribute to the match, this is
989 ``(-1, -1)``. *group* defaults to zero, the entire match.
995 :meth:`~RegexObject.match` method of the :class:`RegexObject`. This is the
996 index into the string at which the RE engine started looking for a match.
1002 :meth:`~RegexObject.match` method of the :class:`RegexObject`. This is the
1023 The regular expression object whose :meth:`~RegexObject.match` or
1030 The string passed to :meth:`~RegexObject.match` or
1035 --------
1041 In this example, we'll use the following helper function to display match
1046 def displaymatch(match):
1047 if match is None:
1049 return '<Match: %r, groups=%r>' % (match.group(), match.groups())
1052 a 5-character string with each character representing a card, "a" for ace, "k"
1058 >>> valid = re.compile(r"^[a2-9tjqk]{5}$")
1059 >>> displaymatch(valid.match("akt5q")) # Valid.
1060 "<Match: 'akt5q', groups=()>"
1061 >>> displaymatch(valid.match("akt5e")) # Invalid.
1062 >>> displaymatch(valid.match("akt")) # Invalid.
1063 >>> displaymatch(valid.match("727ak")) # Valid.
1064 "<Match: '727ak', groups=()>"
1067 To match this with a regular expression, one could use backreferences as such:
1070 >>> displaymatch(pair.match("717ak")) # Pair of 7s.
1071 "<Match: '717', groups=('7',)>"
1072 >>> displaymatch(pair.match("718ak")) # No pairs.
1073 >>> displaymatch(pair.match("354aa")) # Pair of aces.
1074 "<Match: '354aa', groups=('a',)>"
1082 >>> pair.match("717ak").group(1)
1085 # Error because re.match() returns None, which doesn't have a group() method:
1086 >>> pair.match("718ak").group(1)
1089 re.match(r".*(.).*\1", "718ak").group(1)
1092 >>> pair.match("354aa").group(1)
1103 :c:func:`scanf` format strings. The table below offers some more-or-less
1107 +--------------------------------+---------------------------------------------+
1111 +--------------------------------+---------------------------------------------+
1113 +--------------------------------+---------------------------------------------+
1114 | ``%d`` | ``[-+]?\d+`` |
1115 +--------------------------------+---------------------------------------------+
1116 | ``%e``, ``%E``, ``%f``, ``%g`` | ``[-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?`` |
1117 +--------------------------------+---------------------------------------------+
1118 | ``%i`` | ``[-+]?(0[xX][\dA-Fa-f]+|0[0-7]*|\d+)`` |
1119 +--------------------------------+---------------------------------------------+
1120 | ``%o`` | ``[-+]?[0-7]+`` |
1121 +--------------------------------+---------------------------------------------+
1123 +--------------------------------+---------------------------------------------+
1125 +--------------------------------+---------------------------------------------+
1126 | ``%x``, ``%X`` | ``[-+]?(0[xX])?[\dA-Fa-f]+`` |
1127 +--------------------------------+---------------------------------------------+
1131 /usr/sbin/sendmail - 0 errors, 4 warnings
1135 %s - %d errors, %d warnings
1139 (\S+) - (\d+) errors, (\d+) warnings
1142 .. _search-vs-match:
1144 search() vs. match()
1150 :func:`re.match` checks for a match only at the beginning of the string, while
1151 :func:`re.search` checks for a match anywhere in the string (this is what Perl
1156 >>> re.match("c", "abcdef") # No match
1157 >>> re.search("c", "abcdef") # Match
1161 restrict the match at the beginning of the string::
1163 >>> re.match("c", "abcdef") # No match
1164 >>> re.search("^c", "abcdef") # No match
1165 >>> re.search("^a", "abcdef") # Match
1168 Note however that in :const:`MULTILINE` mode :func:`match` only matches at the
1170 beginning with ``'^'`` will match at the beginning of each line.
1172 >>> re.match('X', 'A\nB\nX', re.MULTILINE) # No match
1173 >>> re.search('^X', 'A\nB\nX', re.MULTILINE) # Match
1186 triple-quoted string syntax:
1279 ... print '%02d-%02d: %s' % (m.start(), m.end(), m.group(0))
1280 07-16: carefully
1281 40-47: quickly
1292 >>> re.match(r"\W(.)\1\W", " ff ")
1294 >>> re.match("\\W(.)\\1\\W", " ff ")
1297 When one wants to match a literal backslash, it must be escaped in the regular
1302 >>> re.match(r"\\", r"\\")
1304 >>> re.match("\\\\", r"\\")