• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1:mod:`string` --- Common string operations
2==========================================
3
4.. module:: string
5   :synopsis: Common string operations.
6
7
8.. index:: module: re
9
10**Source code:** :source:`Lib/string.py`
11
12--------------
13
14The :mod:`string` module contains a number of useful constants and
15classes, as well as some deprecated legacy functions that are also
16available as methods on strings. In addition, Python's built-in string
17classes support the sequence type methods described in the
18:ref:`typesseq` section, and also the string-specific methods described
19in the :ref:`string-methods` section. To output formatted strings use
20template strings or the ``%`` operator described in the
21:ref:`string-formatting` section. Also, see the :mod:`re` module for
22string functions based on regular expressions.
23
24String constants
25----------------
26
27The constants defined in this module are:
28
29
30.. data:: ascii_letters
31
32   The concatenation of the :const:`ascii_lowercase` and :const:`ascii_uppercase`
33   constants described below.  This value is not locale-dependent.
34
35
36.. data:: ascii_lowercase
37
38   The lowercase letters ``'abcdefghijklmnopqrstuvwxyz'``.  This value is not
39   locale-dependent and will not change.
40
41
42.. data:: ascii_uppercase
43
44   The uppercase letters ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.  This value is not
45   locale-dependent and will not change.
46
47
48.. data:: digits
49
50   The string ``'0123456789'``.
51
52
53.. data:: hexdigits
54
55   The string ``'0123456789abcdefABCDEF'``.
56
57
58.. data:: letters
59
60   The concatenation of the strings :const:`lowercase` and :const:`uppercase`
61   described below.  The specific value is locale-dependent, and will be updated
62   when :func:`locale.setlocale` is called.
63
64
65.. data:: lowercase
66
67   A string containing all the characters that are considered lowercase letters.
68   On most systems this is the string ``'abcdefghijklmnopqrstuvwxyz'``.  The
69   specific value is locale-dependent, and will be updated when
70   :func:`locale.setlocale` is called.
71
72
73.. data:: octdigits
74
75   The string ``'01234567'``.
76
77
78.. data:: punctuation
79
80   String of ASCII characters which are considered punctuation characters in the
81   ``C`` locale.
82
83
84.. data:: printable
85
86   String of characters which are considered printable.  This is a combination of
87   :const:`digits`, :const:`letters`, :const:`punctuation`, and
88   :const:`whitespace`.
89
90
91.. data:: uppercase
92
93   A string containing all the characters that are considered uppercase letters.
94   On most systems this is the string ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.  The
95   specific value is locale-dependent, and will be updated when
96   :func:`locale.setlocale` is called.
97
98
99.. data:: whitespace
100
101   A string containing all characters that are considered whitespace. On most
102   systems this includes the characters space, tab, linefeed, return, formfeed, and
103   vertical tab.
104
105
106.. _new-string-formatting:
107
108Custom String Formatting
109------------------------
110
111.. versionadded:: 2.6
112
113The built-in str and unicode classes provide the ability
114to do complex variable substitutions and value formatting via the
115:meth:`str.format` method described in :pep:`3101`.  The :class:`Formatter`
116class in the :mod:`string` module allows you to create and customize your own
117string formatting behaviors using the same implementation as the built-in
118:meth:`~str.format` method.
119
120.. class:: Formatter
121
122   The :class:`Formatter` class has the following public methods:
123
124   .. method:: format(format_string, *args, **kwargs)
125
126      The primary API method.  It takes a format string and
127      an arbitrary set of positional and keyword arguments.
128      It is just a wrapper that calls :meth:`vformat`.
129
130   .. method:: vformat(format_string, args, kwargs)
131
132      This function does the actual work of formatting.  It is exposed as a
133      separate function for cases where you want to pass in a predefined
134      dictionary of arguments, rather than unpacking and repacking the
135      dictionary as individual arguments using the ``*args`` and ``**kwargs``
136      syntax.  :meth:`vformat` does the work of breaking up the format string
137      into character data and replacement fields.  It calls the various
138      methods described below.
139
140   In addition, the :class:`Formatter` defines a number of methods that are
141   intended to be replaced by subclasses:
142
143   .. method:: parse(format_string)
144
145      Loop over the format_string and return an iterable of tuples
146      (*literal_text*, *field_name*, *format_spec*, *conversion*).  This is used
147      by :meth:`vformat` to break the string into either literal text, or
148      replacement fields.
149
150      The values in the tuple conceptually represent a span of literal text
151      followed by a single replacement field.  If there is no literal text
152      (which can happen if two replacement fields occur consecutively), then
153      *literal_text* will be a zero-length string.  If there is no replacement
154      field, then the values of *field_name*, *format_spec* and *conversion*
155      will be ``None``.
156
157   .. method:: get_field(field_name, args, kwargs)
158
159      Given *field_name* as returned by :meth:`parse` (see above), convert it to
160      an object to be formatted.  Returns a tuple (obj, used_key).  The default
161      version takes strings of the form defined in :pep:`3101`, such as
162      "0[name]" or "label.title".  *args* and *kwargs* are as passed in to
163      :meth:`vformat`.  The return value *used_key* has the same meaning as the
164      *key* parameter to :meth:`get_value`.
165
166   .. method:: get_value(key, args, kwargs)
167
168      Retrieve a given field value.  The *key* argument will be either an
169      integer or a string.  If it is an integer, it represents the index of the
170      positional argument in *args*; if it is a string, then it represents a
171      named argument in *kwargs*.
172
173      The *args* parameter is set to the list of positional arguments to
174      :meth:`vformat`, and the *kwargs* parameter is set to the dictionary of
175      keyword arguments.
176
177      For compound field names, these functions are only called for the first
178      component of the field name; Subsequent components are handled through
179      normal attribute and indexing operations.
180
181      So for example, the field expression '0.name' would cause
182      :meth:`get_value` to be called with a *key* argument of 0.  The ``name``
183      attribute will be looked up after :meth:`get_value` returns by calling the
184      built-in :func:`getattr` function.
185
186      If the index or keyword refers to an item that does not exist, then an
187      :exc:`IndexError` or :exc:`KeyError` should be raised.
188
189   .. method:: check_unused_args(used_args, args, kwargs)
190
191      Implement checking for unused arguments if desired.  The arguments to this
192      function is the set of all argument keys that were actually referred to in
193      the format string (integers for positional arguments, and strings for
194      named arguments), and a reference to the *args* and *kwargs* that was
195      passed to vformat.  The set of unused args can be calculated from these
196      parameters.  :meth:`check_unused_args` is assumed to raise an exception if
197      the check fails.
198
199   .. method:: format_field(value, format_spec)
200
201      :meth:`format_field` simply calls the global :func:`format` built-in.  The
202      method is provided so that subclasses can override it.
203
204   .. method:: convert_field(value, conversion)
205
206      Converts the value (returned by :meth:`get_field`) given a conversion type
207      (as in the tuple returned by the :meth:`parse` method).  The default
208      version understands 's' (str), 'r' (repr) and 'a' (ascii) conversion
209      types.
210
211
212.. _formatstrings:
213
214Format String Syntax
215--------------------
216
217The :meth:`str.format` method and the :class:`Formatter` class share the same
218syntax for format strings (although in the case of :class:`Formatter`,
219subclasses can define their own format string syntax).
220
221Format strings contain "replacement fields" surrounded by curly braces ``{}``.
222Anything that is not contained in braces is considered literal text, which is
223copied unchanged to the output.  If you need to include a brace character in the
224literal text, it can be escaped by doubling: ``{{`` and ``}}``.
225
226The grammar for a replacement field is as follows:
227
228   .. productionlist:: sf
229      replacement_field: "{" [`field_name`] ["!" `conversion`] [":" `format_spec`] "}"
230      field_name: arg_name ("." `attribute_name` | "[" `element_index` "]")*
231      arg_name: [`identifier` | `integer`]
232      attribute_name: `identifier`
233      element_index: `integer` | `index_string`
234      index_string: <any source character except "]"> +
235      conversion: "r" | "s"
236      format_spec: <described in the next section>
237
238In less formal terms, the replacement field can start with a *field_name* that specifies
239the object whose value is to be formatted and inserted
240into the output instead of the replacement field.
241The *field_name* is optionally followed by a  *conversion* field, which is
242preceded by an exclamation point ``'!'``, and a *format_spec*, which is preceded
243by a colon ``':'``.  These specify a non-default format for the replacement value.
244
245See also the :ref:`formatspec` section.
246
247The *field_name* itself begins with an *arg_name* that is either a number or a
248keyword.  If it's a number, it refers to a positional argument, and if it's a keyword,
249it refers to a named keyword argument.  If the numerical arg_names in a format string
250are 0, 1, 2, ... in sequence, they can all be omitted (not just some)
251and the numbers 0, 1, 2, ... will be automatically inserted in that order.
252Because *arg_name* is not quote-delimited, it is not possible to specify arbitrary
253dictionary keys (e.g., the strings ``'10'`` or ``':-]'``) within a format string.
254The *arg_name* can be followed by any number of index or
255attribute expressions. An expression of the form ``'.name'`` selects the named
256attribute using :func:`getattr`, while an expression of the form ``'[index]'``
257does an index lookup using :func:`__getitem__`.
258
259.. versionchanged:: 2.7
260   The positional argument specifiers can be omitted for :meth:`str.format` and
261   :meth:`unicode.format`, so ``'{} {}'`` is equivalent to ``'{0} {1}'``,
262   ``u'{} {}'`` is equivalent to ``u'{0} {1}'``.
263
264Some simple format string examples::
265
266   "First, thou shalt count to {0}"  # References first positional argument
267   "Bring me a {}"                   # Implicitly references the first positional argument
268   "From {} to {}"                   # Same as "From {0} to {1}"
269   "My quest is {name}"              # References keyword argument 'name'
270   "Weight in tons {0.weight}"       # 'weight' attribute of first positional arg
271   "Units destroyed: {players[0]}"   # First element of keyword argument 'players'.
272
273The *conversion* field causes a type coercion before formatting.  Normally, the
274job of formatting a value is done by the :meth:`__format__` method of the value
275itself.  However, in some cases it is desirable to force a type to be formatted
276as a string, overriding its own definition of formatting.  By converting the
277value to a string before calling :meth:`__format__`, the normal formatting logic
278is bypassed.
279
280Two conversion flags are currently supported: ``'!s'`` which calls :func:`str`
281on the value, and ``'!r'`` which calls :func:`repr`.
282
283Some examples::
284
285   "Harold's a clever {0!s}"        # Calls str() on the argument first
286   "Bring out the holy {name!r}"    # Calls repr() on the argument first
287
288The *format_spec* field contains a specification of how the value should be
289presented, including such details as field width, alignment, padding, decimal
290precision and so on.  Each value type can define its own "formatting
291mini-language" or interpretation of the *format_spec*.
292
293Most built-in types support a common formatting mini-language, which is
294described in the next section.
295
296A *format_spec* field can also include nested replacement fields within it.
297These nested replacement fields may contain a field name, conversion flag
298and format specification, but deeper nesting is
299not allowed.  The replacement fields within the
300format_spec are substituted before the *format_spec* string is interpreted.
301This allows the formatting of a value to be dynamically specified.
302
303See the :ref:`formatexamples` section for some examples.
304
305
306.. _formatspec:
307
308Format Specification Mini-Language
309^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
310
311"Format specifications" are used within replacement fields contained within a
312format string to define how individual values are presented (see
313:ref:`formatstrings`).  They can also be passed directly to the built-in
314:func:`format` function.  Each formattable type may define how the format
315specification is to be interpreted.
316
317Most built-in types implement the following options for format specifications,
318although some of the formatting options are only supported by the numeric types.
319
320A general convention is that an empty format string (``""``) produces
321the same result as if you had called :func:`str` on the value. A
322non-empty format string typically modifies the result.
323
324The general form of a *standard format specifier* is:
325
326.. productionlist:: sf
327   format_spec: [[`fill`]`align`][`sign`][#][0][`width`][,][.`precision`][`type`]
328   fill: <any character>
329   align: "<" | ">" | "=" | "^"
330   sign: "+" | "-" | " "
331   width: `integer`
332   precision: `integer`
333   type: "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%"
334
335If a valid *align* value is specified, it can be preceded by a *fill*
336character that can be any character and defaults to a space if omitted.
337It is not possible to use a literal curly brace ("``{``" or "``}``") as
338the *fill* character when using the :meth:`str.format`
339method.  However, it is possible to insert a curly brace
340with a nested replacement field.  This limitation doesn't
341affect the :func:`format` function.
342
343The meaning of the various alignment options is as follows:
344
345   +---------+----------------------------------------------------------+
346   | Option  | Meaning                                                  |
347   +=========+==========================================================+
348   | ``'<'`` | Forces the field to be left-aligned within the available |
349   |         | space (this is the default for most objects).            |
350   +---------+----------------------------------------------------------+
351   | ``'>'`` | Forces the field to be right-aligned within the          |
352   |         | available space (this is the default for numbers).       |
353   +---------+----------------------------------------------------------+
354   | ``'='`` | Forces the padding to be placed after the sign (if any)  |
355   |         | but before the digits.  This is used for printing fields |
356   |         | in the form '+000000120'. This alignment option is only  |
357   |         | valid for numeric types.  It becomes the default when '0'|
358   |         | immediately precedes the field width.                    |
359   +---------+----------------------------------------------------------+
360   | ``'^'`` | Forces the field to be centered within the available     |
361   |         | space.                                                   |
362   +---------+----------------------------------------------------------+
363
364Note that unless a minimum field width is defined, the field width will always
365be the same size as the data to fill it, so that the alignment option has no
366meaning in this case.
367
368The *sign* option is only valid for number types, and can be one of the
369following:
370
371   +---------+----------------------------------------------------------+
372   | Option  | Meaning                                                  |
373   +=========+==========================================================+
374   | ``'+'`` | indicates that a sign should be used for both            |
375   |         | positive as well as negative numbers.                    |
376   +---------+----------------------------------------------------------+
377   | ``'-'`` | indicates that a sign should be used only for negative   |
378   |         | numbers (this is the default behavior).                  |
379   +---------+----------------------------------------------------------+
380   | space   | indicates that a leading space should be used on         |
381   |         | positive numbers, and a minus sign on negative numbers.  |
382   +---------+----------------------------------------------------------+
383
384The ``'#'`` option is only valid for integers, and only for binary, octal, or
385hexadecimal output.  If present, it specifies that the output will be prefixed
386by ``'0b'``, ``'0o'``, or ``'0x'``, respectively.
387
388The ``','`` option signals the use of a comma for a thousands separator.
389For a locale aware separator, use the ``'n'`` integer presentation type
390instead.
391
392.. versionchanged:: 2.7
393   Added the ``','`` option (see also :pep:`378`).
394
395*width* is a decimal integer defining the minimum field width.  If not
396specified, then the field width will be determined by the content.
397
398When no explicit alignment is given, preceding the *width* field by a zero
399(``'0'``) character enables
400sign-aware zero-padding for numeric types.  This is equivalent to a *fill*
401character of ``'0'`` with an *alignment* type of ``'='``.
402
403The *precision* is a decimal number indicating how many digits should be
404displayed after the decimal point for a floating point value formatted with
405``'f'`` and ``'F'``, or before and after the decimal point for a floating point
406value formatted with ``'g'`` or ``'G'``.  For non-number types the field
407indicates the maximum field size - in other words, how many characters will be
408used from the field content. The *precision* is not allowed for integer values.
409
410Finally, the *type* determines how the data should be presented.
411
412The available string presentation types are:
413
414   +---------+----------------------------------------------------------+
415   | Type    | Meaning                                                  |
416   +=========+==========================================================+
417   | ``'s'`` | String format. This is the default type for strings and  |
418   |         | may be omitted.                                          |
419   +---------+----------------------------------------------------------+
420   | None    | The same as ``'s'``.                                     |
421   +---------+----------------------------------------------------------+
422
423The available integer presentation types are:
424
425   +---------+----------------------------------------------------------+
426   | Type    | Meaning                                                  |
427   +=========+==========================================================+
428   | ``'b'`` | Binary format. Outputs the number in base 2.             |
429   +---------+----------------------------------------------------------+
430   | ``'c'`` | Character. Converts the integer to the corresponding     |
431   |         | unicode character before printing.                       |
432   +---------+----------------------------------------------------------+
433   | ``'d'`` | Decimal Integer. Outputs the number in base 10.          |
434   +---------+----------------------------------------------------------+
435   | ``'o'`` | Octal format. Outputs the number in base 8.              |
436   +---------+----------------------------------------------------------+
437   | ``'x'`` | Hex format. Outputs the number in base 16, using lower-  |
438   |         | case letters for the digits above 9.                     |
439   +---------+----------------------------------------------------------+
440   | ``'X'`` | Hex format. Outputs the number in base 16, using upper-  |
441   |         | case letters for the digits above 9.                     |
442   +---------+----------------------------------------------------------+
443   | ``'n'`` | Number. This is the same as ``'d'``, except that it uses |
444   |         | the current locale setting to insert the appropriate     |
445   |         | number separator characters.                             |
446   +---------+----------------------------------------------------------+
447   | None    | The same as ``'d'``.                                     |
448   +---------+----------------------------------------------------------+
449
450In addition to the above presentation types, integers can be formatted
451with the floating point presentation types listed below (except
452``'n'`` and ``None``). When doing so, :func:`float` is used to convert the
453integer to a floating point number before formatting.
454
455The available presentation types for floating point and decimal values are:
456
457   +---------+----------------------------------------------------------+
458   | Type    | Meaning                                                  |
459   +=========+==========================================================+
460   | ``'e'`` | Exponent notation. Prints the number in scientific       |
461   |         | notation using the letter 'e' to indicate the exponent.  |
462   |         | The default precision is ``6``.                          |
463   +---------+----------------------------------------------------------+
464   | ``'E'`` | Exponent notation. Same as ``'e'`` except it uses an     |
465   |         | upper case 'E' as the separator character.               |
466   +---------+----------------------------------------------------------+
467   | ``'f'`` | Fixed-point notation. Displays the number as a           |
468   |         | fixed-point number.  The default precision is ``6``.     |
469   +---------+----------------------------------------------------------+
470   | ``'F'`` | Fixed point notation. Same as ``'f'``.                   |
471   +---------+----------------------------------------------------------+
472   | ``'g'`` | General format.  For a given precision ``p >= 1``,       |
473   |         | this rounds the number to ``p`` significant digits and   |
474   |         | then formats the result in either fixed-point format     |
475   |         | or in scientific notation, depending on its magnitude.   |
476   |         |                                                          |
477   |         | The precise rules are as follows: suppose that the       |
478   |         | result formatted with presentation type ``'e'`` and      |
479   |         | precision ``p-1`` would have exponent ``exp``.  Then     |
480   |         | if ``-4 <= exp < p``, the number is formatted            |
481   |         | with presentation type ``'f'`` and precision             |
482   |         | ``p-1-exp``.  Otherwise, the number is formatted         |
483   |         | with presentation type ``'e'`` and precision ``p-1``.    |
484   |         | In both cases insignificant trailing zeros are removed   |
485   |         | from the significand, and the decimal point is also      |
486   |         | removed if there are no remaining digits following it.   |
487   |         |                                                          |
488   |         | Positive and negative infinity, positive and negative    |
489   |         | zero, and nans, are formatted as ``inf``, ``-inf``,      |
490   |         | ``0``, ``-0`` and ``nan`` respectively, regardless of    |
491   |         | the precision.                                           |
492   |         |                                                          |
493   |         | A precision of ``0`` is treated as equivalent to a       |
494   |         | precision of ``1``.  The default precision is ``6``.     |
495   +---------+----------------------------------------------------------+
496   | ``'G'`` | General format. Same as ``'g'`` except switches to       |
497   |         | ``'E'`` if the number gets too large. The                |
498   |         | representations of infinity and NaN are uppercased, too. |
499   +---------+----------------------------------------------------------+
500   | ``'n'`` | Number. This is the same as ``'g'``, except that it uses |
501   |         | the current locale setting to insert the appropriate     |
502   |         | number separator characters.                             |
503   +---------+----------------------------------------------------------+
504   | ``'%'`` | Percentage. Multiplies the number by 100 and displays    |
505   |         | in fixed (``'f'``) format, followed by a percent sign.   |
506   +---------+----------------------------------------------------------+
507   | None    | The same as ``'g'``.                                     |
508   +---------+----------------------------------------------------------+
509
510
511
512.. _formatexamples:
513
514Format examples
515^^^^^^^^^^^^^^^
516
517This section contains examples of the :meth:`str.format` syntax and
518comparison with the old ``%``-formatting.
519
520In most of the cases the syntax is similar to the old ``%``-formatting, with the
521addition of the ``{}`` and with ``:`` used instead of ``%``.
522For example, ``'%03.2f'`` can be translated to ``'{:03.2f}'``.
523
524The new format syntax also supports new and different options, shown in the
525follow examples.
526
527Accessing arguments by position::
528
529   >>> '{0}, {1}, {2}'.format('a', 'b', 'c')
530   'a, b, c'
531   >>> '{}, {}, {}'.format('a', 'b', 'c')  # 2.7+ only
532   'a, b, c'
533   >>> '{2}, {1}, {0}'.format('a', 'b', 'c')
534   'c, b, a'
535   >>> '{2}, {1}, {0}'.format(*'abc')      # unpacking argument sequence
536   'c, b, a'
537   >>> '{0}{1}{0}'.format('abra', 'cad')   # arguments' indices can be repeated
538   'abracadabra'
539
540Accessing arguments by name::
541
542   >>> 'Coordinates: {latitude}, {longitude}'.format(latitude='37.24N', longitude='-115.81W')
543   'Coordinates: 37.24N, -115.81W'
544   >>> coord = {'latitude': '37.24N', 'longitude': '-115.81W'}
545   >>> 'Coordinates: {latitude}, {longitude}'.format(**coord)
546   'Coordinates: 37.24N, -115.81W'
547
548Accessing arguments' attributes::
549
550   >>> c = 3-5j
551   >>> ('The complex number {0} is formed from the real part {0.real} '
552   ...  'and the imaginary part {0.imag}.').format(c)
553   'The complex number (3-5j) is formed from the real part 3.0 and the imaginary part -5.0.'
554   >>> class Point(object):
555   ...     def __init__(self, x, y):
556   ...         self.x, self.y = x, y
557   ...     def __str__(self):
558   ...         return 'Point({self.x}, {self.y})'.format(self=self)
559   ...
560   >>> str(Point(4, 2))
561   'Point(4, 2)'
562
563
564Accessing arguments' items::
565
566   >>> coord = (3, 5)
567   >>> 'X: {0[0]};  Y: {0[1]}'.format(coord)
568   'X: 3;  Y: 5'
569
570Replacing ``%s`` and ``%r``::
571
572   >>> "repr() shows quotes: {!r}; str() doesn't: {!s}".format('test1', 'test2')
573   "repr() shows quotes: 'test1'; str() doesn't: test2"
574
575Aligning the text and specifying a width::
576
577   >>> '{:<30}'.format('left aligned')
578   'left aligned                  '
579   >>> '{:>30}'.format('right aligned')
580   '                 right aligned'
581   >>> '{:^30}'.format('centered')
582   '           centered           '
583   >>> '{:*^30}'.format('centered')  # use '*' as a fill char
584   '***********centered***********'
585
586Replacing ``%+f``, ``%-f``, and ``% f`` and specifying a sign::
587
588   >>> '{:+f}; {:+f}'.format(3.14, -3.14)  # show it always
589   '+3.140000; -3.140000'
590   >>> '{: f}; {: f}'.format(3.14, -3.14)  # show a space for positive numbers
591   ' 3.140000; -3.140000'
592   >>> '{:-f}; {:-f}'.format(3.14, -3.14)  # show only the minus -- same as '{:f}; {:f}'
593   '3.140000; -3.140000'
594
595Replacing ``%x`` and ``%o`` and converting the value to different bases::
596
597   >>> # format also supports binary numbers
598   >>> "int: {0:d};  hex: {0:x};  oct: {0:o};  bin: {0:b}".format(42)
599   'int: 42;  hex: 2a;  oct: 52;  bin: 101010'
600   >>> # with 0x, 0o, or 0b as prefix:
601   >>> "int: {0:d};  hex: {0:#x};  oct: {0:#o};  bin: {0:#b}".format(42)
602   'int: 42;  hex: 0x2a;  oct: 0o52;  bin: 0b101010'
603
604Using the comma as a thousands separator::
605
606   >>> '{:,}'.format(1234567890)
607   '1,234,567,890'
608
609Expressing a percentage::
610
611   >>> points = 19.5
612   >>> total = 22
613   >>> 'Correct answers: {:.2%}'.format(points/total)
614   'Correct answers: 88.64%'
615
616Using type-specific formatting::
617
618   >>> import datetime
619   >>> d = datetime.datetime(2010, 7, 4, 12, 15, 58)
620   >>> '{:%Y-%m-%d %H:%M:%S}'.format(d)
621   '2010-07-04 12:15:58'
622
623Nesting arguments and more complex examples::
624
625   >>> for align, text in zip('<^>', ['left', 'center', 'right']):
626   ...     '{0:{fill}{align}16}'.format(text, fill=align, align=align)
627   ...
628   'left<<<<<<<<<<<<'
629   '^^^^^center^^^^^'
630   '>>>>>>>>>>>right'
631   >>>
632   >>> octets = [192, 168, 0, 1]
633   >>> '{:02X}{:02X}{:02X}{:02X}'.format(*octets)
634   'C0A80001'
635   >>> int(_, 16)
636   3232235521
637   >>>
638   >>> width = 5
639   >>> for num in range(5,12):
640   ...     for base in 'dXob':
641   ...         print '{0:{width}{base}}'.format(num, base=base, width=width),
642   ...     print
643   ...
644       5     5     5   101
645       6     6     6   110
646       7     7     7   111
647       8     8    10  1000
648       9     9    11  1001
649      10     A    12  1010
650      11     B    13  1011
651
652
653
654Template strings
655----------------
656
657.. versionadded:: 2.4
658
659Templates provide simpler string substitutions as described in :pep:`292`.
660Instead of the normal ``%``\ -based substitutions, Templates support ``$``\
661-based substitutions, using the following rules:
662
663* ``$$`` is an escape; it is replaced with a single ``$``.
664
665* ``$identifier`` names a substitution placeholder matching a mapping key of
666  ``"identifier"``.  By default, ``"identifier"`` must spell a Python
667  identifier.  The first non-identifier character after the ``$`` character
668  terminates this placeholder specification.
669
670* ``${identifier}`` is equivalent to ``$identifier``.  It is required when valid
671  identifier characters follow the placeholder but are not part of the
672  placeholder, such as ``"${noun}ification"``.
673
674Any other appearance of ``$`` in the string will result in a :exc:`ValueError`
675being raised.
676
677The :mod:`string` module provides a :class:`Template` class that implements
678these rules.  The methods of :class:`Template` are:
679
680
681.. class:: Template(template)
682
683   The constructor takes a single argument which is the template string.
684
685
686   .. method:: substitute(mapping[, **kws])
687
688      Performs the template substitution, returning a new string.  *mapping* is
689      any dictionary-like object with keys that match the placeholders in the
690      template.  Alternatively, you can provide keyword arguments, where the
691      keywords are the placeholders.  When both *mapping* and *kws* are given
692      and there are duplicates, the placeholders from *kws* take precedence.
693
694
695   .. method:: safe_substitute(mapping[, **kws])
696
697      Like :meth:`substitute`, except that if placeholders are missing from
698      *mapping* and *kws*, instead of raising a :exc:`KeyError` exception, the
699      original placeholder will appear in the resulting string intact.  Also,
700      unlike with :meth:`substitute`, any other appearances of the ``$`` will
701      simply return ``$`` instead of raising :exc:`ValueError`.
702
703      While other exceptions may still occur, this method is called "safe"
704      because substitutions always tries to return a usable string instead of
705      raising an exception.  In another sense, :meth:`safe_substitute` may be
706      anything other than safe, since it will silently ignore malformed
707      templates containing dangling delimiters, unmatched braces, or
708      placeholders that are not valid Python identifiers.
709
710   :class:`Template` instances also provide one public data attribute:
711
712   .. attribute:: template
713
714      This is the object passed to the constructor's *template* argument.  In
715      general, you shouldn't change it, but read-only access is not enforced.
716
717Here is an example of how to use a Template::
718
719   >>> from string import Template
720   >>> s = Template('$who likes $what')
721   >>> s.substitute(who='tim', what='kung pao')
722   'tim likes kung pao'
723   >>> d = dict(who='tim')
724   >>> Template('Give $who $100').substitute(d)
725   Traceback (most recent call last):
726   ...
727   ValueError: Invalid placeholder in string: line 1, col 11
728   >>> Template('$who likes $what').substitute(d)
729   Traceback (most recent call last):
730   ...
731   KeyError: 'what'
732   >>> Template('$who likes $what').safe_substitute(d)
733   'tim likes $what'
734
735Advanced usage: you can derive subclasses of :class:`Template` to customize the
736placeholder syntax, delimiter character, or the entire regular expression used
737to parse template strings.  To do this, you can override these class attributes:
738
739* *delimiter* -- This is the literal string describing a placeholder introducing
740  delimiter.  The default value is ``$``.  Note that this should *not* be a
741  regular expression, as the implementation will call :meth:`re.escape` on this
742  string as needed.
743
744* *idpattern* -- This is the regular expression describing the pattern for
745  non-braced placeholders (the braces will be added automatically as
746  appropriate).  The default value is the regular expression
747  ``[_a-z][_a-z0-9]*``.
748
749Alternatively, you can provide the entire regular expression pattern by
750overriding the class attribute *pattern*.  If you do this, the value must be a
751regular expression object with four named capturing groups.  The capturing
752groups correspond to the rules given above, along with the invalid placeholder
753rule:
754
755* *escaped* -- This group matches the escape sequence, e.g. ``$$``, in the
756  default pattern.
757
758* *named* -- This group matches the unbraced placeholder name; it should not
759  include the delimiter in capturing group.
760
761* *braced* -- This group matches the brace enclosed placeholder name; it should
762  not include either the delimiter or braces in the capturing group.
763
764* *invalid* -- This group matches any other delimiter pattern (usually a single
765  delimiter), and it should appear last in the regular expression.
766
767
768String functions
769----------------
770
771The following functions are available to operate on string and Unicode objects.
772They are not available as string methods.
773
774
775.. function:: capwords(s[, sep])
776
777   Split the argument into words using :meth:`str.split`, capitalize each word
778   using :meth:`str.capitalize`, and join the capitalized words using
779   :meth:`str.join`.  If the optional second argument *sep* is absent
780   or ``None``, runs of whitespace characters are replaced by a single space
781   and leading and trailing whitespace are removed, otherwise *sep* is used to
782   split and join the words.
783
784
785.. function:: maketrans(from, to)
786
787   Return a translation table suitable for passing to :func:`translate`, that will
788   map each character in *from* into the character at the same position in *to*;
789   *from* and *to* must have the same length.
790
791   .. note::
792
793      Don't use strings derived from :const:`lowercase` and :const:`uppercase` as
794      arguments; in some locales, these don't have the same length.  For case
795      conversions, always use :meth:`str.lower` and :meth:`str.upper`.
796
797
798Deprecated string functions
799---------------------------
800
801The following list of functions are also defined as methods of string and
802Unicode objects; see section :ref:`string-methods` for more information on
803those.  You should consider these functions as deprecated, although they will
804not be removed until Python 3.  The functions defined in this module are:
805
806
807.. function:: atof(s)
808
809   .. deprecated:: 2.0
810      Use the :func:`float` built-in function.
811
812   .. index:: builtin: float
813
814   Convert a string to a floating point number.  The string must have the standard
815   syntax for a floating point literal in Python, optionally preceded by a sign
816   (``+`` or ``-``).  Note that this behaves identical to the built-in function
817   :func:`float` when passed a string.
818
819   .. note::
820
821      .. index::
822         single: NaN
823         single: Infinity
824
825      When passing in a string, values for NaN and Infinity may be returned, depending
826      on the underlying C library.  The specific set of strings accepted which cause
827      these values to be returned depends entirely on the C library and is known to
828      vary.
829
830
831.. function:: atoi(s[, base])
832
833   .. deprecated:: 2.0
834      Use the :func:`int` built-in function.
835
836   .. index:: builtin: eval
837
838   Convert string *s* to an integer in the given *base*.  The string must consist
839   of one or more digits, optionally preceded by a sign (``+`` or ``-``).  The
840   *base* defaults to 10.  If it is 0, a default base is chosen depending on the
841   leading characters of the string (after stripping the sign): ``0x`` or ``0X``
842   means 16, ``0`` means 8, anything else means 10.  If *base* is 16, a leading
843   ``0x`` or ``0X`` is always accepted, though not required.  This behaves
844   identically to the built-in function :func:`int` when passed a string.  (Also
845   note: for a more flexible interpretation of numeric literals, use the built-in
846   function :func:`eval`.)
847
848
849.. function:: atol(s[, base])
850
851   .. deprecated:: 2.0
852      Use the :func:`long` built-in function.
853
854   .. index:: builtin: long
855
856   Convert string *s* to a long integer in the given *base*. The string must
857   consist of one or more digits, optionally preceded by a sign (``+`` or ``-``).
858   The *base* argument has the same meaning as for :func:`atoi`.  A trailing ``l``
859   or ``L`` is not allowed, except if the base is 0.  Note that when invoked
860   without *base* or with *base* set to 10, this behaves identical to the built-in
861   function :func:`long` when passed a string.
862
863
864.. function:: capitalize(word)
865
866   Return a copy of *word* with only its first character capitalized.
867
868
869.. function:: expandtabs(s[, tabsize])
870
871   Expand tabs in a string replacing them by one or more spaces, depending on the
872   current column and the given tab size.  The column number is reset to zero after
873   each newline occurring in the string. This doesn't understand other non-printing
874   characters or escape sequences.  The tab size defaults to 8.
875
876
877.. function:: find(s, sub[, start[,end]])
878
879   Return the lowest index in *s* where the substring *sub* is found such that
880   *sub* is wholly contained in ``s[start:end]``.  Return ``-1`` on failure.
881   Defaults for *start* and *end* and interpretation of negative values is the same
882   as for slices.
883
884
885.. function:: rfind(s, sub[, start[, end]])
886
887   Like :func:`find` but find the highest index.
888
889
890.. function:: index(s, sub[, start[, end]])
891
892   Like :func:`find` but raise :exc:`ValueError` when the substring is not found.
893
894
895.. function:: rindex(s, sub[, start[, end]])
896
897   Like :func:`rfind` but raise :exc:`ValueError` when the substring is not found.
898
899
900.. function:: count(s, sub[, start[, end]])
901
902   Return the number of (non-overlapping) occurrences of substring *sub* in string
903   ``s[start:end]``. Defaults for *start* and *end* and interpretation of negative
904   values are the same as for slices.
905
906
907.. function:: lower(s)
908
909   Return a copy of *s*, but with upper case letters converted to lower case.
910
911
912.. function:: split(s[, sep[, maxsplit]])
913
914   Return a list of the words of the string *s*.  If the optional second argument
915   *sep* is absent or ``None``, the words are separated by arbitrary strings of
916   whitespace characters (space, tab, newline, return, formfeed).  If the second
917   argument *sep* is present and not ``None``, it specifies a string to be used as
918   the  word separator.  The returned list will then have one more item than the
919   number of non-overlapping occurrences of the separator in the string.
920   If *maxsplit* is given, at most *maxsplit* number of splits occur, and the
921   remainder of the string is returned as the final element of the list (thus,
922   the list will have at most ``maxsplit+1`` elements).  If *maxsplit* is not
923   specified or ``-1``, then there is no limit on the number of splits (all
924   possible splits are made).
925
926   The behavior of split on an empty string depends on the value of *sep*. If *sep*
927   is not specified, or specified as ``None``, the result will be an empty list.
928   If *sep* is specified as any string, the result will be a list containing one
929   element which is an empty string.
930
931
932.. function:: rsplit(s[, sep[, maxsplit]])
933
934   Return a list of the words of the string *s*, scanning *s* from the end.  To all
935   intents and purposes, the resulting list of words is the same as returned by
936   :func:`split`, except when the optional third argument *maxsplit* is explicitly
937   specified and nonzero.  If *maxsplit* is given, at most *maxsplit* number of
938   splits -- the *rightmost* ones -- occur, and the remainder of the string is
939   returned as the first element of the list (thus, the list will have at most
940   ``maxsplit+1`` elements).
941
942   .. versionadded:: 2.4
943
944
945.. function:: splitfields(s[, sep[, maxsplit]])
946
947   This function behaves identically to :func:`split`.  (In the past, :func:`split`
948   was only used with one argument, while :func:`splitfields` was only used with
949   two arguments.)
950
951
952.. function:: join(words[, sep])
953
954   Concatenate a list or tuple of words with intervening occurrences of  *sep*.
955   The default value for *sep* is a single space character.  It is always true that
956   ``string.join(string.split(s, sep), sep)`` equals *s*.
957
958
959.. function:: joinfields(words[, sep])
960
961   This function behaves identically to :func:`join`.  (In the past,  :func:`join`
962   was only used with one argument, while :func:`joinfields` was only used with two
963   arguments.) Note that there is no :meth:`joinfields` method on string objects;
964   use the :meth:`join` method instead.
965
966
967.. function:: lstrip(s[, chars])
968
969   Return a copy of the string with leading characters removed.  If *chars* is
970   omitted or ``None``, whitespace characters are removed.  If given and not
971   ``None``, *chars* must be a string; the characters in the string will be
972   stripped from the beginning of the string this method is called on.
973
974   .. versionchanged:: 2.2.3
975      The *chars* parameter was added.  The *chars* parameter cannot be passed in
976      earlier 2.2 versions.
977
978
979.. function:: rstrip(s[, chars])
980
981   Return a copy of the string with trailing characters removed.  If *chars* is
982   omitted or ``None``, whitespace characters are removed.  If given and not
983   ``None``, *chars* must be a string; the characters in the string will be
984   stripped from the end of the string this method is called on.
985
986   .. versionchanged:: 2.2.3
987      The *chars* parameter was added.  The *chars* parameter cannot be passed in
988      earlier 2.2 versions.
989
990
991.. function:: strip(s[, chars])
992
993   Return a copy of the string with leading and trailing characters removed.  If
994   *chars* is omitted or ``None``, whitespace characters are removed.  If given and
995   not ``None``, *chars* must be a string; the characters in the string will be
996   stripped from the both ends of the string this method is called on.
997
998   .. versionchanged:: 2.2.3
999      The *chars* parameter was added.  The *chars* parameter cannot be passed in
1000      earlier 2.2 versions.
1001
1002
1003.. function:: swapcase(s)
1004
1005   Return a copy of *s*, but with lower case letters converted to upper case and
1006   vice versa.
1007
1008
1009.. function:: translate(s, table[, deletechars])
1010
1011   Delete all characters from *s* that are in *deletechars* (if  present), and then
1012   translate the characters using *table*, which  must be a 256-character string
1013   giving the translation for each character value, indexed by its ordinal.  If
1014   *table* is ``None``, then only the character deletion step is performed.
1015
1016
1017.. function:: upper(s)
1018
1019   Return a copy of *s*, but with lower case letters converted to upper case.
1020
1021
1022.. function:: ljust(s, width[, fillchar])
1023              rjust(s, width[, fillchar])
1024              center(s, width[, fillchar])
1025
1026   These functions respectively left-justify, right-justify and center a string in
1027   a field of given width.  They return a string that is at least *width*
1028   characters wide, created by padding the string *s* with the character *fillchar*
1029   (default is a space) until the given width on the right, left or both sides.
1030   The string is never truncated.
1031
1032
1033.. function:: zfill(s, width)
1034
1035   Pad a numeric string *s* on the left with zero digits until the
1036   given *width* is reached.  Strings starting with a sign are handled
1037   correctly.
1038
1039
1040.. function:: replace(s, old, new[, maxreplace])
1041
1042   Return a copy of string *s* with all occurrences of substring *old* replaced
1043   by *new*.  If the optional argument *maxreplace* is given, the first
1044   *maxreplace* occurrences are replaced.
1045
1046