1:mod:`string` --- Common string operations 2========================================== 3 4.. module:: string 5 :synopsis: Common string operations. 6 7 8.. index:: module: re 9 10**Source code:** :source:`Lib/string.py` 11 12-------------- 13 14The :mod:`string` module contains a number of useful constants and 15classes, as well as some deprecated legacy functions that are also 16available as methods on strings. In addition, Python's built-in string 17classes support the sequence type methods described in the 18:ref:`typesseq` section, and also the string-specific methods described 19in the :ref:`string-methods` section. To output formatted strings use 20template strings or the ``%`` operator described in the 21:ref:`string-formatting` section. Also, see the :mod:`re` module for 22string functions based on regular expressions. 23 24String constants 25---------------- 26 27The constants defined in this module are: 28 29 30.. data:: ascii_letters 31 32 The concatenation of the :const:`ascii_lowercase` and :const:`ascii_uppercase` 33 constants described below. This value is not locale-dependent. 34 35 36.. data:: ascii_lowercase 37 38 The lowercase letters ``'abcdefghijklmnopqrstuvwxyz'``. This value is not 39 locale-dependent and will not change. 40 41 42.. data:: ascii_uppercase 43 44 The uppercase letters ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``. This value is not 45 locale-dependent and will not change. 46 47 48.. data:: digits 49 50 The string ``'0123456789'``. 51 52 53.. data:: hexdigits 54 55 The string ``'0123456789abcdefABCDEF'``. 56 57 58.. data:: letters 59 60 The concatenation of the strings :const:`lowercase` and :const:`uppercase` 61 described below. The specific value is locale-dependent, and will be updated 62 when :func:`locale.setlocale` is called. 63 64 65.. data:: lowercase 66 67 A string containing all the characters that are considered lowercase letters. 68 On most systems this is the string ``'abcdefghijklmnopqrstuvwxyz'``. The 69 specific value is locale-dependent, and will be updated when 70 :func:`locale.setlocale` is called. 71 72 73.. data:: octdigits 74 75 The string ``'01234567'``. 76 77 78.. data:: punctuation 79 80 String of ASCII characters which are considered punctuation characters in the 81 ``C`` locale. 82 83 84.. data:: printable 85 86 String of characters which are considered printable. This is a combination of 87 :const:`digits`, :const:`letters`, :const:`punctuation`, and 88 :const:`whitespace`. 89 90 91.. data:: uppercase 92 93 A string containing all the characters that are considered uppercase letters. 94 On most systems this is the string ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``. The 95 specific value is locale-dependent, and will be updated when 96 :func:`locale.setlocale` is called. 97 98 99.. data:: whitespace 100 101 A string containing all characters that are considered whitespace. On most 102 systems this includes the characters space, tab, linefeed, return, formfeed, and 103 vertical tab. 104 105 106.. _new-string-formatting: 107 108Custom String Formatting 109------------------------ 110 111.. versionadded:: 2.6 112 113The built-in str and unicode classes provide the ability 114to do complex variable substitutions and value formatting via the 115:meth:`str.format` method described in :pep:`3101`. The :class:`Formatter` 116class in the :mod:`string` module allows you to create and customize your own 117string formatting behaviors using the same implementation as the built-in 118:meth:`~str.format` method. 119 120.. class:: Formatter 121 122 The :class:`Formatter` class has the following public methods: 123 124 .. method:: format(format_string, *args, **kwargs) 125 126 The primary API method. It takes a format string and 127 an arbitrary set of positional and keyword arguments. 128 It is just a wrapper that calls :meth:`vformat`. 129 130 .. method:: vformat(format_string, args, kwargs) 131 132 This function does the actual work of formatting. It is exposed as a 133 separate function for cases where you want to pass in a predefined 134 dictionary of arguments, rather than unpacking and repacking the 135 dictionary as individual arguments using the ``*args`` and ``**kwargs`` 136 syntax. :meth:`vformat` does the work of breaking up the format string 137 into character data and replacement fields. It calls the various 138 methods described below. 139 140 In addition, the :class:`Formatter` defines a number of methods that are 141 intended to be replaced by subclasses: 142 143 .. method:: parse(format_string) 144 145 Loop over the format_string and return an iterable of tuples 146 (*literal_text*, *field_name*, *format_spec*, *conversion*). This is used 147 by :meth:`vformat` to break the string into either literal text, or 148 replacement fields. 149 150 The values in the tuple conceptually represent a span of literal text 151 followed by a single replacement field. If there is no literal text 152 (which can happen if two replacement fields occur consecutively), then 153 *literal_text* will be a zero-length string. If there is no replacement 154 field, then the values of *field_name*, *format_spec* and *conversion* 155 will be ``None``. 156 157 .. method:: get_field(field_name, args, kwargs) 158 159 Given *field_name* as returned by :meth:`parse` (see above), convert it to 160 an object to be formatted. Returns a tuple (obj, used_key). The default 161 version takes strings of the form defined in :pep:`3101`, such as 162 "0[name]" or "label.title". *args* and *kwargs* are as passed in to 163 :meth:`vformat`. The return value *used_key* has the same meaning as the 164 *key* parameter to :meth:`get_value`. 165 166 .. method:: get_value(key, args, kwargs) 167 168 Retrieve a given field value. The *key* argument will be either an 169 integer or a string. If it is an integer, it represents the index of the 170 positional argument in *args*; if it is a string, then it represents a 171 named argument in *kwargs*. 172 173 The *args* parameter is set to the list of positional arguments to 174 :meth:`vformat`, and the *kwargs* parameter is set to the dictionary of 175 keyword arguments. 176 177 For compound field names, these functions are only called for the first 178 component of the field name; Subsequent components are handled through 179 normal attribute and indexing operations. 180 181 So for example, the field expression '0.name' would cause 182 :meth:`get_value` to be called with a *key* argument of 0. The ``name`` 183 attribute will be looked up after :meth:`get_value` returns by calling the 184 built-in :func:`getattr` function. 185 186 If the index or keyword refers to an item that does not exist, then an 187 :exc:`IndexError` or :exc:`KeyError` should be raised. 188 189 .. method:: check_unused_args(used_args, args, kwargs) 190 191 Implement checking for unused arguments if desired. The arguments to this 192 function is the set of all argument keys that were actually referred to in 193 the format string (integers for positional arguments, and strings for 194 named arguments), and a reference to the *args* and *kwargs* that was 195 passed to vformat. The set of unused args can be calculated from these 196 parameters. :meth:`check_unused_args` is assumed to raise an exception if 197 the check fails. 198 199 .. method:: format_field(value, format_spec) 200 201 :meth:`format_field` simply calls the global :func:`format` built-in. The 202 method is provided so that subclasses can override it. 203 204 .. method:: convert_field(value, conversion) 205 206 Converts the value (returned by :meth:`get_field`) given a conversion type 207 (as in the tuple returned by the :meth:`parse` method). The default 208 version understands 's' (str), 'r' (repr) and 'a' (ascii) conversion 209 types. 210 211 212.. _formatstrings: 213 214Format String Syntax 215-------------------- 216 217The :meth:`str.format` method and the :class:`Formatter` class share the same 218syntax for format strings (although in the case of :class:`Formatter`, 219subclasses can define their own format string syntax). 220 221Format strings contain "replacement fields" surrounded by curly braces ``{}``. 222Anything that is not contained in braces is considered literal text, which is 223copied unchanged to the output. If you need to include a brace character in the 224literal text, it can be escaped by doubling: ``{{`` and ``}}``. 225 226The grammar for a replacement field is as follows: 227 228 .. productionlist:: sf 229 replacement_field: "{" [`field_name`] ["!" `conversion`] [":" `format_spec`] "}" 230 field_name: arg_name ("." `attribute_name` | "[" `element_index` "]")* 231 arg_name: [`identifier` | `integer`] 232 attribute_name: `identifier` 233 element_index: `integer` | `index_string` 234 index_string: <any source character except "]"> + 235 conversion: "r" | "s" 236 format_spec: <described in the next section> 237 238In less formal terms, the replacement field can start with a *field_name* that specifies 239the object whose value is to be formatted and inserted 240into the output instead of the replacement field. 241The *field_name* is optionally followed by a *conversion* field, which is 242preceded by an exclamation point ``'!'``, and a *format_spec*, which is preceded 243by a colon ``':'``. These specify a non-default format for the replacement value. 244 245See also the :ref:`formatspec` section. 246 247The *field_name* itself begins with an *arg_name* that is either a number or a 248keyword. If it's a number, it refers to a positional argument, and if it's a keyword, 249it refers to a named keyword argument. If the numerical arg_names in a format string 250are 0, 1, 2, ... in sequence, they can all be omitted (not just some) 251and the numbers 0, 1, 2, ... will be automatically inserted in that order. 252Because *arg_name* is not quote-delimited, it is not possible to specify arbitrary 253dictionary keys (e.g., the strings ``'10'`` or ``':-]'``) within a format string. 254The *arg_name* can be followed by any number of index or 255attribute expressions. An expression of the form ``'.name'`` selects the named 256attribute using :func:`getattr`, while an expression of the form ``'[index]'`` 257does an index lookup using :func:`__getitem__`. 258 259.. versionchanged:: 2.7 260 The positional argument specifiers can be omitted for :meth:`str.format` and 261 :meth:`unicode.format`, so ``'{} {}'`` is equivalent to ``'{0} {1}'``, 262 ``u'{} {}'`` is equivalent to ``u'{0} {1}'``. 263 264Some simple format string examples:: 265 266 "First, thou shalt count to {0}" # References first positional argument 267 "Bring me a {}" # Implicitly references the first positional argument 268 "From {} to {}" # Same as "From {0} to {1}" 269 "My quest is {name}" # References keyword argument 'name' 270 "Weight in tons {0.weight}" # 'weight' attribute of first positional arg 271 "Units destroyed: {players[0]}" # First element of keyword argument 'players'. 272 273The *conversion* field causes a type coercion before formatting. Normally, the 274job of formatting a value is done by the :meth:`__format__` method of the value 275itself. However, in some cases it is desirable to force a type to be formatted 276as a string, overriding its own definition of formatting. By converting the 277value to a string before calling :meth:`__format__`, the normal formatting logic 278is bypassed. 279 280Two conversion flags are currently supported: ``'!s'`` which calls :func:`str` 281on the value, and ``'!r'`` which calls :func:`repr`. 282 283Some examples:: 284 285 "Harold's a clever {0!s}" # Calls str() on the argument first 286 "Bring out the holy {name!r}" # Calls repr() on the argument first 287 288The *format_spec* field contains a specification of how the value should be 289presented, including such details as field width, alignment, padding, decimal 290precision and so on. Each value type can define its own "formatting 291mini-language" or interpretation of the *format_spec*. 292 293Most built-in types support a common formatting mini-language, which is 294described in the next section. 295 296A *format_spec* field can also include nested replacement fields within it. 297These nested replacement fields may contain a field name, conversion flag 298and format specification, but deeper nesting is 299not allowed. The replacement fields within the 300format_spec are substituted before the *format_spec* string is interpreted. 301This allows the formatting of a value to be dynamically specified. 302 303See the :ref:`formatexamples` section for some examples. 304 305 306.. _formatspec: 307 308Format Specification Mini-Language 309^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 310 311"Format specifications" are used within replacement fields contained within a 312format string to define how individual values are presented (see 313:ref:`formatstrings`). They can also be passed directly to the built-in 314:func:`format` function. Each formattable type may define how the format 315specification is to be interpreted. 316 317Most built-in types implement the following options for format specifications, 318although some of the formatting options are only supported by the numeric types. 319 320A general convention is that an empty format string (``""``) produces 321the same result as if you had called :func:`str` on the value. A 322non-empty format string typically modifies the result. 323 324The general form of a *standard format specifier* is: 325 326.. productionlist:: sf 327 format_spec: [[`fill`]`align`][`sign`][#][0][`width`][,][.`precision`][`type`] 328 fill: <any character> 329 align: "<" | ">" | "=" | "^" 330 sign: "+" | "-" | " " 331 width: `integer` 332 precision: `integer` 333 type: "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%" 334 335If a valid *align* value is specified, it can be preceded by a *fill* 336character that can be any character and defaults to a space if omitted. 337It is not possible to use a literal curly brace ("``{``" or "``}``") as 338the *fill* character when using the :meth:`str.format` 339method. However, it is possible to insert a curly brace 340with a nested replacement field. This limitation doesn't 341affect the :func:`format` function. 342 343The meaning of the various alignment options is as follows: 344 345 +---------+----------------------------------------------------------+ 346 | Option | Meaning | 347 +=========+==========================================================+ 348 | ``'<'`` | Forces the field to be left-aligned within the available | 349 | | space (this is the default for most objects). | 350 +---------+----------------------------------------------------------+ 351 | ``'>'`` | Forces the field to be right-aligned within the | 352 | | available space (this is the default for numbers). | 353 +---------+----------------------------------------------------------+ 354 | ``'='`` | Forces the padding to be placed after the sign (if any) | 355 | | but before the digits. This is used for printing fields | 356 | | in the form '+000000120'. This alignment option is only | 357 | | valid for numeric types. It becomes the default when '0'| 358 | | immediately precedes the field width. | 359 +---------+----------------------------------------------------------+ 360 | ``'^'`` | Forces the field to be centered within the available | 361 | | space. | 362 +---------+----------------------------------------------------------+ 363 364Note that unless a minimum field width is defined, the field width will always 365be the same size as the data to fill it, so that the alignment option has no 366meaning in this case. 367 368The *sign* option is only valid for number types, and can be one of the 369following: 370 371 +---------+----------------------------------------------------------+ 372 | Option | Meaning | 373 +=========+==========================================================+ 374 | ``'+'`` | indicates that a sign should be used for both | 375 | | positive as well as negative numbers. | 376 +---------+----------------------------------------------------------+ 377 | ``'-'`` | indicates that a sign should be used only for negative | 378 | | numbers (this is the default behavior). | 379 +---------+----------------------------------------------------------+ 380 | space | indicates that a leading space should be used on | 381 | | positive numbers, and a minus sign on negative numbers. | 382 +---------+----------------------------------------------------------+ 383 384The ``'#'`` option is only valid for integers, and only for binary, octal, or 385hexadecimal output. If present, it specifies that the output will be prefixed 386by ``'0b'``, ``'0o'``, or ``'0x'``, respectively. 387 388The ``','`` option signals the use of a comma for a thousands separator. 389For a locale aware separator, use the ``'n'`` integer presentation type 390instead. 391 392.. versionchanged:: 2.7 393 Added the ``','`` option (see also :pep:`378`). 394 395*width* is a decimal integer defining the minimum field width. If not 396specified, then the field width will be determined by the content. 397 398When no explicit alignment is given, preceding the *width* field by a zero 399(``'0'``) character enables 400sign-aware zero-padding for numeric types. This is equivalent to a *fill* 401character of ``'0'`` with an *alignment* type of ``'='``. 402 403The *precision* is a decimal number indicating how many digits should be 404displayed after the decimal point for a floating point value formatted with 405``'f'`` and ``'F'``, or before and after the decimal point for a floating point 406value formatted with ``'g'`` or ``'G'``. For non-number types the field 407indicates the maximum field size - in other words, how many characters will be 408used from the field content. The *precision* is not allowed for integer values. 409 410Finally, the *type* determines how the data should be presented. 411 412The available string presentation types are: 413 414 +---------+----------------------------------------------------------+ 415 | Type | Meaning | 416 +=========+==========================================================+ 417 | ``'s'`` | String format. This is the default type for strings and | 418 | | may be omitted. | 419 +---------+----------------------------------------------------------+ 420 | None | The same as ``'s'``. | 421 +---------+----------------------------------------------------------+ 422 423The available integer presentation types are: 424 425 +---------+----------------------------------------------------------+ 426 | Type | Meaning | 427 +=========+==========================================================+ 428 | ``'b'`` | Binary format. Outputs the number in base 2. | 429 +---------+----------------------------------------------------------+ 430 | ``'c'`` | Character. Converts the integer to the corresponding | 431 | | unicode character before printing. | 432 +---------+----------------------------------------------------------+ 433 | ``'d'`` | Decimal Integer. Outputs the number in base 10. | 434 +---------+----------------------------------------------------------+ 435 | ``'o'`` | Octal format. Outputs the number in base 8. | 436 +---------+----------------------------------------------------------+ 437 | ``'x'`` | Hex format. Outputs the number in base 16, using lower- | 438 | | case letters for the digits above 9. | 439 +---------+----------------------------------------------------------+ 440 | ``'X'`` | Hex format. Outputs the number in base 16, using upper- | 441 | | case letters for the digits above 9. | 442 +---------+----------------------------------------------------------+ 443 | ``'n'`` | Number. This is the same as ``'d'``, except that it uses | 444 | | the current locale setting to insert the appropriate | 445 | | number separator characters. | 446 +---------+----------------------------------------------------------+ 447 | None | The same as ``'d'``. | 448 +---------+----------------------------------------------------------+ 449 450In addition to the above presentation types, integers can be formatted 451with the floating point presentation types listed below (except 452``'n'`` and ``None``). When doing so, :func:`float` is used to convert the 453integer to a floating point number before formatting. 454 455The available presentation types for floating point and decimal values are: 456 457 +---------+----------------------------------------------------------+ 458 | Type | Meaning | 459 +=========+==========================================================+ 460 | ``'e'`` | Exponent notation. Prints the number in scientific | 461 | | notation using the letter 'e' to indicate the exponent. | 462 | | The default precision is ``6``. | 463 +---------+----------------------------------------------------------+ 464 | ``'E'`` | Exponent notation. Same as ``'e'`` except it uses an | 465 | | upper case 'E' as the separator character. | 466 +---------+----------------------------------------------------------+ 467 | ``'f'`` | Fixed-point notation. Displays the number as a | 468 | | fixed-point number. The default precision is ``6``. | 469 +---------+----------------------------------------------------------+ 470 | ``'F'`` | Fixed point notation. Same as ``'f'``. | 471 +---------+----------------------------------------------------------+ 472 | ``'g'`` | General format. For a given precision ``p >= 1``, | 473 | | this rounds the number to ``p`` significant digits and | 474 | | then formats the result in either fixed-point format | 475 | | or in scientific notation, depending on its magnitude. | 476 | | | 477 | | The precise rules are as follows: suppose that the | 478 | | result formatted with presentation type ``'e'`` and | 479 | | precision ``p-1`` would have exponent ``exp``. Then | 480 | | if ``-4 <= exp < p``, the number is formatted | 481 | | with presentation type ``'f'`` and precision | 482 | | ``p-1-exp``. Otherwise, the number is formatted | 483 | | with presentation type ``'e'`` and precision ``p-1``. | 484 | | In both cases insignificant trailing zeros are removed | 485 | | from the significand, and the decimal point is also | 486 | | removed if there are no remaining digits following it. | 487 | | | 488 | | Positive and negative infinity, positive and negative | 489 | | zero, and nans, are formatted as ``inf``, ``-inf``, | 490 | | ``0``, ``-0`` and ``nan`` respectively, regardless of | 491 | | the precision. | 492 | | | 493 | | A precision of ``0`` is treated as equivalent to a | 494 | | precision of ``1``. The default precision is ``6``. | 495 +---------+----------------------------------------------------------+ 496 | ``'G'`` | General format. Same as ``'g'`` except switches to | 497 | | ``'E'`` if the number gets too large. The | 498 | | representations of infinity and NaN are uppercased, too. | 499 +---------+----------------------------------------------------------+ 500 | ``'n'`` | Number. This is the same as ``'g'``, except that it uses | 501 | | the current locale setting to insert the appropriate | 502 | | number separator characters. | 503 +---------+----------------------------------------------------------+ 504 | ``'%'`` | Percentage. Multiplies the number by 100 and displays | 505 | | in fixed (``'f'``) format, followed by a percent sign. | 506 +---------+----------------------------------------------------------+ 507 | None | The same as ``'g'``. | 508 +---------+----------------------------------------------------------+ 509 510 511 512.. _formatexamples: 513 514Format examples 515^^^^^^^^^^^^^^^ 516 517This section contains examples of the :meth:`str.format` syntax and 518comparison with the old ``%``-formatting. 519 520In most of the cases the syntax is similar to the old ``%``-formatting, with the 521addition of the ``{}`` and with ``:`` used instead of ``%``. 522For example, ``'%03.2f'`` can be translated to ``'{:03.2f}'``. 523 524The new format syntax also supports new and different options, shown in the 525follow examples. 526 527Accessing arguments by position:: 528 529 >>> '{0}, {1}, {2}'.format('a', 'b', 'c') 530 'a, b, c' 531 >>> '{}, {}, {}'.format('a', 'b', 'c') # 2.7+ only 532 'a, b, c' 533 >>> '{2}, {1}, {0}'.format('a', 'b', 'c') 534 'c, b, a' 535 >>> '{2}, {1}, {0}'.format(*'abc') # unpacking argument sequence 536 'c, b, a' 537 >>> '{0}{1}{0}'.format('abra', 'cad') # arguments' indices can be repeated 538 'abracadabra' 539 540Accessing arguments by name:: 541 542 >>> 'Coordinates: {latitude}, {longitude}'.format(latitude='37.24N', longitude='-115.81W') 543 'Coordinates: 37.24N, -115.81W' 544 >>> coord = {'latitude': '37.24N', 'longitude': '-115.81W'} 545 >>> 'Coordinates: {latitude}, {longitude}'.format(**coord) 546 'Coordinates: 37.24N, -115.81W' 547 548Accessing arguments' attributes:: 549 550 >>> c = 3-5j 551 >>> ('The complex number {0} is formed from the real part {0.real} ' 552 ... 'and the imaginary part {0.imag}.').format(c) 553 'The complex number (3-5j) is formed from the real part 3.0 and the imaginary part -5.0.' 554 >>> class Point(object): 555 ... def __init__(self, x, y): 556 ... self.x, self.y = x, y 557 ... def __str__(self): 558 ... return 'Point({self.x}, {self.y})'.format(self=self) 559 ... 560 >>> str(Point(4, 2)) 561 'Point(4, 2)' 562 563 564Accessing arguments' items:: 565 566 >>> coord = (3, 5) 567 >>> 'X: {0[0]}; Y: {0[1]}'.format(coord) 568 'X: 3; Y: 5' 569 570Replacing ``%s`` and ``%r``:: 571 572 >>> "repr() shows quotes: {!r}; str() doesn't: {!s}".format('test1', 'test2') 573 "repr() shows quotes: 'test1'; str() doesn't: test2" 574 575Aligning the text and specifying a width:: 576 577 >>> '{:<30}'.format('left aligned') 578 'left aligned ' 579 >>> '{:>30}'.format('right aligned') 580 ' right aligned' 581 >>> '{:^30}'.format('centered') 582 ' centered ' 583 >>> '{:*^30}'.format('centered') # use '*' as a fill char 584 '***********centered***********' 585 586Replacing ``%+f``, ``%-f``, and ``% f`` and specifying a sign:: 587 588 >>> '{:+f}; {:+f}'.format(3.14, -3.14) # show it always 589 '+3.140000; -3.140000' 590 >>> '{: f}; {: f}'.format(3.14, -3.14) # show a space for positive numbers 591 ' 3.140000; -3.140000' 592 >>> '{:-f}; {:-f}'.format(3.14, -3.14) # show only the minus -- same as '{:f}; {:f}' 593 '3.140000; -3.140000' 594 595Replacing ``%x`` and ``%o`` and converting the value to different bases:: 596 597 >>> # format also supports binary numbers 598 >>> "int: {0:d}; hex: {0:x}; oct: {0:o}; bin: {0:b}".format(42) 599 'int: 42; hex: 2a; oct: 52; bin: 101010' 600 >>> # with 0x, 0o, or 0b as prefix: 601 >>> "int: {0:d}; hex: {0:#x}; oct: {0:#o}; bin: {0:#b}".format(42) 602 'int: 42; hex: 0x2a; oct: 0o52; bin: 0b101010' 603 604Using the comma as a thousands separator:: 605 606 >>> '{:,}'.format(1234567890) 607 '1,234,567,890' 608 609Expressing a percentage:: 610 611 >>> points = 19.5 612 >>> total = 22 613 >>> 'Correct answers: {:.2%}'.format(points/total) 614 'Correct answers: 88.64%' 615 616Using type-specific formatting:: 617 618 >>> import datetime 619 >>> d = datetime.datetime(2010, 7, 4, 12, 15, 58) 620 >>> '{:%Y-%m-%d %H:%M:%S}'.format(d) 621 '2010-07-04 12:15:58' 622 623Nesting arguments and more complex examples:: 624 625 >>> for align, text in zip('<^>', ['left', 'center', 'right']): 626 ... '{0:{fill}{align}16}'.format(text, fill=align, align=align) 627 ... 628 'left<<<<<<<<<<<<' 629 '^^^^^center^^^^^' 630 '>>>>>>>>>>>right' 631 >>> 632 >>> octets = [192, 168, 0, 1] 633 >>> '{:02X}{:02X}{:02X}{:02X}'.format(*octets) 634 'C0A80001' 635 >>> int(_, 16) 636 3232235521 637 >>> 638 >>> width = 5 639 >>> for num in range(5,12): 640 ... for base in 'dXob': 641 ... print '{0:{width}{base}}'.format(num, base=base, width=width), 642 ... print 643 ... 644 5 5 5 101 645 6 6 6 110 646 7 7 7 111 647 8 8 10 1000 648 9 9 11 1001 649 10 A 12 1010 650 11 B 13 1011 651 652 653 654Template strings 655---------------- 656 657.. versionadded:: 2.4 658 659Templates provide simpler string substitutions as described in :pep:`292`. 660Instead of the normal ``%``\ -based substitutions, Templates support ``$``\ 661-based substitutions, using the following rules: 662 663* ``$$`` is an escape; it is replaced with a single ``$``. 664 665* ``$identifier`` names a substitution placeholder matching a mapping key of 666 ``"identifier"``. By default, ``"identifier"`` must spell a Python 667 identifier. The first non-identifier character after the ``$`` character 668 terminates this placeholder specification. 669 670* ``${identifier}`` is equivalent to ``$identifier``. It is required when valid 671 identifier characters follow the placeholder but are not part of the 672 placeholder, such as ``"${noun}ification"``. 673 674Any other appearance of ``$`` in the string will result in a :exc:`ValueError` 675being raised. 676 677The :mod:`string` module provides a :class:`Template` class that implements 678these rules. The methods of :class:`Template` are: 679 680 681.. class:: Template(template) 682 683 The constructor takes a single argument which is the template string. 684 685 686 .. method:: substitute(mapping[, **kws]) 687 688 Performs the template substitution, returning a new string. *mapping* is 689 any dictionary-like object with keys that match the placeholders in the 690 template. Alternatively, you can provide keyword arguments, where the 691 keywords are the placeholders. When both *mapping* and *kws* are given 692 and there are duplicates, the placeholders from *kws* take precedence. 693 694 695 .. method:: safe_substitute(mapping[, **kws]) 696 697 Like :meth:`substitute`, except that if placeholders are missing from 698 *mapping* and *kws*, instead of raising a :exc:`KeyError` exception, the 699 original placeholder will appear in the resulting string intact. Also, 700 unlike with :meth:`substitute`, any other appearances of the ``$`` will 701 simply return ``$`` instead of raising :exc:`ValueError`. 702 703 While other exceptions may still occur, this method is called "safe" 704 because substitutions always tries to return a usable string instead of 705 raising an exception. In another sense, :meth:`safe_substitute` may be 706 anything other than safe, since it will silently ignore malformed 707 templates containing dangling delimiters, unmatched braces, or 708 placeholders that are not valid Python identifiers. 709 710 :class:`Template` instances also provide one public data attribute: 711 712 .. attribute:: template 713 714 This is the object passed to the constructor's *template* argument. In 715 general, you shouldn't change it, but read-only access is not enforced. 716 717Here is an example of how to use a Template:: 718 719 >>> from string import Template 720 >>> s = Template('$who likes $what') 721 >>> s.substitute(who='tim', what='kung pao') 722 'tim likes kung pao' 723 >>> d = dict(who='tim') 724 >>> Template('Give $who $100').substitute(d) 725 Traceback (most recent call last): 726 ... 727 ValueError: Invalid placeholder in string: line 1, col 11 728 >>> Template('$who likes $what').substitute(d) 729 Traceback (most recent call last): 730 ... 731 KeyError: 'what' 732 >>> Template('$who likes $what').safe_substitute(d) 733 'tim likes $what' 734 735Advanced usage: you can derive subclasses of :class:`Template` to customize the 736placeholder syntax, delimiter character, or the entire regular expression used 737to parse template strings. To do this, you can override these class attributes: 738 739* *delimiter* -- This is the literal string describing a placeholder introducing 740 delimiter. The default value is ``$``. Note that this should *not* be a 741 regular expression, as the implementation will call :meth:`re.escape` on this 742 string as needed. 743 744* *idpattern* -- This is the regular expression describing the pattern for 745 non-braced placeholders (the braces will be added automatically as 746 appropriate). The default value is the regular expression 747 ``[_a-z][_a-z0-9]*``. 748 749Alternatively, you can provide the entire regular expression pattern by 750overriding the class attribute *pattern*. If you do this, the value must be a 751regular expression object with four named capturing groups. The capturing 752groups correspond to the rules given above, along with the invalid placeholder 753rule: 754 755* *escaped* -- This group matches the escape sequence, e.g. ``$$``, in the 756 default pattern. 757 758* *named* -- This group matches the unbraced placeholder name; it should not 759 include the delimiter in capturing group. 760 761* *braced* -- This group matches the brace enclosed placeholder name; it should 762 not include either the delimiter or braces in the capturing group. 763 764* *invalid* -- This group matches any other delimiter pattern (usually a single 765 delimiter), and it should appear last in the regular expression. 766 767 768String functions 769---------------- 770 771The following functions are available to operate on string and Unicode objects. 772They are not available as string methods. 773 774 775.. function:: capwords(s[, sep]) 776 777 Split the argument into words using :meth:`str.split`, capitalize each word 778 using :meth:`str.capitalize`, and join the capitalized words using 779 :meth:`str.join`. If the optional second argument *sep* is absent 780 or ``None``, runs of whitespace characters are replaced by a single space 781 and leading and trailing whitespace are removed, otherwise *sep* is used to 782 split and join the words. 783 784 785.. function:: maketrans(from, to) 786 787 Return a translation table suitable for passing to :func:`translate`, that will 788 map each character in *from* into the character at the same position in *to*; 789 *from* and *to* must have the same length. 790 791 .. note:: 792 793 Don't use strings derived from :const:`lowercase` and :const:`uppercase` as 794 arguments; in some locales, these don't have the same length. For case 795 conversions, always use :meth:`str.lower` and :meth:`str.upper`. 796 797 798Deprecated string functions 799--------------------------- 800 801The following list of functions are also defined as methods of string and 802Unicode objects; see section :ref:`string-methods` for more information on 803those. You should consider these functions as deprecated, although they will 804not be removed until Python 3. The functions defined in this module are: 805 806 807.. function:: atof(s) 808 809 .. deprecated:: 2.0 810 Use the :func:`float` built-in function. 811 812 .. index:: builtin: float 813 814 Convert a string to a floating point number. The string must have the standard 815 syntax for a floating point literal in Python, optionally preceded by a sign 816 (``+`` or ``-``). Note that this behaves identical to the built-in function 817 :func:`float` when passed a string. 818 819 .. note:: 820 821 .. index:: 822 single: NaN 823 single: Infinity 824 825 When passing in a string, values for NaN and Infinity may be returned, depending 826 on the underlying C library. The specific set of strings accepted which cause 827 these values to be returned depends entirely on the C library and is known to 828 vary. 829 830 831.. function:: atoi(s[, base]) 832 833 .. deprecated:: 2.0 834 Use the :func:`int` built-in function. 835 836 .. index:: builtin: eval 837 838 Convert string *s* to an integer in the given *base*. The string must consist 839 of one or more digits, optionally preceded by a sign (``+`` or ``-``). The 840 *base* defaults to 10. If it is 0, a default base is chosen depending on the 841 leading characters of the string (after stripping the sign): ``0x`` or ``0X`` 842 means 16, ``0`` means 8, anything else means 10. If *base* is 16, a leading 843 ``0x`` or ``0X`` is always accepted, though not required. This behaves 844 identically to the built-in function :func:`int` when passed a string. (Also 845 note: for a more flexible interpretation of numeric literals, use the built-in 846 function :func:`eval`.) 847 848 849.. function:: atol(s[, base]) 850 851 .. deprecated:: 2.0 852 Use the :func:`long` built-in function. 853 854 .. index:: builtin: long 855 856 Convert string *s* to a long integer in the given *base*. The string must 857 consist of one or more digits, optionally preceded by a sign (``+`` or ``-``). 858 The *base* argument has the same meaning as for :func:`atoi`. A trailing ``l`` 859 or ``L`` is not allowed, except if the base is 0. Note that when invoked 860 without *base* or with *base* set to 10, this behaves identical to the built-in 861 function :func:`long` when passed a string. 862 863 864.. function:: capitalize(word) 865 866 Return a copy of *word* with only its first character capitalized. 867 868 869.. function:: expandtabs(s[, tabsize]) 870 871 Expand tabs in a string replacing them by one or more spaces, depending on the 872 current column and the given tab size. The column number is reset to zero after 873 each newline occurring in the string. This doesn't understand other non-printing 874 characters or escape sequences. The tab size defaults to 8. 875 876 877.. function:: find(s, sub[, start[,end]]) 878 879 Return the lowest index in *s* where the substring *sub* is found such that 880 *sub* is wholly contained in ``s[start:end]``. Return ``-1`` on failure. 881 Defaults for *start* and *end* and interpretation of negative values is the same 882 as for slices. 883 884 885.. function:: rfind(s, sub[, start[, end]]) 886 887 Like :func:`find` but find the highest index. 888 889 890.. function:: index(s, sub[, start[, end]]) 891 892 Like :func:`find` but raise :exc:`ValueError` when the substring is not found. 893 894 895.. function:: rindex(s, sub[, start[, end]]) 896 897 Like :func:`rfind` but raise :exc:`ValueError` when the substring is not found. 898 899 900.. function:: count(s, sub[, start[, end]]) 901 902 Return the number of (non-overlapping) occurrences of substring *sub* in string 903 ``s[start:end]``. Defaults for *start* and *end* and interpretation of negative 904 values are the same as for slices. 905 906 907.. function:: lower(s) 908 909 Return a copy of *s*, but with upper case letters converted to lower case. 910 911 912.. function:: split(s[, sep[, maxsplit]]) 913 914 Return a list of the words of the string *s*. If the optional second argument 915 *sep* is absent or ``None``, the words are separated by arbitrary strings of 916 whitespace characters (space, tab, newline, return, formfeed). If the second 917 argument *sep* is present and not ``None``, it specifies a string to be used as 918 the word separator. The returned list will then have one more item than the 919 number of non-overlapping occurrences of the separator in the string. 920 If *maxsplit* is given, at most *maxsplit* number of splits occur, and the 921 remainder of the string is returned as the final element of the list (thus, 922 the list will have at most ``maxsplit+1`` elements). If *maxsplit* is not 923 specified or ``-1``, then there is no limit on the number of splits (all 924 possible splits are made). 925 926 The behavior of split on an empty string depends on the value of *sep*. If *sep* 927 is not specified, or specified as ``None``, the result will be an empty list. 928 If *sep* is specified as any string, the result will be a list containing one 929 element which is an empty string. 930 931 932.. function:: rsplit(s[, sep[, maxsplit]]) 933 934 Return a list of the words of the string *s*, scanning *s* from the end. To all 935 intents and purposes, the resulting list of words is the same as returned by 936 :func:`split`, except when the optional third argument *maxsplit* is explicitly 937 specified and nonzero. If *maxsplit* is given, at most *maxsplit* number of 938 splits -- the *rightmost* ones -- occur, and the remainder of the string is 939 returned as the first element of the list (thus, the list will have at most 940 ``maxsplit+1`` elements). 941 942 .. versionadded:: 2.4 943 944 945.. function:: splitfields(s[, sep[, maxsplit]]) 946 947 This function behaves identically to :func:`split`. (In the past, :func:`split` 948 was only used with one argument, while :func:`splitfields` was only used with 949 two arguments.) 950 951 952.. function:: join(words[, sep]) 953 954 Concatenate a list or tuple of words with intervening occurrences of *sep*. 955 The default value for *sep* is a single space character. It is always true that 956 ``string.join(string.split(s, sep), sep)`` equals *s*. 957 958 959.. function:: joinfields(words[, sep]) 960 961 This function behaves identically to :func:`join`. (In the past, :func:`join` 962 was only used with one argument, while :func:`joinfields` was only used with two 963 arguments.) Note that there is no :meth:`joinfields` method on string objects; 964 use the :meth:`join` method instead. 965 966 967.. function:: lstrip(s[, chars]) 968 969 Return a copy of the string with leading characters removed. If *chars* is 970 omitted or ``None``, whitespace characters are removed. If given and not 971 ``None``, *chars* must be a string; the characters in the string will be 972 stripped from the beginning of the string this method is called on. 973 974 .. versionchanged:: 2.2.3 975 The *chars* parameter was added. The *chars* parameter cannot be passed in 976 earlier 2.2 versions. 977 978 979.. function:: rstrip(s[, chars]) 980 981 Return a copy of the string with trailing characters removed. If *chars* is 982 omitted or ``None``, whitespace characters are removed. If given and not 983 ``None``, *chars* must be a string; the characters in the string will be 984 stripped from the end of the string this method is called on. 985 986 .. versionchanged:: 2.2.3 987 The *chars* parameter was added. The *chars* parameter cannot be passed in 988 earlier 2.2 versions. 989 990 991.. function:: strip(s[, chars]) 992 993 Return a copy of the string with leading and trailing characters removed. If 994 *chars* is omitted or ``None``, whitespace characters are removed. If given and 995 not ``None``, *chars* must be a string; the characters in the string will be 996 stripped from the both ends of the string this method is called on. 997 998 .. versionchanged:: 2.2.3 999 The *chars* parameter was added. The *chars* parameter cannot be passed in 1000 earlier 2.2 versions. 1001 1002 1003.. function:: swapcase(s) 1004 1005 Return a copy of *s*, but with lower case letters converted to upper case and 1006 vice versa. 1007 1008 1009.. function:: translate(s, table[, deletechars]) 1010 1011 Delete all characters from *s* that are in *deletechars* (if present), and then 1012 translate the characters using *table*, which must be a 256-character string 1013 giving the translation for each character value, indexed by its ordinal. If 1014 *table* is ``None``, then only the character deletion step is performed. 1015 1016 1017.. function:: upper(s) 1018 1019 Return a copy of *s*, but with lower case letters converted to upper case. 1020 1021 1022.. function:: ljust(s, width[, fillchar]) 1023 rjust(s, width[, fillchar]) 1024 center(s, width[, fillchar]) 1025 1026 These functions respectively left-justify, right-justify and center a string in 1027 a field of given width. They return a string that is at least *width* 1028 characters wide, created by padding the string *s* with the character *fillchar* 1029 (default is a space) until the given width on the right, left or both sides. 1030 The string is never truncated. 1031 1032 1033.. function:: zfill(s, width) 1034 1035 Pad a numeric string *s* on the left with zero digits until the 1036 given *width* is reached. Strings starting with a sign are handled 1037 correctly. 1038 1039 1040.. function:: replace(s, old, new[, maxreplace]) 1041 1042 Return a copy of string *s* with all occurrences of substring *old* replaced 1043 by *new*. If the optional argument *maxreplace* is given, the first 1044 *maxreplace* occurrences are replaced. 1045 1046