1:mod:`urllib.parse` --- Parse URLs into components 2================================================== 3 4.. module:: urllib.parse 5 :synopsis: Parse URLs into or assemble them from components. 6 7**Source code:** :source:`Lib/urllib/parse.py` 8 9.. index:: 10 single: WWW 11 single: World Wide Web 12 single: URL 13 pair: URL; parsing 14 pair: relative; URL 15 16-------------- 17 18This module defines a standard interface to break Uniform Resource Locator (URL) 19strings up in components (addressing scheme, network location, path etc.), to 20combine the components back into a URL string, and to convert a "relative URL" 21to an absolute URL given a "base URL." 22 23The module has been designed to match the internet RFC on Relative Uniform 24Resource Locators. It supports the following URL schemes: ``file``, ``ftp``, 25``gopher``, ``hdl``, ``http``, ``https``, ``imap``, ``mailto``, ``mms``, 26``news``, ``nntp``, ``prospero``, ``rsync``, ``rtsp``, ``rtspu``, ``sftp``, 27``shttp``, ``sip``, ``sips``, ``snews``, ``svn``, ``svn+ssh``, ``telnet``, 28``wais``, ``ws``, ``wss``. 29 30The :mod:`urllib.parse` module defines functions that fall into two broad 31categories: URL parsing and URL quoting. These are covered in detail in 32the following sections. 33 34URL Parsing 35----------- 36 37The URL parsing functions focus on splitting a URL string into its components, 38or on combining URL components into a URL string. 39 40.. function:: urlparse(urlstring, scheme='', allow_fragments=True) 41 42 Parse a URL into six components, returning a 6-item :term:`named tuple`. This 43 corresponds to the general structure of a URL: 44 ``scheme://netloc/path;parameters?query#fragment``. 45 Each tuple item is a string, possibly empty. The components are not broken up 46 into smaller parts (for example, the network location is a single string), and % 47 escapes are not expanded. The delimiters as shown above are not part of the 48 result, except for a leading slash in the *path* component, which is retained if 49 present. For example: 50 51 .. doctest:: 52 :options: +NORMALIZE_WHITESPACE 53 54 >>> from urllib.parse import urlparse 55 >>> urlparse("scheme://netloc/path;parameters?query#fragment") 56 ParseResult(scheme='scheme', netloc='netloc', path='/path;parameters', params='', 57 query='query', fragment='fragment') 58 >>> o = urlparse("http://docs.python.org:80/3/library/urllib.parse.html?" 59 ... "highlight=params#url-parsing") 60 >>> o 61 ParseResult(scheme='http', netloc='docs.python.org:80', 62 path='/3/library/urllib.parse.html', params='', 63 query='highlight=params', fragment='url-parsing') 64 >>> o.scheme 65 'http' 66 >>> o.netloc 67 'docs.python.org:80' 68 >>> o.hostname 69 'docs.python.org' 70 >>> o.port 71 80 72 >>> o._replace(fragment="").geturl() 73 'http://docs.python.org:80/3/library/urllib.parse.html?highlight=params' 74 75 Following the syntax specifications in :rfc:`1808`, urlparse recognizes 76 a netloc only if it is properly introduced by '//'. Otherwise the 77 input is presumed to be a relative URL and thus to start with 78 a path component. 79 80 .. doctest:: 81 :options: +NORMALIZE_WHITESPACE 82 83 >>> from urllib.parse import urlparse 84 >>> urlparse('//www.cwi.nl:80/%7Eguido/Python.html') 85 ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html', 86 params='', query='', fragment='') 87 >>> urlparse('www.cwi.nl/%7Eguido/Python.html') 88 ParseResult(scheme='', netloc='', path='www.cwi.nl/%7Eguido/Python.html', 89 params='', query='', fragment='') 90 >>> urlparse('help/Python.html') 91 ParseResult(scheme='', netloc='', path='help/Python.html', params='', 92 query='', fragment='') 93 94 The *scheme* argument gives the default addressing scheme, to be 95 used only if the URL does not specify one. It should be the same type 96 (text or bytes) as *urlstring*, except that the default value ``''`` is 97 always allowed, and is automatically converted to ``b''`` if appropriate. 98 99 If the *allow_fragments* argument is false, fragment identifiers are not 100 recognized. Instead, they are parsed as part of the path, parameters 101 or query component, and :attr:`fragment` is set to the empty string in 102 the return value. 103 104 The return value is a :term:`named tuple`, which means that its items can 105 be accessed by index or as named attributes, which are: 106 107 +------------------+-------+-------------------------+------------------------+ 108 | Attribute | Index | Value | Value if not present | 109 +==================+=======+=========================+========================+ 110 | :attr:`scheme` | 0 | URL scheme specifier | *scheme* parameter | 111 +------------------+-------+-------------------------+------------------------+ 112 | :attr:`netloc` | 1 | Network location part | empty string | 113 +------------------+-------+-------------------------+------------------------+ 114 | :attr:`path` | 2 | Hierarchical path | empty string | 115 +------------------+-------+-------------------------+------------------------+ 116 | :attr:`params` | 3 | No longer used | always an empty string | 117 +------------------+-------+-------------------------+------------------------+ 118 | :attr:`query` | 4 | Query component | empty string | 119 +------------------+-------+-------------------------+------------------------+ 120 | :attr:`fragment` | 5 | Fragment identifier | empty string | 121 +------------------+-------+-------------------------+------------------------+ 122 | :attr:`username` | | User name | :const:`None` | 123 +------------------+-------+-------------------------+------------------------+ 124 | :attr:`password` | | Password | :const:`None` | 125 +------------------+-------+-------------------------+------------------------+ 126 | :attr:`hostname` | | Host name (lower case) | :const:`None` | 127 +------------------+-------+-------------------------+------------------------+ 128 | :attr:`port` | | Port number as integer, | :const:`None` | 129 | | | if present | | 130 +------------------+-------+-------------------------+------------------------+ 131 132 Reading the :attr:`port` attribute will raise a :exc:`ValueError` if 133 an invalid port is specified in the URL. See section 134 :ref:`urlparse-result-object` for more information on the result object. 135 136 Unmatched square brackets in the :attr:`netloc` attribute will raise a 137 :exc:`ValueError`. 138 139 Characters in the :attr:`netloc` attribute that decompose under NFKC 140 normalization (as used by the IDNA encoding) into any of ``/``, ``?``, 141 ``#``, ``@``, or ``:`` will raise a :exc:`ValueError`. If the URL is 142 decomposed before parsing, no error will be raised. 143 144 As is the case with all named tuples, the subclass has a few additional methods 145 and attributes that are particularly useful. One such method is :meth:`_replace`. 146 The :meth:`_replace` method will return a new ParseResult object replacing specified 147 fields with new values. 148 149 .. doctest:: 150 :options: +NORMALIZE_WHITESPACE 151 152 >>> from urllib.parse import urlparse 153 >>> u = urlparse('//www.cwi.nl:80/%7Eguido/Python.html') 154 >>> u 155 ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html', 156 params='', query='', fragment='') 157 >>> u._replace(scheme='http') 158 ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html', 159 params='', query='', fragment='') 160 161 162 .. versionchanged:: 3.2 163 Added IPv6 URL parsing capabilities. 164 165 .. versionchanged:: 3.3 166 The fragment is now parsed for all URL schemes (unless *allow_fragment* is 167 false), in accordance with :rfc:`3986`. Previously, an allowlist of 168 schemes that support fragments existed. 169 170 .. versionchanged:: 3.6 171 Out-of-range port numbers now raise :exc:`ValueError`, instead of 172 returning :const:`None`. 173 174 .. versionchanged:: 3.8 175 Characters that affect netloc parsing under NFKC normalization will 176 now raise :exc:`ValueError`. 177 178 179.. function:: parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None, separator='&') 180 181 Parse a query string given as a string argument (data of type 182 :mimetype:`application/x-www-form-urlencoded`). Data are returned as a 183 dictionary. The dictionary keys are the unique query variable names and the 184 values are lists of values for each name. 185 186 The optional argument *keep_blank_values* is a flag indicating whether blank 187 values in percent-encoded queries should be treated as blank strings. A true value 188 indicates that blanks should be retained as blank strings. The default false 189 value indicates that blank values are to be ignored and treated as if they were 190 not included. 191 192 The optional argument *strict_parsing* is a flag indicating what to do with 193 parsing errors. If false (the default), errors are silently ignored. If true, 194 errors raise a :exc:`ValueError` exception. 195 196 The optional *encoding* and *errors* parameters specify how to decode 197 percent-encoded sequences into Unicode characters, as accepted by the 198 :meth:`bytes.decode` method. 199 200 The optional argument *max_num_fields* is the maximum number of fields to 201 read. If set, then throws a :exc:`ValueError` if there are more than 202 *max_num_fields* fields read. 203 204 The optional argument *separator* is the symbol to use for separating the 205 query arguments. It defaults to ``&``. 206 207 Use the :func:`urllib.parse.urlencode` function (with the ``doseq`` 208 parameter set to ``True``) to convert such dictionaries into query 209 strings. 210 211 212 .. versionchanged:: 3.2 213 Add *encoding* and *errors* parameters. 214 215 .. versionchanged:: 3.8 216 Added *max_num_fields* parameter. 217 218 .. versionchanged:: 3.10 219 Added *separator* parameter with the default value of ``&``. Python 220 versions earlier than Python 3.10 allowed using both ``;`` and ``&`` as 221 query parameter separator. This has been changed to allow only a single 222 separator key, with ``&`` as the default separator. 223 224 225.. function:: parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None, separator='&') 226 227 Parse a query string given as a string argument (data of type 228 :mimetype:`application/x-www-form-urlencoded`). Data are returned as a list of 229 name, value pairs. 230 231 The optional argument *keep_blank_values* is a flag indicating whether blank 232 values in percent-encoded queries should be treated as blank strings. A true value 233 indicates that blanks should be retained as blank strings. The default false 234 value indicates that blank values are to be ignored and treated as if they were 235 not included. 236 237 The optional argument *strict_parsing* is a flag indicating what to do with 238 parsing errors. If false (the default), errors are silently ignored. If true, 239 errors raise a :exc:`ValueError` exception. 240 241 The optional *encoding* and *errors* parameters specify how to decode 242 percent-encoded sequences into Unicode characters, as accepted by the 243 :meth:`bytes.decode` method. 244 245 The optional argument *max_num_fields* is the maximum number of fields to 246 read. If set, then throws a :exc:`ValueError` if there are more than 247 *max_num_fields* fields read. 248 249 The optional argument *separator* is the symbol to use for separating the 250 query arguments. It defaults to ``&``. 251 252 Use the :func:`urllib.parse.urlencode` function to convert such lists of pairs into 253 query strings. 254 255 .. versionchanged:: 3.2 256 Add *encoding* and *errors* parameters. 257 258 .. versionchanged:: 3.8 259 Added *max_num_fields* parameter. 260 261 .. versionchanged:: 3.10 262 Added *separator* parameter with the default value of ``&``. Python 263 versions earlier than Python 3.10 allowed using both ``;`` and ``&`` as 264 query parameter separator. This has been changed to allow only a single 265 separator key, with ``&`` as the default separator. 266 267 268.. function:: urlunparse(parts) 269 270 Construct a URL from a tuple as returned by ``urlparse()``. The *parts* 271 argument can be any six-item iterable. This may result in a slightly 272 different, but equivalent URL, if the URL that was parsed originally had 273 unnecessary delimiters (for example, a ``?`` with an empty query; the RFC 274 states that these are equivalent). 275 276 277.. function:: urlsplit(urlstring, scheme='', allow_fragments=True) 278 279 This is similar to :func:`urlparse`, but does not split the params from the URL. 280 This should generally be used instead of :func:`urlparse` if the more recent URL 281 syntax allowing parameters to be applied to each segment of the *path* portion 282 of the URL (see :rfc:`2396`) is wanted. A separate function is needed to 283 separate the path segments and parameters. This function returns a 5-item 284 :term:`named tuple`:: 285 286 (addressing scheme, network location, path, query, fragment identifier). 287 288 The return value is a :term:`named tuple`, its items can be accessed by index 289 or as named attributes: 290 291 +------------------+-------+-------------------------+----------------------+ 292 | Attribute | Index | Value | Value if not present | 293 +==================+=======+=========================+======================+ 294 | :attr:`scheme` | 0 | URL scheme specifier | *scheme* parameter | 295 +------------------+-------+-------------------------+----------------------+ 296 | :attr:`netloc` | 1 | Network location part | empty string | 297 +------------------+-------+-------------------------+----------------------+ 298 | :attr:`path` | 2 | Hierarchical path | empty string | 299 +------------------+-------+-------------------------+----------------------+ 300 | :attr:`query` | 3 | Query component | empty string | 301 +------------------+-------+-------------------------+----------------------+ 302 | :attr:`fragment` | 4 | Fragment identifier | empty string | 303 +------------------+-------+-------------------------+----------------------+ 304 | :attr:`username` | | User name | :const:`None` | 305 +------------------+-------+-------------------------+----------------------+ 306 | :attr:`password` | | Password | :const:`None` | 307 +------------------+-------+-------------------------+----------------------+ 308 | :attr:`hostname` | | Host name (lower case) | :const:`None` | 309 +------------------+-------+-------------------------+----------------------+ 310 | :attr:`port` | | Port number as integer, | :const:`None` | 311 | | | if present | | 312 +------------------+-------+-------------------------+----------------------+ 313 314 Reading the :attr:`port` attribute will raise a :exc:`ValueError` if 315 an invalid port is specified in the URL. See section 316 :ref:`urlparse-result-object` for more information on the result object. 317 318 Unmatched square brackets in the :attr:`netloc` attribute will raise a 319 :exc:`ValueError`. 320 321 Characters in the :attr:`netloc` attribute that decompose under NFKC 322 normalization (as used by the IDNA encoding) into any of ``/``, ``?``, 323 ``#``, ``@``, or ``:`` will raise a :exc:`ValueError`. If the URL is 324 decomposed before parsing, no error will be raised. 325 326 Following the `WHATWG spec`_ that updates RFC 3986, ASCII newline 327 ``\n``, ``\r`` and tab ``\t`` characters are stripped from the URL. 328 329 .. versionchanged:: 3.6 330 Out-of-range port numbers now raise :exc:`ValueError`, instead of 331 returning :const:`None`. 332 333 .. versionchanged:: 3.8 334 Characters that affect netloc parsing under NFKC normalization will 335 now raise :exc:`ValueError`. 336 337 .. versionchanged:: 3.10 338 ASCII newline and tab characters are stripped from the URL. 339 340.. _WHATWG spec: https://url.spec.whatwg.org/#concept-basic-url-parser 341 342.. function:: urlunsplit(parts) 343 344 Combine the elements of a tuple as returned by :func:`urlsplit` into a 345 complete URL as a string. The *parts* argument can be any five-item 346 iterable. This may result in a slightly different, but equivalent URL, if the 347 URL that was parsed originally had unnecessary delimiters (for example, a ? 348 with an empty query; the RFC states that these are equivalent). 349 350 351.. function:: urljoin(base, url, allow_fragments=True) 352 353 Construct a full ("absolute") URL by combining a "base URL" (*base*) with 354 another URL (*url*). Informally, this uses components of the base URL, in 355 particular the addressing scheme, the network location and (part of) the 356 path, to provide missing components in the relative URL. For example: 357 358 >>> from urllib.parse import urljoin 359 >>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', 'FAQ.html') 360 'http://www.cwi.nl/%7Eguido/FAQ.html' 361 362 The *allow_fragments* argument has the same meaning and default as for 363 :func:`urlparse`. 364 365 .. note:: 366 367 If *url* is an absolute URL (that is, it starts with ``//`` or ``scheme://``), 368 the *url*'s hostname and/or scheme will be present in the result. For example: 369 370 .. doctest:: 371 372 >>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', 373 ... '//www.python.org/%7Eguido') 374 'http://www.python.org/%7Eguido' 375 376 If you do not want that behavior, preprocess the *url* with :func:`urlsplit` and 377 :func:`urlunsplit`, removing possible *scheme* and *netloc* parts. 378 379 380 .. versionchanged:: 3.5 381 382 Behavior updated to match the semantics defined in :rfc:`3986`. 383 384 385.. function:: urldefrag(url) 386 387 If *url* contains a fragment identifier, return a modified version of *url* 388 with no fragment identifier, and the fragment identifier as a separate 389 string. If there is no fragment identifier in *url*, return *url* unmodified 390 and an empty string. 391 392 The return value is a :term:`named tuple`, its items can be accessed by index 393 or as named attributes: 394 395 +------------------+-------+-------------------------+----------------------+ 396 | Attribute | Index | Value | Value if not present | 397 +==================+=======+=========================+======================+ 398 | :attr:`url` | 0 | URL with no fragment | empty string | 399 +------------------+-------+-------------------------+----------------------+ 400 | :attr:`fragment` | 1 | Fragment identifier | empty string | 401 +------------------+-------+-------------------------+----------------------+ 402 403 See section :ref:`urlparse-result-object` for more information on the result 404 object. 405 406 .. versionchanged:: 3.2 407 Result is a structured object rather than a simple 2-tuple. 408 409.. function:: unwrap(url) 410 411 Extract the url from a wrapped URL (that is, a string formatted as 412 ``<URL:scheme://host/path>``, ``<scheme://host/path>``, ``URL:scheme://host/path`` 413 or ``scheme://host/path``). If *url* is not a wrapped URL, it is returned 414 without changes. 415 416.. _parsing-ascii-encoded-bytes: 417 418Parsing ASCII Encoded Bytes 419--------------------------- 420 421The URL parsing functions were originally designed to operate on character 422strings only. In practice, it is useful to be able to manipulate properly 423quoted and encoded URLs as sequences of ASCII bytes. Accordingly, the 424URL parsing functions in this module all operate on :class:`bytes` and 425:class:`bytearray` objects in addition to :class:`str` objects. 426 427If :class:`str` data is passed in, the result will also contain only 428:class:`str` data. If :class:`bytes` or :class:`bytearray` data is 429passed in, the result will contain only :class:`bytes` data. 430 431Attempting to mix :class:`str` data with :class:`bytes` or 432:class:`bytearray` in a single function call will result in a 433:exc:`TypeError` being raised, while attempting to pass in non-ASCII 434byte values will trigger :exc:`UnicodeDecodeError`. 435 436To support easier conversion of result objects between :class:`str` and 437:class:`bytes`, all return values from URL parsing functions provide 438either an :meth:`encode` method (when the result contains :class:`str` 439data) or a :meth:`decode` method (when the result contains :class:`bytes` 440data). The signatures of these methods match those of the corresponding 441:class:`str` and :class:`bytes` methods (except that the default encoding 442is ``'ascii'`` rather than ``'utf-8'``). Each produces a value of a 443corresponding type that contains either :class:`bytes` data (for 444:meth:`encode` methods) or :class:`str` data (for 445:meth:`decode` methods). 446 447Applications that need to operate on potentially improperly quoted URLs 448that may contain non-ASCII data will need to do their own decoding from 449bytes to characters before invoking the URL parsing methods. 450 451The behaviour described in this section applies only to the URL parsing 452functions. The URL quoting functions use their own rules when producing 453or consuming byte sequences as detailed in the documentation of the 454individual URL quoting functions. 455 456.. versionchanged:: 3.2 457 URL parsing functions now accept ASCII encoded byte sequences 458 459 460.. _urlparse-result-object: 461 462Structured Parse Results 463------------------------ 464 465The result objects from the :func:`urlparse`, :func:`urlsplit` and 466:func:`urldefrag` functions are subclasses of the :class:`tuple` type. 467These subclasses add the attributes listed in the documentation for 468those functions, the encoding and decoding support described in the 469previous section, as well as an additional method: 470 471.. method:: urllib.parse.SplitResult.geturl() 472 473 Return the re-combined version of the original URL as a string. This may 474 differ from the original URL in that the scheme may be normalized to lower 475 case and empty components may be dropped. Specifically, empty parameters, 476 queries, and fragment identifiers will be removed. 477 478 For :func:`urldefrag` results, only empty fragment identifiers will be removed. 479 For :func:`urlsplit` and :func:`urlparse` results, all noted changes will be 480 made to the URL returned by this method. 481 482 The result of this method remains unchanged if passed back through the original 483 parsing function: 484 485 >>> from urllib.parse import urlsplit 486 >>> url = 'HTTP://www.Python.org/doc/#' 487 >>> r1 = urlsplit(url) 488 >>> r1.geturl() 489 'http://www.Python.org/doc/' 490 >>> r2 = urlsplit(r1.geturl()) 491 >>> r2.geturl() 492 'http://www.Python.org/doc/' 493 494 495The following classes provide the implementations of the structured parse 496results when operating on :class:`str` objects: 497 498.. class:: DefragResult(url, fragment) 499 500 Concrete class for :func:`urldefrag` results containing :class:`str` 501 data. The :meth:`encode` method returns a :class:`DefragResultBytes` 502 instance. 503 504 .. versionadded:: 3.2 505 506.. class:: ParseResult(scheme, netloc, path, params, query, fragment) 507 508 Concrete class for :func:`urlparse` results containing :class:`str` 509 data. The :meth:`encode` method returns a :class:`ParseResultBytes` 510 instance. 511 512.. class:: SplitResult(scheme, netloc, path, query, fragment) 513 514 Concrete class for :func:`urlsplit` results containing :class:`str` 515 data. The :meth:`encode` method returns a :class:`SplitResultBytes` 516 instance. 517 518 519The following classes provide the implementations of the parse results when 520operating on :class:`bytes` or :class:`bytearray` objects: 521 522.. class:: DefragResultBytes(url, fragment) 523 524 Concrete class for :func:`urldefrag` results containing :class:`bytes` 525 data. The :meth:`decode` method returns a :class:`DefragResult` 526 instance. 527 528 .. versionadded:: 3.2 529 530.. class:: ParseResultBytes(scheme, netloc, path, params, query, fragment) 531 532 Concrete class for :func:`urlparse` results containing :class:`bytes` 533 data. The :meth:`decode` method returns a :class:`ParseResult` 534 instance. 535 536 .. versionadded:: 3.2 537 538.. class:: SplitResultBytes(scheme, netloc, path, query, fragment) 539 540 Concrete class for :func:`urlsplit` results containing :class:`bytes` 541 data. The :meth:`decode` method returns a :class:`SplitResult` 542 instance. 543 544 .. versionadded:: 3.2 545 546 547URL Quoting 548----------- 549 550The URL quoting functions focus on taking program data and making it safe 551for use as URL components by quoting special characters and appropriately 552encoding non-ASCII text. They also support reversing these operations to 553recreate the original data from the contents of a URL component if that 554task isn't already covered by the URL parsing functions above. 555 556.. function:: quote(string, safe='/', encoding=None, errors=None) 557 558 Replace special characters in *string* using the ``%xx`` escape. Letters, 559 digits, and the characters ``'_.-~'`` are never quoted. By default, this 560 function is intended for quoting the path section of a URL. The optional 561 *safe* parameter specifies additional ASCII characters that should not be 562 quoted --- its default value is ``'/'``. 563 564 *string* may be either a :class:`str` or a :class:`bytes` object. 565 566 .. versionchanged:: 3.7 567 Moved from :rfc:`2396` to :rfc:`3986` for quoting URL strings. "~" is now 568 included in the set of unreserved characters. 569 570 The optional *encoding* and *errors* parameters specify how to deal with 571 non-ASCII characters, as accepted by the :meth:`str.encode` method. 572 *encoding* defaults to ``'utf-8'``. 573 *errors* defaults to ``'strict'``, meaning unsupported characters raise a 574 :class:`UnicodeEncodeError`. 575 *encoding* and *errors* must not be supplied if *string* is a 576 :class:`bytes`, or a :class:`TypeError` is raised. 577 578 Note that ``quote(string, safe, encoding, errors)`` is equivalent to 579 ``quote_from_bytes(string.encode(encoding, errors), safe)``. 580 581 Example: ``quote('/El Niño/')`` yields ``'/El%20Ni%C3%B1o/'``. 582 583 584.. function:: quote_plus(string, safe='', encoding=None, errors=None) 585 586 Like :func:`quote`, but also replace spaces with plus signs, as required for 587 quoting HTML form values when building up a query string to go into a URL. 588 Plus signs in the original string are escaped unless they are included in 589 *safe*. It also does not have *safe* default to ``'/'``. 590 591 Example: ``quote_plus('/El Niño/')`` yields ``'%2FEl+Ni%C3%B1o%2F'``. 592 593 594.. function:: quote_from_bytes(bytes, safe='/') 595 596 Like :func:`quote`, but accepts a :class:`bytes` object rather than a 597 :class:`str`, and does not perform string-to-bytes encoding. 598 599 Example: ``quote_from_bytes(b'a&\xef')`` yields 600 ``'a%26%EF'``. 601 602 603.. function:: unquote(string, encoding='utf-8', errors='replace') 604 605 Replace ``%xx`` escapes with their single-character equivalent. 606 The optional *encoding* and *errors* parameters specify how to decode 607 percent-encoded sequences into Unicode characters, as accepted by the 608 :meth:`bytes.decode` method. 609 610 *string* may be either a :class:`str` or a :class:`bytes` object. 611 612 *encoding* defaults to ``'utf-8'``. 613 *errors* defaults to ``'replace'``, meaning invalid sequences are replaced 614 by a placeholder character. 615 616 Example: ``unquote('/El%20Ni%C3%B1o/')`` yields ``'/El Niño/'``. 617 618 .. versionchanged:: 3.9 619 *string* parameter supports bytes and str objects (previously only str). 620 621 622 623 624.. function:: unquote_plus(string, encoding='utf-8', errors='replace') 625 626 Like :func:`unquote`, but also replace plus signs with spaces, as required 627 for unquoting HTML form values. 628 629 *string* must be a :class:`str`. 630 631 Example: ``unquote_plus('/El+Ni%C3%B1o/')`` yields ``'/El Niño/'``. 632 633 634.. function:: unquote_to_bytes(string) 635 636 Replace ``%xx`` escapes with their single-octet equivalent, and return a 637 :class:`bytes` object. 638 639 *string* may be either a :class:`str` or a :class:`bytes` object. 640 641 If it is a :class:`str`, unescaped non-ASCII characters in *string* 642 are encoded into UTF-8 bytes. 643 644 Example: ``unquote_to_bytes('a%26%EF')`` yields ``b'a&\xef'``. 645 646 647.. function:: urlencode(query, doseq=False, safe='', encoding=None, \ 648 errors=None, quote_via=quote_plus) 649 650 Convert a mapping object or a sequence of two-element tuples, which may 651 contain :class:`str` or :class:`bytes` objects, to a percent-encoded ASCII 652 text string. If the resultant string is to be used as a *data* for POST 653 operation with the :func:`~urllib.request.urlopen` function, then 654 it should be encoded to bytes, otherwise it would result in a 655 :exc:`TypeError`. 656 657 The resulting string is a series of ``key=value`` pairs separated by ``'&'`` 658 characters, where both *key* and *value* are quoted using the *quote_via* 659 function. By default, :func:`quote_plus` is used to quote the values, which 660 means spaces are quoted as a ``'+'`` character and '/' characters are 661 encoded as ``%2F``, which follows the standard for GET requests 662 (``application/x-www-form-urlencoded``). An alternate function that can be 663 passed as *quote_via* is :func:`quote`, which will encode spaces as ``%20`` 664 and not encode '/' characters. For maximum control of what is quoted, use 665 ``quote`` and specify a value for *safe*. 666 667 When a sequence of two-element tuples is used as the *query* 668 argument, the first element of each tuple is a key and the second is a 669 value. The value element in itself can be a sequence and in that case, if 670 the optional parameter *doseq* evaluates to ``True``, individual 671 ``key=value`` pairs separated by ``'&'`` are generated for each element of 672 the value sequence for the key. The order of parameters in the encoded 673 string will match the order of parameter tuples in the sequence. 674 675 The *safe*, *encoding*, and *errors* parameters are passed down to 676 *quote_via* (the *encoding* and *errors* parameters are only passed 677 when a query element is a :class:`str`). 678 679 To reverse this encoding process, :func:`parse_qs` and :func:`parse_qsl` are 680 provided in this module to parse query strings into Python data structures. 681 682 Refer to :ref:`urllib examples <urllib-examples>` to find out how the 683 :func:`urllib.parse.urlencode` method can be used for generating the query 684 string of a URL or data for a POST request. 685 686 .. versionchanged:: 3.2 687 *query* supports bytes and string objects. 688 689 .. versionadded:: 3.5 690 *quote_via* parameter. 691 692 693.. seealso:: 694 695 `WHATWG`_ - URL Living standard 696 Working Group for the URL Standard that defines URLs, domains, IP addresses, the 697 application/x-www-form-urlencoded format, and their API. 698 699 :rfc:`3986` - Uniform Resource Identifiers 700 This is the current standard (STD66). Any changes to urllib.parse module 701 should conform to this. Certain deviations could be observed, which are 702 mostly for backward compatibility purposes and for certain de-facto 703 parsing requirements as commonly observed in major browsers. 704 705 :rfc:`2732` - Format for Literal IPv6 Addresses in URL's. 706 This specifies the parsing requirements of IPv6 URLs. 707 708 :rfc:`2396` - Uniform Resource Identifiers (URI): Generic Syntax 709 Document describing the generic syntactic requirements for both Uniform Resource 710 Names (URNs) and Uniform Resource Locators (URLs). 711 712 :rfc:`2368` - The mailto URL scheme. 713 Parsing requirements for mailto URL schemes. 714 715 :rfc:`1808` - Relative Uniform Resource Locators 716 This Request For Comments includes the rules for joining an absolute and a 717 relative URL, including a fair number of "Abnormal Examples" which govern the 718 treatment of border cases. 719 720 :rfc:`1738` - Uniform Resource Locators (URL) 721 This specifies the formal syntax and semantics of absolute URLs. 722 723.. _WHATWG: https://url.spec.whatwg.org/ 724