1:mod:`urllib.parse` --- Parse URLs into components 2================================================== 3 4.. module:: urllib.parse 5 :synopsis: Parse URLs into or assemble them from components. 6 7**Source code:** :source:`Lib/urllib/parse.py` 8 9.. index:: 10 single: WWW 11 single: World Wide Web 12 single: URL 13 pair: URL; parsing 14 pair: relative; URL 15 16-------------- 17 18This module defines a standard interface to break Uniform Resource Locator (URL) 19strings up in components (addressing scheme, network location, path etc.), to 20combine the components back into a URL string, and to convert a "relative URL" 21to an absolute URL given a "base URL." 22 23The module has been designed to match the Internet RFC on Relative Uniform 24Resource Locators. It supports the following URL schemes: ``file``, ``ftp``, 25``gopher``, ``hdl``, ``http``, ``https``, ``imap``, ``mailto``, ``mms``, 26``news``, ``nntp``, ``prospero``, ``rsync``, ``rtsp``, ``rtspu``, ``sftp``, 27``shttp``, ``sip``, ``sips``, ``snews``, ``svn``, ``svn+ssh``, ``telnet``, 28``wais``, ``ws``, ``wss``. 29 30The :mod:`urllib.parse` module defines functions that fall into two broad 31categories: URL parsing and URL quoting. These are covered in detail in 32the following sections. 33 34URL Parsing 35----------- 36 37The URL parsing functions focus on splitting a URL string into its components, 38or on combining URL components into a URL string. 39 40.. function:: urlparse(urlstring, scheme='', allow_fragments=True) 41 42 Parse a URL into six components, returning a 6-item :term:`named tuple`. This 43 corresponds to the general structure of a URL: 44 ``scheme://netloc/path;parameters?query#fragment``. 45 Each tuple item is a string, possibly empty. The components are not broken up 46 into smaller parts (for example, the network location is a single string), and % 47 escapes are not expanded. The delimiters as shown above are not part of the 48 result, except for a leading slash in the *path* component, which is retained if 49 present. For example: 50 51 >>> from urllib.parse import urlparse 52 >>> o = urlparse('http://www.cwi.nl:80/%7Eguido/Python.html') 53 >>> o # doctest: +NORMALIZE_WHITESPACE 54 ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html', 55 params='', query='', fragment='') 56 >>> o.scheme 57 'http' 58 >>> o.port 59 80 60 >>> o.geturl() 61 'http://www.cwi.nl:80/%7Eguido/Python.html' 62 63 Following the syntax specifications in :rfc:`1808`, urlparse recognizes 64 a netloc only if it is properly introduced by '//'. Otherwise the 65 input is presumed to be a relative URL and thus to start with 66 a path component. 67 68 .. doctest:: 69 :options: +NORMALIZE_WHITESPACE 70 71 >>> from urllib.parse import urlparse 72 >>> urlparse('//www.cwi.nl:80/%7Eguido/Python.html') 73 ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html', 74 params='', query='', fragment='') 75 >>> urlparse('www.cwi.nl/%7Eguido/Python.html') 76 ParseResult(scheme='', netloc='', path='www.cwi.nl/%7Eguido/Python.html', 77 params='', query='', fragment='') 78 >>> urlparse('help/Python.html') 79 ParseResult(scheme='', netloc='', path='help/Python.html', params='', 80 query='', fragment='') 81 82 The *scheme* argument gives the default addressing scheme, to be 83 used only if the URL does not specify one. It should be the same type 84 (text or bytes) as *urlstring*, except that the default value ``''`` is 85 always allowed, and is automatically converted to ``b''`` if appropriate. 86 87 If the *allow_fragments* argument is false, fragment identifiers are not 88 recognized. Instead, they are parsed as part of the path, parameters 89 or query component, and :attr:`fragment` is set to the empty string in 90 the return value. 91 92 The return value is a :term:`named tuple`, which means that its items can 93 be accessed by index or as named attributes, which are: 94 95 +------------------+-------+--------------------------+----------------------+ 96 | Attribute | Index | Value | Value if not present | 97 +==================+=======+==========================+======================+ 98 | :attr:`scheme` | 0 | URL scheme specifier | *scheme* parameter | 99 +------------------+-------+--------------------------+----------------------+ 100 | :attr:`netloc` | 1 | Network location part | empty string | 101 +------------------+-------+--------------------------+----------------------+ 102 | :attr:`path` | 2 | Hierarchical path | empty string | 103 +------------------+-------+--------------------------+----------------------+ 104 | :attr:`params` | 3 | Parameters for last path | empty string | 105 | | | element | | 106 +------------------+-------+--------------------------+----------------------+ 107 | :attr:`query` | 4 | Query component | empty string | 108 +------------------+-------+--------------------------+----------------------+ 109 | :attr:`fragment` | 5 | Fragment identifier | empty string | 110 +------------------+-------+--------------------------+----------------------+ 111 | :attr:`username` | | User name | :const:`None` | 112 +------------------+-------+--------------------------+----------------------+ 113 | :attr:`password` | | Password | :const:`None` | 114 +------------------+-------+--------------------------+----------------------+ 115 | :attr:`hostname` | | Host name (lower case) | :const:`None` | 116 +------------------+-------+--------------------------+----------------------+ 117 | :attr:`port` | | Port number as integer, | :const:`None` | 118 | | | if present | | 119 +------------------+-------+--------------------------+----------------------+ 120 121 Reading the :attr:`port` attribute will raise a :exc:`ValueError` if 122 an invalid port is specified in the URL. See section 123 :ref:`urlparse-result-object` for more information on the result object. 124 125 Unmatched square brackets in the :attr:`netloc` attribute will raise a 126 :exc:`ValueError`. 127 128 Characters in the :attr:`netloc` attribute that decompose under NFKC 129 normalization (as used by the IDNA encoding) into any of ``/``, ``?``, 130 ``#``, ``@``, or ``:`` will raise a :exc:`ValueError`. If the URL is 131 decomposed before parsing, no error will be raised. 132 133 As is the case with all named tuples, the subclass has a few additional methods 134 and attributes that are particularly useful. One such method is :meth:`_replace`. 135 The :meth:`_replace` method will return a new ParseResult object replacing specified 136 fields with new values. 137 138 .. doctest:: 139 :options: +NORMALIZE_WHITESPACE 140 141 >>> from urllib.parse import urlparse 142 >>> u = urlparse('//www.cwi.nl:80/%7Eguido/Python.html') 143 >>> u 144 ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html', 145 params='', query='', fragment='') 146 >>> u._replace(scheme='http') 147 ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html', 148 params='', query='', fragment='') 149 150 151 .. versionchanged:: 3.2 152 Added IPv6 URL parsing capabilities. 153 154 .. versionchanged:: 3.3 155 The fragment is now parsed for all URL schemes (unless *allow_fragment* is 156 false), in accordance with :rfc:`3986`. Previously, a whitelist of 157 schemes that support fragments existed. 158 159 .. versionchanged:: 3.6 160 Out-of-range port numbers now raise :exc:`ValueError`, instead of 161 returning :const:`None`. 162 163 .. versionchanged:: 3.8 164 Characters that affect netloc parsing under NFKC normalization will 165 now raise :exc:`ValueError`. 166 167 168.. function:: parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None) 169 170 Parse a query string given as a string argument (data of type 171 :mimetype:`application/x-www-form-urlencoded`). Data are returned as a 172 dictionary. The dictionary keys are the unique query variable names and the 173 values are lists of values for each name. 174 175 The optional argument *keep_blank_values* is a flag indicating whether blank 176 values in percent-encoded queries should be treated as blank strings. A true value 177 indicates that blanks should be retained as blank strings. The default false 178 value indicates that blank values are to be ignored and treated as if they were 179 not included. 180 181 The optional argument *strict_parsing* is a flag indicating what to do with 182 parsing errors. If false (the default), errors are silently ignored. If true, 183 errors raise a :exc:`ValueError` exception. 184 185 The optional *encoding* and *errors* parameters specify how to decode 186 percent-encoded sequences into Unicode characters, as accepted by the 187 :meth:`bytes.decode` method. 188 189 The optional argument *max_num_fields* is the maximum number of fields to 190 read. If set, then throws a :exc:`ValueError` if there are more than 191 *max_num_fields* fields read. 192 193 Use the :func:`urllib.parse.urlencode` function (with the ``doseq`` 194 parameter set to ``True``) to convert such dictionaries into query 195 strings. 196 197 198 .. versionchanged:: 3.2 199 Add *encoding* and *errors* parameters. 200 201 .. versionchanged:: 3.8 202 Added *max_num_fields* parameter. 203 204 205.. function:: parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None) 206 207 Parse a query string given as a string argument (data of type 208 :mimetype:`application/x-www-form-urlencoded`). Data are returned as a list of 209 name, value pairs. 210 211 The optional argument *keep_blank_values* is a flag indicating whether blank 212 values in percent-encoded queries should be treated as blank strings. A true value 213 indicates that blanks should be retained as blank strings. The default false 214 value indicates that blank values are to be ignored and treated as if they were 215 not included. 216 217 The optional argument *strict_parsing* is a flag indicating what to do with 218 parsing errors. If false (the default), errors are silently ignored. If true, 219 errors raise a :exc:`ValueError` exception. 220 221 The optional *encoding* and *errors* parameters specify how to decode 222 percent-encoded sequences into Unicode characters, as accepted by the 223 :meth:`bytes.decode` method. 224 225 The optional argument *max_num_fields* is the maximum number of fields to 226 read. If set, then throws a :exc:`ValueError` if there are more than 227 *max_num_fields* fields read. 228 229 Use the :func:`urllib.parse.urlencode` function to convert such lists of pairs into 230 query strings. 231 232 .. versionchanged:: 3.2 233 Add *encoding* and *errors* parameters. 234 235 .. versionchanged:: 3.8 236 Added *max_num_fields* parameter. 237 238 239.. function:: urlunparse(parts) 240 241 Construct a URL from a tuple as returned by ``urlparse()``. The *parts* 242 argument can be any six-item iterable. This may result in a slightly 243 different, but equivalent URL, if the URL that was parsed originally had 244 unnecessary delimiters (for example, a ``?`` with an empty query; the RFC 245 states that these are equivalent). 246 247 248.. function:: urlsplit(urlstring, scheme='', allow_fragments=True) 249 250 This is similar to :func:`urlparse`, but does not split the params from the URL. 251 This should generally be used instead of :func:`urlparse` if the more recent URL 252 syntax allowing parameters to be applied to each segment of the *path* portion 253 of the URL (see :rfc:`2396`) is wanted. A separate function is needed to 254 separate the path segments and parameters. This function returns a 5-item 255 :term:`named tuple`:: 256 257 (addressing scheme, network location, path, query, fragment identifier). 258 259 The return value is a :term:`named tuple`, its items can be accessed by index 260 or as named attributes: 261 262 +------------------+-------+-------------------------+----------------------+ 263 | Attribute | Index | Value | Value if not present | 264 +==================+=======+=========================+======================+ 265 | :attr:`scheme` | 0 | URL scheme specifier | *scheme* parameter | 266 +------------------+-------+-------------------------+----------------------+ 267 | :attr:`netloc` | 1 | Network location part | empty string | 268 +------------------+-------+-------------------------+----------------------+ 269 | :attr:`path` | 2 | Hierarchical path | empty string | 270 +------------------+-------+-------------------------+----------------------+ 271 | :attr:`query` | 3 | Query component | empty string | 272 +------------------+-------+-------------------------+----------------------+ 273 | :attr:`fragment` | 4 | Fragment identifier | empty string | 274 +------------------+-------+-------------------------+----------------------+ 275 | :attr:`username` | | User name | :const:`None` | 276 +------------------+-------+-------------------------+----------------------+ 277 | :attr:`password` | | Password | :const:`None` | 278 +------------------+-------+-------------------------+----------------------+ 279 | :attr:`hostname` | | Host name (lower case) | :const:`None` | 280 +------------------+-------+-------------------------+----------------------+ 281 | :attr:`port` | | Port number as integer, | :const:`None` | 282 | | | if present | | 283 +------------------+-------+-------------------------+----------------------+ 284 285 Reading the :attr:`port` attribute will raise a :exc:`ValueError` if 286 an invalid port is specified in the URL. See section 287 :ref:`urlparse-result-object` for more information on the result object. 288 289 Unmatched square brackets in the :attr:`netloc` attribute will raise a 290 :exc:`ValueError`. 291 292 Characters in the :attr:`netloc` attribute that decompose under NFKC 293 normalization (as used by the IDNA encoding) into any of ``/``, ``?``, 294 ``#``, ``@``, or ``:`` will raise a :exc:`ValueError`. If the URL is 295 decomposed before parsing, no error will be raised. 296 297 .. versionchanged:: 3.6 298 Out-of-range port numbers now raise :exc:`ValueError`, instead of 299 returning :const:`None`. 300 301 .. versionchanged:: 3.8 302 Characters that affect netloc parsing under NFKC normalization will 303 now raise :exc:`ValueError`. 304 305 306.. function:: urlunsplit(parts) 307 308 Combine the elements of a tuple as returned by :func:`urlsplit` into a 309 complete URL as a string. The *parts* argument can be any five-item 310 iterable. This may result in a slightly different, but equivalent URL, if the 311 URL that was parsed originally had unnecessary delimiters (for example, a ? 312 with an empty query; the RFC states that these are equivalent). 313 314 315.. function:: urljoin(base, url, allow_fragments=True) 316 317 Construct a full ("absolute") URL by combining a "base URL" (*base*) with 318 another URL (*url*). Informally, this uses components of the base URL, in 319 particular the addressing scheme, the network location and (part of) the 320 path, to provide missing components in the relative URL. For example: 321 322 >>> from urllib.parse import urljoin 323 >>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', 'FAQ.html') 324 'http://www.cwi.nl/%7Eguido/FAQ.html' 325 326 The *allow_fragments* argument has the same meaning and default as for 327 :func:`urlparse`. 328 329 .. note:: 330 331 If *url* is an absolute URL (that is, it starts with ``//`` or ``scheme://``), 332 the *url*'s hostname and/or scheme will be present in the result. For example: 333 334 .. doctest:: 335 336 >>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', 337 ... '//www.python.org/%7Eguido') 338 'http://www.python.org/%7Eguido' 339 340 If you do not want that behavior, preprocess the *url* with :func:`urlsplit` and 341 :func:`urlunsplit`, removing possible *scheme* and *netloc* parts. 342 343 344 .. versionchanged:: 3.5 345 346 Behavior updated to match the semantics defined in :rfc:`3986`. 347 348 349.. function:: urldefrag(url) 350 351 If *url* contains a fragment identifier, return a modified version of *url* 352 with no fragment identifier, and the fragment identifier as a separate 353 string. If there is no fragment identifier in *url*, return *url* unmodified 354 and an empty string. 355 356 The return value is a :term:`named tuple`, its items can be accessed by index 357 or as named attributes: 358 359 +------------------+-------+-------------------------+----------------------+ 360 | Attribute | Index | Value | Value if not present | 361 +==================+=======+=========================+======================+ 362 | :attr:`url` | 0 | URL with no fragment | empty string | 363 +------------------+-------+-------------------------+----------------------+ 364 | :attr:`fragment` | 1 | Fragment identifier | empty string | 365 +------------------+-------+-------------------------+----------------------+ 366 367 See section :ref:`urlparse-result-object` for more information on the result 368 object. 369 370 .. versionchanged:: 3.2 371 Result is a structured object rather than a simple 2-tuple. 372 373.. function:: unwrap(url) 374 375 Extract the url from a wrapped URL (that is, a string formatted as 376 ``<URL:scheme://host/path>``, ``<scheme://host/path>``, ``URL:scheme://host/path`` 377 or ``scheme://host/path``). If *url* is not a wrapped URL, it is returned 378 without changes. 379 380.. _parsing-ascii-encoded-bytes: 381 382Parsing ASCII Encoded Bytes 383--------------------------- 384 385The URL parsing functions were originally designed to operate on character 386strings only. In practice, it is useful to be able to manipulate properly 387quoted and encoded URLs as sequences of ASCII bytes. Accordingly, the 388URL parsing functions in this module all operate on :class:`bytes` and 389:class:`bytearray` objects in addition to :class:`str` objects. 390 391If :class:`str` data is passed in, the result will also contain only 392:class:`str` data. If :class:`bytes` or :class:`bytearray` data is 393passed in, the result will contain only :class:`bytes` data. 394 395Attempting to mix :class:`str` data with :class:`bytes` or 396:class:`bytearray` in a single function call will result in a 397:exc:`TypeError` being raised, while attempting to pass in non-ASCII 398byte values will trigger :exc:`UnicodeDecodeError`. 399 400To support easier conversion of result objects between :class:`str` and 401:class:`bytes`, all return values from URL parsing functions provide 402either an :meth:`encode` method (when the result contains :class:`str` 403data) or a :meth:`decode` method (when the result contains :class:`bytes` 404data). The signatures of these methods match those of the corresponding 405:class:`str` and :class:`bytes` methods (except that the default encoding 406is ``'ascii'`` rather than ``'utf-8'``). Each produces a value of a 407corresponding type that contains either :class:`bytes` data (for 408:meth:`encode` methods) or :class:`str` data (for 409:meth:`decode` methods). 410 411Applications that need to operate on potentially improperly quoted URLs 412that may contain non-ASCII data will need to do their own decoding from 413bytes to characters before invoking the URL parsing methods. 414 415The behaviour described in this section applies only to the URL parsing 416functions. The URL quoting functions use their own rules when producing 417or consuming byte sequences as detailed in the documentation of the 418individual URL quoting functions. 419 420.. versionchanged:: 3.2 421 URL parsing functions now accept ASCII encoded byte sequences 422 423 424.. _urlparse-result-object: 425 426Structured Parse Results 427------------------------ 428 429The result objects from the :func:`urlparse`, :func:`urlsplit` and 430:func:`urldefrag` functions are subclasses of the :class:`tuple` type. 431These subclasses add the attributes listed in the documentation for 432those functions, the encoding and decoding support described in the 433previous section, as well as an additional method: 434 435.. method:: urllib.parse.SplitResult.geturl() 436 437 Return the re-combined version of the original URL as a string. This may 438 differ from the original URL in that the scheme may be normalized to lower 439 case and empty components may be dropped. Specifically, empty parameters, 440 queries, and fragment identifiers will be removed. 441 442 For :func:`urldefrag` results, only empty fragment identifiers will be removed. 443 For :func:`urlsplit` and :func:`urlparse` results, all noted changes will be 444 made to the URL returned by this method. 445 446 The result of this method remains unchanged if passed back through the original 447 parsing function: 448 449 >>> from urllib.parse import urlsplit 450 >>> url = 'HTTP://www.Python.org/doc/#' 451 >>> r1 = urlsplit(url) 452 >>> r1.geturl() 453 'http://www.Python.org/doc/' 454 >>> r2 = urlsplit(r1.geturl()) 455 >>> r2.geturl() 456 'http://www.Python.org/doc/' 457 458 459The following classes provide the implementations of the structured parse 460results when operating on :class:`str` objects: 461 462.. class:: DefragResult(url, fragment) 463 464 Concrete class for :func:`urldefrag` results containing :class:`str` 465 data. The :meth:`encode` method returns a :class:`DefragResultBytes` 466 instance. 467 468 .. versionadded:: 3.2 469 470.. class:: ParseResult(scheme, netloc, path, params, query, fragment) 471 472 Concrete class for :func:`urlparse` results containing :class:`str` 473 data. The :meth:`encode` method returns a :class:`ParseResultBytes` 474 instance. 475 476.. class:: SplitResult(scheme, netloc, path, query, fragment) 477 478 Concrete class for :func:`urlsplit` results containing :class:`str` 479 data. The :meth:`encode` method returns a :class:`SplitResultBytes` 480 instance. 481 482 483The following classes provide the implementations of the parse results when 484operating on :class:`bytes` or :class:`bytearray` objects: 485 486.. class:: DefragResultBytes(url, fragment) 487 488 Concrete class for :func:`urldefrag` results containing :class:`bytes` 489 data. The :meth:`decode` method returns a :class:`DefragResult` 490 instance. 491 492 .. versionadded:: 3.2 493 494.. class:: ParseResultBytes(scheme, netloc, path, params, query, fragment) 495 496 Concrete class for :func:`urlparse` results containing :class:`bytes` 497 data. The :meth:`decode` method returns a :class:`ParseResult` 498 instance. 499 500 .. versionadded:: 3.2 501 502.. class:: SplitResultBytes(scheme, netloc, path, query, fragment) 503 504 Concrete class for :func:`urlsplit` results containing :class:`bytes` 505 data. The :meth:`decode` method returns a :class:`SplitResult` 506 instance. 507 508 .. versionadded:: 3.2 509 510 511URL Quoting 512----------- 513 514The URL quoting functions focus on taking program data and making it safe 515for use as URL components by quoting special characters and appropriately 516encoding non-ASCII text. They also support reversing these operations to 517recreate the original data from the contents of a URL component if that 518task isn't already covered by the URL parsing functions above. 519 520.. function:: quote(string, safe='/', encoding=None, errors=None) 521 522 Replace special characters in *string* using the ``%xx`` escape. Letters, 523 digits, and the characters ``'_.-~'`` are never quoted. By default, this 524 function is intended for quoting the path section of a URL. The optional 525 *safe* parameter specifies additional ASCII characters that should not be 526 quoted --- its default value is ``'/'``. 527 528 *string* may be either a :class:`str` or a :class:`bytes` object. 529 530 .. versionchanged:: 3.7 531 Moved from :rfc:`2396` to :rfc:`3986` for quoting URL strings. "~" is now 532 included in the set of unreserved characters. 533 534 The optional *encoding* and *errors* parameters specify how to deal with 535 non-ASCII characters, as accepted by the :meth:`str.encode` method. 536 *encoding* defaults to ``'utf-8'``. 537 *errors* defaults to ``'strict'``, meaning unsupported characters raise a 538 :class:`UnicodeEncodeError`. 539 *encoding* and *errors* must not be supplied if *string* is a 540 :class:`bytes`, or a :class:`TypeError` is raised. 541 542 Note that ``quote(string, safe, encoding, errors)`` is equivalent to 543 ``quote_from_bytes(string.encode(encoding, errors), safe)``. 544 545 Example: ``quote('/El Niño/')`` yields ``'/El%20Ni%C3%B1o/'``. 546 547 548.. function:: quote_plus(string, safe='', encoding=None, errors=None) 549 550 Like :func:`quote`, but also replace spaces with plus signs, as required for 551 quoting HTML form values when building up a query string to go into a URL. 552 Plus signs in the original string are escaped unless they are included in 553 *safe*. It also does not have *safe* default to ``'/'``. 554 555 Example: ``quote_plus('/El Niño/')`` yields ``'%2FEl+Ni%C3%B1o%2F'``. 556 557 558.. function:: quote_from_bytes(bytes, safe='/') 559 560 Like :func:`quote`, but accepts a :class:`bytes` object rather than a 561 :class:`str`, and does not perform string-to-bytes encoding. 562 563 Example: ``quote_from_bytes(b'a&\xef')`` yields 564 ``'a%26%EF'``. 565 566 567.. function:: unquote(string, encoding='utf-8', errors='replace') 568 569 Replace ``%xx`` escapes with their single-character equivalent. 570 The optional *encoding* and *errors* parameters specify how to decode 571 percent-encoded sequences into Unicode characters, as accepted by the 572 :meth:`bytes.decode` method. 573 574 *string* may be either a :class:`str` or a :class:`bytes` object. 575 576 *encoding* defaults to ``'utf-8'``. 577 *errors* defaults to ``'replace'``, meaning invalid sequences are replaced 578 by a placeholder character. 579 580 Example: ``unquote('/El%20Ni%C3%B1o/')`` yields ``'/El Niño/'``. 581 582 .. versionchanged:: 3.9 583 *string* parameter supports bytes and str objects (previously only str). 584 585 586 587 588.. function:: unquote_plus(string, encoding='utf-8', errors='replace') 589 590 Like :func:`unquote`, but also replace plus signs with spaces, as required 591 for unquoting HTML form values. 592 593 *string* must be a :class:`str`. 594 595 Example: ``unquote_plus('/El+Ni%C3%B1o/')`` yields ``'/El Niño/'``. 596 597 598.. function:: unquote_to_bytes(string) 599 600 Replace ``%xx`` escapes with their single-octet equivalent, and return a 601 :class:`bytes` object. 602 603 *string* may be either a :class:`str` or a :class:`bytes` object. 604 605 If it is a :class:`str`, unescaped non-ASCII characters in *string* 606 are encoded into UTF-8 bytes. 607 608 Example: ``unquote_to_bytes('a%26%EF')`` yields ``b'a&\xef'``. 609 610 611.. function:: urlencode(query, doseq=False, safe='', encoding=None, \ 612 errors=None, quote_via=quote_plus) 613 614 Convert a mapping object or a sequence of two-element tuples, which may 615 contain :class:`str` or :class:`bytes` objects, to a percent-encoded ASCII 616 text string. If the resultant string is to be used as a *data* for POST 617 operation with the :func:`~urllib.request.urlopen` function, then 618 it should be encoded to bytes, otherwise it would result in a 619 :exc:`TypeError`. 620 621 The resulting string is a series of ``key=value`` pairs separated by ``'&'`` 622 characters, where both *key* and *value* are quoted using the *quote_via* 623 function. By default, :func:`quote_plus` is used to quote the values, which 624 means spaces are quoted as a ``'+'`` character and '/' characters are 625 encoded as ``%2F``, which follows the standard for GET requests 626 (``application/x-www-form-urlencoded``). An alternate function that can be 627 passed as *quote_via* is :func:`quote`, which will encode spaces as ``%20`` 628 and not encode '/' characters. For maximum control of what is quoted, use 629 ``quote`` and specify a value for *safe*. 630 631 When a sequence of two-element tuples is used as the *query* 632 argument, the first element of each tuple is a key and the second is a 633 value. The value element in itself can be a sequence and in that case, if 634 the optional parameter *doseq* evaluates to ``True``, individual 635 ``key=value`` pairs separated by ``'&'`` are generated for each element of 636 the value sequence for the key. The order of parameters in the encoded 637 string will match the order of parameter tuples in the sequence. 638 639 The *safe*, *encoding*, and *errors* parameters are passed down to 640 *quote_via* (the *encoding* and *errors* parameters are only passed 641 when a query element is a :class:`str`). 642 643 To reverse this encoding process, :func:`parse_qs` and :func:`parse_qsl` are 644 provided in this module to parse query strings into Python data structures. 645 646 Refer to :ref:`urllib examples <urllib-examples>` to find out how the 647 :func:`urllib.parse.urlencode` method can be used for generating the query 648 string of a URL or data for a POST request. 649 650 .. versionchanged:: 3.2 651 *query* supports bytes and string objects. 652 653 .. versionadded:: 3.5 654 *quote_via* parameter. 655 656 657.. seealso:: 658 659 :rfc:`3986` - Uniform Resource Identifiers 660 This is the current standard (STD66). Any changes to urllib.parse module 661 should conform to this. Certain deviations could be observed, which are 662 mostly for backward compatibility purposes and for certain de-facto 663 parsing requirements as commonly observed in major browsers. 664 665 :rfc:`2732` - Format for Literal IPv6 Addresses in URL's. 666 This specifies the parsing requirements of IPv6 URLs. 667 668 :rfc:`2396` - Uniform Resource Identifiers (URI): Generic Syntax 669 Document describing the generic syntactic requirements for both Uniform Resource 670 Names (URNs) and Uniform Resource Locators (URLs). 671 672 :rfc:`2368` - The mailto URL scheme. 673 Parsing requirements for mailto URL schemes. 674 675 :rfc:`1808` - Relative Uniform Resource Locators 676 This Request For Comments includes the rules for joining an absolute and a 677 relative URL, including a fair number of "Abnormal Examples" which govern the 678 treatment of border cases. 679 680 :rfc:`1738` - Uniform Resource Locators (URL) 681 This specifies the formal syntax and semantics of absolute URLs. 682