• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1:mod:`urllib.parse` --- Parse URLs into components
2==================================================
3
4.. module:: urllib.parse
5   :synopsis: Parse URLs into or assemble them from components.
6
7**Source code:** :source:`Lib/urllib/parse.py`
8
9.. index::
10   single: WWW
11   single: World Wide Web
12   single: URL
13   pair: URL; parsing
14   pair: relative; URL
15
16--------------
17
18This module defines a standard interface to break Uniform Resource Locator (URL)
19strings up in components (addressing scheme, network location, path etc.), to
20combine the components back into a URL string, and to convert a "relative URL"
21to an absolute URL given a "base URL."
22
23The module has been designed to match the internet RFC on Relative Uniform
24Resource Locators. It supports the following URL schemes: ``file``, ``ftp``,
25``gopher``, ``hdl``, ``http``, ``https``, ``imap``, ``mailto``, ``mms``,
26``news``, ``nntp``, ``prospero``, ``rsync``, ``rtsp``, ``rtspu``, ``sftp``,
27``shttp``, ``sip``, ``sips``, ``snews``, ``svn``, ``svn+ssh``, ``telnet``,
28``wais``, ``ws``, ``wss``.
29
30The :mod:`urllib.parse` module defines functions that fall into two broad
31categories: URL parsing and URL quoting. These are covered in detail in
32the following sections.
33
34URL Parsing
35-----------
36
37The URL parsing functions focus on splitting a URL string into its components,
38or on combining URL components into a URL string.
39
40.. function:: urlparse(urlstring, scheme='', allow_fragments=True)
41
42   Parse a URL into six components, returning a 6-item :term:`named tuple`.  This
43   corresponds to the general structure of a URL:
44   ``scheme://netloc/path;parameters?query#fragment``.
45   Each tuple item is a string, possibly empty. The components are not broken up
46   into smaller parts (for example, the network location is a single string), and %
47   escapes are not expanded. The delimiters as shown above are not part of the
48   result, except for a leading slash in the *path* component, which is retained if
49   present.  For example:
50
51   .. doctest::
52      :options: +NORMALIZE_WHITESPACE
53
54      >>> from urllib.parse import urlparse
55      >>> urlparse("scheme://netloc/path;parameters?query#fragment")
56      ParseResult(scheme='scheme', netloc='netloc', path='/path;parameters', params='',
57                  query='query', fragment='fragment')
58      >>> o = urlparse("http://docs.python.org:80/3/library/urllib.parse.html?"
59      ...              "highlight=params#url-parsing")
60      >>> o
61      ParseResult(scheme='http', netloc='docs.python.org:80',
62                  path='/3/library/urllib.parse.html', params='',
63                  query='highlight=params', fragment='url-parsing')
64      >>> o.scheme
65      'http'
66      >>> o.netloc
67      'docs.python.org:80'
68      >>> o.hostname
69      'docs.python.org'
70      >>> o.port
71      80
72      >>> o._replace(fragment="").geturl()
73      'http://docs.python.org:80/3/library/urllib.parse.html?highlight=params'
74
75   Following the syntax specifications in :rfc:`1808`, urlparse recognizes
76   a netloc only if it is properly introduced by '//'.  Otherwise the
77   input is presumed to be a relative URL and thus to start with
78   a path component.
79
80   .. doctest::
81      :options: +NORMALIZE_WHITESPACE
82
83      >>> from urllib.parse import urlparse
84      >>> urlparse('//www.cwi.nl:80/%7Eguido/Python.html')
85      ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',
86                  params='', query='', fragment='')
87      >>> urlparse('www.cwi.nl/%7Eguido/Python.html')
88      ParseResult(scheme='', netloc='', path='www.cwi.nl/%7Eguido/Python.html',
89                  params='', query='', fragment='')
90      >>> urlparse('help/Python.html')
91      ParseResult(scheme='', netloc='', path='help/Python.html', params='',
92                  query='', fragment='')
93
94   The *scheme* argument gives the default addressing scheme, to be
95   used only if the URL does not specify one.  It should be the same type
96   (text or bytes) as *urlstring*, except that the default value ``''`` is
97   always allowed, and is automatically converted to ``b''`` if appropriate.
98
99   If the *allow_fragments* argument is false, fragment identifiers are not
100   recognized.  Instead, they are parsed as part of the path, parameters
101   or query component, and :attr:`fragment` is set to the empty string in
102   the return value.
103
104   The return value is a :term:`named tuple`, which means that its items can
105   be accessed by index or as named attributes, which are:
106
107   +------------------+-------+-------------------------+------------------------+
108   | Attribute        | Index | Value                   | Value if not present   |
109   +==================+=======+=========================+========================+
110   | :attr:`scheme`   | 0     | URL scheme specifier    | *scheme* parameter     |
111   +------------------+-------+-------------------------+------------------------+
112   | :attr:`netloc`   | 1     | Network location part   | empty string           |
113   +------------------+-------+-------------------------+------------------------+
114   | :attr:`path`     | 2     | Hierarchical path       | empty string           |
115   +------------------+-------+-------------------------+------------------------+
116   | :attr:`params`   | 3     | No longer used          | always an empty string |
117   +------------------+-------+-------------------------+------------------------+
118   | :attr:`query`    | 4     | Query component         | empty string           |
119   +------------------+-------+-------------------------+------------------------+
120   | :attr:`fragment` | 5     | Fragment identifier     | empty string           |
121   +------------------+-------+-------------------------+------------------------+
122   | :attr:`username` |       | User name               | :const:`None`          |
123   +------------------+-------+-------------------------+------------------------+
124   | :attr:`password` |       | Password                | :const:`None`          |
125   +------------------+-------+-------------------------+------------------------+
126   | :attr:`hostname` |       | Host name (lower case)  | :const:`None`          |
127   +------------------+-------+-------------------------+------------------------+
128   | :attr:`port`     |       | Port number as integer, | :const:`None`          |
129   |                  |       | if present              |                        |
130   +------------------+-------+-------------------------+------------------------+
131
132   Reading the :attr:`port` attribute will raise a :exc:`ValueError` if
133   an invalid port is specified in the URL.  See section
134   :ref:`urlparse-result-object` for more information on the result object.
135
136   Unmatched square brackets in the :attr:`netloc` attribute will raise a
137   :exc:`ValueError`.
138
139   Characters in the :attr:`netloc` attribute that decompose under NFKC
140   normalization (as used by the IDNA encoding) into any of ``/``, ``?``,
141   ``#``, ``@``, or ``:`` will raise a :exc:`ValueError`. If the URL is
142   decomposed before parsing, no error will be raised.
143
144   As is the case with all named tuples, the subclass has a few additional methods
145   and attributes that are particularly useful. One such method is :meth:`_replace`.
146   The :meth:`_replace` method will return a new ParseResult object replacing specified
147   fields with new values.
148
149   .. doctest::
150      :options: +NORMALIZE_WHITESPACE
151
152      >>> from urllib.parse import urlparse
153      >>> u = urlparse('//www.cwi.nl:80/%7Eguido/Python.html')
154      >>> u
155      ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',
156                  params='', query='', fragment='')
157      >>> u._replace(scheme='http')
158      ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',
159                  params='', query='', fragment='')
160
161
162   .. versionchanged:: 3.2
163      Added IPv6 URL parsing capabilities.
164
165   .. versionchanged:: 3.3
166      The fragment is now parsed for all URL schemes (unless *allow_fragment* is
167      false), in accordance with :rfc:`3986`.  Previously, an allowlist of
168      schemes that support fragments existed.
169
170   .. versionchanged:: 3.6
171      Out-of-range port numbers now raise :exc:`ValueError`, instead of
172      returning :const:`None`.
173
174   .. versionchanged:: 3.8
175      Characters that affect netloc parsing under NFKC normalization will
176      now raise :exc:`ValueError`.
177
178
179.. function:: parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None, separator='&')
180
181   Parse a query string given as a string argument (data of type
182   :mimetype:`application/x-www-form-urlencoded`).  Data are returned as a
183   dictionary.  The dictionary keys are the unique query variable names and the
184   values are lists of values for each name.
185
186   The optional argument *keep_blank_values* is a flag indicating whether blank
187   values in percent-encoded queries should be treated as blank strings. A true value
188   indicates that blanks should be retained as  blank strings.  The default false
189   value indicates that blank values are to be ignored and treated as if they were
190   not included.
191
192   The optional argument *strict_parsing* is a flag indicating what to do with
193   parsing errors.  If false (the default), errors are silently ignored.  If true,
194   errors raise a :exc:`ValueError` exception.
195
196   The optional *encoding* and *errors* parameters specify how to decode
197   percent-encoded sequences into Unicode characters, as accepted by the
198   :meth:`bytes.decode` method.
199
200   The optional argument *max_num_fields* is the maximum number of fields to
201   read. If set, then throws a :exc:`ValueError` if there are more than
202   *max_num_fields* fields read.
203
204   The optional argument *separator* is the symbol to use for separating the
205   query arguments. It defaults to ``&``.
206
207   Use the :func:`urllib.parse.urlencode` function (with the ``doseq``
208   parameter set to ``True``) to convert such dictionaries into query
209   strings.
210
211
212   .. versionchanged:: 3.2
213      Add *encoding* and *errors* parameters.
214
215   .. versionchanged:: 3.8
216      Added *max_num_fields* parameter.
217
218   .. versionchanged:: 3.10
219      Added *separator* parameter with the default value of ``&``. Python
220      versions earlier than Python 3.10 allowed using both ``;`` and ``&`` as
221      query parameter separator. This has been changed to allow only a single
222      separator key, with ``&`` as the default separator.
223
224
225.. function:: parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None, separator='&')
226
227   Parse a query string given as a string argument (data of type
228   :mimetype:`application/x-www-form-urlencoded`).  Data are returned as a list of
229   name, value pairs.
230
231   The optional argument *keep_blank_values* is a flag indicating whether blank
232   values in percent-encoded queries should be treated as blank strings. A true value
233   indicates that blanks should be retained as  blank strings.  The default false
234   value indicates that blank values are to be ignored and treated as if they were
235   not included.
236
237   The optional argument *strict_parsing* is a flag indicating what to do with
238   parsing errors.  If false (the default), errors are silently ignored.  If true,
239   errors raise a :exc:`ValueError` exception.
240
241   The optional *encoding* and *errors* parameters specify how to decode
242   percent-encoded sequences into Unicode characters, as accepted by the
243   :meth:`bytes.decode` method.
244
245   The optional argument *max_num_fields* is the maximum number of fields to
246   read. If set, then throws a :exc:`ValueError` if there are more than
247   *max_num_fields* fields read.
248
249   The optional argument *separator* is the symbol to use for separating the
250   query arguments. It defaults to ``&``.
251
252   Use the :func:`urllib.parse.urlencode` function to convert such lists of pairs into
253   query strings.
254
255   .. versionchanged:: 3.2
256      Add *encoding* and *errors* parameters.
257
258   .. versionchanged:: 3.8
259      Added *max_num_fields* parameter.
260
261   .. versionchanged:: 3.10
262      Added *separator* parameter with the default value of ``&``. Python
263      versions earlier than Python 3.10 allowed using both ``;`` and ``&`` as
264      query parameter separator. This has been changed to allow only a single
265      separator key, with ``&`` as the default separator.
266
267
268.. function:: urlunparse(parts)
269
270   Construct a URL from a tuple as returned by ``urlparse()``. The *parts*
271   argument can be any six-item iterable. This may result in a slightly
272   different, but equivalent URL, if the URL that was parsed originally had
273   unnecessary delimiters (for example, a ``?`` with an empty query; the RFC
274   states that these are equivalent).
275
276
277.. function:: urlsplit(urlstring, scheme='', allow_fragments=True)
278
279   This is similar to :func:`urlparse`, but does not split the params from the URL.
280   This should generally be used instead of :func:`urlparse` if the more recent URL
281   syntax allowing parameters to be applied to each segment of the *path* portion
282   of the URL (see :rfc:`2396`) is wanted.  A separate function is needed to
283   separate the path segments and parameters.  This function returns a 5-item
284   :term:`named tuple`::
285
286      (addressing scheme, network location, path, query, fragment identifier).
287
288   The return value is a :term:`named tuple`, its items can be accessed by index
289   or as named attributes:
290
291   +------------------+-------+-------------------------+----------------------+
292   | Attribute        | Index | Value                   | Value if not present |
293   +==================+=======+=========================+======================+
294   | :attr:`scheme`   | 0     | URL scheme specifier    | *scheme* parameter   |
295   +------------------+-------+-------------------------+----------------------+
296   | :attr:`netloc`   | 1     | Network location part   | empty string         |
297   +------------------+-------+-------------------------+----------------------+
298   | :attr:`path`     | 2     | Hierarchical path       | empty string         |
299   +------------------+-------+-------------------------+----------------------+
300   | :attr:`query`    | 3     | Query component         | empty string         |
301   +------------------+-------+-------------------------+----------------------+
302   | :attr:`fragment` | 4     | Fragment identifier     | empty string         |
303   +------------------+-------+-------------------------+----------------------+
304   | :attr:`username` |       | User name               | :const:`None`        |
305   +------------------+-------+-------------------------+----------------------+
306   | :attr:`password` |       | Password                | :const:`None`        |
307   +------------------+-------+-------------------------+----------------------+
308   | :attr:`hostname` |       | Host name (lower case)  | :const:`None`        |
309   +------------------+-------+-------------------------+----------------------+
310   | :attr:`port`     |       | Port number as integer, | :const:`None`        |
311   |                  |       | if present              |                      |
312   +------------------+-------+-------------------------+----------------------+
313
314   Reading the :attr:`port` attribute will raise a :exc:`ValueError` if
315   an invalid port is specified in the URL.  See section
316   :ref:`urlparse-result-object` for more information on the result object.
317
318   Unmatched square brackets in the :attr:`netloc` attribute will raise a
319   :exc:`ValueError`.
320
321   Characters in the :attr:`netloc` attribute that decompose under NFKC
322   normalization (as used by the IDNA encoding) into any of ``/``, ``?``,
323   ``#``, ``@``, or ``:`` will raise a :exc:`ValueError`. If the URL is
324   decomposed before parsing, no error will be raised.
325
326   Following the `WHATWG spec`_ that updates RFC 3986, ASCII newline
327   ``\n``, ``\r`` and tab ``\t`` characters are stripped from the URL.
328
329   .. versionchanged:: 3.6
330      Out-of-range port numbers now raise :exc:`ValueError`, instead of
331      returning :const:`None`.
332
333   .. versionchanged:: 3.8
334      Characters that affect netloc parsing under NFKC normalization will
335      now raise :exc:`ValueError`.
336
337   .. versionchanged:: 3.10
338      ASCII newline and tab characters are stripped from the URL.
339
340.. _WHATWG spec: https://url.spec.whatwg.org/#concept-basic-url-parser
341
342.. function:: urlunsplit(parts)
343
344   Combine the elements of a tuple as returned by :func:`urlsplit` into a
345   complete URL as a string. The *parts* argument can be any five-item
346   iterable. This may result in a slightly different, but equivalent URL, if the
347   URL that was parsed originally had unnecessary delimiters (for example, a ?
348   with an empty query; the RFC states that these are equivalent).
349
350
351.. function:: urljoin(base, url, allow_fragments=True)
352
353   Construct a full ("absolute") URL by combining a "base URL" (*base*) with
354   another URL (*url*).  Informally, this uses components of the base URL, in
355   particular the addressing scheme, the network location and (part of) the
356   path, to provide missing components in the relative URL.  For example:
357
358      >>> from urllib.parse import urljoin
359      >>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', 'FAQ.html')
360      'http://www.cwi.nl/%7Eguido/FAQ.html'
361
362   The *allow_fragments* argument has the same meaning and default as for
363   :func:`urlparse`.
364
365   .. note::
366
367      If *url* is an absolute URL (that is, it starts with ``//`` or ``scheme://``),
368      the *url*'s hostname and/or scheme will be present in the result.  For example:
369
370      .. doctest::
371
372         >>> urljoin('http://www.cwi.nl/%7Eguido/Python.html',
373         ...         '//www.python.org/%7Eguido')
374         'http://www.python.org/%7Eguido'
375
376      If you do not want that behavior, preprocess the *url* with :func:`urlsplit` and
377      :func:`urlunsplit`, removing possible *scheme* and *netloc* parts.
378
379
380   .. versionchanged:: 3.5
381
382      Behavior updated to match the semantics defined in :rfc:`3986`.
383
384
385.. function:: urldefrag(url)
386
387   If *url* contains a fragment identifier, return a modified version of *url*
388   with no fragment identifier, and the fragment identifier as a separate
389   string.  If there is no fragment identifier in *url*, return *url* unmodified
390   and an empty string.
391
392   The return value is a :term:`named tuple`, its items can be accessed by index
393   or as named attributes:
394
395   +------------------+-------+-------------------------+----------------------+
396   | Attribute        | Index | Value                   | Value if not present |
397   +==================+=======+=========================+======================+
398   | :attr:`url`      | 0     | URL with no fragment    | empty string         |
399   +------------------+-------+-------------------------+----------------------+
400   | :attr:`fragment` | 1     | Fragment identifier     | empty string         |
401   +------------------+-------+-------------------------+----------------------+
402
403   See section :ref:`urlparse-result-object` for more information on the result
404   object.
405
406   .. versionchanged:: 3.2
407      Result is a structured object rather than a simple 2-tuple.
408
409.. function:: unwrap(url)
410
411   Extract the url from a wrapped URL (that is, a string formatted as
412   ``<URL:scheme://host/path>``, ``<scheme://host/path>``, ``URL:scheme://host/path``
413   or ``scheme://host/path``). If *url* is not a wrapped URL, it is returned
414   without changes.
415
416.. _parsing-ascii-encoded-bytes:
417
418Parsing ASCII Encoded Bytes
419---------------------------
420
421The URL parsing functions were originally designed to operate on character
422strings only. In practice, it is useful to be able to manipulate properly
423quoted and encoded URLs as sequences of ASCII bytes. Accordingly, the
424URL parsing functions in this module all operate on :class:`bytes` and
425:class:`bytearray` objects in addition to :class:`str` objects.
426
427If :class:`str` data is passed in, the result will also contain only
428:class:`str` data. If :class:`bytes` or :class:`bytearray` data is
429passed in, the result will contain only :class:`bytes` data.
430
431Attempting to mix :class:`str` data with :class:`bytes` or
432:class:`bytearray` in a single function call will result in a
433:exc:`TypeError` being raised, while attempting to pass in non-ASCII
434byte values will trigger :exc:`UnicodeDecodeError`.
435
436To support easier conversion of result objects between :class:`str` and
437:class:`bytes`, all return values from URL parsing functions provide
438either an :meth:`encode` method (when the result contains :class:`str`
439data) or a :meth:`decode` method (when the result contains :class:`bytes`
440data). The signatures of these methods match those of the corresponding
441:class:`str` and :class:`bytes` methods (except that the default encoding
442is ``'ascii'`` rather than ``'utf-8'``). Each produces a value of a
443corresponding type that contains either :class:`bytes` data (for
444:meth:`encode` methods) or :class:`str` data (for
445:meth:`decode` methods).
446
447Applications that need to operate on potentially improperly quoted URLs
448that may contain non-ASCII data will need to do their own decoding from
449bytes to characters before invoking the URL parsing methods.
450
451The behaviour described in this section applies only to the URL parsing
452functions. The URL quoting functions use their own rules when producing
453or consuming byte sequences as detailed in the documentation of the
454individual URL quoting functions.
455
456.. versionchanged:: 3.2
457   URL parsing functions now accept ASCII encoded byte sequences
458
459
460.. _urlparse-result-object:
461
462Structured Parse Results
463------------------------
464
465The result objects from the :func:`urlparse`, :func:`urlsplit`  and
466:func:`urldefrag` functions are subclasses of the :class:`tuple` type.
467These subclasses add the attributes listed in the documentation for
468those functions, the encoding and decoding support described in the
469previous section, as well as an additional method:
470
471.. method:: urllib.parse.SplitResult.geturl()
472
473   Return the re-combined version of the original URL as a string. This may
474   differ from the original URL in that the scheme may be normalized to lower
475   case and empty components may be dropped. Specifically, empty parameters,
476   queries, and fragment identifiers will be removed.
477
478   For :func:`urldefrag` results, only empty fragment identifiers will be removed.
479   For :func:`urlsplit` and :func:`urlparse` results, all noted changes will be
480   made to the URL returned by this method.
481
482   The result of this method remains unchanged if passed back through the original
483   parsing function:
484
485      >>> from urllib.parse import urlsplit
486      >>> url = 'HTTP://www.Python.org/doc/#'
487      >>> r1 = urlsplit(url)
488      >>> r1.geturl()
489      'http://www.Python.org/doc/'
490      >>> r2 = urlsplit(r1.geturl())
491      >>> r2.geturl()
492      'http://www.Python.org/doc/'
493
494
495The following classes provide the implementations of the structured parse
496results when operating on :class:`str` objects:
497
498.. class:: DefragResult(url, fragment)
499
500   Concrete class for :func:`urldefrag` results containing :class:`str`
501   data. The :meth:`encode` method returns a :class:`DefragResultBytes`
502   instance.
503
504   .. versionadded:: 3.2
505
506.. class:: ParseResult(scheme, netloc, path, params, query, fragment)
507
508   Concrete class for :func:`urlparse` results containing :class:`str`
509   data. The :meth:`encode` method returns a :class:`ParseResultBytes`
510   instance.
511
512.. class:: SplitResult(scheme, netloc, path, query, fragment)
513
514   Concrete class for :func:`urlsplit` results containing :class:`str`
515   data. The :meth:`encode` method returns a :class:`SplitResultBytes`
516   instance.
517
518
519The following classes provide the implementations of the parse results when
520operating on :class:`bytes` or :class:`bytearray` objects:
521
522.. class:: DefragResultBytes(url, fragment)
523
524   Concrete class for :func:`urldefrag` results containing :class:`bytes`
525   data. The :meth:`decode` method returns a :class:`DefragResult`
526   instance.
527
528   .. versionadded:: 3.2
529
530.. class:: ParseResultBytes(scheme, netloc, path, params, query, fragment)
531
532   Concrete class for :func:`urlparse` results containing :class:`bytes`
533   data. The :meth:`decode` method returns a :class:`ParseResult`
534   instance.
535
536   .. versionadded:: 3.2
537
538.. class:: SplitResultBytes(scheme, netloc, path, query, fragment)
539
540   Concrete class for :func:`urlsplit` results containing :class:`bytes`
541   data. The :meth:`decode` method returns a :class:`SplitResult`
542   instance.
543
544   .. versionadded:: 3.2
545
546
547URL Quoting
548-----------
549
550The URL quoting functions focus on taking program data and making it safe
551for use as URL components by quoting special characters and appropriately
552encoding non-ASCII text. They also support reversing these operations to
553recreate the original data from the contents of a URL component if that
554task isn't already covered by the URL parsing functions above.
555
556.. function:: quote(string, safe='/', encoding=None, errors=None)
557
558   Replace special characters in *string* using the ``%xx`` escape. Letters,
559   digits, and the characters ``'_.-~'`` are never quoted. By default, this
560   function is intended for quoting the path section of a URL. The optional
561   *safe* parameter specifies additional ASCII characters that should not be
562   quoted --- its default value is ``'/'``.
563
564   *string* may be either a :class:`str` or a :class:`bytes` object.
565
566   .. versionchanged:: 3.7
567      Moved from :rfc:`2396` to :rfc:`3986` for quoting URL strings. "~" is now
568      included in the set of unreserved characters.
569
570   The optional *encoding* and *errors* parameters specify how to deal with
571   non-ASCII characters, as accepted by the :meth:`str.encode` method.
572   *encoding* defaults to ``'utf-8'``.
573   *errors* defaults to ``'strict'``, meaning unsupported characters raise a
574   :class:`UnicodeEncodeError`.
575   *encoding* and *errors* must not be supplied if *string* is a
576   :class:`bytes`, or a :class:`TypeError` is raised.
577
578   Note that ``quote(string, safe, encoding, errors)`` is equivalent to
579   ``quote_from_bytes(string.encode(encoding, errors), safe)``.
580
581   Example: ``quote('/El Niño/')`` yields ``'/El%20Ni%C3%B1o/'``.
582
583
584.. function:: quote_plus(string, safe='', encoding=None, errors=None)
585
586   Like :func:`quote`, but also replace spaces with plus signs, as required for
587   quoting HTML form values when building up a query string to go into a URL.
588   Plus signs in the original string are escaped unless they are included in
589   *safe*.  It also does not have *safe* default to ``'/'``.
590
591   Example: ``quote_plus('/El Niño/')`` yields ``'%2FEl+Ni%C3%B1o%2F'``.
592
593
594.. function:: quote_from_bytes(bytes, safe='/')
595
596   Like :func:`quote`, but accepts a :class:`bytes` object rather than a
597   :class:`str`, and does not perform string-to-bytes encoding.
598
599   Example: ``quote_from_bytes(b'a&\xef')`` yields
600   ``'a%26%EF'``.
601
602
603.. function:: unquote(string, encoding='utf-8', errors='replace')
604
605   Replace ``%xx`` escapes with their single-character equivalent.
606   The optional *encoding* and *errors* parameters specify how to decode
607   percent-encoded sequences into Unicode characters, as accepted by the
608   :meth:`bytes.decode` method.
609
610   *string* may be either a :class:`str` or a :class:`bytes` object.
611
612   *encoding* defaults to ``'utf-8'``.
613   *errors* defaults to ``'replace'``, meaning invalid sequences are replaced
614   by a placeholder character.
615
616   Example: ``unquote('/El%20Ni%C3%B1o/')`` yields ``'/El Niño/'``.
617
618   .. versionchanged:: 3.9
619      *string* parameter supports bytes and str objects (previously only str).
620
621
622
623
624.. function:: unquote_plus(string, encoding='utf-8', errors='replace')
625
626   Like :func:`unquote`, but also replace plus signs with spaces, as required
627   for unquoting HTML form values.
628
629   *string* must be a :class:`str`.
630
631   Example: ``unquote_plus('/El+Ni%C3%B1o/')`` yields ``'/El Niño/'``.
632
633
634.. function:: unquote_to_bytes(string)
635
636   Replace ``%xx`` escapes with their single-octet equivalent, and return a
637   :class:`bytes` object.
638
639   *string* may be either a :class:`str` or a :class:`bytes` object.
640
641   If it is a :class:`str`, unescaped non-ASCII characters in *string*
642   are encoded into UTF-8 bytes.
643
644   Example: ``unquote_to_bytes('a%26%EF')`` yields ``b'a&\xef'``.
645
646
647.. function:: urlencode(query, doseq=False, safe='', encoding=None, \
648                        errors=None, quote_via=quote_plus)
649
650   Convert a mapping object or a sequence of two-element tuples, which may
651   contain :class:`str` or :class:`bytes` objects, to a percent-encoded ASCII
652   text string.  If the resultant string is to be used as a *data* for POST
653   operation with the :func:`~urllib.request.urlopen` function, then
654   it should be encoded to bytes, otherwise it would result in a
655   :exc:`TypeError`.
656
657   The resulting string is a series of ``key=value`` pairs separated by ``'&'``
658   characters, where both *key* and *value* are quoted using the *quote_via*
659   function.  By default, :func:`quote_plus` is used to quote the values, which
660   means spaces are quoted as a ``'+'`` character and '/' characters are
661   encoded as ``%2F``, which follows the standard for GET requests
662   (``application/x-www-form-urlencoded``).  An alternate function that can be
663   passed as *quote_via* is :func:`quote`, which will encode spaces as ``%20``
664   and not encode '/' characters.  For maximum control of what is quoted, use
665   ``quote`` and specify a value for *safe*.
666
667   When a sequence of two-element tuples is used as the *query*
668   argument, the first element of each tuple is a key and the second is a
669   value. The value element in itself can be a sequence and in that case, if
670   the optional parameter *doseq* evaluates to ``True``, individual
671   ``key=value`` pairs separated by ``'&'`` are generated for each element of
672   the value sequence for the key.  The order of parameters in the encoded
673   string will match the order of parameter tuples in the sequence.
674
675   The *safe*, *encoding*, and *errors* parameters are passed down to
676   *quote_via* (the *encoding* and *errors* parameters are only passed
677   when a query element is a :class:`str`).
678
679   To reverse this encoding process, :func:`parse_qs` and :func:`parse_qsl` are
680   provided in this module to parse query strings into Python data structures.
681
682   Refer to :ref:`urllib examples <urllib-examples>` to find out how the
683   :func:`urllib.parse.urlencode` method can be used for generating the query
684   string of a URL or data for a POST request.
685
686   .. versionchanged:: 3.2
687      *query* supports bytes and string objects.
688
689   .. versionadded:: 3.5
690      *quote_via* parameter.
691
692
693.. seealso::
694
695   `WHATWG`_ -  URL Living standard
696      Working Group for the URL Standard that defines URLs, domains, IP addresses, the
697      application/x-www-form-urlencoded format, and their API.
698
699   :rfc:`3986` - Uniform Resource Identifiers
700      This is the current standard (STD66). Any changes to urllib.parse module
701      should conform to this. Certain deviations could be observed, which are
702      mostly for backward compatibility purposes and for certain de-facto
703      parsing requirements as commonly observed in major browsers.
704
705   :rfc:`2732` - Format for Literal IPv6 Addresses in URL's.
706      This specifies the parsing requirements of IPv6 URLs.
707
708   :rfc:`2396` - Uniform Resource Identifiers (URI): Generic Syntax
709      Document describing the generic syntactic requirements for both Uniform Resource
710      Names (URNs) and Uniform Resource Locators (URLs).
711
712   :rfc:`2368` - The mailto URL scheme.
713      Parsing requirements for mailto URL schemes.
714
715   :rfc:`1808` - Relative Uniform Resource Locators
716      This Request For Comments includes the rules for joining an absolute and a
717      relative URL, including a fair number of "Abnormal Examples" which govern the
718      treatment of border cases.
719
720   :rfc:`1738` - Uniform Resource Locators (URL)
721      This specifies the formal syntax and semantics of absolute URLs.
722
723.. _WHATWG: https://url.spec.whatwg.org/
724