• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1:mod:`!urllib.request` --- Extensible library for opening URLs
2==============================================================
3
4.. module:: urllib.request
5   :synopsis: Extensible library for opening URLs.
6
7.. moduleauthor:: Jeremy Hylton <jeremy@alum.mit.edu>
8.. sectionauthor:: Moshe Zadka <moshez@users.sourceforge.net>
9.. sectionauthor:: Senthil Kumaran <senthil@uthcode.com>
10
11**Source code:** :source:`Lib/urllib/request.py`
12
13--------------
14
15The :mod:`urllib.request` module defines functions and classes which help in
16opening URLs (mostly HTTP) in a complex world --- basic and digest
17authentication, redirections, cookies and more.
18
19.. seealso::
20
21    The `Requests package <https://requests.readthedocs.io/en/master/>`_
22    is recommended for a higher-level HTTP client interface.
23
24.. warning::
25
26   On macOS it is unsafe to use this module in programs using
27   :func:`os.fork` because the :func:`getproxies` implementation for
28   macOS uses a higher-level system API. Set the environment variable
29   ``no_proxy`` to ``*`` to avoid this problem
30   (e.g. ``os.environ["no_proxy"] = "*"``).
31
32.. include:: ../includes/wasm-notavail.rst
33
34The :mod:`urllib.request` module defines the following functions:
35
36
37.. function:: urlopen(url, data=None[, timeout], *, context=None)
38
39   Open *url*, which can be either a string containing a valid, properly
40   encoded URL, or a :class:`Request` object.
41
42   *data* must be an object specifying additional data to be sent to the
43   server, or ``None`` if no such data is needed.  See :class:`Request`
44   for details.
45
46   urllib.request module uses HTTP/1.1 and includes ``Connection:close`` header
47   in its HTTP requests.
48
49   The optional *timeout* parameter specifies a timeout in seconds for
50   blocking operations like the connection attempt (if not specified,
51   the global default timeout setting will be used).  This actually
52   only works for HTTP, HTTPS and FTP connections.
53
54   If *context* is specified, it must be a :class:`ssl.SSLContext` instance
55   describing the various SSL options. See :class:`~http.client.HTTPSConnection`
56   for more details.
57
58   This function always returns an object which can work as a
59   :term:`context manager` and has the properties *url*, *headers*, and *status*.
60   See :class:`urllib.response.addinfourl` for more detail on these properties.
61
62   For HTTP and HTTPS URLs, this function returns a
63   :class:`http.client.HTTPResponse` object slightly modified. In addition
64   to the three new methods above, the msg attribute contains the
65   same information as the :attr:`~http.client.HTTPResponse.reason`
66   attribute --- the reason phrase returned by server --- instead of
67   the response headers as it is specified in the documentation for
68   :class:`~http.client.HTTPResponse`.
69
70   For FTP, file, and data URLs and requests explicitly handled by legacy
71   :class:`URLopener` and :class:`FancyURLopener` classes, this function
72   returns a :class:`urllib.response.addinfourl` object.
73
74   Raises :exc:`~urllib.error.URLError` on protocol errors.
75
76   Note that ``None`` may be returned if no handler handles the request (though
77   the default installed global :class:`OpenerDirector` uses
78   :class:`UnknownHandler` to ensure this never happens).
79
80   In addition, if proxy settings are detected (for example, when a ``*_proxy``
81   environment variable like :envvar:`!http_proxy` is set),
82   :class:`ProxyHandler` is default installed and makes sure the requests are
83   handled through the proxy.
84
85   The legacy ``urllib.urlopen`` function from Python 2.6 and earlier has been
86   discontinued; :func:`urllib.request.urlopen` corresponds to the old
87   ``urllib2.urlopen``.  Proxy handling, which was done by passing a dictionary
88   parameter to ``urllib.urlopen``, can be obtained by using
89   :class:`ProxyHandler` objects.
90
91   .. audit-event:: urllib.Request fullurl,data,headers,method urllib.request.urlopen
92
93      The default opener raises an :ref:`auditing event <auditing>`
94      ``urllib.Request`` with arguments ``fullurl``, ``data``, ``headers``,
95      ``method`` taken from the request object.
96
97   .. versionchanged:: 3.2
98      *cafile* and *capath* were added.
99
100      HTTPS virtual hosts are now supported if possible (that is, if
101      :const:`ssl.HAS_SNI` is true).
102
103      *data* can be an iterable object.
104
105   .. versionchanged:: 3.3
106      *cadefault* was added.
107
108   .. versionchanged:: 3.4.3
109      *context* was added.
110
111   .. versionchanged:: 3.10
112      HTTPS connection now send an ALPN extension with protocol indicator
113      ``http/1.1`` when no *context* is given. Custom *context* should set
114      ALPN protocols with :meth:`~ssl.SSLContext.set_alpn_protocols`.
115
116   .. versionchanged:: 3.13
117      Remove *cafile*, *capath* and *cadefault* parameters: use the *context*
118      parameter instead.
119
120
121.. function:: install_opener(opener)
122
123   Install an :class:`OpenerDirector` instance as the default global opener.
124   Installing an opener is only necessary if you want urlopen to use that
125   opener; otherwise, simply call :meth:`OpenerDirector.open` instead of
126   :func:`~urllib.request.urlopen`.  The code does not check for a real
127   :class:`OpenerDirector`, and any class with the appropriate interface will
128   work.
129
130
131.. function:: build_opener([handler, ...])
132
133   Return an :class:`OpenerDirector` instance, which chains the handlers in the
134   order given. *handler*\s can be either instances of :class:`BaseHandler`, or
135   subclasses of :class:`BaseHandler` (in which case it must be possible to call
136   the constructor without any parameters).  Instances of the following classes
137   will be in front of the *handler*\s, unless the *handler*\s contain them,
138   instances of them or subclasses of them: :class:`ProxyHandler` (if proxy
139   settings are detected), :class:`UnknownHandler`, :class:`HTTPHandler`,
140   :class:`HTTPDefaultErrorHandler`, :class:`HTTPRedirectHandler`,
141   :class:`FTPHandler`, :class:`FileHandler`, :class:`HTTPErrorProcessor`.
142
143   If the Python installation has SSL support (i.e., if the :mod:`ssl` module
144   can be imported), :class:`HTTPSHandler` will also be added.
145
146   A :class:`BaseHandler` subclass may also change its :attr:`handler_order`
147   attribute to modify its position in the handlers list.
148
149
150.. function:: pathname2url(path)
151
152   Convert the given local path to a ``file:`` URL. This function uses
153   :func:`~urllib.parse.quote` function to encode the path. For historical
154   reasons, the return value omits the ``file:`` scheme prefix. This example
155   shows the function being used on Windows::
156
157      >>> from urllib.request import pathname2url
158      >>> path = 'C:\\Program Files'
159      >>> 'file:' + pathname2url(path)
160      'file:///C:/Program%20Files'
161
162
163.. function:: url2pathname(url)
164
165   Convert the given ``file:`` URL to a local path. This function uses
166   :func:`~urllib.parse.unquote` to decode the URL. For historical reasons,
167   the given value *must* omit the ``file:`` scheme prefix. This example shows
168   the function being used on Windows::
169
170      >>> from urllib.request import url2pathname
171      >>> url = 'file:///C:/Program%20Files'
172      >>> url2pathname(url.removeprefix('file:'))
173      'C:\\Program Files'
174
175.. function:: getproxies()
176
177   This helper function returns a dictionary of scheme to proxy server URL
178   mappings. It scans the environment for variables named ``<scheme>_proxy``,
179   in a case insensitive approach, for all operating systems first, and when it
180   cannot find it, looks for proxy information from System
181   Configuration for macOS and Windows Systems Registry for Windows.
182   If both lowercase and uppercase environment variables exist (and disagree),
183   lowercase is preferred.
184
185   .. note::
186
187      If the environment variable ``REQUEST_METHOD`` is set, which usually
188      indicates your script is running in a CGI environment, the environment
189      variable ``HTTP_PROXY`` (uppercase ``_PROXY``) will be ignored. This is
190      because that variable can be injected by a client using the "Proxy:" HTTP
191      header. If you need to use an HTTP proxy in a CGI environment, either use
192      ``ProxyHandler`` explicitly, or make sure the variable name is in
193      lowercase (or at least the ``_proxy`` suffix).
194
195
196The following classes are provided:
197
198.. class:: Request(url, data=None, headers={}, origin_req_host=None, unverifiable=False, method=None)
199
200   This class is an abstraction of a URL request.
201
202   *url* should be a string containing a valid, properly encoded URL.
203
204   *data* must be an object specifying additional data to send to the
205   server, or ``None`` if no such data is needed.  Currently HTTP
206   requests are the only ones that use *data*.  The supported object
207   types include bytes, file-like objects, and iterables of bytes-like objects.
208   If no ``Content-Length`` nor ``Transfer-Encoding`` header field
209   has been provided, :class:`HTTPHandler` will set these headers according
210   to the type of *data*.  ``Content-Length`` will be used to send
211   bytes objects, while ``Transfer-Encoding: chunked`` as specified in
212   :rfc:`7230`, Section 3.3.1 will be used to send files and other iterables.
213
214   For an HTTP POST request method, *data* should be a buffer in the
215   standard :mimetype:`application/x-www-form-urlencoded` format.  The
216   :func:`urllib.parse.urlencode` function takes a mapping or sequence
217   of 2-tuples and returns an ASCII string in this format. It should
218   be encoded to bytes before being used as the *data* parameter.
219
220   *headers* should be a dictionary, and will be treated as if
221   :meth:`add_header` was called with each key and value as arguments.
222   This is often used to "spoof" the ``User-Agent`` header value, which is
223   used by a browser to identify itself -- some HTTP servers only
224   allow requests coming from common browsers as opposed to scripts.
225   For example, Mozilla Firefox may identify itself as ``"Mozilla/5.0
226   (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11"``, while
227   :mod:`urllib`'s default user agent string is
228   ``"Python-urllib/2.6"`` (on Python 2.6).
229   All header keys are sent in camel case.
230
231   An appropriate ``Content-Type`` header should be included if the *data*
232   argument is present.  If this header has not been provided and *data*
233   is not ``None``, ``Content-Type: application/x-www-form-urlencoded`` will
234   be added as a default.
235
236   The next two arguments are only of interest for correct handling
237   of third-party HTTP cookies:
238
239   *origin_req_host* should be the request-host of the origin
240   transaction, as defined by :rfc:`2965`.  It defaults to
241   ``http.cookiejar.request_host(self)``.  This is the host name or IP
242   address of the original request that was initiated by the user.
243   For example, if the request is for an image in an HTML document,
244   this should be the request-host of the request for the page
245   containing the image.
246
247   *unverifiable* should indicate whether the request is unverifiable,
248   as defined by :rfc:`2965`.  It defaults to ``False``.  An unverifiable
249   request is one whose URL the user did not have the option to
250   approve.  For example, if the request is for an image in an HTML
251   document, and the user had no option to approve the automatic
252   fetching of the image, this should be true.
253
254   *method* should be a string that indicates the HTTP request method that
255   will be used (e.g. ``'HEAD'``).  If provided, its value is stored in the
256   :attr:`~Request.method` attribute and is used by :meth:`get_method`.
257   The default is ``'GET'`` if *data* is ``None`` or ``'POST'`` otherwise.
258   Subclasses may indicate a different default method by setting the
259   :attr:`~Request.method` attribute in the class itself.
260
261   .. note::
262      The request will not work as expected if the data object is unable
263      to deliver its content more than once (e.g. a file or an iterable
264      that can produce the content only once) and the request is retried
265      for HTTP redirects or authentication.  The *data* is sent to the
266      HTTP server right away after the headers.  There is no support for
267      a 100-continue expectation in the library.
268
269   .. versionchanged:: 3.3
270      :attr:`Request.method` argument is added to the Request class.
271
272   .. versionchanged:: 3.4
273      Default :attr:`Request.method` may be indicated at the class level.
274
275   .. versionchanged:: 3.6
276      Do not raise an error if the ``Content-Length`` has not been
277      provided and *data* is neither ``None`` nor a bytes object.
278      Fall back to use chunked transfer encoding instead.
279
280.. class:: OpenerDirector()
281
282   The :class:`OpenerDirector` class opens URLs via :class:`BaseHandler`\ s chained
283   together. It manages the chaining of handlers, and recovery from errors.
284
285
286.. class:: BaseHandler()
287
288   This is the base class for all registered handlers --- and handles only the
289   simple mechanics of registration.
290
291
292.. class:: HTTPDefaultErrorHandler()
293
294   A class which defines a default handler for HTTP error responses; all responses
295   are turned into :exc:`~urllib.error.HTTPError` exceptions.
296
297
298.. class:: HTTPRedirectHandler()
299
300   A class to handle redirections.
301
302
303.. class:: HTTPCookieProcessor(cookiejar=None)
304
305   A class to handle HTTP Cookies.
306
307
308.. class:: ProxyHandler(proxies=None)
309
310   Cause requests to go through a proxy. If *proxies* is given, it must be a
311   dictionary mapping protocol names to URLs of proxies. The default is to read
312   the list of proxies from the environment variables
313   ``<protocol>_proxy``.  If no proxy environment variables are set, then
314   in a Windows environment proxy settings are obtained from the registry's
315   Internet Settings section, and in a macOS environment proxy information
316   is retrieved from the System Configuration Framework.
317
318   To disable autodetected proxy pass an empty dictionary.
319
320   The :envvar:`no_proxy` environment variable can be used to specify hosts
321   which shouldn't be reached via proxy; if set, it should be a comma-separated
322   list of hostname suffixes, optionally with ``:port`` appended, for example
323   ``cern.ch,ncsa.uiuc.edu,some.host:8080``.
324
325   .. note::
326
327      ``HTTP_PROXY`` will be ignored if a variable ``REQUEST_METHOD`` is set;
328      see the documentation on :func:`~urllib.request.getproxies`.
329
330
331.. class:: HTTPPasswordMgr()
332
333   Keep a database of  ``(realm, uri) -> (user, password)`` mappings.
334
335
336.. class:: HTTPPasswordMgrWithDefaultRealm()
337
338   Keep a database of  ``(realm, uri) -> (user, password)`` mappings. A realm of
339   ``None`` is considered a catch-all realm, which is searched if no other realm
340   fits.
341
342
343.. class:: HTTPPasswordMgrWithPriorAuth()
344
345   A variant of :class:`HTTPPasswordMgrWithDefaultRealm` that also has a
346   database of ``uri -> is_authenticated`` mappings.  Can be used by a
347   BasicAuth handler to determine when to send authentication credentials
348   immediately instead of waiting for a ``401`` response first.
349
350   .. versionadded:: 3.5
351
352
353.. class:: AbstractBasicAuthHandler(password_mgr=None)
354
355   This is a mixin class that helps with HTTP authentication, both to the remote
356   host and to a proxy. *password_mgr*, if given, should be something that is
357   compatible with :class:`HTTPPasswordMgr`; refer to section
358   :ref:`http-password-mgr` for information on the interface that must be
359   supported.  If *passwd_mgr* also provides ``is_authenticated`` and
360   ``update_authenticated`` methods (see
361   :ref:`http-password-mgr-with-prior-auth`), then the handler will use the
362   ``is_authenticated`` result for a given URI to determine whether or not to
363   send authentication credentials with the request.  If ``is_authenticated``
364   returns ``True`` for the URI, credentials are sent.  If ``is_authenticated``
365   is ``False``, credentials are not sent, and then if a ``401`` response is
366   received the request is re-sent with the authentication credentials.  If
367   authentication succeeds, ``update_authenticated`` is called to set
368   ``is_authenticated`` ``True`` for the URI, so that subsequent requests to
369   the URI or any of its super-URIs will automatically include the
370   authentication credentials.
371
372   .. versionadded:: 3.5
373      Added ``is_authenticated`` support.
374
375
376.. class:: HTTPBasicAuthHandler(password_mgr=None)
377
378   Handle authentication with the remote host. *password_mgr*, if given, should
379   be something that is compatible with :class:`HTTPPasswordMgr`; refer to
380   section :ref:`http-password-mgr` for information on the interface that must
381   be supported. HTTPBasicAuthHandler will raise a :exc:`ValueError` when
382   presented with a wrong Authentication scheme.
383
384
385.. class:: ProxyBasicAuthHandler(password_mgr=None)
386
387   Handle authentication with the proxy. *password_mgr*, if given, should be
388   something that is compatible with :class:`HTTPPasswordMgr`; refer to section
389   :ref:`http-password-mgr` for information on the interface that must be
390   supported.
391
392
393.. class:: AbstractDigestAuthHandler(password_mgr=None)
394
395   This is a mixin class that helps with HTTP authentication, both to the remote
396   host and to a proxy. *password_mgr*, if given, should be something that is
397   compatible with :class:`HTTPPasswordMgr`; refer to section
398   :ref:`http-password-mgr` for information on the interface that must be
399   supported.
400
401
402.. class:: HTTPDigestAuthHandler(password_mgr=None)
403
404   Handle authentication with the remote host. *password_mgr*, if given, should
405   be something that is compatible with :class:`HTTPPasswordMgr`; refer to
406   section :ref:`http-password-mgr` for information on the interface that must
407   be supported. When both Digest Authentication Handler and Basic
408   Authentication Handler are both added, Digest Authentication is always tried
409   first. If the Digest Authentication returns a 40x response again, it is sent
410   to Basic Authentication handler to Handle.  This Handler method will raise a
411   :exc:`ValueError` when presented with an authentication scheme other than
412   Digest or Basic.
413
414   .. versionchanged:: 3.3
415      Raise :exc:`ValueError` on unsupported Authentication Scheme.
416
417
418
419.. class:: ProxyDigestAuthHandler(password_mgr=None)
420
421   Handle authentication with the proxy. *password_mgr*, if given, should be
422   something that is compatible with :class:`HTTPPasswordMgr`; refer to section
423   :ref:`http-password-mgr` for information on the interface that must be
424   supported.
425
426
427.. class:: HTTPHandler()
428
429   A class to handle opening of HTTP URLs.
430
431
432.. class:: HTTPSHandler(debuglevel=0, context=None, check_hostname=None)
433
434   A class to handle opening of HTTPS URLs.  *context* and *check_hostname*
435   have the same meaning as in :class:`http.client.HTTPSConnection`.
436
437   .. versionchanged:: 3.2
438      *context* and *check_hostname* were added.
439
440
441.. class:: FileHandler()
442
443   Open local files.
444
445.. class:: DataHandler()
446
447   Open data URLs.
448
449   .. versionadded:: 3.4
450
451.. class:: FTPHandler()
452
453   Open FTP URLs.
454
455
456.. class:: CacheFTPHandler()
457
458   Open FTP URLs, keeping a cache of open FTP connections to minimize delays.
459
460
461.. class:: UnknownHandler()
462
463   A catch-all class to handle unknown URLs.
464
465
466.. class:: HTTPErrorProcessor()
467
468   Process HTTP error responses.
469
470
471.. _request-objects:
472
473Request Objects
474---------------
475
476The following methods describe :class:`Request`'s public interface,
477and so all may be overridden in subclasses.  It also defines several
478public attributes that can be used by clients to inspect the parsed
479request.
480
481.. attribute:: Request.full_url
482
483   The original URL passed to the constructor.
484
485   .. versionchanged:: 3.4
486
487   Request.full_url is a property with setter, getter and a deleter. Getting
488   :attr:`~Request.full_url` returns the original request URL with the
489   fragment, if it was present.
490
491.. attribute:: Request.type
492
493   The URI scheme.
494
495.. attribute:: Request.host
496
497   The URI authority, typically a host, but may also contain a port
498   separated by a colon.
499
500.. attribute:: Request.origin_req_host
501
502   The original host for the request, without port.
503
504.. attribute:: Request.selector
505
506   The URI path.  If the :class:`Request` uses a proxy, then selector
507   will be the full URL that is passed to the proxy.
508
509.. attribute:: Request.data
510
511   The entity body for the request, or ``None`` if not specified.
512
513   .. versionchanged:: 3.4
514      Changing value of :attr:`Request.data` now deletes "Content-Length"
515      header if it was previously set or calculated.
516
517.. attribute:: Request.unverifiable
518
519   boolean, indicates whether the request is unverifiable as defined
520   by :rfc:`2965`.
521
522.. attribute:: Request.method
523
524   The HTTP request method to use.  By default its value is :const:`None`,
525   which means that :meth:`~Request.get_method` will do its normal computation
526   of the method to be used.  Its value can be set (thus overriding the default
527   computation in :meth:`~Request.get_method`) either by providing a default
528   value by setting it at the class level in a :class:`Request` subclass, or by
529   passing a value in to the :class:`Request` constructor via the *method*
530   argument.
531
532   .. versionadded:: 3.3
533
534   .. versionchanged:: 3.4
535      A default value can now be set in subclasses; previously it could only
536      be set via the constructor argument.
537
538
539.. method:: Request.get_method()
540
541   Return a string indicating the HTTP request method.  If
542   :attr:`Request.method` is not ``None``, return its value, otherwise return
543   ``'GET'`` if :attr:`Request.data` is ``None``, or ``'POST'`` if it's not.
544   This is only meaningful for HTTP requests.
545
546   .. versionchanged:: 3.3
547      get_method now looks at the value of :attr:`Request.method`.
548
549
550.. method:: Request.add_header(key, val)
551
552   Add another header to the request.  Headers are currently ignored by all
553   handlers except HTTP handlers, where they are added to the list of headers sent
554   to the server.  Note that there cannot be more than one header with the same
555   name, and later calls will overwrite previous calls in case the *key* collides.
556   Currently, this is no loss of HTTP functionality, since all headers which have
557   meaning when used more than once have a (header-specific) way of gaining the
558   same functionality using only one header.  Note that headers added using
559   this method are also added to redirected requests.
560
561
562.. method:: Request.add_unredirected_header(key, header)
563
564   Add a header that will not be added to a redirected request.
565
566
567.. method:: Request.has_header(header)
568
569   Return whether the instance has the named header (checks both regular and
570   unredirected).
571
572
573.. method:: Request.remove_header(header)
574
575   Remove named header from the request instance (both from regular and
576   unredirected headers).
577
578   .. versionadded:: 3.4
579
580
581.. method:: Request.get_full_url()
582
583   Return the URL given in the constructor.
584
585   .. versionchanged:: 3.4
586
587   Returns :attr:`Request.full_url`
588
589
590.. method:: Request.set_proxy(host, type)
591
592   Prepare the request by connecting to a proxy server. The *host* and *type* will
593   replace those of the instance, and the instance's selector will be the original
594   URL given in the constructor.
595
596
597.. method:: Request.get_header(header_name, default=None)
598
599   Return the value of the given header. If the header is not present, return
600   the default value.
601
602
603.. method:: Request.header_items()
604
605   Return a list of tuples (header_name, header_value) of the Request headers.
606
607.. versionchanged:: 3.4
608   The request methods add_data, has_data, get_data, get_type, get_host,
609   get_selector, get_origin_req_host and is_unverifiable that were deprecated
610   since 3.3 have been removed.
611
612
613.. _opener-director-objects:
614
615OpenerDirector Objects
616----------------------
617
618:class:`OpenerDirector` instances have the following methods:
619
620
621.. method:: OpenerDirector.add_handler(handler)
622
623   *handler* should be an instance of :class:`BaseHandler`.  The following methods
624   are searched, and added to the possible chains (note that HTTP errors are a
625   special case).  Note that, in the following, *protocol* should be replaced
626   with the actual protocol to handle, for example :meth:`http_response` would
627   be the HTTP protocol response handler.  Also *type* should be replaced with
628   the actual HTTP code, for example :meth:`http_error_404` would handle HTTP
629   404 errors.
630
631   * :meth:`!<protocol>_open` --- signal that the handler knows how to open *protocol*
632     URLs.
633
634     See |protocol_open|_ for more information.
635
636   * :meth:`!http_error_\<type\>` --- signal that the handler knows how to handle HTTP
637     errors with HTTP error code *type*.
638
639     See |http_error_nnn|_ for more information.
640
641   * :meth:`!<protocol>_error` --- signal that the handler knows how to handle errors
642     from (non-\ ``http``) *protocol*.
643
644   * :meth:`!<protocol>_request` --- signal that the handler knows how to pre-process
645     *protocol* requests.
646
647     See |protocol_request|_ for more information.
648
649   * :meth:`!<protocol>_response` --- signal that the handler knows how to
650     post-process *protocol* responses.
651
652     See |protocol_response|_ for more information.
653
654.. |protocol_open| replace:: :meth:`BaseHandler.<protocol>_open`
655.. |http_error_nnn| replace:: :meth:`BaseHandler.http_error_\<nnn\>`
656.. |protocol_request| replace:: :meth:`BaseHandler.<protocol>_request`
657.. |protocol_response| replace:: :meth:`BaseHandler.<protocol>_response`
658
659.. method:: OpenerDirector.open(url, data=None[, timeout])
660
661   Open the given *url* (which can be a request object or a string), optionally
662   passing the given *data*. Arguments, return values and exceptions raised are
663   the same as those of :func:`urlopen` (which simply calls the :meth:`open`
664   method on the currently installed global :class:`OpenerDirector`).  The
665   optional *timeout* parameter specifies a timeout in seconds for blocking
666   operations like the connection attempt (if not specified, the global default
667   timeout setting will be used). The timeout feature actually works only for
668   HTTP, HTTPS and FTP connections.
669
670
671.. method:: OpenerDirector.error(proto, *args)
672
673   Handle an error of the given protocol.  This will call the registered error
674   handlers for the given protocol with the given arguments (which are protocol
675   specific).  The HTTP protocol is a special case which uses the HTTP response
676   code to determine the specific error handler; refer to the :meth:`!http_error_\<type\>`
677   methods of the handler classes.
678
679   Return values and exceptions raised are the same as those of :func:`urlopen`.
680
681OpenerDirector objects open URLs in three stages:
682
683The order in which these methods are called within each stage is determined by
684sorting the handler instances.
685
686#. Every handler with a method named like :meth:`!<protocol>_request` has that
687   method called to pre-process the request.
688
689#. Handlers with a method named like :meth:`!<protocol>_open` are called to handle
690   the request. This stage ends when a handler either returns a non-\ :const:`None`
691   value (ie. a response), or raises an exception (usually
692   :exc:`~urllib.error.URLError`).  Exceptions are allowed to propagate.
693
694   In fact, the above algorithm is first tried for methods named
695   :meth:`~BaseHandler.default_open`.  If all such methods return :const:`None`, the algorithm
696   is repeated for methods named like :meth:`!<protocol>_open`.  If all such methods
697   return :const:`None`, the algorithm is repeated for methods named
698   :meth:`~BaseHandler.unknown_open`.
699
700   Note that the implementation of these methods may involve calls of the parent
701   :class:`OpenerDirector` instance's :meth:`~OpenerDirector.open` and
702   :meth:`~OpenerDirector.error` methods.
703
704#. Every handler with a method named like :meth:`!<protocol>_response` has that
705   method called to post-process the response.
706
707
708.. _base-handler-objects:
709
710BaseHandler Objects
711-------------------
712
713:class:`BaseHandler` objects provide a couple of methods that are directly
714useful, and others that are meant to be used by derived classes.  These are
715intended for direct use:
716
717
718.. method:: BaseHandler.add_parent(director)
719
720   Add a director as parent.
721
722
723.. method:: BaseHandler.close()
724
725   Remove any parents.
726
727The following attribute and methods should only be used by classes derived from
728:class:`BaseHandler`.
729
730.. note::
731
732   The convention has been adopted that subclasses defining
733   :meth:`!<protocol>_request` or :meth:`!<protocol>_response` methods are named
734   :class:`!\*Processor`; all others are named :class:`!\*Handler`.
735
736
737.. attribute:: BaseHandler.parent
738
739   A valid :class:`OpenerDirector`, which can be used to open using a different
740   protocol, or handle errors.
741
742
743.. method:: BaseHandler.default_open(req)
744
745   This method is *not* defined in :class:`BaseHandler`, but subclasses should
746   define it if they want to catch all URLs.
747
748   This method, if implemented, will be called by the parent
749   :class:`OpenerDirector`.  It should return a file-like object as described in
750   the return value of the :meth:`~OpenerDirector.open` method of :class:`OpenerDirector`, or ``None``.
751   It should raise :exc:`~urllib.error.URLError`, unless a truly exceptional
752   thing happens (for example, :exc:`MemoryError` should not be mapped to
753   :exc:`~urllib.error.URLError`).
754
755   This method will be called before any protocol-specific open method.
756
757
758.. _protocol_open:
759.. method:: BaseHandler.<protocol>_open(req)
760   :noindex:
761
762   This method is *not* defined in :class:`BaseHandler`, but subclasses should
763   define it if they want to handle URLs with the given protocol.
764
765   This method, if defined, will be called by the parent :class:`OpenerDirector`.
766   Return values should be the same as for  :meth:`~BaseHandler.default_open`.
767
768
769.. method:: BaseHandler.unknown_open(req)
770
771   This method is *not* defined in :class:`BaseHandler`, but subclasses should
772   define it if they want to catch all URLs with no specific registered handler to
773   open it.
774
775   This method, if implemented, will be called by the :attr:`parent`
776   :class:`OpenerDirector`.  Return values should be the same as for
777   :meth:`default_open`.
778
779
780.. method:: BaseHandler.http_error_default(req, fp, code, msg, hdrs)
781
782   This method is *not* defined in :class:`BaseHandler`, but subclasses should
783   override it if they intend to provide a catch-all for otherwise unhandled HTTP
784   errors.  It will be called automatically by the  :class:`OpenerDirector` getting
785   the error, and should not normally be called in other circumstances.
786
787   *req* will be a :class:`Request` object, *fp* will be a file-like object with
788   the HTTP error body, *code* will be the three-digit code of the error, *msg*
789   will be the user-visible explanation of the code and *hdrs* will be a mapping
790   object with the headers of the error.
791
792   Return values and exceptions raised should be the same as those of
793   :func:`urlopen`.
794
795
796.. _http_error_nnn:
797.. method:: BaseHandler.http_error_<nnn>(req, fp, code, msg, hdrs)
798
799   *nnn* should be a three-digit HTTP error code.  This method is also not defined
800   in :class:`BaseHandler`, but will be called, if it exists, on an instance of a
801   subclass, when an HTTP error with code *nnn* occurs.
802
803   Subclasses should override this method to handle specific HTTP errors.
804
805   Arguments, return values and exceptions raised should be the same as for
806   :meth:`~BaseHandler.http_error_default`.
807
808
809.. _protocol_request:
810.. method:: BaseHandler.<protocol>_request(req)
811   :noindex:
812
813   This method is *not* defined in :class:`BaseHandler`, but subclasses should
814   define it if they want to pre-process requests of the given protocol.
815
816   This method, if defined, will be called by the parent :class:`OpenerDirector`.
817   *req* will be a :class:`Request` object. The return value should be a
818   :class:`Request` object.
819
820
821.. _protocol_response:
822.. method:: BaseHandler.<protocol>_response(req, response)
823   :noindex:
824
825   This method is *not* defined in :class:`BaseHandler`, but subclasses should
826   define it if they want to post-process responses of the given protocol.
827
828   This method, if defined, will be called by the parent :class:`OpenerDirector`.
829   *req* will be a :class:`Request` object. *response* will be an object
830   implementing the same interface as the return value of :func:`urlopen`.  The
831   return value should implement the same interface as the return value of
832   :func:`urlopen`.
833
834
835.. _http-redirect-handler:
836
837HTTPRedirectHandler Objects
838---------------------------
839
840.. note::
841
842   Some HTTP redirections require action from this module's client code.  If this
843   is the case, :exc:`~urllib.error.HTTPError` is raised.  See :rfc:`2616` for
844   details of the precise meanings of the various redirection codes.
845
846   An :exc:`~urllib.error.HTTPError` exception raised as a security consideration if the
847   HTTPRedirectHandler is presented with a redirected URL which is not an HTTP,
848   HTTPS or FTP URL.
849
850
851.. method:: HTTPRedirectHandler.redirect_request(req, fp, code, msg, hdrs, newurl)
852
853   Return a :class:`Request` or ``None`` in response to a redirect. This is called
854   by the default implementations of the :meth:`!http_error_30\*` methods when a
855   redirection is received from the server.  If a redirection should take place,
856   return a new :class:`Request` to allow :meth:`!http_error_30\*` to perform the
857   redirect to *newurl*.  Otherwise, raise :exc:`~urllib.error.HTTPError` if
858   no other handler should try to handle this URL, or return ``None`` if you
859   can't but another handler might.
860
861   .. note::
862
863      The default implementation of this method does not strictly follow :rfc:`2616`,
864      which says that 301 and 302 responses to ``POST`` requests must not be
865      automatically redirected without confirmation by the user.  In reality, browsers
866      do allow automatic redirection of these responses, changing the POST to a
867      ``GET``, and the default implementation reproduces this behavior.
868
869
870.. method:: HTTPRedirectHandler.http_error_301(req, fp, code, msg, hdrs)
871
872   Redirect to the ``Location:`` or ``URI:`` URL.  This method is called by the
873   parent :class:`OpenerDirector` when getting an HTTP 'moved permanently' response.
874
875
876.. method:: HTTPRedirectHandler.http_error_302(req, fp, code, msg, hdrs)
877
878   The same as :meth:`http_error_301`, but called for the 'found' response.
879
880
881.. method:: HTTPRedirectHandler.http_error_303(req, fp, code, msg, hdrs)
882
883   The same as :meth:`http_error_301`, but called for the 'see other' response.
884
885
886.. method:: HTTPRedirectHandler.http_error_307(req, fp, code, msg, hdrs)
887
888   The same as :meth:`http_error_301`, but called for the 'temporary redirect'
889   response. It does not allow changing the request method from ``POST``
890   to ``GET``.
891
892
893.. method:: HTTPRedirectHandler.http_error_308(req, fp, code, msg, hdrs)
894
895   The same as :meth:`http_error_301`, but called for the 'permanent redirect'
896   response. It does not allow changing the request method from ``POST``
897   to ``GET``.
898
899   .. versionadded:: 3.11
900
901
902.. _http-cookie-processor:
903
904HTTPCookieProcessor Objects
905---------------------------
906
907:class:`HTTPCookieProcessor` instances have one attribute:
908
909.. attribute:: HTTPCookieProcessor.cookiejar
910
911   The :class:`http.cookiejar.CookieJar` in which cookies are stored.
912
913
914.. _proxy-handler:
915
916ProxyHandler Objects
917--------------------
918
919
920.. method:: ProxyHandler.<protocol>_open(request)
921   :noindex:
922
923   The :class:`ProxyHandler` will have a method :meth:`!<protocol>_open` for every
924   *protocol* which has a proxy in the *proxies* dictionary given in the
925   constructor.  The method will modify requests to go through the proxy, by
926   calling ``request.set_proxy()``, and call the next handler in the chain to
927   actually execute the protocol.
928
929
930.. _http-password-mgr:
931
932HTTPPasswordMgr Objects
933-----------------------
934
935These methods are available on :class:`HTTPPasswordMgr` and
936:class:`HTTPPasswordMgrWithDefaultRealm` objects.
937
938
939.. method:: HTTPPasswordMgr.add_password(realm, uri, user, passwd)
940
941   *uri* can be either a single URI, or a sequence of URIs. *realm*, *user* and
942   *passwd* must be strings. This causes ``(user, passwd)`` to be used as
943   authentication tokens when authentication for *realm* and a super-URI of any of
944   the given URIs is given.
945
946
947.. method:: HTTPPasswordMgr.find_user_password(realm, authuri)
948
949   Get user/password for given realm and URI, if any.  This method will return
950   ``(None, None)`` if there is no matching user/password.
951
952   For :class:`HTTPPasswordMgrWithDefaultRealm` objects, the realm ``None`` will be
953   searched if the given *realm* has no matching user/password.
954
955
956.. _http-password-mgr-with-prior-auth:
957
958HTTPPasswordMgrWithPriorAuth Objects
959------------------------------------
960
961This password manager extends :class:`HTTPPasswordMgrWithDefaultRealm` to support
962tracking URIs for which authentication credentials should always be sent.
963
964
965.. method:: HTTPPasswordMgrWithPriorAuth.add_password(realm, uri, user, \
966            passwd, is_authenticated=False)
967
968   *realm*, *uri*, *user*, *passwd* are as for
969   :meth:`HTTPPasswordMgr.add_password`.  *is_authenticated* sets the initial
970   value of the ``is_authenticated`` flag for the given URI or list of URIs.
971   If *is_authenticated* is specified as ``True``, *realm* is ignored.
972
973
974.. method:: HTTPPasswordMgrWithPriorAuth.find_user_password(realm, authuri)
975
976   Same as for :class:`HTTPPasswordMgrWithDefaultRealm` objects
977
978
979.. method:: HTTPPasswordMgrWithPriorAuth.update_authenticated(self, uri, \
980            is_authenticated=False)
981
982   Update the ``is_authenticated`` flag for the given *uri* or list
983   of URIs.
984
985
986.. method:: HTTPPasswordMgrWithPriorAuth.is_authenticated(self, authuri)
987
988   Returns the current state of the ``is_authenticated`` flag for
989   the given URI.
990
991
992.. _abstract-basic-auth-handler:
993
994AbstractBasicAuthHandler Objects
995--------------------------------
996
997
998.. method:: AbstractBasicAuthHandler.http_error_auth_reqed(authreq, host, req, headers)
999
1000   Handle an authentication request by getting a user/password pair, and re-trying
1001   the request.  *authreq* should be the name of the header where the information
1002   about the realm is included in the request, *host* specifies the URL and path to
1003   authenticate for, *req* should be the (failed) :class:`Request` object, and
1004   *headers* should be the error headers.
1005
1006   *host* is either an authority (e.g. ``"python.org"``) or a URL containing an
1007   authority component (e.g. ``"http://python.org/"``). In either case, the
1008   authority must not contain a userinfo component (so, ``"python.org"`` and
1009   ``"python.org:80"`` are fine, ``"joe:password@python.org"`` is not).
1010
1011
1012.. _http-basic-auth-handler:
1013
1014HTTPBasicAuthHandler Objects
1015----------------------------
1016
1017
1018.. method:: HTTPBasicAuthHandler.http_error_401(req, fp, code,  msg, hdrs)
1019
1020   Retry the request with authentication information, if available.
1021
1022
1023.. _proxy-basic-auth-handler:
1024
1025ProxyBasicAuthHandler Objects
1026-----------------------------
1027
1028
1029.. method:: ProxyBasicAuthHandler.http_error_407(req, fp, code,  msg, hdrs)
1030
1031   Retry the request with authentication information, if available.
1032
1033
1034.. _abstract-digest-auth-handler:
1035
1036AbstractDigestAuthHandler Objects
1037---------------------------------
1038
1039
1040.. method:: AbstractDigestAuthHandler.http_error_auth_reqed(authreq, host, req, headers)
1041
1042   *authreq* should be the name of the header where the information about the realm
1043   is included in the request, *host* should be the host to authenticate to, *req*
1044   should be the (failed) :class:`Request` object, and *headers* should be the
1045   error headers.
1046
1047
1048.. _http-digest-auth-handler:
1049
1050HTTPDigestAuthHandler Objects
1051-----------------------------
1052
1053
1054.. method:: HTTPDigestAuthHandler.http_error_401(req, fp, code,  msg, hdrs)
1055
1056   Retry the request with authentication information, if available.
1057
1058
1059.. _proxy-digest-auth-handler:
1060
1061ProxyDigestAuthHandler Objects
1062------------------------------
1063
1064
1065.. method:: ProxyDigestAuthHandler.http_error_407(req, fp, code,  msg, hdrs)
1066
1067   Retry the request with authentication information, if available.
1068
1069
1070.. _http-handler-objects:
1071
1072HTTPHandler Objects
1073-------------------
1074
1075
1076.. method:: HTTPHandler.http_open(req)
1077
1078   Send an HTTP request, which can be either GET or POST, depending on
1079   ``req.has_data()``.
1080
1081
1082.. _https-handler-objects:
1083
1084HTTPSHandler Objects
1085--------------------
1086
1087
1088.. method:: HTTPSHandler.https_open(req)
1089
1090   Send an HTTPS request, which can be either GET or POST, depending on
1091   ``req.has_data()``.
1092
1093
1094.. _file-handler-objects:
1095
1096FileHandler Objects
1097-------------------
1098
1099
1100.. method:: FileHandler.file_open(req)
1101
1102   Open the file locally, if there is no host name, or the host name is
1103   ``'localhost'``.
1104
1105   .. versionchanged:: 3.2
1106      This method is applicable only for local hostnames.  When a remote
1107      hostname is given, a :exc:`~urllib.error.URLError` is raised.
1108
1109
1110.. _data-handler-objects:
1111
1112DataHandler Objects
1113-------------------
1114
1115.. method:: DataHandler.data_open(req)
1116
1117   Read a data URL. This kind of URL contains the content encoded in the URL
1118   itself. The data URL syntax is specified in :rfc:`2397`. This implementation
1119   ignores white spaces in base64 encoded data URLs so the URL may be wrapped
1120   in whatever source file it comes from. But even though some browsers don't
1121   mind about a missing padding at the end of a base64 encoded data URL, this
1122   implementation will raise a :exc:`ValueError` in that case.
1123
1124
1125.. _ftp-handler-objects:
1126
1127FTPHandler Objects
1128------------------
1129
1130
1131.. method:: FTPHandler.ftp_open(req)
1132
1133   Open the FTP file indicated by *req*. The login is always done with empty
1134   username and password.
1135
1136
1137.. _cacheftp-handler-objects:
1138
1139CacheFTPHandler Objects
1140-----------------------
1141
1142:class:`CacheFTPHandler` objects are :class:`FTPHandler` objects with the
1143following additional methods:
1144
1145
1146.. method:: CacheFTPHandler.setTimeout(t)
1147
1148   Set timeout of connections to *t* seconds.
1149
1150
1151.. method:: CacheFTPHandler.setMaxConns(m)
1152
1153   Set maximum number of cached connections to *m*.
1154
1155
1156.. _unknown-handler-objects:
1157
1158UnknownHandler Objects
1159----------------------
1160
1161
1162.. method:: UnknownHandler.unknown_open()
1163
1164   Raise a :exc:`~urllib.error.URLError` exception.
1165
1166
1167.. _http-error-processor-objects:
1168
1169HTTPErrorProcessor Objects
1170--------------------------
1171
1172.. method:: HTTPErrorProcessor.http_response(request, response)
1173
1174   Process HTTP error responses.
1175
1176   For 200 error codes, the response object is returned immediately.
1177
1178   For non-200 error codes, this simply passes the job on to the
1179   :meth:`!http_error_\<type\>` handler methods, via :meth:`OpenerDirector.error`.
1180   Eventually, :class:`HTTPDefaultErrorHandler` will raise an
1181   :exc:`~urllib.error.HTTPError` if no other handler handles the error.
1182
1183
1184.. method:: HTTPErrorProcessor.https_response(request, response)
1185
1186   Process HTTPS error responses.
1187
1188   The behavior is same as :meth:`http_response`.
1189
1190
1191.. _urllib-request-examples:
1192
1193Examples
1194--------
1195
1196In addition to the examples below, more examples are given in
1197:ref:`urllib-howto`.
1198
1199This example gets the python.org main page and displays the first 300 bytes of
1200it. ::
1201
1202   >>> import urllib.request
1203   >>> with urllib.request.urlopen('http://www.python.org/') as f:
1204   ...     print(f.read(300))
1205   ...
1206   b'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1207   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html
1208   xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">\n\n<head>\n
1209   <meta http-equiv="content-type" content="text/html; charset=utf-8" />\n
1210   <title>Python Programming '
1211
1212Note that urlopen returns a bytes object.  This is because there is no way
1213for urlopen to automatically determine the encoding of the byte stream
1214it receives from the HTTP server. In general, a program will decode
1215the returned bytes object to string once it determines or guesses
1216the appropriate encoding.
1217
1218The following W3C document, https://www.w3.org/International/O-charset\ , lists
1219the various ways in which an (X)HTML or an XML document could have specified its
1220encoding information.
1221
1222As the python.org website uses *utf-8* encoding as specified in its meta tag, we
1223will use the same for decoding the bytes object. ::
1224
1225   >>> with urllib.request.urlopen('http://www.python.org/') as f:
1226   ...     print(f.read(100).decode('utf-8'))
1227   ...
1228   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1229   "http://www.w3.org/TR/xhtml1/DTD/xhtm
1230
1231It is also possible to achieve the same result without using the
1232:term:`context manager` approach. ::
1233
1234   >>> import urllib.request
1235   >>> f = urllib.request.urlopen('http://www.python.org/')
1236   >>> print(f.read(100).decode('utf-8'))
1237   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1238   "http://www.w3.org/TR/xhtml1/DTD/xhtm
1239
1240In the following example, we are sending a data-stream to the stdin of a CGI
1241and reading the data it returns to us. Note that this example will only work
1242when the Python installation supports SSL. ::
1243
1244   >>> import urllib.request
1245   >>> req = urllib.request.Request(url='https://localhost/cgi-bin/test.cgi',
1246   ...                       data=b'This data is passed to stdin of the CGI')
1247   >>> with urllib.request.urlopen(req) as f:
1248   ...     print(f.read().decode('utf-8'))
1249   ...
1250   Got Data: "This data is passed to stdin of the CGI"
1251
1252The code for the sample CGI used in the above example is::
1253
1254   #!/usr/bin/env python
1255   import sys
1256   data = sys.stdin.read()
1257   print('Content-type: text/plain\n\nGot Data: "%s"' % data)
1258
1259Here is an example of doing a ``PUT`` request using :class:`Request`::
1260
1261    import urllib.request
1262    DATA = b'some data'
1263    req = urllib.request.Request(url='http://localhost:8080', data=DATA, method='PUT')
1264    with urllib.request.urlopen(req) as f:
1265        pass
1266    print(f.status)
1267    print(f.reason)
1268
1269Use of Basic HTTP Authentication::
1270
1271   import urllib.request
1272   # Create an OpenerDirector with support for Basic HTTP Authentication...
1273   auth_handler = urllib.request.HTTPBasicAuthHandler()
1274   auth_handler.add_password(realm='PDQ Application',
1275                             uri='https://mahler:8092/site-updates.py',
1276                             user='klem',
1277                             passwd='kadidd!ehopper')
1278   opener = urllib.request.build_opener(auth_handler)
1279   # ...and install it globally so it can be used with urlopen.
1280   urllib.request.install_opener(opener)
1281   urllib.request.urlopen('http://www.example.com/login.html')
1282
1283:func:`build_opener` provides many handlers by default, including a
1284:class:`ProxyHandler`.  By default, :class:`ProxyHandler` uses the environment
1285variables named ``<scheme>_proxy``, where ``<scheme>`` is the URL scheme
1286involved.  For example, the :envvar:`!http_proxy` environment variable is read to
1287obtain the HTTP proxy's URL.
1288
1289This example replaces the default :class:`ProxyHandler` with one that uses
1290programmatically supplied proxy URLs, and adds proxy authorization support with
1291:class:`ProxyBasicAuthHandler`. ::
1292
1293   proxy_handler = urllib.request.ProxyHandler({'http': 'http://www.example.com:3128/'})
1294   proxy_auth_handler = urllib.request.ProxyBasicAuthHandler()
1295   proxy_auth_handler.add_password('realm', 'host', 'username', 'password')
1296
1297   opener = urllib.request.build_opener(proxy_handler, proxy_auth_handler)
1298   # This time, rather than install the OpenerDirector, we use it directly:
1299   opener.open('http://www.example.com/login.html')
1300
1301Adding HTTP headers:
1302
1303Use the *headers* argument to the :class:`Request` constructor, or::
1304
1305   import urllib.request
1306   req = urllib.request.Request('http://www.example.com/')
1307   req.add_header('Referer', 'http://www.python.org/')
1308   # Customize the default User-Agent header value:
1309   req.add_header('User-Agent', 'urllib-example/0.1 (Contact: . . .)')
1310   r = urllib.request.urlopen(req)
1311
1312:class:`OpenerDirector` automatically adds a :mailheader:`User-Agent` header to
1313every :class:`Request`.  To change this::
1314
1315   import urllib.request
1316   opener = urllib.request.build_opener()
1317   opener.addheaders = [('User-agent', 'Mozilla/5.0')]
1318   opener.open('http://www.example.com/')
1319
1320Also, remember that a few standard headers (:mailheader:`Content-Length`,
1321:mailheader:`Content-Type` and :mailheader:`Host`)
1322are added when the :class:`Request` is passed to :func:`urlopen` (or
1323:meth:`OpenerDirector.open`).
1324
1325.. _urllib-examples:
1326
1327Here is an example session that uses the ``GET`` method to retrieve a URL
1328containing parameters::
1329
1330   >>> import urllib.request
1331   >>> import urllib.parse
1332   >>> params = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
1333   >>> url = "http://www.musi-cal.com/cgi-bin/query?%s" % params
1334   >>> with urllib.request.urlopen(url) as f:
1335   ...     print(f.read().decode('utf-8'))
1336   ...
1337
1338The following example uses the ``POST`` method instead. Note that params output
1339from urlencode is encoded to bytes before it is sent to urlopen as data::
1340
1341   >>> import urllib.request
1342   >>> import urllib.parse
1343   >>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
1344   >>> data = data.encode('ascii')
1345   >>> with urllib.request.urlopen("http://requestb.in/xrbl82xr", data) as f:
1346   ...     print(f.read().decode('utf-8'))
1347   ...
1348
1349The following example uses an explicitly specified HTTP proxy, overriding
1350environment settings::
1351
1352   >>> import urllib.request
1353   >>> proxies = {'http': 'http://proxy.example.com:8080/'}
1354   >>> opener = urllib.request.FancyURLopener(proxies)
1355   >>> with opener.open("http://www.python.org") as f:
1356   ...     f.read().decode('utf-8')
1357   ...
1358
1359The following example uses no proxies at all, overriding environment settings::
1360
1361   >>> import urllib.request
1362   >>> opener = urllib.request.FancyURLopener({})
1363   >>> with opener.open("http://www.python.org/") as f:
1364   ...     f.read().decode('utf-8')
1365   ...
1366
1367
1368Legacy interface
1369----------------
1370
1371The following functions and classes are ported from the Python 2 module
1372``urllib`` (as opposed to ``urllib2``).  They might become deprecated at
1373some point in the future.
1374
1375.. function:: urlretrieve(url, filename=None, reporthook=None, data=None)
1376
1377   Copy a network object denoted by a URL to a local file. If the URL
1378   points to a local file, the object will not be copied unless filename is supplied.
1379   Return a tuple ``(filename, headers)`` where *filename* is the
1380   local file name under which the object can be found, and *headers* is whatever
1381   the :meth:`!info` method of the object returned by :func:`urlopen` returned (for
1382   a remote object). Exceptions are the same as for :func:`urlopen`.
1383
1384   The second argument, if present, specifies the file location to copy to (if
1385   absent, the location will be a tempfile with a generated name). The third
1386   argument, if present, is a callable that will be called once on
1387   establishment of the network connection and once after each block read
1388   thereafter.  The callable will be passed three arguments; a count of blocks
1389   transferred so far, a block size in bytes, and the total size of the file.  The
1390   third argument may be ``-1`` on older FTP servers which do not return a file
1391   size in response to a retrieval request.
1392
1393   The following example illustrates the most common usage scenario::
1394
1395      >>> import urllib.request
1396      >>> local_filename, headers = urllib.request.urlretrieve('http://python.org/')
1397      >>> html = open(local_filename)
1398      >>> html.close()
1399
1400   If the *url* uses the :file:`http:` scheme identifier, the optional *data*
1401   argument may be given to specify a ``POST`` request (normally the request
1402   type is ``GET``).  The *data* argument must be a bytes object in standard
1403   :mimetype:`application/x-www-form-urlencoded` format; see the
1404   :func:`urllib.parse.urlencode` function.
1405
1406   :func:`urlretrieve` will raise :exc:`~urllib.error.ContentTooShortError` when it detects that
1407   the amount of data available  was less than the expected amount (which is the
1408   size reported by a  *Content-Length* header). This can occur, for example, when
1409   the  download is interrupted.
1410
1411   The *Content-Length* is treated as a lower bound: if there's more data  to read,
1412   urlretrieve reads more data, but if less data is available,  it raises the
1413   exception.
1414
1415   You can still retrieve the downloaded data in this case, it is stored in the
1416   :attr:`!content` attribute of the exception instance.
1417
1418   If no *Content-Length* header was supplied, urlretrieve can not check the size
1419   of the data it has downloaded, and just returns it.  In this case you just have
1420   to assume that the download was successful.
1421
1422.. function:: urlcleanup()
1423
1424   Cleans up temporary files that may have been left behind by previous
1425   calls to :func:`urlretrieve`.
1426
1427.. class:: URLopener(proxies=None, **x509)
1428
1429   .. deprecated:: 3.3
1430
1431   Base class for opening and reading URLs.  Unless you need to support opening
1432   objects using schemes other than :file:`http:`, :file:`ftp:`, or :file:`file:`,
1433   you probably want to use :class:`FancyURLopener`.
1434
1435   By default, the :class:`URLopener` class sends a :mailheader:`User-Agent` header
1436   of ``urllib/VVV``, where *VVV* is the :mod:`urllib` version number.
1437   Applications can define their own :mailheader:`User-Agent` header by subclassing
1438   :class:`URLopener` or :class:`FancyURLopener` and setting the class attribute
1439   :attr:`version` to an appropriate string value in the subclass definition.
1440
1441   The optional *proxies* parameter should be a dictionary mapping scheme names to
1442   proxy URLs, where an empty dictionary turns proxies off completely.  Its default
1443   value is ``None``, in which case environmental proxy settings will be used if
1444   present, as discussed in the definition of :func:`urlopen`, above.
1445
1446   Additional keyword parameters, collected in *x509*, may be used for
1447   authentication of the client when using the :file:`https:` scheme.  The keywords
1448   *key_file* and *cert_file* are supported to provide an  SSL key and certificate;
1449   both are needed to support client authentication.
1450
1451   :class:`URLopener` objects will raise an :exc:`OSError` exception if the server
1452   returns an error code.
1453
1454   .. method:: open(fullurl, data=None)
1455
1456      Open *fullurl* using the appropriate protocol.  This method sets up cache and
1457      proxy information, then calls the appropriate open method with its input
1458      arguments.  If the scheme is not recognized, :meth:`open_unknown` is called.
1459      The *data* argument has the same meaning as the *data* argument of
1460      :func:`urlopen`.
1461
1462      This method always quotes *fullurl* using :func:`~urllib.parse.quote`.
1463
1464   .. method:: open_unknown(fullurl, data=None)
1465
1466      Overridable interface to open unknown URL types.
1467
1468
1469   .. method:: retrieve(url, filename=None, reporthook=None, data=None)
1470
1471      Retrieves the contents of *url* and places it in *filename*.  The return value
1472      is a tuple consisting of a local filename and either an
1473      :class:`email.message.Message` object containing the response headers (for remote
1474      URLs) or ``None`` (for local URLs).  The caller must then open and read the
1475      contents of *filename*.  If *filename* is not given and the URL refers to a
1476      local file, the input filename is returned.  If the URL is non-local and
1477      *filename* is not given, the filename is the output of :func:`tempfile.mktemp`
1478      with a suffix that matches the suffix of the last path component of the input
1479      URL.  If *reporthook* is given, it must be a function accepting three numeric
1480      parameters: A chunk number, the maximum size chunks are read in and the total size of the download
1481      (-1 if unknown).  It will be called once at the start and after each chunk of data is read from the
1482      network.  *reporthook* is ignored for local URLs.
1483
1484      If the *url* uses the :file:`http:` scheme identifier, the optional *data*
1485      argument may be given to specify a ``POST`` request (normally the request type
1486      is ``GET``).  The *data* argument must in standard
1487      :mimetype:`application/x-www-form-urlencoded` format; see the
1488      :func:`urllib.parse.urlencode` function.
1489
1490
1491   .. attribute:: version
1492
1493      Variable that specifies the user agent of the opener object.  To get
1494      :mod:`urllib` to tell servers that it is a particular user agent, set this in a
1495      subclass as a class variable or in the constructor before calling the base
1496      constructor.
1497
1498
1499.. class:: FancyURLopener(...)
1500
1501   .. deprecated:: 3.3
1502
1503   :class:`FancyURLopener` subclasses :class:`URLopener` providing default handling
1504   for the following HTTP response codes: 301, 302, 303, 307 and 401.  For the 30x
1505   response codes listed above, the :mailheader:`Location` header is used to fetch
1506   the actual URL.  For 401 response codes (authentication required), basic HTTP
1507   authentication is performed.  For the 30x response codes, recursion is bounded
1508   by the value of the *maxtries* attribute, which defaults to 10.
1509
1510   For all other response codes, the method :meth:`~BaseHandler.http_error_default` is called
1511   which you can override in subclasses to handle the error appropriately.
1512
1513   .. note::
1514
1515      According to the letter of :rfc:`2616`, 301 and 302 responses to POST requests
1516      must not be automatically redirected without confirmation by the user.  In
1517      reality, browsers do allow automatic redirection of these responses, changing
1518      the POST to a GET, and :mod:`urllib` reproduces this behaviour.
1519
1520   The parameters to the constructor are the same as those for :class:`URLopener`.
1521
1522   .. note::
1523
1524      When performing basic authentication, a :class:`FancyURLopener` instance calls
1525      its :meth:`prompt_user_passwd` method.  The default implementation asks the
1526      users for the required information on the controlling terminal.  A subclass may
1527      override this method to support more appropriate behavior if needed.
1528
1529   The :class:`FancyURLopener` class offers one additional method that should be
1530   overloaded to provide the appropriate behavior:
1531
1532   .. method:: prompt_user_passwd(host, realm)
1533
1534      Return information needed to authenticate the user at the given host in the
1535      specified security realm.  The return value should be a tuple, ``(user,
1536      password)``, which can be used for basic authentication.
1537
1538      The implementation prompts for this information on the terminal; an application
1539      should override this method to use an appropriate interaction model in the local
1540      environment.
1541
1542
1543:mod:`urllib.request` Restrictions
1544----------------------------------
1545
1546.. index::
1547   pair: HTTP; protocol
1548   pair: FTP; protocol
1549
1550* Currently, only the following protocols are supported: HTTP (versions 0.9 and
1551  1.0), FTP, local files, and data URLs.
1552
1553  .. versionchanged:: 3.4 Added support for data URLs.
1554
1555* The caching feature of :func:`urlretrieve` has been disabled until someone
1556  finds the time to hack proper processing of Expiration time headers.
1557
1558* There should be a function to query whether a particular URL is in the cache.
1559
1560* For backward compatibility, if a URL appears to point to a local file but the
1561  file can't be opened, the URL is re-interpreted using the FTP protocol.  This
1562  can sometimes cause confusing error messages.
1563
1564* The :func:`urlopen` and :func:`urlretrieve` functions can cause arbitrarily
1565  long delays while waiting for a network connection to be set up.  This means
1566  that it is difficult to build an interactive web client using these functions
1567  without using threads.
1568
1569  .. index::
1570     single: HTML
1571     pair: HTTP; protocol
1572
1573* The data returned by :func:`urlopen` or :func:`urlretrieve` is the raw data
1574  returned by the server.  This may be binary data (such as an image), plain text
1575  or (for example) HTML.  The HTTP protocol provides type information in the reply
1576  header, which can be inspected by looking at the :mailheader:`Content-Type`
1577  header.  If the returned data is HTML, you can use the module
1578  :mod:`html.parser` to parse it.
1579
1580  .. index:: single: FTP
1581
1582* The code handling the FTP protocol cannot differentiate between a file and a
1583  directory.  This can lead to unexpected behavior when attempting to read a URL
1584  that points to a file that is not accessible.  If the URL ends in a ``/``, it is
1585  assumed to refer to a directory and will be handled accordingly.  But if an
1586  attempt to read a file leads to a 550 error (meaning the URL cannot be found or
1587  is not accessible, often for permission reasons), then the path is treated as a
1588  directory in order to handle the case when a directory is specified by a URL but
1589  the trailing ``/`` has been left off.  This can cause misleading results when
1590  you try to fetch a file whose read permissions make it inaccessible; the FTP
1591  code will try to read it, fail with a 550 error, and then perform a directory
1592  listing for the unreadable file. If fine-grained control is needed, consider
1593  using the :mod:`ftplib` module, subclassing :class:`FancyURLopener`, or changing
1594  *_urlopener* to meet your needs.
1595
1596
1597
1598:mod:`urllib.response` --- Response classes used by urllib
1599==========================================================
1600
1601.. module:: urllib.response
1602   :synopsis: Response classes used by urllib.
1603
1604The :mod:`urllib.response` module defines functions and classes which define a
1605minimal file-like interface, including ``read()`` and ``readline()``.
1606Functions defined by this module are used internally by the :mod:`urllib.request` module.
1607The typical response object is a :class:`urllib.response.addinfourl` instance:
1608
1609.. class:: addinfourl
1610
1611   .. attribute:: url
1612
1613      URL of the resource retrieved, commonly used to determine if a redirect was followed.
1614
1615   .. attribute:: headers
1616
1617      Returns the headers of the response in the form of an :class:`~email.message.EmailMessage` instance.
1618
1619   .. attribute:: status
1620
1621      .. versionadded:: 3.9
1622
1623      Status code returned by server.
1624
1625   .. method:: geturl()
1626
1627      .. deprecated:: 3.9
1628         Deprecated in favor of :attr:`~addinfourl.url`.
1629
1630   .. method:: info()
1631
1632      .. deprecated:: 3.9
1633         Deprecated in favor of :attr:`~addinfourl.headers`.
1634
1635   .. attribute:: code
1636
1637      .. deprecated:: 3.9
1638         Deprecated in favor of :attr:`~addinfourl.status`.
1639
1640   .. method:: getcode()
1641
1642      .. deprecated:: 3.9
1643         Deprecated in favor of :attr:`~addinfourl.status`.
1644