• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1:mod:`urllib.request` --- Extensible library for opening URLs
2=============================================================
3
4.. module:: urllib.request
5   :synopsis: Extensible library for opening URLs.
6
7.. moduleauthor:: Jeremy Hylton <jeremy@alum.mit.edu>
8.. sectionauthor:: Moshe Zadka <moshez@users.sourceforge.net>
9.. sectionauthor:: Senthil Kumaran <senthil@uthcode.com>
10
11**Source code:** :source:`Lib/urllib/request.py`
12
13--------------
14
15The :mod:`urllib.request` module defines functions and classes which help in
16opening URLs (mostly HTTP) in a complex world --- basic and digest
17authentication, redirections, cookies and more.
18
19.. seealso::
20
21    The `Requests package <https://requests.readthedocs.io/en/master/>`_
22    is recommended for a higher-level HTTP client interface.
23
24
25The :mod:`urllib.request` module defines the following functions:
26
27
28.. function:: urlopen(url, data=None[, timeout], *, cafile=None, capath=None, cadefault=False, context=None)
29
30   Open the URL *url*, which can be either a string or a
31   :class:`Request` object.
32
33   *data* must be an object specifying additional data to be sent to the
34   server, or ``None`` if no such data is needed.  See :class:`Request`
35   for details.
36
37   urllib.request module uses HTTP/1.1 and includes ``Connection:close`` header
38   in its HTTP requests.
39
40   The optional *timeout* parameter specifies a timeout in seconds for
41   blocking operations like the connection attempt (if not specified,
42   the global default timeout setting will be used).  This actually
43   only works for HTTP, HTTPS and FTP connections.
44
45   If *context* is specified, it must be a :class:`ssl.SSLContext` instance
46   describing the various SSL options. See :class:`~http.client.HTTPSConnection`
47   for more details.
48
49   The optional *cafile* and *capath* parameters specify a set of trusted
50   CA certificates for HTTPS requests.  *cafile* should point to a single
51   file containing a bundle of CA certificates, whereas *capath* should
52   point to a directory of hashed certificate files.  More information can
53   be found in :meth:`ssl.SSLContext.load_verify_locations`.
54
55   The *cadefault* parameter is ignored.
56
57   This function always returns an object which can work as a
58   :term:`context manager` and has the properties *url*, *headers*, and *status*.
59   See :class:`urllib.response.addinfourl` for more detail on these properties.
60
61   For HTTP and HTTPS URLs, this function returns a
62   :class:`http.client.HTTPResponse` object slightly modified. In addition
63   to the three new methods above, the msg attribute contains the
64   same information as the :attr:`~http.client.HTTPResponse.reason`
65   attribute --- the reason phrase returned by server --- instead of
66   the response headers as it is specified in the documentation for
67   :class:`~http.client.HTTPResponse`.
68
69   For FTP, file, and data URLs and requests explicitly handled by legacy
70   :class:`URLopener` and :class:`FancyURLopener` classes, this function
71   returns a :class:`urllib.response.addinfourl` object.
72
73   Raises :exc:`~urllib.error.URLError` on protocol errors.
74
75   Note that ``None`` may be returned if no handler handles the request (though
76   the default installed global :class:`OpenerDirector` uses
77   :class:`UnknownHandler` to ensure this never happens).
78
79   In addition, if proxy settings are detected (for example, when a ``*_proxy``
80   environment variable like :envvar:`http_proxy` is set),
81   :class:`ProxyHandler` is default installed and makes sure the requests are
82   handled through the proxy.
83
84   The legacy ``urllib.urlopen`` function from Python 2.6 and earlier has been
85   discontinued; :func:`urllib.request.urlopen` corresponds to the old
86   ``urllib2.urlopen``.  Proxy handling, which was done by passing a dictionary
87   parameter to ``urllib.urlopen``, can be obtained by using
88   :class:`ProxyHandler` objects.
89
90   .. audit-event:: urllib.Request fullurl,data,headers,method urllib.request.urlopen
91
92      The default opener raises an :ref:`auditing event <auditing>`
93      ``urllib.Request`` with arguments ``fullurl``, ``data``, ``headers``,
94      ``method`` taken from the request object.
95
96   .. versionchanged:: 3.2
97      *cafile* and *capath* were added.
98
99   .. versionchanged:: 3.2
100      HTTPS virtual hosts are now supported if possible (that is, if
101      :data:`ssl.HAS_SNI` is true).
102
103   .. versionadded:: 3.2
104      *data* can be an iterable object.
105
106   .. versionchanged:: 3.3
107      *cadefault* was added.
108
109   .. versionchanged:: 3.4.3
110      *context* was added.
111
112   .. versionchanged:: 3.10
113      HTTPS connection now send an ALPN extension with protocol indicator
114      ``http/1.1`` when no *context* is given. Custom *context* should set
115      ALPN protocols with :meth:`~ssl.SSLContext.set_alpn_protocol`.
116
117   .. deprecated:: 3.6
118
119       *cafile*, *capath* and *cadefault* are deprecated in favor of *context*.
120       Please use :meth:`ssl.SSLContext.load_cert_chain` instead, or let
121       :func:`ssl.create_default_context` select the system's trusted CA
122       certificates for you.
123
124
125.. function:: install_opener(opener)
126
127   Install an :class:`OpenerDirector` instance as the default global opener.
128   Installing an opener is only necessary if you want urlopen to use that
129   opener; otherwise, simply call :meth:`OpenerDirector.open` instead of
130   :func:`~urllib.request.urlopen`.  The code does not check for a real
131   :class:`OpenerDirector`, and any class with the appropriate interface will
132   work.
133
134
135.. function:: build_opener([handler, ...])
136
137   Return an :class:`OpenerDirector` instance, which chains the handlers in the
138   order given. *handler*\s can be either instances of :class:`BaseHandler`, or
139   subclasses of :class:`BaseHandler` (in which case it must be possible to call
140   the constructor without any parameters).  Instances of the following classes
141   will be in front of the *handler*\s, unless the *handler*\s contain them,
142   instances of them or subclasses of them: :class:`ProxyHandler` (if proxy
143   settings are detected), :class:`UnknownHandler`, :class:`HTTPHandler`,
144   :class:`HTTPDefaultErrorHandler`, :class:`HTTPRedirectHandler`,
145   :class:`FTPHandler`, :class:`FileHandler`, :class:`HTTPErrorProcessor`.
146
147   If the Python installation has SSL support (i.e., if the :mod:`ssl` module
148   can be imported), :class:`HTTPSHandler` will also be added.
149
150   A :class:`BaseHandler` subclass may also change its :attr:`handler_order`
151   attribute to modify its position in the handlers list.
152
153
154.. function:: pathname2url(path)
155
156   Convert the pathname *path* from the local syntax for a path to the form used in
157   the path component of a URL.  This does not produce a complete URL.  The return
158   value will already be quoted using the :func:`~urllib.parse.quote` function.
159
160
161.. function:: url2pathname(path)
162
163   Convert the path component *path* from a percent-encoded URL to the local syntax for a
164   path.  This does not accept a complete URL.  This function uses
165   :func:`~urllib.parse.unquote` to decode *path*.
166
167.. function:: getproxies()
168
169   This helper function returns a dictionary of scheme to proxy server URL
170   mappings. It scans the environment for variables named ``<scheme>_proxy``,
171   in a case insensitive approach, for all operating systems first, and when it
172   cannot find it, looks for proxy information from System
173   Configuration for macOS and Windows Systems Registry for Windows.
174   If both lowercase and uppercase environment variables exist (and disagree),
175   lowercase is preferred.
176
177   .. note::
178
179      If the environment variable ``REQUEST_METHOD`` is set, which usually
180      indicates your script is running in a CGI environment, the environment
181      variable ``HTTP_PROXY`` (uppercase ``_PROXY``) will be ignored. This is
182      because that variable can be injected by a client using the "Proxy:" HTTP
183      header. If you need to use an HTTP proxy in a CGI environment, either use
184      ``ProxyHandler`` explicitly, or make sure the variable name is in
185      lowercase (or at least the ``_proxy`` suffix).
186
187
188The following classes are provided:
189
190.. class:: Request(url, data=None, headers={}, origin_req_host=None, unverifiable=False, method=None)
191
192   This class is an abstraction of a URL request.
193
194   *url* should be a string containing a valid URL.
195
196   *data* must be an object specifying additional data to send to the
197   server, or ``None`` if no such data is needed.  Currently HTTP
198   requests are the only ones that use *data*.  The supported object
199   types include bytes, file-like objects, and iterables of bytes-like objects.
200   If no ``Content-Length`` nor ``Transfer-Encoding`` header field
201   has been provided, :class:`HTTPHandler` will set these headers according
202   to the type of *data*.  ``Content-Length`` will be used to send
203   bytes objects, while ``Transfer-Encoding: chunked`` as specified in
204   :rfc:`7230`, Section 3.3.1 will be used to send files and other iterables.
205
206   For an HTTP POST request method, *data* should be a buffer in the
207   standard :mimetype:`application/x-www-form-urlencoded` format.  The
208   :func:`urllib.parse.urlencode` function takes a mapping or sequence
209   of 2-tuples and returns an ASCII string in this format. It should
210   be encoded to bytes before being used as the *data* parameter.
211
212   *headers* should be a dictionary, and will be treated as if
213   :meth:`add_header` was called with each key and value as arguments.
214   This is often used to "spoof" the ``User-Agent`` header value, which is
215   used by a browser to identify itself -- some HTTP servers only
216   allow requests coming from common browsers as opposed to scripts.
217   For example, Mozilla Firefox may identify itself as ``"Mozilla/5.0
218   (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11"``, while
219   :mod:`urllib`'s default user agent string is
220   ``"Python-urllib/2.6"`` (on Python 2.6).
221
222   An appropriate ``Content-Type`` header should be included if the *data*
223   argument is present.  If this header has not been provided and *data*
224   is not None, ``Content-Type: application/x-www-form-urlencoded`` will
225   be added as a default.
226
227   The next two arguments are only of interest for correct handling
228   of third-party HTTP cookies:
229
230   *origin_req_host* should be the request-host of the origin
231   transaction, as defined by :rfc:`2965`.  It defaults to
232   ``http.cookiejar.request_host(self)``.  This is the host name or IP
233   address of the original request that was initiated by the user.
234   For example, if the request is for an image in an HTML document,
235   this should be the request-host of the request for the page
236   containing the image.
237
238   *unverifiable* should indicate whether the request is unverifiable,
239   as defined by :rfc:`2965`.  It defaults to ``False``.  An unverifiable
240   request is one whose URL the user did not have the option to
241   approve.  For example, if the request is for an image in an HTML
242   document, and the user had no option to approve the automatic
243   fetching of the image, this should be true.
244
245   *method* should be a string that indicates the HTTP request method that
246   will be used (e.g. ``'HEAD'``).  If provided, its value is stored in the
247   :attr:`~Request.method` attribute and is used by :meth:`get_method()`.
248   The default is ``'GET'`` if *data* is ``None`` or ``'POST'`` otherwise.
249   Subclasses may indicate a different default method by setting the
250   :attr:`~Request.method` attribute in the class itself.
251
252   .. note::
253      The request will not work as expected if the data object is unable
254      to deliver its content more than once (e.g. a file or an iterable
255      that can produce the content only once) and the request is retried
256      for HTTP redirects or authentication.  The *data* is sent to the
257      HTTP server right away after the headers.  There is no support for
258      a 100-continue expectation in the library.
259
260   .. versionchanged:: 3.3
261      :attr:`Request.method` argument is added to the Request class.
262
263   .. versionchanged:: 3.4
264      Default :attr:`Request.method` may be indicated at the class level.
265
266   .. versionchanged:: 3.6
267      Do not raise an error if the ``Content-Length`` has not been
268      provided and *data* is neither ``None`` nor a bytes object.
269      Fall back to use chunked transfer encoding instead.
270
271.. class:: OpenerDirector()
272
273   The :class:`OpenerDirector` class opens URLs via :class:`BaseHandler`\ s chained
274   together. It manages the chaining of handlers, and recovery from errors.
275
276
277.. class:: BaseHandler()
278
279   This is the base class for all registered handlers --- and handles only the
280   simple mechanics of registration.
281
282
283.. class:: HTTPDefaultErrorHandler()
284
285   A class which defines a default handler for HTTP error responses; all responses
286   are turned into :exc:`~urllib.error.HTTPError` exceptions.
287
288
289.. class:: HTTPRedirectHandler()
290
291   A class to handle redirections.
292
293
294.. class:: HTTPCookieProcessor(cookiejar=None)
295
296   A class to handle HTTP Cookies.
297
298
299.. class:: ProxyHandler(proxies=None)
300
301   Cause requests to go through a proxy. If *proxies* is given, it must be a
302   dictionary mapping protocol names to URLs of proxies. The default is to read
303   the list of proxies from the environment variables
304   ``<protocol>_proxy``.  If no proxy environment variables are set, then
305   in a Windows environment proxy settings are obtained from the registry's
306   Internet Settings section, and in a macOS environment proxy information
307   is retrieved from the System Configuration Framework.
308
309   To disable autodetected proxy pass an empty dictionary.
310
311   The :envvar:`no_proxy` environment variable can be used to specify hosts
312   which shouldn't be reached via proxy; if set, it should be a comma-separated
313   list of hostname suffixes, optionally with ``:port`` appended, for example
314   ``cern.ch,ncsa.uiuc.edu,some.host:8080``.
315
316    .. note::
317
318       ``HTTP_PROXY`` will be ignored if a variable ``REQUEST_METHOD`` is set;
319       see the documentation on :func:`~urllib.request.getproxies`.
320
321
322.. class:: HTTPPasswordMgr()
323
324   Keep a database of  ``(realm, uri) -> (user, password)`` mappings.
325
326
327.. class:: HTTPPasswordMgrWithDefaultRealm()
328
329   Keep a database of  ``(realm, uri) -> (user, password)`` mappings. A realm of
330   ``None`` is considered a catch-all realm, which is searched if no other realm
331   fits.
332
333
334.. class:: HTTPPasswordMgrWithPriorAuth()
335
336   A variant of :class:`HTTPPasswordMgrWithDefaultRealm` that also has a
337   database of ``uri -> is_authenticated`` mappings.  Can be used by a
338   BasicAuth handler to determine when to send authentication credentials
339   immediately instead of waiting for a ``401`` response first.
340
341   .. versionadded:: 3.5
342
343
344.. class:: AbstractBasicAuthHandler(password_mgr=None)
345
346   This is a mixin class that helps with HTTP authentication, both to the remote
347   host and to a proxy. *password_mgr*, if given, should be something that is
348   compatible with :class:`HTTPPasswordMgr`; refer to section
349   :ref:`http-password-mgr` for information on the interface that must be
350   supported.  If *passwd_mgr* also provides ``is_authenticated`` and
351   ``update_authenticated`` methods (see
352   :ref:`http-password-mgr-with-prior-auth`), then the handler will use the
353   ``is_authenticated`` result for a given URI to determine whether or not to
354   send authentication credentials with the request.  If ``is_authenticated``
355   returns ``True`` for the URI, credentials are sent.  If ``is_authenticated``
356   is ``False``, credentials are not sent, and then if a ``401`` response is
357   received the request is re-sent with the authentication credentials.  If
358   authentication succeeds, ``update_authenticated`` is called to set
359   ``is_authenticated`` ``True`` for the URI, so that subsequent requests to
360   the URI or any of its super-URIs will automatically include the
361   authentication credentials.
362
363   .. versionadded:: 3.5
364      Added ``is_authenticated`` support.
365
366
367.. class:: HTTPBasicAuthHandler(password_mgr=None)
368
369   Handle authentication with the remote host. *password_mgr*, if given, should
370   be something that is compatible with :class:`HTTPPasswordMgr`; refer to
371   section :ref:`http-password-mgr` for information on the interface that must
372   be supported. HTTPBasicAuthHandler will raise a :exc:`ValueError` when
373   presented with a wrong Authentication scheme.
374
375
376.. class:: ProxyBasicAuthHandler(password_mgr=None)
377
378   Handle authentication with the proxy. *password_mgr*, if given, should be
379   something that is compatible with :class:`HTTPPasswordMgr`; refer to section
380   :ref:`http-password-mgr` for information on the interface that must be
381   supported.
382
383
384.. class:: AbstractDigestAuthHandler(password_mgr=None)
385
386   This is a mixin class that helps with HTTP authentication, both to the remote
387   host and to a proxy. *password_mgr*, if given, should be something that is
388   compatible with :class:`HTTPPasswordMgr`; refer to section
389   :ref:`http-password-mgr` for information on the interface that must be
390   supported.
391
392
393.. class:: HTTPDigestAuthHandler(password_mgr=None)
394
395   Handle authentication with the remote host. *password_mgr*, if given, should
396   be something that is compatible with :class:`HTTPPasswordMgr`; refer to
397   section :ref:`http-password-mgr` for information on the interface that must
398   be supported. When both Digest Authentication Handler and Basic
399   Authentication Handler are both added, Digest Authentication is always tried
400   first. If the Digest Authentication returns a 40x response again, it is sent
401   to Basic Authentication handler to Handle.  This Handler method will raise a
402   :exc:`ValueError` when presented with an authentication scheme other than
403   Digest or Basic.
404
405   .. versionchanged:: 3.3
406      Raise :exc:`ValueError` on unsupported Authentication Scheme.
407
408
409
410.. class:: ProxyDigestAuthHandler(password_mgr=None)
411
412   Handle authentication with the proxy. *password_mgr*, if given, should be
413   something that is compatible with :class:`HTTPPasswordMgr`; refer to section
414   :ref:`http-password-mgr` for information on the interface that must be
415   supported.
416
417
418.. class:: HTTPHandler()
419
420   A class to handle opening of HTTP URLs.
421
422
423.. class:: HTTPSHandler(debuglevel=0, context=None, check_hostname=None)
424
425   A class to handle opening of HTTPS URLs.  *context* and *check_hostname*
426   have the same meaning as in :class:`http.client.HTTPSConnection`.
427
428   .. versionchanged:: 3.2
429      *context* and *check_hostname* were added.
430
431
432.. class:: FileHandler()
433
434   Open local files.
435
436.. class:: DataHandler()
437
438   Open data URLs.
439
440   .. versionadded:: 3.4
441
442.. class:: FTPHandler()
443
444   Open FTP URLs.
445
446
447.. class:: CacheFTPHandler()
448
449   Open FTP URLs, keeping a cache of open FTP connections to minimize delays.
450
451
452.. class:: UnknownHandler()
453
454   A catch-all class to handle unknown URLs.
455
456
457.. class:: HTTPErrorProcessor()
458
459   Process HTTP error responses.
460
461
462.. _request-objects:
463
464Request Objects
465---------------
466
467The following methods describe :class:`Request`'s public interface,
468and so all may be overridden in subclasses.  It also defines several
469public attributes that can be used by clients to inspect the parsed
470request.
471
472.. attribute:: Request.full_url
473
474   The original URL passed to the constructor.
475
476   .. versionchanged:: 3.4
477
478   Request.full_url is a property with setter, getter and a deleter. Getting
479   :attr:`~Request.full_url` returns the original request URL with the
480   fragment, if it was present.
481
482.. attribute:: Request.type
483
484   The URI scheme.
485
486.. attribute:: Request.host
487
488   The URI authority, typically a host, but may also contain a port
489   separated by a colon.
490
491.. attribute:: Request.origin_req_host
492
493   The original host for the request, without port.
494
495.. attribute:: Request.selector
496
497   The URI path.  If the :class:`Request` uses a proxy, then selector
498   will be the full URL that is passed to the proxy.
499
500.. attribute:: Request.data
501
502   The entity body for the request, or ``None`` if not specified.
503
504   .. versionchanged:: 3.4
505      Changing value of :attr:`Request.data` now deletes "Content-Length"
506      header if it was previously set or calculated.
507
508.. attribute:: Request.unverifiable
509
510   boolean, indicates whether the request is unverifiable as defined
511   by :rfc:`2965`.
512
513.. attribute:: Request.method
514
515   The HTTP request method to use.  By default its value is :const:`None`,
516   which means that :meth:`~Request.get_method` will do its normal computation
517   of the method to be used.  Its value can be set (thus overriding the default
518   computation in :meth:`~Request.get_method`) either by providing a default
519   value by setting it at the class level in a :class:`Request` subclass, or by
520   passing a value in to the :class:`Request` constructor via the *method*
521   argument.
522
523   .. versionadded:: 3.3
524
525   .. versionchanged:: 3.4
526      A default value can now be set in subclasses; previously it could only
527      be set via the constructor argument.
528
529
530.. method:: Request.get_method()
531
532   Return a string indicating the HTTP request method.  If
533   :attr:`Request.method` is not ``None``, return its value, otherwise return
534   ``'GET'`` if :attr:`Request.data` is ``None``, or ``'POST'`` if it's not.
535   This is only meaningful for HTTP requests.
536
537   .. versionchanged:: 3.3
538      get_method now looks at the value of :attr:`Request.method`.
539
540
541.. method:: Request.add_header(key, val)
542
543   Add another header to the request.  Headers are currently ignored by all
544   handlers except HTTP handlers, where they are added to the list of headers sent
545   to the server.  Note that there cannot be more than one header with the same
546   name, and later calls will overwrite previous calls in case the *key* collides.
547   Currently, this is no loss of HTTP functionality, since all headers which have
548   meaning when used more than once have a (header-specific) way of gaining the
549   same functionality using only one header.
550
551
552.. method:: Request.add_unredirected_header(key, header)
553
554   Add a header that will not be added to a redirected request.
555
556
557.. method:: Request.has_header(header)
558
559   Return whether the instance has the named header (checks both regular and
560   unredirected).
561
562
563.. method:: Request.remove_header(header)
564
565   Remove named header from the request instance (both from regular and
566   unredirected headers).
567
568   .. versionadded:: 3.4
569
570
571.. method:: Request.get_full_url()
572
573   Return the URL given in the constructor.
574
575   .. versionchanged:: 3.4
576
577   Returns :attr:`Request.full_url`
578
579
580.. method:: Request.set_proxy(host, type)
581
582   Prepare the request by connecting to a proxy server. The *host* and *type* will
583   replace those of the instance, and the instance's selector will be the original
584   URL given in the constructor.
585
586
587.. method:: Request.get_header(header_name, default=None)
588
589   Return the value of the given header. If the header is not present, return
590   the default value.
591
592
593.. method:: Request.header_items()
594
595   Return a list of tuples (header_name, header_value) of the Request headers.
596
597.. versionchanged:: 3.4
598   The request methods add_data, has_data, get_data, get_type, get_host,
599   get_selector, get_origin_req_host and is_unverifiable that were deprecated
600   since 3.3 have been removed.
601
602
603.. _opener-director-objects:
604
605OpenerDirector Objects
606----------------------
607
608:class:`OpenerDirector` instances have the following methods:
609
610
611.. method:: OpenerDirector.add_handler(handler)
612
613   *handler* should be an instance of :class:`BaseHandler`.  The following methods
614   are searched, and added to the possible chains (note that HTTP errors are a
615   special case).  Note that, in the following, *protocol* should be replaced
616   with the actual protocol to handle, for example :meth:`http_response` would
617   be the HTTP protocol response handler.  Also *type* should be replaced with
618   the actual HTTP code, for example :meth:`http_error_404` would handle HTTP
619   404 errors.
620
621   * :meth:`<protocol>_open` --- signal that the handler knows how to open *protocol*
622     URLs.
623
624     See |protocol_open|_ for more information.
625
626   * :meth:`http_error_\<type\>` --- signal that the handler knows how to handle HTTP
627     errors with HTTP error code *type*.
628
629     See |http_error_nnn|_ for more information.
630
631   * :meth:`<protocol>_error` --- signal that the handler knows how to handle errors
632     from (non-\ ``http``) *protocol*.
633
634   * :meth:`<protocol>_request` --- signal that the handler knows how to pre-process
635     *protocol* requests.
636
637     See |protocol_request|_ for more information.
638
639   * :meth:`<protocol>_response` --- signal that the handler knows how to
640     post-process *protocol* responses.
641
642     See |protocol_response|_ for more information.
643
644.. |protocol_open| replace:: :meth:`BaseHandler.<protocol>_open`
645.. |http_error_nnn| replace:: :meth:`BaseHandler.http_error_\<nnn\>`
646.. |protocol_request| replace:: :meth:`BaseHandler.<protocol>_request`
647.. |protocol_response| replace:: :meth:`BaseHandler.<protocol>_response`
648
649.. method:: OpenerDirector.open(url, data=None[, timeout])
650
651   Open the given *url* (which can be a request object or a string), optionally
652   passing the given *data*. Arguments, return values and exceptions raised are
653   the same as those of :func:`urlopen` (which simply calls the :meth:`open`
654   method on the currently installed global :class:`OpenerDirector`).  The
655   optional *timeout* parameter specifies a timeout in seconds for blocking
656   operations like the connection attempt (if not specified, the global default
657   timeout setting will be used). The timeout feature actually works only for
658   HTTP, HTTPS and FTP connections.
659
660
661.. method:: OpenerDirector.error(proto, *args)
662
663   Handle an error of the given protocol.  This will call the registered error
664   handlers for the given protocol with the given arguments (which are protocol
665   specific).  The HTTP protocol is a special case which uses the HTTP response
666   code to determine the specific error handler; refer to the :meth:`http_error_\<type\>`
667   methods of the handler classes.
668
669   Return values and exceptions raised are the same as those of :func:`urlopen`.
670
671OpenerDirector objects open URLs in three stages:
672
673The order in which these methods are called within each stage is determined by
674sorting the handler instances.
675
676#. Every handler with a method named like :meth:`<protocol>_request` has that
677   method called to pre-process the request.
678
679#. Handlers with a method named like :meth:`<protocol>_open` are called to handle
680   the request. This stage ends when a handler either returns a non-\ :const:`None`
681   value (ie. a response), or raises an exception (usually
682   :exc:`~urllib.error.URLError`).  Exceptions are allowed to propagate.
683
684   In fact, the above algorithm is first tried for methods named
685   :meth:`default_open`.  If all such methods return :const:`None`, the algorithm
686   is repeated for methods named like :meth:`<protocol>_open`.  If all such methods
687   return :const:`None`, the algorithm is repeated for methods named
688   :meth:`unknown_open`.
689
690   Note that the implementation of these methods may involve calls of the parent
691   :class:`OpenerDirector` instance's :meth:`~OpenerDirector.open` and
692   :meth:`~OpenerDirector.error` methods.
693
694#. Every handler with a method named like :meth:`<protocol>_response` has that
695   method called to post-process the response.
696
697
698.. _base-handler-objects:
699
700BaseHandler Objects
701-------------------
702
703:class:`BaseHandler` objects provide a couple of methods that are directly
704useful, and others that are meant to be used by derived classes.  These are
705intended for direct use:
706
707
708.. method:: BaseHandler.add_parent(director)
709
710   Add a director as parent.
711
712
713.. method:: BaseHandler.close()
714
715   Remove any parents.
716
717The following attribute and methods should only be used by classes derived from
718:class:`BaseHandler`.
719
720.. note::
721
722   The convention has been adopted that subclasses defining
723   :meth:`<protocol>_request` or :meth:`<protocol>_response` methods are named
724   :class:`\*Processor`; all others are named :class:`\*Handler`.
725
726
727.. attribute:: BaseHandler.parent
728
729   A valid :class:`OpenerDirector`, which can be used to open using a different
730   protocol, or handle errors.
731
732
733.. method:: BaseHandler.default_open(req)
734
735   This method is *not* defined in :class:`BaseHandler`, but subclasses should
736   define it if they want to catch all URLs.
737
738   This method, if implemented, will be called by the parent
739   :class:`OpenerDirector`.  It should return a file-like object as described in
740   the return value of the :meth:`open` of :class:`OpenerDirector`, or ``None``.
741   It should raise :exc:`~urllib.error.URLError`, unless a truly exceptional
742   thing happens (for example, :exc:`MemoryError` should not be mapped to
743   :exc:`URLError`).
744
745   This method will be called before any protocol-specific open method.
746
747
748.. _protocol_open:
749.. method:: BaseHandler.<protocol>_open(req)
750   :noindex:
751
752   This method is *not* defined in :class:`BaseHandler`, but subclasses should
753   define it if they want to handle URLs with the given protocol.
754
755   This method, if defined, will be called by the parent :class:`OpenerDirector`.
756   Return values should be the same as for  :meth:`default_open`.
757
758
759.. method:: BaseHandler.unknown_open(req)
760
761   This method is *not* defined in :class:`BaseHandler`, but subclasses should
762   define it if they want to catch all URLs with no specific registered handler to
763   open it.
764
765   This method, if implemented, will be called by the :attr:`parent`
766   :class:`OpenerDirector`.  Return values should be the same as for
767   :meth:`default_open`.
768
769
770.. method:: BaseHandler.http_error_default(req, fp, code, msg, hdrs)
771
772   This method is *not* defined in :class:`BaseHandler`, but subclasses should
773   override it if they intend to provide a catch-all for otherwise unhandled HTTP
774   errors.  It will be called automatically by the  :class:`OpenerDirector` getting
775   the error, and should not normally be called in other circumstances.
776
777   *req* will be a :class:`Request` object, *fp* will be a file-like object with
778   the HTTP error body, *code* will be the three-digit code of the error, *msg*
779   will be the user-visible explanation of the code and *hdrs* will be a mapping
780   object with the headers of the error.
781
782   Return values and exceptions raised should be the same as those of
783   :func:`urlopen`.
784
785
786.. _http_error_nnn:
787.. method:: BaseHandler.http_error_<nnn>(req, fp, code, msg, hdrs)
788
789   *nnn* should be a three-digit HTTP error code.  This method is also not defined
790   in :class:`BaseHandler`, but will be called, if it exists, on an instance of a
791   subclass, when an HTTP error with code *nnn* occurs.
792
793   Subclasses should override this method to handle specific HTTP errors.
794
795   Arguments, return values and exceptions raised should be the same as for
796   :meth:`http_error_default`.
797
798
799.. _protocol_request:
800.. method:: BaseHandler.<protocol>_request(req)
801   :noindex:
802
803   This method is *not* defined in :class:`BaseHandler`, but subclasses should
804   define it if they want to pre-process requests of the given protocol.
805
806   This method, if defined, will be called by the parent :class:`OpenerDirector`.
807   *req* will be a :class:`Request` object. The return value should be a
808   :class:`Request` object.
809
810
811.. _protocol_response:
812.. method:: BaseHandler.<protocol>_response(req, response)
813   :noindex:
814
815   This method is *not* defined in :class:`BaseHandler`, but subclasses should
816   define it if they want to post-process responses of the given protocol.
817
818   This method, if defined, will be called by the parent :class:`OpenerDirector`.
819   *req* will be a :class:`Request` object. *response* will be an object
820   implementing the same interface as the return value of :func:`urlopen`.  The
821   return value should implement the same interface as the return value of
822   :func:`urlopen`.
823
824
825.. _http-redirect-handler:
826
827HTTPRedirectHandler Objects
828---------------------------
829
830.. note::
831
832   Some HTTP redirections require action from this module's client code.  If this
833   is the case, :exc:`~urllib.error.HTTPError` is raised.  See :rfc:`2616` for
834   details of the precise meanings of the various redirection codes.
835
836   An :class:`HTTPError` exception raised as a security consideration if the
837   HTTPRedirectHandler is presented with a redirected URL which is not an HTTP,
838   HTTPS or FTP URL.
839
840
841.. method:: HTTPRedirectHandler.redirect_request(req, fp, code, msg, hdrs, newurl)
842
843   Return a :class:`Request` or ``None`` in response to a redirect. This is called
844   by the default implementations of the :meth:`http_error_30\*` methods when a
845   redirection is received from the server.  If a redirection should take place,
846   return a new :class:`Request` to allow :meth:`http_error_30\*` to perform the
847   redirect to *newurl*.  Otherwise, raise :exc:`~urllib.error.HTTPError` if
848   no other handler should try to handle this URL, or return ``None`` if you
849   can't but another handler might.
850
851   .. note::
852
853      The default implementation of this method does not strictly follow :rfc:`2616`,
854      which says that 301 and 302 responses to ``POST`` requests must not be
855      automatically redirected without confirmation by the user.  In reality, browsers
856      do allow automatic redirection of these responses, changing the POST to a
857      ``GET``, and the default implementation reproduces this behavior.
858
859
860.. method:: HTTPRedirectHandler.http_error_301(req, fp, code, msg, hdrs)
861
862   Redirect to the ``Location:`` or ``URI:`` URL.  This method is called by the
863   parent :class:`OpenerDirector` when getting an HTTP 'moved permanently' response.
864
865
866.. method:: HTTPRedirectHandler.http_error_302(req, fp, code, msg, hdrs)
867
868   The same as :meth:`http_error_301`, but called for the 'found' response.
869
870
871.. method:: HTTPRedirectHandler.http_error_303(req, fp, code, msg, hdrs)
872
873   The same as :meth:`http_error_301`, but called for the 'see other' response.
874
875
876.. method:: HTTPRedirectHandler.http_error_307(req, fp, code, msg, hdrs)
877
878   The same as :meth:`http_error_301`, but called for the 'temporary redirect'
879   response.
880
881
882.. _http-cookie-processor:
883
884HTTPCookieProcessor Objects
885---------------------------
886
887:class:`HTTPCookieProcessor` instances have one attribute:
888
889.. attribute:: HTTPCookieProcessor.cookiejar
890
891   The :class:`http.cookiejar.CookieJar` in which cookies are stored.
892
893
894.. _proxy-handler:
895
896ProxyHandler Objects
897--------------------
898
899
900.. method:: ProxyHandler.<protocol>_open(request)
901   :noindex:
902
903   The :class:`ProxyHandler` will have a method :meth:`<protocol>_open` for every
904   *protocol* which has a proxy in the *proxies* dictionary given in the
905   constructor.  The method will modify requests to go through the proxy, by
906   calling ``request.set_proxy()``, and call the next handler in the chain to
907   actually execute the protocol.
908
909
910.. _http-password-mgr:
911
912HTTPPasswordMgr Objects
913-----------------------
914
915These methods are available on :class:`HTTPPasswordMgr` and
916:class:`HTTPPasswordMgrWithDefaultRealm` objects.
917
918
919.. method:: HTTPPasswordMgr.add_password(realm, uri, user, passwd)
920
921   *uri* can be either a single URI, or a sequence of URIs. *realm*, *user* and
922   *passwd* must be strings. This causes ``(user, passwd)`` to be used as
923   authentication tokens when authentication for *realm* and a super-URI of any of
924   the given URIs is given.
925
926
927.. method:: HTTPPasswordMgr.find_user_password(realm, authuri)
928
929   Get user/password for given realm and URI, if any.  This method will return
930   ``(None, None)`` if there is no matching user/password.
931
932   For :class:`HTTPPasswordMgrWithDefaultRealm` objects, the realm ``None`` will be
933   searched if the given *realm* has no matching user/password.
934
935
936.. _http-password-mgr-with-prior-auth:
937
938HTTPPasswordMgrWithPriorAuth Objects
939------------------------------------
940
941This password manager extends :class:`HTTPPasswordMgrWithDefaultRealm` to support
942tracking URIs for which authentication credentials should always be sent.
943
944
945.. method:: HTTPPasswordMgrWithPriorAuth.add_password(realm, uri, user, \
946            passwd, is_authenticated=False)
947
948   *realm*, *uri*, *user*, *passwd* are as for
949   :meth:`HTTPPasswordMgr.add_password`.  *is_authenticated* sets the initial
950   value of the ``is_authenticated`` flag for the given URI or list of URIs.
951   If *is_authenticated* is specified as ``True``, *realm* is ignored.
952
953
954.. method:: HTTPPasswordMgrWithPriorAuth.find_user_password(realm, authuri)
955
956   Same as for :class:`HTTPPasswordMgrWithDefaultRealm` objects
957
958
959.. method:: HTTPPasswordMgrWithPriorAuth.update_authenticated(self, uri, \
960            is_authenticated=False)
961
962   Update the ``is_authenticated`` flag for the given *uri* or list
963   of URIs.
964
965
966.. method:: HTTPPasswordMgrWithPriorAuth.is_authenticated(self, authuri)
967
968   Returns the current state of the ``is_authenticated`` flag for
969   the given URI.
970
971
972.. _abstract-basic-auth-handler:
973
974AbstractBasicAuthHandler Objects
975--------------------------------
976
977
978.. method:: AbstractBasicAuthHandler.http_error_auth_reqed(authreq, host, req, headers)
979
980   Handle an authentication request by getting a user/password pair, and re-trying
981   the request.  *authreq* should be the name of the header where the information
982   about the realm is included in the request, *host* specifies the URL and path to
983   authenticate for, *req* should be the (failed) :class:`Request` object, and
984   *headers* should be the error headers.
985
986   *host* is either an authority (e.g. ``"python.org"``) or a URL containing an
987   authority component (e.g. ``"http://python.org/"``). In either case, the
988   authority must not contain a userinfo component (so, ``"python.org"`` and
989   ``"python.org:80"`` are fine, ``"joe:password@python.org"`` is not).
990
991
992.. _http-basic-auth-handler:
993
994HTTPBasicAuthHandler Objects
995----------------------------
996
997
998.. method:: HTTPBasicAuthHandler.http_error_401(req, fp, code,  msg, hdrs)
999
1000   Retry the request with authentication information, if available.
1001
1002
1003.. _proxy-basic-auth-handler:
1004
1005ProxyBasicAuthHandler Objects
1006-----------------------------
1007
1008
1009.. method:: ProxyBasicAuthHandler.http_error_407(req, fp, code,  msg, hdrs)
1010
1011   Retry the request with authentication information, if available.
1012
1013
1014.. _abstract-digest-auth-handler:
1015
1016AbstractDigestAuthHandler Objects
1017---------------------------------
1018
1019
1020.. method:: AbstractDigestAuthHandler.http_error_auth_reqed(authreq, host, req, headers)
1021
1022   *authreq* should be the name of the header where the information about the realm
1023   is included in the request, *host* should be the host to authenticate to, *req*
1024   should be the (failed) :class:`Request` object, and *headers* should be the
1025   error headers.
1026
1027
1028.. _http-digest-auth-handler:
1029
1030HTTPDigestAuthHandler Objects
1031-----------------------------
1032
1033
1034.. method:: HTTPDigestAuthHandler.http_error_401(req, fp, code,  msg, hdrs)
1035
1036   Retry the request with authentication information, if available.
1037
1038
1039.. _proxy-digest-auth-handler:
1040
1041ProxyDigestAuthHandler Objects
1042------------------------------
1043
1044
1045.. method:: ProxyDigestAuthHandler.http_error_407(req, fp, code,  msg, hdrs)
1046
1047   Retry the request with authentication information, if available.
1048
1049
1050.. _http-handler-objects:
1051
1052HTTPHandler Objects
1053-------------------
1054
1055
1056.. method:: HTTPHandler.http_open(req)
1057
1058   Send an HTTP request, which can be either GET or POST, depending on
1059   ``req.has_data()``.
1060
1061
1062.. _https-handler-objects:
1063
1064HTTPSHandler Objects
1065--------------------
1066
1067
1068.. method:: HTTPSHandler.https_open(req)
1069
1070   Send an HTTPS request, which can be either GET or POST, depending on
1071   ``req.has_data()``.
1072
1073
1074.. _file-handler-objects:
1075
1076FileHandler Objects
1077-------------------
1078
1079
1080.. method:: FileHandler.file_open(req)
1081
1082   Open the file locally, if there is no host name, or the host name is
1083   ``'localhost'``.
1084
1085   .. versionchanged:: 3.2
1086      This method is applicable only for local hostnames.  When a remote
1087      hostname is given, an :exc:`~urllib.error.URLError` is raised.
1088
1089
1090.. _data-handler-objects:
1091
1092DataHandler Objects
1093-------------------
1094
1095.. method:: DataHandler.data_open(req)
1096
1097   Read a data URL. This kind of URL contains the content encoded in the URL
1098   itself. The data URL syntax is specified in :rfc:`2397`. This implementation
1099   ignores white spaces in base64 encoded data URLs so the URL may be wrapped
1100   in whatever source file it comes from. But even though some browsers don't
1101   mind about a missing padding at the end of a base64 encoded data URL, this
1102   implementation will raise an :exc:`ValueError` in that case.
1103
1104
1105.. _ftp-handler-objects:
1106
1107FTPHandler Objects
1108------------------
1109
1110
1111.. method:: FTPHandler.ftp_open(req)
1112
1113   Open the FTP file indicated by *req*. The login is always done with empty
1114   username and password.
1115
1116
1117.. _cacheftp-handler-objects:
1118
1119CacheFTPHandler Objects
1120-----------------------
1121
1122:class:`CacheFTPHandler` objects are :class:`FTPHandler` objects with the
1123following additional methods:
1124
1125
1126.. method:: CacheFTPHandler.setTimeout(t)
1127
1128   Set timeout of connections to *t* seconds.
1129
1130
1131.. method:: CacheFTPHandler.setMaxConns(m)
1132
1133   Set maximum number of cached connections to *m*.
1134
1135
1136.. _unknown-handler-objects:
1137
1138UnknownHandler Objects
1139----------------------
1140
1141
1142.. method:: UnknownHandler.unknown_open()
1143
1144   Raise a :exc:`~urllib.error.URLError` exception.
1145
1146
1147.. _http-error-processor-objects:
1148
1149HTTPErrorProcessor Objects
1150--------------------------
1151
1152.. method:: HTTPErrorProcessor.http_response(request, response)
1153
1154   Process HTTP error responses.
1155
1156   For 200 error codes, the response object is returned immediately.
1157
1158   For non-200 error codes, this simply passes the job on to the
1159   :meth:`http_error_\<type\>` handler methods, via :meth:`OpenerDirector.error`.
1160   Eventually, :class:`HTTPDefaultErrorHandler` will raise an
1161   :exc:`~urllib.error.HTTPError` if no other handler handles the error.
1162
1163
1164.. method:: HTTPErrorProcessor.https_response(request, response)
1165
1166   Process HTTPS error responses.
1167
1168   The behavior is same as :meth:`http_response`.
1169
1170
1171.. _urllib-request-examples:
1172
1173Examples
1174--------
1175
1176In addition to the examples below, more examples are given in
1177:ref:`urllib-howto`.
1178
1179This example gets the python.org main page and displays the first 300 bytes of
1180it. ::
1181
1182   >>> import urllib.request
1183   >>> with urllib.request.urlopen('http://www.python.org/') as f:
1184   ...     print(f.read(300))
1185   ...
1186   b'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1187   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html
1188   xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">\n\n<head>\n
1189   <meta http-equiv="content-type" content="text/html; charset=utf-8" />\n
1190   <title>Python Programming '
1191
1192Note that urlopen returns a bytes object.  This is because there is no way
1193for urlopen to automatically determine the encoding of the byte stream
1194it receives from the HTTP server. In general, a program will decode
1195the returned bytes object to string once it determines or guesses
1196the appropriate encoding.
1197
1198The following W3C document, https://www.w3.org/International/O-charset\ , lists
1199the various ways in which an (X)HTML or an XML document could have specified its
1200encoding information.
1201
1202As the python.org website uses *utf-8* encoding as specified in its meta tag, we
1203will use the same for decoding the bytes object. ::
1204
1205   >>> with urllib.request.urlopen('http://www.python.org/') as f:
1206   ...     print(f.read(100).decode('utf-8'))
1207   ...
1208   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1209   "http://www.w3.org/TR/xhtml1/DTD/xhtm
1210
1211It is also possible to achieve the same result without using the
1212:term:`context manager` approach. ::
1213
1214   >>> import urllib.request
1215   >>> f = urllib.request.urlopen('http://www.python.org/')
1216   >>> print(f.read(100).decode('utf-8'))
1217   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1218   "http://www.w3.org/TR/xhtml1/DTD/xhtm
1219
1220In the following example, we are sending a data-stream to the stdin of a CGI
1221and reading the data it returns to us. Note that this example will only work
1222when the Python installation supports SSL. ::
1223
1224   >>> import urllib.request
1225   >>> req = urllib.request.Request(url='https://localhost/cgi-bin/test.cgi',
1226   ...                       data=b'This data is passed to stdin of the CGI')
1227   >>> with urllib.request.urlopen(req) as f:
1228   ...     print(f.read().decode('utf-8'))
1229   ...
1230   Got Data: "This data is passed to stdin of the CGI"
1231
1232The code for the sample CGI used in the above example is::
1233
1234   #!/usr/bin/env python
1235   import sys
1236   data = sys.stdin.read()
1237   print('Content-type: text/plain\n\nGot Data: "%s"' % data)
1238
1239Here is an example of doing a ``PUT`` request using :class:`Request`::
1240
1241    import urllib.request
1242    DATA = b'some data'
1243    req = urllib.request.Request(url='http://localhost:8080', data=DATA, method='PUT')
1244    with urllib.request.urlopen(req) as f:
1245        pass
1246    print(f.status)
1247    print(f.reason)
1248
1249Use of Basic HTTP Authentication::
1250
1251   import urllib.request
1252   # Create an OpenerDirector with support for Basic HTTP Authentication...
1253   auth_handler = urllib.request.HTTPBasicAuthHandler()
1254   auth_handler.add_password(realm='PDQ Application',
1255                             uri='https://mahler:8092/site-updates.py',
1256                             user='klem',
1257                             passwd='kadidd!ehopper')
1258   opener = urllib.request.build_opener(auth_handler)
1259   # ...and install it globally so it can be used with urlopen.
1260   urllib.request.install_opener(opener)
1261   urllib.request.urlopen('http://www.example.com/login.html')
1262
1263:func:`build_opener` provides many handlers by default, including a
1264:class:`ProxyHandler`.  By default, :class:`ProxyHandler` uses the environment
1265variables named ``<scheme>_proxy``, where ``<scheme>`` is the URL scheme
1266involved.  For example, the :envvar:`http_proxy` environment variable is read to
1267obtain the HTTP proxy's URL.
1268
1269This example replaces the default :class:`ProxyHandler` with one that uses
1270programmatically-supplied proxy URLs, and adds proxy authorization support with
1271:class:`ProxyBasicAuthHandler`. ::
1272
1273   proxy_handler = urllib.request.ProxyHandler({'http': 'http://www.example.com:3128/'})
1274   proxy_auth_handler = urllib.request.ProxyBasicAuthHandler()
1275   proxy_auth_handler.add_password('realm', 'host', 'username', 'password')
1276
1277   opener = urllib.request.build_opener(proxy_handler, proxy_auth_handler)
1278   # This time, rather than install the OpenerDirector, we use it directly:
1279   opener.open('http://www.example.com/login.html')
1280
1281Adding HTTP headers:
1282
1283Use the *headers* argument to the :class:`Request` constructor, or::
1284
1285   import urllib.request
1286   req = urllib.request.Request('http://www.example.com/')
1287   req.add_header('Referer', 'http://www.python.org/')
1288   # Customize the default User-Agent header value:
1289   req.add_header('User-Agent', 'urllib-example/0.1 (Contact: . . .)')
1290   r = urllib.request.urlopen(req)
1291
1292:class:`OpenerDirector` automatically adds a :mailheader:`User-Agent` header to
1293every :class:`Request`.  To change this::
1294
1295   import urllib.request
1296   opener = urllib.request.build_opener()
1297   opener.addheaders = [('User-agent', 'Mozilla/5.0')]
1298   opener.open('http://www.example.com/')
1299
1300Also, remember that a few standard headers (:mailheader:`Content-Length`,
1301:mailheader:`Content-Type` and :mailheader:`Host`)
1302are added when the :class:`Request` is passed to :func:`urlopen` (or
1303:meth:`OpenerDirector.open`).
1304
1305.. _urllib-examples:
1306
1307Here is an example session that uses the ``GET`` method to retrieve a URL
1308containing parameters::
1309
1310   >>> import urllib.request
1311   >>> import urllib.parse
1312   >>> params = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
1313   >>> url = "http://www.musi-cal.com/cgi-bin/query?%s" % params
1314   >>> with urllib.request.urlopen(url) as f:
1315   ...     print(f.read().decode('utf-8'))
1316   ...
1317
1318The following example uses the ``POST`` method instead. Note that params output
1319from urlencode is encoded to bytes before it is sent to urlopen as data::
1320
1321   >>> import urllib.request
1322   >>> import urllib.parse
1323   >>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
1324   >>> data = data.encode('ascii')
1325   >>> with urllib.request.urlopen("http://requestb.in/xrbl82xr", data) as f:
1326   ...     print(f.read().decode('utf-8'))
1327   ...
1328
1329The following example uses an explicitly specified HTTP proxy, overriding
1330environment settings::
1331
1332   >>> import urllib.request
1333   >>> proxies = {'http': 'http://proxy.example.com:8080/'}
1334   >>> opener = urllib.request.FancyURLopener(proxies)
1335   >>> with opener.open("http://www.python.org") as f:
1336   ...     f.read().decode('utf-8')
1337   ...
1338
1339The following example uses no proxies at all, overriding environment settings::
1340
1341   >>> import urllib.request
1342   >>> opener = urllib.request.FancyURLopener({})
1343   >>> with opener.open("http://www.python.org/") as f:
1344   ...     f.read().decode('utf-8')
1345   ...
1346
1347
1348Legacy interface
1349----------------
1350
1351The following functions and classes are ported from the Python 2 module
1352``urllib`` (as opposed to ``urllib2``).  They might become deprecated at
1353some point in the future.
1354
1355.. function:: urlretrieve(url, filename=None, reporthook=None, data=None)
1356
1357   Copy a network object denoted by a URL to a local file. If the URL
1358   points to a local file, the object will not be copied unless filename is supplied.
1359   Return a tuple ``(filename, headers)`` where *filename* is the
1360   local file name under which the object can be found, and *headers* is whatever
1361   the :meth:`info` method of the object returned by :func:`urlopen` returned (for
1362   a remote object). Exceptions are the same as for :func:`urlopen`.
1363
1364   The second argument, if present, specifies the file location to copy to (if
1365   absent, the location will be a tempfile with a generated name). The third
1366   argument, if present, is a callable that will be called once on
1367   establishment of the network connection and once after each block read
1368   thereafter.  The callable will be passed three arguments; a count of blocks
1369   transferred so far, a block size in bytes, and the total size of the file.  The
1370   third argument may be ``-1`` on older FTP servers which do not return a file
1371   size in response to a retrieval request.
1372
1373   The following example illustrates the most common usage scenario::
1374
1375      >>> import urllib.request
1376      >>> local_filename, headers = urllib.request.urlretrieve('http://python.org/')
1377      >>> html = open(local_filename)
1378      >>> html.close()
1379
1380   If the *url* uses the :file:`http:` scheme identifier, the optional *data*
1381   argument may be given to specify a ``POST`` request (normally the request
1382   type is ``GET``).  The *data* argument must be a bytes object in standard
1383   :mimetype:`application/x-www-form-urlencoded` format; see the
1384   :func:`urllib.parse.urlencode` function.
1385
1386   :func:`urlretrieve` will raise :exc:`ContentTooShortError` when it detects that
1387   the amount of data available  was less than the expected amount (which is the
1388   size reported by a  *Content-Length* header). This can occur, for example, when
1389   the  download is interrupted.
1390
1391   The *Content-Length* is treated as a lower bound: if there's more data  to read,
1392   urlretrieve reads more data, but if less data is available,  it raises the
1393   exception.
1394
1395   You can still retrieve the downloaded data in this case, it is stored  in the
1396   :attr:`content` attribute of the exception instance.
1397
1398   If no *Content-Length* header was supplied, urlretrieve can not check the size
1399   of the data it has downloaded, and just returns it.  In this case you just have
1400   to assume that the download was successful.
1401
1402.. function:: urlcleanup()
1403
1404   Cleans up temporary files that may have been left behind by previous
1405   calls to :func:`urlretrieve`.
1406
1407.. class:: URLopener(proxies=None, **x509)
1408
1409   .. deprecated:: 3.3
1410
1411   Base class for opening and reading URLs.  Unless you need to support opening
1412   objects using schemes other than :file:`http:`, :file:`ftp:`, or :file:`file:`,
1413   you probably want to use :class:`FancyURLopener`.
1414
1415   By default, the :class:`URLopener` class sends a :mailheader:`User-Agent` header
1416   of ``urllib/VVV``, where *VVV* is the :mod:`urllib` version number.
1417   Applications can define their own :mailheader:`User-Agent` header by subclassing
1418   :class:`URLopener` or :class:`FancyURLopener` and setting the class attribute
1419   :attr:`version` to an appropriate string value in the subclass definition.
1420
1421   The optional *proxies* parameter should be a dictionary mapping scheme names to
1422   proxy URLs, where an empty dictionary turns proxies off completely.  Its default
1423   value is ``None``, in which case environmental proxy settings will be used if
1424   present, as discussed in the definition of :func:`urlopen`, above.
1425
1426   Additional keyword parameters, collected in *x509*, may be used for
1427   authentication of the client when using the :file:`https:` scheme.  The keywords
1428   *key_file* and *cert_file* are supported to provide an  SSL key and certificate;
1429   both are needed to support client authentication.
1430
1431   :class:`URLopener` objects will raise an :exc:`OSError` exception if the server
1432   returns an error code.
1433
1434   .. method:: open(fullurl, data=None)
1435
1436      Open *fullurl* using the appropriate protocol.  This method sets up cache and
1437      proxy information, then calls the appropriate open method with its input
1438      arguments.  If the scheme is not recognized, :meth:`open_unknown` is called.
1439      The *data* argument has the same meaning as the *data* argument of
1440      :func:`urlopen`.
1441
1442      This method always quotes *fullurl* using :func:`~urllib.parse.quote`.
1443
1444   .. method:: open_unknown(fullurl, data=None)
1445
1446      Overridable interface to open unknown URL types.
1447
1448
1449   .. method:: retrieve(url, filename=None, reporthook=None, data=None)
1450
1451      Retrieves the contents of *url* and places it in *filename*.  The return value
1452      is a tuple consisting of a local filename and either an
1453      :class:`email.message.Message` object containing the response headers (for remote
1454      URLs) or ``None`` (for local URLs).  The caller must then open and read the
1455      contents of *filename*.  If *filename* is not given and the URL refers to a
1456      local file, the input filename is returned.  If the URL is non-local and
1457      *filename* is not given, the filename is the output of :func:`tempfile.mktemp`
1458      with a suffix that matches the suffix of the last path component of the input
1459      URL.  If *reporthook* is given, it must be a function accepting three numeric
1460      parameters: A chunk number, the maximum size chunks are read in and the total size of the download
1461      (-1 if unknown).  It will be called once at the start and after each chunk of data is read from the
1462      network.  *reporthook* is ignored for local URLs.
1463
1464      If the *url* uses the :file:`http:` scheme identifier, the optional *data*
1465      argument may be given to specify a ``POST`` request (normally the request type
1466      is ``GET``).  The *data* argument must in standard
1467      :mimetype:`application/x-www-form-urlencoded` format; see the
1468      :func:`urllib.parse.urlencode` function.
1469
1470
1471   .. attribute:: version
1472
1473      Variable that specifies the user agent of the opener object.  To get
1474      :mod:`urllib` to tell servers that it is a particular user agent, set this in a
1475      subclass as a class variable or in the constructor before calling the base
1476      constructor.
1477
1478
1479.. class:: FancyURLopener(...)
1480
1481   .. deprecated:: 3.3
1482
1483   :class:`FancyURLopener` subclasses :class:`URLopener` providing default handling
1484   for the following HTTP response codes: 301, 302, 303, 307 and 401.  For the 30x
1485   response codes listed above, the :mailheader:`Location` header is used to fetch
1486   the actual URL.  For 401 response codes (authentication required), basic HTTP
1487   authentication is performed.  For the 30x response codes, recursion is bounded
1488   by the value of the *maxtries* attribute, which defaults to 10.
1489
1490   For all other response codes, the method :meth:`http_error_default` is called
1491   which you can override in subclasses to handle the error appropriately.
1492
1493   .. note::
1494
1495      According to the letter of :rfc:`2616`, 301 and 302 responses to POST requests
1496      must not be automatically redirected without confirmation by the user.  In
1497      reality, browsers do allow automatic redirection of these responses, changing
1498      the POST to a GET, and :mod:`urllib` reproduces this behaviour.
1499
1500   The parameters to the constructor are the same as those for :class:`URLopener`.
1501
1502   .. note::
1503
1504      When performing basic authentication, a :class:`FancyURLopener` instance calls
1505      its :meth:`prompt_user_passwd` method.  The default implementation asks the
1506      users for the required information on the controlling terminal.  A subclass may
1507      override this method to support more appropriate behavior if needed.
1508
1509   The :class:`FancyURLopener` class offers one additional method that should be
1510   overloaded to provide the appropriate behavior:
1511
1512   .. method:: prompt_user_passwd(host, realm)
1513
1514      Return information needed to authenticate the user at the given host in the
1515      specified security realm.  The return value should be a tuple, ``(user,
1516      password)``, which can be used for basic authentication.
1517
1518      The implementation prompts for this information on the terminal; an application
1519      should override this method to use an appropriate interaction model in the local
1520      environment.
1521
1522
1523:mod:`urllib.request` Restrictions
1524----------------------------------
1525
1526  .. index::
1527     pair: HTTP; protocol
1528     pair: FTP; protocol
1529
1530* Currently, only the following protocols are supported: HTTP (versions 0.9 and
1531  1.0), FTP, local files, and data URLs.
1532
1533  .. versionchanged:: 3.4 Added support for data URLs.
1534
1535* The caching feature of :func:`urlretrieve` has been disabled until someone
1536  finds the time to hack proper processing of Expiration time headers.
1537
1538* There should be a function to query whether a particular URL is in the cache.
1539
1540* For backward compatibility, if a URL appears to point to a local file but the
1541  file can't be opened, the URL is re-interpreted using the FTP protocol.  This
1542  can sometimes cause confusing error messages.
1543
1544* The :func:`urlopen` and :func:`urlretrieve` functions can cause arbitrarily
1545  long delays while waiting for a network connection to be set up.  This means
1546  that it is difficult to build an interactive web client using these functions
1547  without using threads.
1548
1549  .. index::
1550     single: HTML
1551     pair: HTTP; protocol
1552
1553* The data returned by :func:`urlopen` or :func:`urlretrieve` is the raw data
1554  returned by the server.  This may be binary data (such as an image), plain text
1555  or (for example) HTML.  The HTTP protocol provides type information in the reply
1556  header, which can be inspected by looking at the :mailheader:`Content-Type`
1557  header.  If the returned data is HTML, you can use the module
1558  :mod:`html.parser` to parse it.
1559
1560  .. index:: single: FTP
1561
1562* The code handling the FTP protocol cannot differentiate between a file and a
1563  directory.  This can lead to unexpected behavior when attempting to read a URL
1564  that points to a file that is not accessible.  If the URL ends in a ``/``, it is
1565  assumed to refer to a directory and will be handled accordingly.  But if an
1566  attempt to read a file leads to a 550 error (meaning the URL cannot be found or
1567  is not accessible, often for permission reasons), then the path is treated as a
1568  directory in order to handle the case when a directory is specified by a URL but
1569  the trailing ``/`` has been left off.  This can cause misleading results when
1570  you try to fetch a file whose read permissions make it inaccessible; the FTP
1571  code will try to read it, fail with a 550 error, and then perform a directory
1572  listing for the unreadable file. If fine-grained control is needed, consider
1573  using the :mod:`ftplib` module, subclassing :class:`FancyURLopener`, or changing
1574  *_urlopener* to meet your needs.
1575
1576
1577
1578:mod:`urllib.response` --- Response classes used by urllib
1579==========================================================
1580
1581.. module:: urllib.response
1582   :synopsis: Response classes used by urllib.
1583
1584The :mod:`urllib.response` module defines functions and classes which define a
1585minimal file-like interface, including ``read()`` and ``readline()``.
1586Functions defined by this module are used internally by the :mod:`urllib.request` module.
1587The typical response object is a :class:`urllib.response.addinfourl` instance:
1588
1589.. class:: addinfourl
1590
1591   .. attribute:: url
1592
1593      URL of the resource retrieved, commonly used to determine if a redirect was followed.
1594
1595   .. attribute:: headers
1596
1597      Returns the headers of the response in the form of an :class:`~email.message.EmailMessage` instance.
1598
1599   .. attribute:: status
1600
1601      .. versionadded:: 3.9
1602
1603      Status code returned by server.
1604
1605   .. method:: geturl()
1606
1607      .. deprecated:: 3.9
1608         Deprecated in favor of :attr:`~addinfourl.url`.
1609
1610   .. method:: info()
1611
1612      .. deprecated:: 3.9
1613         Deprecated in favor of :attr:`~addinfourl.headers`.
1614
1615   .. attribute:: code
1616
1617      .. deprecated:: 3.9
1618         Deprecated in favor of :attr:`~addinfourl.status`.
1619
1620   .. method:: getstatus()
1621
1622      .. deprecated:: 3.9
1623         Deprecated in favor of :attr:`~addinfourl.status`.
1624