1# (c) 2005 Ian Bicking and contributors; written for Paste (http://pythonpaste.org) 2# Licensed under the MIT license: http://www.opensource.org/licenses/mit-license.php 3# (c) 2005 Ian Bicking, Clark C. Evans and contributors 4# This module is part of the Python Paste Project and is released under 5# the MIT License: http://www.opensource.org/licenses/mit-license.php 6# Some of this code was funded by: http://prometheusresearch.com 7""" 8HTTP Message Header Fields (see RFC 4229) 9 10This contains general support for HTTP/1.1 message headers [1]_ in a 11manner that supports WSGI ``environ`` [2]_ and ``response_headers`` 12[3]_. Specifically, this module defines a ``HTTPHeader`` class whose 13instances correspond to field-name items. The actual field-content for 14the message-header is stored in the appropriate WSGI collection (either 15the ``environ`` for requests, or ``response_headers`` for responses). 16 17Each ``HTTPHeader`` instance is a callable (defining ``__call__``) 18that takes one of the following: 19 20 - an ``environ`` dictionary, returning the corresponding header 21 value by according to the WSGI's ``HTTP_`` prefix mechanism, e.g., 22 ``USER_AGENT(environ)`` returns ``environ.get('HTTP_USER_AGENT')`` 23 24 - a ``response_headers`` list, giving a comma-delimited string for 25 each corresponding ``header_value`` tuple entries (see below). 26 27 - a sequence of string ``*args`` that are comma-delimited into 28 a single string value: ``CONTENT_TYPE("text/html","text/plain")`` 29 returns ``"text/html, text/plain"`` 30 31 - a set of ``**kwargs`` keyword arguments that are used to create 32 a header value, in a manner dependent upon the particular header in 33 question (to make value construction easier and error-free): 34 ``CONTENT_DISPOSITION(max_age=CONTENT_DISPOSITION.ONEWEEK)`` 35 returns ``"public, max-age=60480"`` 36 37Each ``HTTPHeader`` instance also provides several methods to act on 38a WSGI collection, for removing and setting header values. 39 40 ``delete(collection)`` 41 42 This method removes all entries of the corresponding header from 43 the given collection (``environ`` or ``response_headers``), e.g., 44 ``USER_AGENT.delete(environ)`` deletes the 'HTTP_USER_AGENT' entry 45 from the ``environ``. 46 47 ``update(collection, *args, **kwargs)`` 48 49 This method does an in-place replacement of the given header entry, 50 for example: ``CONTENT_LENGTH(response_headers,len(body))`` 51 52 The first argument is a valid ``environ`` dictionary or 53 ``response_headers`` list; remaining arguments are passed on to 54 ``__call__(*args, **kwargs)`` for value construction. 55 56 ``apply(collection, **kwargs)`` 57 58 This method is similar to update, only that it may affect other 59 headers. For example, according to recommendations in RFC 2616, 60 certain Cache-Control configurations should also set the 61 ``Expires`` header for HTTP/1.0 clients. By default, ``apply()`` 62 is simply ``update()`` but limited to keyword arguments. 63 64This particular approach to managing headers within a WSGI collection 65has several advantages: 66 67 1. Typos in the header name are easily detected since they become a 68 ``NameError`` when executed. The approach of using header strings 69 directly can be problematic; for example, the following should 70 return ``None`` : ``environ.get("HTTP_ACCEPT_LANGUAGES")`` 71 72 2. For specific headers with validation, using ``__call__`` will 73 result in an automatic header value check. For example, the 74 _ContentDisposition header will reject a value having ``maxage`` 75 or ``max_age`` (the appropriate parameter is ``max-age`` ). 76 77 3. When appending/replacing headers, the field-name has the suggested 78 RFC capitalization (e.g. ``Content-Type`` or ``ETag``) for 79 user-agents that incorrectly use case-sensitive matches. 80 81 4. Some headers (such as ``Content-Type``) are 0, that is, 82 only one entry of this type may occur in a given set of 83 ``response_headers``. This module knows about those cases and 84 enforces this cardinality constraint. 85 86 5. The exact details of WSGI header management are abstracted so 87 the programmer need not worry about operational differences 88 between ``environ`` dictionary or ``response_headers`` list. 89 90 6. Sorting of ``HTTPHeaders`` is done following the RFC suggestion 91 that general-headers come first, followed by request and response 92 headers, and finishing with entity-headers. 93 94 7. Special care is given to exceptional cases such as Set-Cookie 95 which violates the RFC's recommendation about combining header 96 content into a single entry using comma separation. 97 98A particular difficulty with HTTP message headers is a categorization 99of sorts as described in section 4.2: 100 101 Multiple message-header fields with the same field-name MAY be 102 present in a message if and only if the entire field-value for 103 that header field is defined as a comma-separated list [i.e., 104 #(values)]. It MUST be possible to combine the multiple header 105 fields into one "field-name: field-value" pair, without changing 106 the semantics of the message, by appending each subsequent 107 field-value to the first, each separated by a comma. 108 109This creates three fundamentally different kinds of headers: 110 111 - Those that do not have a #(values) production, and hence are 112 singular and may only occur once in a set of response fields; 113 this case is handled by the ``_SingleValueHeader`` subclass. 114 115 - Those which have the #(values) production and follow the 116 combining rule outlined above; our ``_MultiValueHeader`` case. 117 118 - Those which are multi-valued, but cannot be combined (such as the 119 ``Set-Cookie`` header due to its ``Expires`` parameter); or where 120 combining them into a single header entry would cause common 121 user-agents to fail (``WWW-Authenticate``, ``Warning``) since 122 they fail to handle dates even when properly quoted. This case 123 is handled by ``_MultiEntryHeader``. 124 125Since this project does not have time to provide rigorous support 126and validation for all headers, it does a basic construction of 127headers listed in RFC 2616 (plus a few others) so that they can 128be obtained by simply doing ``from paste.httpheaders import *``; 129the name of the header instance is the "common name" less any 130dashes to give CamelCase style names. 131 132.. [1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2 133.. [2] http://www.python.org/peps/pep-0333.html#environ-variables 134.. [3] http://www.python.org/peps/pep-0333.html#the-start-response-callable 135 136""" 137import mimetypes 138import six 139from time import time as now 140try: 141 # Python 3 142 from email.utils import formatdate, parsedate_tz, mktime_tz 143 from urllib.request import AbstractDigestAuthHandler, parse_keqv_list, parse_http_list 144except ImportError: 145 # Python 2 146 from rfc822 import formatdate, parsedate_tz, mktime_tz 147 from urllib2 import AbstractDigestAuthHandler, parse_keqv_list, parse_http_list 148 149from .httpexceptions import HTTPBadRequest 150 151__all__ = ['get_header', 'list_headers', 'normalize_headers', 152 'HTTPHeader', 'EnvironVariable' ] 153 154class EnvironVariable(str): 155 """ 156 a CGI ``environ`` variable as described by WSGI 157 158 This is a helper object so that standard WSGI ``environ`` variables 159 can be extracted w/o syntax error possibility. 160 """ 161 def __call__(self, environ): 162 return environ.get(self,'') 163 def __repr__(self): 164 return '<EnvironVariable %s>' % self 165 def update(self, environ, value): 166 environ[self] = value 167REMOTE_USER = EnvironVariable("REMOTE_USER") 168REMOTE_SESSION = EnvironVariable("REMOTE_SESSION") 169AUTH_TYPE = EnvironVariable("AUTH_TYPE") 170REQUEST_METHOD = EnvironVariable("REQUEST_METHOD") 171SCRIPT_NAME = EnvironVariable("SCRIPT_NAME") 172PATH_INFO = EnvironVariable("PATH_INFO") 173 174for _name, _obj in six.iteritems(dict(globals())): 175 if isinstance(_obj, EnvironVariable): 176 __all__.append(_name) 177 178_headers = {} 179 180class HTTPHeader(object): 181 """ 182 an HTTP header 183 184 HTTPHeader instances represent a particular ``field-name`` of an 185 HTTP message header. They do not hold a field-value, but instead 186 provide operations that work on is corresponding values. Storage 187 of the actual field values is done with WSGI ``environ`` or 188 ``response_headers`` as appropriate. Typically, a sub-classes that 189 represent a specific HTTP header, such as _ContentDisposition, are 190 0. Once constructed the HTTPHeader instances themselves 191 are immutable and stateless. 192 193 For purposes of documentation a "container" refers to either a 194 WSGI ``environ`` dictionary, or a ``response_headers`` list. 195 196 Member variables (and correspondingly constructor arguments). 197 198 ``name`` 199 200 the ``field-name`` of the header, in "common form" 201 as presented in RFC 2616; e.g. 'Content-Type' 202 203 ``category`` 204 205 one of 'general', 'request', 'response', or 'entity' 206 207 ``version`` 208 209 version of HTTP (informational) with which the header should 210 be recognized 211 212 ``sort_order`` 213 214 sorting order to be applied before sorting on 215 field-name when ordering headers in a response 216 217 Special Methods: 218 219 ``__call__`` 220 221 The primary method of the HTTPHeader instance is to make 222 it a callable, it takes either a collection, a string value, 223 or keyword arguments and attempts to find/construct a valid 224 field-value 225 226 ``__lt__`` 227 228 This method is used so that HTTPHeader objects can be 229 sorted in a manner suggested by RFC 2616. 230 231 ``__str__`` 232 233 The string-value for instances of this class is 234 the ``field-name``. 235 236 Primary Methods: 237 238 ``delete()`` 239 240 remove the all occurrences (if any) of the given 241 header in the collection provided 242 243 ``update()`` 244 245 replaces (if they exist) all field-value items 246 in the given collection with the value provided 247 248 ``tuples()`` 249 250 returns a set of (field-name, field-value) tuples 251 5 for extending ``response_headers`` 252 253 Custom Methods (these may not be implemented): 254 255 ``apply()`` 256 257 similar to ``update``, but with two differences; first, 258 only keyword arguments can be used, and second, specific 259 sub-classes may introduce side-effects 260 261 ``parse()`` 262 263 converts a string value of the header into a more usable 264 form, such as time in seconds for a date header, etc. 265 266 The collected versions of initialized header instances are immediately 267 registered and accessible through the ``get_header`` function. Do not 268 inherit from this directly, use one of ``_SingleValueHeader``, 269 ``_MultiValueHeader``, or ``_MultiEntryHeader`` as appropriate. 270 """ 271 272 # 273 # Things which can be customized 274 # 275 version = '1.1' 276 category = 'general' 277 reference = '' 278 extensions = {} 279 280 def compose(self, **kwargs): 281 """ 282 build header value from keyword arguments 283 284 This method is used to build the corresponding header value when 285 keyword arguments (or no arguments) were provided. The result 286 should be a sequence of values. For example, the ``Expires`` 287 header takes a keyword argument ``time`` (e.g. time.time()) from 288 which it returns a the corresponding date. 289 """ 290 raise NotImplementedError() 291 292 def parse(self, *args, **kwargs): 293 """ 294 convert raw header value into more usable form 295 296 This method invokes ``values()`` with the arguments provided, 297 parses the header results, and then returns a header-specific 298 data structure corresponding to the header. For example, the 299 ``Expires`` header returns seconds (as returned by time.time()) 300 """ 301 raise NotImplementedError() 302 303 def apply(self, collection, **kwargs): 304 """ 305 update the collection /w header value (may have side effects) 306 307 This method is similar to ``update`` only that usage may result 308 in other headers being changed as recommended by the corresponding 309 specification. The return value is defined by the particular 310 sub-class. For example, the ``_CacheControl.apply()`` sets the 311 ``Expires`` header in addition to its normal behavior. 312 """ 313 self.update(collection, **kwargs) 314 315 # 316 # Things which are standardized (mostly) 317 # 318 def __new__(cls, name, category=None, reference=None, version=None): 319 """ 320 construct a new ``HTTPHeader`` instance 321 322 We use the ``__new__`` operator to ensure that only one 323 ``HTTPHeader`` instance exists for each field-name, and to 324 register the header so that it can be found/enumerated. 325 """ 326 self = get_header(name, raiseError=False) 327 if self: 328 # Allow the registration to happen again, but assert 329 # that everything is identical. 330 assert self.name == name, \ 331 "duplicate registration with different capitalization" 332 assert self.category == category, \ 333 "duplicate registration with different category" 334 assert cls == self.__class__, \ 335 "duplicate registration with different class" 336 return self 337 338 self = object.__new__(cls) 339 self.name = name 340 assert isinstance(self.name, str) 341 self.category = category or self.category 342 self.version = version or self.version 343 self.reference = reference or self.reference 344 _headers[self.name.lower()] = self 345 self.sort_order = {'general': 1, 'request': 2, 346 'response': 3, 'entity': 4 }[self.category] 347 self._environ_name = getattr(self, '_environ_name', 348 'HTTP_'+ self.name.upper().replace("-","_")) 349 self._headers_name = getattr(self, '_headers_name', 350 self.name.lower()) 351 assert self.version in ('1.1', '1.0', '0.9') 352 return self 353 354 def __str__(self): 355 return self.name 356 357 def __lt__(self, other): 358 """ 359 sort header instances as specified by RFC 2616 360 361 Re-define sorting so that general headers are first, followed 362 by request/response headers, and then entity headers. The 363 list.sort() methods use the less-than operator for this purpose. 364 """ 365 if isinstance(other, HTTPHeader): 366 if self.sort_order != other.sort_order: 367 return self.sort_order < other.sort_order 368 return self.name < other.name 369 return False 370 371 def __repr__(self): 372 ref = self.reference and (' (%s)' % self.reference) or '' 373 return '<%s %s%s>' % (self.__class__.__name__, self.name, ref) 374 375 def values(self, *args, **kwargs): 376 """ 377 find/construct field-value(s) for the given header 378 379 Resolution is done according to the following arguments: 380 381 - If only keyword arguments are given, then this is equivalent 382 to ``compose(**kwargs)``. 383 384 - If the first (and only) argument is a dict, it is assumed 385 to be a WSGI ``environ`` and the result of the corresponding 386 ``HTTP_`` entry is returned. 387 388 - If the first (and only) argument is a list, it is assumed 389 to be a WSGI ``response_headers`` and the field-value(s) 390 for this header are collected and returned. 391 392 - In all other cases, the arguments are collected, checked that 393 they are string values, possibly verified by the header's 394 logic, and returned. 395 396 At this time it is an error to provide keyword arguments if args 397 is present (this might change). It is an error to provide both 398 a WSGI object and also string arguments. If no arguments are 399 provided, then ``compose()`` is called to provide a default 400 value for the header; if there is not default it is an error. 401 """ 402 if not args: 403 return self.compose(**kwargs) 404 if list == type(args[0]): 405 assert 1 == len(args) 406 result = [] 407 name = self.name.lower() 408 for value in [value for header, value in args[0] 409 if header.lower() == name]: 410 result.append(value) 411 return result 412 if dict == type(args[0]): 413 assert 1 == len(args) and 'wsgi.version' in args[0] 414 value = args[0].get(self._environ_name) 415 if not value: 416 return () 417 return (value,) 418 for item in args: 419 assert not type(item) in (dict, list) 420 return args 421 422 def __call__(self, *args, **kwargs): 423 """ 424 converts ``values()`` into a string value 425 426 This method converts the results of ``values()`` into a string 427 value for common usage. By default, it is asserted that only 428 one value exists; if you need to access all values then either 429 call ``values()`` directly, or inherit ``_MultiValueHeader`` 430 which overrides this method to return a comma separated list of 431 values as described by section 4.2 of RFC 2616. 432 """ 433 values = self.values(*args, **kwargs) 434 assert isinstance(values, (tuple, list)) 435 if not values: 436 return '' 437 assert len(values) == 1, "more than one value: %s" % repr(values) 438 return str(values[0]).strip() 439 440 def delete(self, collection): 441 """ 442 removes all occurances of the header from the collection provided 443 """ 444 if type(collection) == dict: 445 if self._environ_name in collection: 446 del collection[self._environ_name] 447 return self 448 assert list == type(collection) 449 i = 0 450 while i < len(collection): 451 if collection[i][0].lower() == self._headers_name: 452 del collection[i] 453 continue 454 i += 1 455 456 def update(self, collection, *args, **kwargs): 457 """ 458 updates the collection with the provided header value 459 460 This method replaces (in-place when possible) all occurrences of 461 the given header with the provided value. If no value is 462 provided, this is the same as ``remove`` (note that this case 463 can only occur if the target is a collection w/o a corresponding 464 header value). The return value is the new header value (which 465 could be a list for ``_MultiEntryHeader`` instances). 466 """ 467 value = self.__call__(*args, **kwargs) 468 if not value: 469 self.delete(collection) 470 return 471 if type(collection) == dict: 472 collection[self._environ_name] = value 473 return 474 assert list == type(collection) 475 i = 0 476 found = False 477 while i < len(collection): 478 if collection[i][0].lower() == self._headers_name: 479 if found: 480 del collection[i] 481 continue 482 collection[i] = (self.name, value) 483 found = True 484 i += 1 485 if not found: 486 collection.append((self.name, value)) 487 488 def tuples(self, *args, **kwargs): 489 value = self.__call__(*args, **kwargs) 490 if not value: 491 return () 492 return [(self.name, value)] 493 494class _SingleValueHeader(HTTPHeader): 495 """ 496 a ``HTTPHeader`` with exactly a single value 497 498 This is the default behavior of ``HTTPHeader`` where returning a 499 the string-value of headers via ``__call__`` assumes that only 500 a single value exists. 501 """ 502 pass 503 504class _MultiValueHeader(HTTPHeader): 505 """ 506 a ``HTTPHeader`` with one or more values 507 508 The field-value for these header instances is is allowed to be more 509 than one value; whereby the ``__call__`` method returns a comma 510 separated list as described by section 4.2 of RFC 2616. 511 """ 512 513 def __call__(self, *args, **kwargs): 514 results = self.values(*args, **kwargs) 515 if not results: 516 return '' 517 return ", ".join([str(v).strip() for v in results]) 518 519 def parse(self, *args, **kwargs): 520 value = self.__call__(*args, **kwargs) 521 values = value.split(',') 522 return [ 523 v.strip() for v in values 524 if v.strip()] 525 526class _MultiEntryHeader(HTTPHeader): 527 """ 528 a multi-value ``HTTPHeader`` where items cannot be combined with a comma 529 530 This header is multi-valued, but the values should not be combined 531 with a comma since the header is not in compliance with RFC 2616 532 (Set-Cookie due to Expires parameter) or which common user-agents do 533 not behave well when the header values are combined. 534 """ 535 536 def update(self, collection, *args, **kwargs): 537 assert list == type(collection), "``environ`` may not be updated" 538 self.delete(collection) 539 collection.extend(self.tuples(*args, **kwargs)) 540 541 def tuples(self, *args, **kwargs): 542 values = self.values(*args, **kwargs) 543 if not values: 544 return () 545 return [(self.name, value.strip()) for value in values] 546 547def get_header(name, raiseError=True): 548 """ 549 find the given ``HTTPHeader`` instance 550 551 This function finds the corresponding ``HTTPHeader`` for the 552 ``name`` provided. So that python-style names can be used, 553 underscores are converted to dashes before the lookup. 554 """ 555 retval = _headers.get(str(name).strip().lower().replace("_","-")) 556 if not retval and raiseError: 557 raise AssertionError("'%s' is an unknown header" % name) 558 return retval 559 560def list_headers(general=None, request=None, response=None, entity=None): 561 " list all headers for a given category " 562 if not (general or request or response or entity): 563 general = request = response = entity = True 564 search = [] 565 for (bool, strval) in ((general, 'general'), (request, 'request'), 566 (response, 'response'), (entity, 'entity')): 567 if bool: 568 search.append(strval) 569 return [head for head in _headers.values() if head.category in search] 570 571def normalize_headers(response_headers, strict=True): 572 """ 573 sort headers as suggested by RFC 2616 574 575 This alters the underlying response_headers to use the common 576 name for each header; as well as sorting them with general 577 headers first, followed by request/response headers, then 578 entity headers, and unknown headers last. 579 """ 580 category = {} 581 for idx in range(len(response_headers)): 582 (key, val) = response_headers[idx] 583 head = get_header(key, strict) 584 if not head: 585 newhead = '-'.join([x.capitalize() for x in 586 key.replace("_","-").split("-")]) 587 response_headers[idx] = (newhead, val) 588 category[newhead] = 4 589 continue 590 response_headers[idx] = (str(head), val) 591 category[str(head)] = head.sort_order 592 def key_func(item): 593 value = item[0] 594 return (category[value], value) 595 response_headers.sort(key=key_func) 596 597class _DateHeader(_SingleValueHeader): 598 """ 599 handle date-based headers 600 601 This extends the ``_SingleValueHeader`` object with specific 602 treatment of time values: 603 604 - It overrides ``compose`` to provide a sole keyword argument 605 ``time`` which is an offset in seconds from the current time. 606 607 - A ``time`` method is provided which parses the given value 608 and returns the current time value. 609 """ 610 611 def compose(self, time=None, delta=None): 612 time = time or now() 613 if delta: 614 assert type(delta) == int 615 time += delta 616 return (formatdate(time),) 617 618 def parse(self, *args, **kwargs): 619 """ return the time value (in seconds since 1970) """ 620 value = self.__call__(*args, **kwargs) 621 if value: 622 try: 623 return mktime_tz(parsedate_tz(value)) 624 except (TypeError, OverflowError): 625 raise HTTPBadRequest(( 626 "Received an ill-formed timestamp for %s: %s\r\n") % 627 (self.name, value)) 628 629# 630# Following are specific HTTP headers. Since these classes are mostly 631# singletons, there is no point in keeping the class around once it has 632# been instantiated, so we use the same name. 633# 634 635class _CacheControl(_MultiValueHeader): 636 """ 637 Cache-Control, RFC 2616 14.9 (use ``CACHE_CONTROL``) 638 639 This header can be constructed (using keyword arguments), by 640 first specifying one of the following mechanisms: 641 642 ``public`` 643 644 if True, this argument specifies that the 645 response, as a whole, may be cashed. 646 647 ``private`` 648 649 if True, this argument specifies that the response, as a 650 whole, may be cashed; this implementation does not support 651 the enumeration of private fields 652 653 ``no_cache`` 654 655 if True, this argument specifies that the response, as a 656 whole, may not be cashed; this implementation does not 657 support the enumeration of private fields 658 659 In general, only one of the above three may be True, the other 2 660 must then be False or None. If all three are None, then the cache 661 is assumed to be ``public``. Following one of these mechanism 662 specifiers are various modifiers: 663 664 ``no_store`` 665 666 indicates if content may be stored on disk; 667 otherwise cache is limited to memory (note: 668 users can still save the data, this applies 669 to intermediate caches) 670 671 ``max_age`` 672 673 the maximum duration (in seconds) for which 674 the content should be cached; if ``no-cache`` 675 is specified, this defaults to 0 seconds 676 677 ``s_maxage`` 678 679 the maximum duration (in seconds) for which the 680 content should be allowed in a shared cache. 681 682 ``no_transform`` 683 684 specifies that an intermediate cache should 685 not convert the content from one type to 686 another (e.g. transform a BMP to a PNG). 687 688 ``extensions`` 689 690 gives additional cache-control extensions, 691 such as items like, community="UCI" (14.9.6) 692 693 The usage of ``apply()`` on this header has side-effects. As 694 recommended by RFC 2616, if ``max_age`` is provided, then then the 695 ``Expires`` header is also calculated for HTTP/1.0 clients and 696 proxies (this is done at the time ``apply()`` is called). For 697 ``no-cache`` and for ``private`` cases, we either do not want the 698 response cached or do not want any response accidently returned to 699 other users; so to prevent this case, we set the ``Expires`` header 700 to the time of the request, signifying to HTTP/1.0 transports that 701 the content isn't to be cached. If you are using SSL, your 702 communication is already "private", so to work with HTTP/1.0 703 browsers over SSL, consider specifying your cache as ``public`` as 704 the distinction between public and private is moot. 705 """ 706 707 # common values for max-age; "good enough" approximates 708 ONE_HOUR = 60*60 709 ONE_DAY = ONE_HOUR * 24 710 ONE_WEEK = ONE_DAY * 7 711 ONE_MONTH = ONE_DAY * 30 712 ONE_YEAR = ONE_WEEK * 52 713 714 def _compose(self, public=None, private=None, no_cache=None, 715 no_store=False, max_age=None, s_maxage=None, 716 no_transform=False, **extensions): 717 assert isinstance(max_age, (type(None), int)) 718 assert isinstance(s_maxage, (type(None), int)) 719 expires = 0 720 result = [] 721 if private is True: 722 assert not public and not no_cache and not s_maxage 723 result.append('private') 724 elif no_cache is True: 725 assert not public and not private and not max_age 726 result.append('no-cache') 727 else: 728 assert public is None or public is True 729 assert not private and not no_cache 730 expires = max_age 731 result.append('public') 732 if no_store: 733 result.append('no-store') 734 if no_transform: 735 result.append('no-transform') 736 if max_age is not None: 737 result.append('max-age=%d' % max_age) 738 if s_maxage is not None: 739 result.append('s-maxage=%d' % s_maxage) 740 for (k, v) in six.iteritems(extensions): 741 if k not in self.extensions: 742 raise AssertionError("unexpected extension used: '%s'" % k) 743 result.append('%s="%s"' % (k.replace("_", "-"), v)) 744 return (result, expires) 745 746 def compose(self, **kwargs): 747 (result, expires) = self._compose(**kwargs) 748 return result 749 750 def apply(self, collection, **kwargs): 751 """ returns the offset expiration in seconds """ 752 (result, expires) = self._compose(**kwargs) 753 if expires is not None: 754 EXPIRES.update(collection, delta=expires) 755 self.update(collection, *result) 756 return expires 757 758_CacheControl('Cache-Control', 'general', 'RFC 2616, 14.9') 759 760class _ContentType(_SingleValueHeader): 761 """ 762 Content-Type, RFC 2616 section 14.17 763 764 Unlike other headers, use the CGI variable instead. 765 """ 766 version = '1.0' 767 _environ_name = 'CONTENT_TYPE' 768 769 # common mimetype constants 770 UNKNOWN = 'application/octet-stream' 771 TEXT_PLAIN = 'text/plain' 772 TEXT_HTML = 'text/html' 773 TEXT_XML = 'text/xml' 774 775 def compose(self, major=None, minor=None, charset=None): 776 if not major: 777 if minor in ('plain', 'html', 'xml'): 778 major = 'text' 779 else: 780 assert not minor and not charset 781 return (self.UNKNOWN,) 782 if not minor: 783 minor = "*" 784 result = "%s/%s" % (major, minor) 785 if charset: 786 result += "; charset=%s" % charset 787 return (result,) 788 789_ContentType('Content-Type', 'entity', 'RFC 2616, 14.17') 790 791class _ContentLength(_SingleValueHeader): 792 """ 793 Content-Length, RFC 2616 section 14.13 794 795 Unlike other headers, use the CGI variable instead. 796 """ 797 version = "1.0" 798 _environ_name = 'CONTENT_LENGTH' 799 800_ContentLength('Content-Length', 'entity', 'RFC 2616, 14.13') 801 802class _ContentDisposition(_SingleValueHeader): 803 """ 804 Content-Disposition, RFC 2183 (use ``CONTENT_DISPOSITION``) 805 806 This header can be constructed (using keyword arguments), 807 by first specifying one of the following mechanisms: 808 809 ``attachment`` 810 811 if True, this specifies that the content should not be 812 shown in the browser and should be handled externally, 813 even if the browser could render the content 814 815 ``inline`` 816 817 exclusive with attachment; indicates that the content 818 should be rendered in the browser if possible, but 819 otherwise it should be handled externally 820 821 Only one of the above 2 may be True. If both are None, then 822 the disposition is assumed to be an ``attachment``. These are 823 distinct fields since support for field enumeration may be 824 added in the future. 825 826 ``filename`` 827 828 the filename parameter, if any, to be reported; if 829 this is None, then the current object's filename 830 attribute is used 831 832 The usage of ``apply()`` on this header has side-effects. If 833 filename is provided, and Content-Type is not set or is 834 'application/octet-stream', then the mimetypes.guess is used to 835 upgrade the Content-Type setting. 836 """ 837 838 def _compose(self, attachment=None, inline=None, filename=None): 839 result = [] 840 if inline is True: 841 assert not attachment 842 result.append('inline') 843 else: 844 assert not inline 845 result.append('attachment') 846 if filename: 847 assert '"' not in filename 848 filename = filename.split("/")[-1] 849 filename = filename.split("\\")[-1] 850 result.append('filename="%s"' % filename) 851 return (("; ".join(result),), filename) 852 853 def compose(self, **kwargs): 854 (result, mimetype) = self._compose(**kwargs) 855 return result 856 857 def apply(self, collection, **kwargs): 858 """ return the new Content-Type side-effect value """ 859 (result, filename) = self._compose(**kwargs) 860 mimetype = CONTENT_TYPE(collection) 861 if filename and (not mimetype or CONTENT_TYPE.UNKNOWN == mimetype): 862 mimetype, _ = mimetypes.guess_type(filename) 863 if mimetype and CONTENT_TYPE.UNKNOWN != mimetype: 864 CONTENT_TYPE.update(collection, mimetype) 865 self.update(collection, *result) 866 return mimetype 867 868_ContentDisposition('Content-Disposition', 'entity', 'RFC 2183') 869 870class _IfModifiedSince(_DateHeader): 871 """ 872 If-Modified-Since, RFC 2616 section 14.25 873 """ 874 version = '1.0' 875 876 def __call__(self, *args, **kwargs): 877 """ 878 Split the value on ';' incase the header includes extra attributes. E.g. 879 IE 6 is known to send: 880 If-Modified-Since: Sun, 25 Jun 2006 20:36:35 GMT; length=1506 881 """ 882 return _DateHeader.__call__(self, *args, **kwargs).split(';', 1)[0] 883 884 def parse(self, *args, **kwargs): 885 value = _DateHeader.parse(self, *args, **kwargs) 886 if value and value > now(): 887 raise HTTPBadRequest(( 888 "Please check your system clock.\r\n" 889 "According to this server, the time provided in the\r\n" 890 "%s header is in the future.\r\n") % self.name) 891 return value 892_IfModifiedSince('If-Modified-Since', 'request', 'RFC 2616, 14.25') 893 894class _Range(_MultiValueHeader): 895 """ 896 Range, RFC 2616 14.35 (use ``RANGE``) 897 898 According to section 14.16, the response to this message should be a 899 206 Partial Content and that if multiple non-overlapping byte ranges 900 are requested (it is an error to request multiple overlapping 901 ranges) the result should be sent as multipart/byteranges mimetype. 902 903 The server should respond with '416 Requested Range Not Satisfiable' 904 if the requested ranges are out-of-bounds. The specification also 905 indicates that a syntax error in the Range request should result in 906 the header being ignored rather than a '400 Bad Request'. 907 """ 908 909 def parse(self, *args, **kwargs): 910 """ 911 Returns a tuple (units, list), where list is a sequence of 912 (begin, end) tuples; and end is None if it was not provided. 913 """ 914 value = self.__call__(*args, **kwargs) 915 if not value: 916 return None 917 ranges = [] 918 last_end = -1 919 try: 920 (units, range) = value.split("=", 1) 921 units = units.strip().lower() 922 for item in range.split(","): 923 (begin, end) = item.split("-") 924 if not begin.strip(): 925 begin = 0 926 else: 927 begin = int(begin) 928 if begin <= last_end: 929 raise ValueError() 930 if not end.strip(): 931 end = None 932 else: 933 end = int(end) 934 last_end = end 935 ranges.append((begin, end)) 936 except ValueError: 937 # In this case where the Range header is malformed, 938 # section 14.16 says to treat the request as if the 939 # Range header was not present. How do I log this? 940 return None 941 return (units, ranges) 942_Range('Range', 'request', 'RFC 2616, 14.35') 943 944class _AcceptLanguage(_MultiValueHeader): 945 """ 946 Accept-Language, RFC 2616 section 14.4 947 """ 948 949 def parse(self, *args, **kwargs): 950 """ 951 Return a list of language tags sorted by their "q" values. For example, 952 "en-us,en;q=0.5" should return ``["en-us", "en"]``. If there is no 953 ``Accept-Language`` header present, default to ``[]``. 954 """ 955 header = self.__call__(*args, **kwargs) 956 if header is None: 957 return [] 958 langs = [v for v in header.split(",") if v] 959 qs = [] 960 for lang in langs: 961 pieces = lang.split(";") 962 lang, params = pieces[0].strip().lower(), pieces[1:] 963 q = 1 964 for param in params: 965 if '=' not in param: 966 # Malformed request; probably a bot, we'll ignore 967 continue 968 lvalue, rvalue = param.split("=") 969 lvalue = lvalue.strip().lower() 970 rvalue = rvalue.strip() 971 if lvalue == "q": 972 q = float(rvalue) 973 qs.append((lang, q)) 974 qs.sort(key=lambda query: query[1], reverse=True) 975 return [lang for (lang, q) in qs] 976_AcceptLanguage('Accept-Language', 'request', 'RFC 2616, 14.4') 977 978class _AcceptRanges(_MultiValueHeader): 979 """ 980 Accept-Ranges, RFC 2616 section 14.5 981 """ 982 def compose(self, none=None, bytes=None): 983 if bytes: 984 return ('bytes',) 985 return ('none',) 986_AcceptRanges('Accept-Ranges', 'response', 'RFC 2616, 14.5') 987 988class _ContentRange(_SingleValueHeader): 989 """ 990 Content-Range, RFC 2616 section 14.6 991 """ 992 def compose(self, first_byte=None, last_byte=None, total_length=None): 993 retval = "bytes %d-%d/%d" % (first_byte, last_byte, total_length) 994 assert last_byte == -1 or first_byte <= last_byte 995 assert last_byte < total_length 996 return (retval,) 997_ContentRange('Content-Range', 'entity', 'RFC 2616, 14.6') 998 999class _Authorization(_SingleValueHeader): 1000 """ 1001 Authorization, RFC 2617 (RFC 2616, 14.8) 1002 """ 1003 def compose(self, digest=None, basic=None, username=None, password=None, 1004 challenge=None, path=None, method=None): 1005 assert username and password 1006 if basic or not challenge: 1007 assert not digest 1008 userpass = "%s:%s" % (username.strip(), password.strip()) 1009 return "Basic %s" % userpass.encode('base64').strip() 1010 assert challenge and not basic 1011 path = path or "/" 1012 (_, realm) = challenge.split('realm="') 1013 (realm, _) = realm.split('"', 1) 1014 auth = AbstractDigestAuthHandler() 1015 auth.add_password(realm, path, username, password) 1016 (token, challenge) = challenge.split(' ', 1) 1017 chal = parse_keqv_list(parse_http_list(challenge)) 1018 class FakeRequest(object): 1019 if six.PY3: 1020 @property 1021 def full_url(self): 1022 return path 1023 1024 selector = full_url 1025 1026 @property 1027 def data(self): 1028 return None 1029 else: 1030 def get_full_url(self): 1031 return path 1032 1033 get_selector = get_full_url 1034 1035 def has_data(self): 1036 return False 1037 1038 def get_method(self): 1039 return method or "GET" 1040 1041 retval = "Digest %s" % auth.get_authorization(FakeRequest(), chal) 1042 return (retval,) 1043_Authorization('Authorization', 'request', 'RFC 2617') 1044 1045# 1046# For now, construct a minimalistic version of the field-names; at a 1047# later date more complicated headers may sprout content constructors. 1048# The items commented out have concrete variants. 1049# 1050for (name, category, version, style, comment) in \ 1051(("Accept" ,'request' ,'1.1','multi-value','RFC 2616, 14.1' ) 1052,("Accept-Charset" ,'request' ,'1.1','multi-value','RFC 2616, 14.2' ) 1053,("Accept-Encoding" ,'request' ,'1.1','multi-value','RFC 2616, 14.3' ) 1054#,("Accept-Language" ,'request' ,'1.1','multi-value','RFC 2616, 14.4' ) 1055#,("Accept-Ranges" ,'response','1.1','multi-value','RFC 2616, 14.5' ) 1056,("Age" ,'response','1.1','singular' ,'RFC 2616, 14.6' ) 1057,("Allow" ,'entity' ,'1.0','multi-value','RFC 2616, 14.7' ) 1058#,("Authorization" ,'request' ,'1.0','singular' ,'RFC 2616, 14.8' ) 1059#,("Cache-Control" ,'general' ,'1.1','multi-value','RFC 2616, 14.9' ) 1060,("Cookie" ,'request' ,'1.0','multi-value','RFC 2109/Netscape') 1061,("Connection" ,'general' ,'1.1','multi-value','RFC 2616, 14.10') 1062,("Content-Encoding" ,'entity' ,'1.0','multi-value','RFC 2616, 14.11') 1063#,("Content-Disposition",'entity' ,'1.1','multi-value','RFC 2616, 15.5' ) 1064,("Content-Language" ,'entity' ,'1.1','multi-value','RFC 2616, 14.12') 1065#,("Content-Length" ,'entity' ,'1.0','singular' ,'RFC 2616, 14.13') 1066,("Content-Location" ,'entity' ,'1.1','singular' ,'RFC 2616, 14.14') 1067,("Content-MD5" ,'entity' ,'1.1','singular' ,'RFC 2616, 14.15') 1068#,("Content-Range" ,'entity' ,'1.1','singular' ,'RFC 2616, 14.16') 1069#,("Content-Type" ,'entity' ,'1.0','singular' ,'RFC 2616, 14.17') 1070,("Date" ,'general' ,'1.0','date-header','RFC 2616, 14.18') 1071,("ETag" ,'response','1.1','singular' ,'RFC 2616, 14.19') 1072,("Expect" ,'request' ,'1.1','multi-value','RFC 2616, 14.20') 1073,("Expires" ,'entity' ,'1.0','date-header','RFC 2616, 14.21') 1074,("From" ,'request' ,'1.0','singular' ,'RFC 2616, 14.22') 1075,("Host" ,'request' ,'1.1','singular' ,'RFC 2616, 14.23') 1076,("If-Match" ,'request' ,'1.1','multi-value','RFC 2616, 14.24') 1077#,("If-Modified-Since" ,'request' ,'1.0','date-header','RFC 2616, 14.25') 1078,("If-None-Match" ,'request' ,'1.1','multi-value','RFC 2616, 14.26') 1079,("If-Range" ,'request' ,'1.1','singular' ,'RFC 2616, 14.27') 1080,("If-Unmodified-Since",'request' ,'1.1','date-header' ,'RFC 2616, 14.28') 1081,("Last-Modified" ,'entity' ,'1.0','date-header','RFC 2616, 14.29') 1082,("Location" ,'response','1.0','singular' ,'RFC 2616, 14.30') 1083,("Max-Forwards" ,'request' ,'1.1','singular' ,'RFC 2616, 14.31') 1084,("Pragma" ,'general' ,'1.0','multi-value','RFC 2616, 14.32') 1085,("Proxy-Authenticate" ,'response','1.1','multi-value','RFC 2616, 14.33') 1086,("Proxy-Authorization",'request' ,'1.1','singular' ,'RFC 2616, 14.34') 1087#,("Range" ,'request' ,'1.1','multi-value','RFC 2616, 14.35') 1088,("Referer" ,'request' ,'1.0','singular' ,'RFC 2616, 14.36') 1089,("Retry-After" ,'response','1.1','singular' ,'RFC 2616, 14.37') 1090,("Server" ,'response','1.0','singular' ,'RFC 2616, 14.38') 1091,("Set-Cookie" ,'response','1.0','multi-entry','RFC 2109/Netscape') 1092,("TE" ,'request' ,'1.1','multi-value','RFC 2616, 14.39') 1093,("Trailer" ,'general' ,'1.1','multi-value','RFC 2616, 14.40') 1094,("Transfer-Encoding" ,'general' ,'1.1','multi-value','RFC 2616, 14.41') 1095,("Upgrade" ,'general' ,'1.1','multi-value','RFC 2616, 14.42') 1096,("User-Agent" ,'request' ,'1.0','singular' ,'RFC 2616, 14.43') 1097,("Vary" ,'response','1.1','multi-value','RFC 2616, 14.44') 1098,("Via" ,'general' ,'1.1','multi-value','RFC 2616, 14.45') 1099,("Warning" ,'general' ,'1.1','multi-entry','RFC 2616, 14.46') 1100,("WWW-Authenticate" ,'response','1.0','multi-entry','RFC 2616, 14.47')): 1101 klass = {'multi-value': _MultiValueHeader, 1102 'multi-entry': _MultiEntryHeader, 1103 'date-header': _DateHeader, 1104 'singular' : _SingleValueHeader}[style] 1105 klass(name, category, comment, version).__doc__ = comment 1106 del klass 1107 1108for head in _headers.values(): 1109 headname = head.name.replace("-","_").upper() 1110 locals()[headname] = head 1111 __all__.append(headname) 1112 1113__pudge_all__ = __all__[:] 1114for _name, _obj in six.iteritems(dict(globals())): 1115 if isinstance(_obj, type) and issubclass(_obj, HTTPHeader): 1116 __pudge_all__.append(_name) 1117