1:mod:`cgi` --- Common Gateway Interface support 2=============================================== 3 4.. module:: cgi 5 :synopsis: Helpers for running Python scripts via the Common Gateway Interface. 6 7 8.. index:: 9 pair: WWW; server 10 pair: CGI; protocol 11 pair: HTTP; protocol 12 pair: MIME; headers 13 single: URL 14 single: Common Gateway Interface 15 16**Source code:** :source:`Lib/cgi.py` 17 18-------------- 19 20Support module for Common Gateway Interface (CGI) scripts. 21 22This module defines a number of utilities for use by CGI scripts written in 23Python. 24 25 26Introduction 27------------ 28 29.. _cgi-intro: 30 31A CGI script is invoked by an HTTP server, usually to process user input 32submitted through an HTML ``<FORM>`` or ``<ISINDEX>`` element. 33 34Most often, CGI scripts live in the server's special :file:`cgi-bin` directory. 35The HTTP server places all sorts of information about the request (such as the 36client's hostname, the requested URL, the query string, and lots of other 37goodies) in the script's shell environment, executes the script, and sends the 38script's output back to the client. 39 40The script's input is connected to the client too, and sometimes the form data 41is read this way; at other times the form data is passed via the "query string" 42part of the URL. This module is intended to take care of the different cases 43and provide a simpler interface to the Python script. It also provides a number 44of utilities that help in debugging scripts, and the latest addition is support 45for file uploads from a form (if your browser supports it). 46 47The output of a CGI script should consist of two sections, separated by a blank 48line. The first section contains a number of headers, telling the client what 49kind of data is following. Python code to generate a minimal header section 50looks like this:: 51 52 print "Content-Type: text/html" # HTML is following 53 print # blank line, end of headers 54 55The second section is usually HTML, which allows the client software to display 56nicely formatted text with header, in-line images, etc. Here's Python code that 57prints a simple piece of HTML:: 58 59 print "<TITLE>CGI script output</TITLE>" 60 print "<H1>This is my first CGI script</H1>" 61 print "Hello, world!" 62 63 64.. _using-the-cgi-module: 65 66Using the cgi module 67-------------------- 68 69Begin by writing ``import cgi``. Do not use ``from cgi import *`` --- the 70module defines all sorts of names for its own use or for backward compatibility 71that you don't want in your namespace. 72 73When you write a new script, consider adding these lines:: 74 75 import cgitb 76 cgitb.enable() 77 78This activates a special exception handler that will display detailed reports in 79the Web browser if any errors occur. If you'd rather not show the guts of your 80program to users of your script, you can have the reports saved to files 81instead, with code like this:: 82 83 import cgitb 84 cgitb.enable(display=0, logdir="/path/to/logdir") 85 86It's very helpful to use this feature during script development. The reports 87produced by :mod:`cgitb` provide information that can save you a lot of time in 88tracking down bugs. You can always remove the ``cgitb`` line later when you 89have tested your script and are confident that it works correctly. 90 91To get at submitted form data, it's best to use the :class:`FieldStorage` class. 92The other classes defined in this module are provided mostly for backward 93compatibility. Instantiate it exactly once, without arguments. This reads the 94form contents from standard input or the environment (depending on the value of 95various environment variables set according to the CGI standard). Since it may 96consume standard input, it should be instantiated only once. 97 98The :class:`FieldStorage` instance can be indexed like a Python dictionary. 99It allows membership testing with the :keyword:`in` operator, and also supports 100the standard dictionary method :meth:`~dict.keys` and the built-in function 101:func:`len`. Form fields containing empty strings are ignored and do not appear 102in the dictionary; to keep such values, provide a true value for the optional 103*keep_blank_values* keyword parameter when creating the :class:`FieldStorage` 104instance. 105 106For instance, the following code (which assumes that the 107:mailheader:`Content-Type` header and blank line have already been printed) 108checks that the fields ``name`` and ``addr`` are both set to a non-empty 109string:: 110 111 form = cgi.FieldStorage() 112 if "name" not in form or "addr" not in form: 113 print "<H1>Error</H1>" 114 print "Please fill in the name and addr fields." 115 return 116 print "<p>name:", form["name"].value 117 print "<p>addr:", form["addr"].value 118 ...further form processing here... 119 120Here the fields, accessed through ``form[key]``, are themselves instances of 121:class:`FieldStorage` (or :class:`MiniFieldStorage`, depending on the form 122encoding). The :attr:`~FieldStorage.value` attribute of the instance yields 123the string value of the field. The :meth:`~FieldStorage.getvalue` method 124returns this string value directly; it also accepts an optional second argument 125as a default to return if the requested key is not present. 126 127If the submitted form data contains more than one field with the same name, the 128object retrieved by ``form[key]`` is not a :class:`FieldStorage` or 129:class:`MiniFieldStorage` instance but a list of such instances. Similarly, in 130this situation, ``form.getvalue(key)`` would return a list of strings. If you 131expect this possibility (when your HTML form contains multiple fields with the 132same name), use the :meth:`~FieldStorage.getlist` method, which always returns 133a list of values (so that you do not need to special-case the single item 134case). For example, this code concatenates any number of username fields, 135separated by commas:: 136 137 value = form.getlist("username") 138 usernames = ",".join(value) 139 140If a field represents an uploaded file, accessing the value via the 141:attr:`~FieldStorage.value` attribute or the :func:`~FieldStorage.getvalue` 142method reads the entire file in memory as a string. This may not be what you 143want. You can test for an uploaded file by testing either the 144:attr:`~FieldStorage.filename` attribute or the :attr:`~FieldStorage.file` 145attribute. You can then read the data at leisure from the :attr:`!file` 146attribute:: 147 148 fileitem = form["userfile"] 149 if fileitem.file: 150 # It's an uploaded file; count lines 151 linecount = 0 152 while 1: 153 line = fileitem.file.readline() 154 if not line: break 155 linecount = linecount + 1 156 157If an error is encountered when obtaining the contents of an uploaded file 158(for example, when the user interrupts the form submission by clicking on 159a Back or Cancel button) the :attr:`~FieldStorage.done` attribute of the 160object for the field will be set to the value -1. 161 162The file upload draft standard entertains the possibility of uploading multiple 163files from one field (using a recursive :mimetype:`multipart/\*` encoding). 164When this occurs, the item will be a dictionary-like :class:`FieldStorage` item. 165This can be determined by testing its :attr:`!type` attribute, which should be 166:mimetype:`multipart/form-data` (or perhaps another MIME type matching 167:mimetype:`multipart/\*`). In this case, it can be iterated over recursively 168just like the top-level form object. 169 170When a form is submitted in the "old" format (as the query string or as a single 171data part of type :mimetype:`application/x-www-form-urlencoded`), the items will 172actually be instances of the class :class:`MiniFieldStorage`. In this case, the 173:attr:`!list`, :attr:`!file`, and :attr:`filename` attributes are always ``None``. 174 175A form submitted via POST that also has a query string will contain both 176:class:`FieldStorage` and :class:`MiniFieldStorage` items. 177 178Higher Level Interface 179---------------------- 180 181.. versionadded:: 2.2 182 183The previous section explains how to read CGI form data using the 184:class:`FieldStorage` class. This section describes a higher level interface 185which was added to this class to allow one to do it in a more readable and 186intuitive way. The interface doesn't make the techniques described in previous 187sections obsolete --- they are still useful to process file uploads efficiently, 188for example. 189 190.. XXX: Is this true ? 191 192The interface consists of two simple methods. Using the methods you can process 193form data in a generic way, without the need to worry whether only one or more 194values were posted under one name. 195 196In the previous section, you learned to write following code anytime you 197expected a user to post more than one value under one name:: 198 199 item = form.getvalue("item") 200 if isinstance(item, list): 201 # The user is requesting more than one item. 202 else: 203 # The user is requesting only one item. 204 205This situation is common for example when a form contains a group of multiple 206checkboxes with the same name:: 207 208 <input type="checkbox" name="item" value="1" /> 209 <input type="checkbox" name="item" value="2" /> 210 211In most situations, however, there's only one form control with a particular 212name in a form and then you expect and need only one value associated with this 213name. So you write a script containing for example this code:: 214 215 user = form.getvalue("user").upper() 216 217The problem with the code is that you should never expect that a client will 218provide valid input to your scripts. For example, if a curious user appends 219another ``user=foo`` pair to the query string, then the script would crash, 220because in this situation the ``getvalue("user")`` method call returns a list 221instead of a string. Calling the :meth:`~str.upper` method on a list is not valid 222(since lists do not have a method of this name) and results in an 223:exc:`AttributeError` exception. 224 225Therefore, the appropriate way to read form data values was to always use the 226code which checks whether the obtained value is a single value or a list of 227values. That's annoying and leads to less readable scripts. 228 229A more convenient approach is to use the methods :meth:`~FieldStorage.getfirst` 230and :meth:`~FieldStorage.getlist` provided by this higher level interface. 231 232 233.. method:: FieldStorage.getfirst(name[, default]) 234 235 This method always returns only one value associated with form field *name*. 236 The method returns only the first value in case that more values were posted 237 under such name. Please note that the order in which the values are received 238 may vary from browser to browser and should not be counted on. [#]_ If no such 239 form field or value exists then the method returns the value specified by the 240 optional parameter *default*. This parameter defaults to ``None`` if not 241 specified. 242 243 244.. method:: FieldStorage.getlist(name) 245 246 This method always returns a list of values associated with form field *name*. 247 The method returns an empty list if no such form field or value exists for 248 *name*. It returns a list consisting of one item if only one such value exists. 249 250Using these methods you can write nice compact code:: 251 252 import cgi 253 form = cgi.FieldStorage() 254 user = form.getfirst("user", "").upper() # This way it's safe. 255 for item in form.getlist("item"): 256 do_something(item) 257 258 259Old classes 260----------- 261 262.. deprecated:: 2.6 263 264 These classes, present in earlier versions of the :mod:`cgi` module, are 265 still supported for backward compatibility. New applications should use the 266 :class:`FieldStorage` class. 267 268:class:`SvFormContentDict` stores single value form content as dictionary; it 269assumes each field name occurs in the form only once. 270 271:class:`FormContentDict` stores multiple value form content as a dictionary (the 272form items are lists of values). Useful if your form contains multiple fields 273with the same name. 274 275Other classes (:class:`FormContent`, :class:`InterpFormContentDict`) are present 276for backwards compatibility with really old applications only. 277 278 279.. _functions-in-cgi-module: 280 281Functions 282--------- 283 284These are useful if you want more control, or if you want to employ some of the 285algorithms implemented in this module in other circumstances. 286 287 288.. function:: parse(fp[, environ[, keep_blank_values[, strict_parsing]]]) 289 290 Parse a query in the environment or from a file (the file defaults to 291 ``sys.stdin`` and environment defaults to ``os.environ``). The *keep_blank_values* and *strict_parsing* parameters are 292 passed to :func:`urlparse.parse_qs` unchanged. 293 294 295.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing]]) 296 297 This function is deprecated in this module. Use :func:`urlparse.parse_qs` 298 instead. It is maintained here only for backward compatibility. 299 300.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing]]) 301 302 This function is deprecated in this module. Use :func:`urlparse.parse_qsl` 303 instead. It is maintained here only for backward compatibility. 304 305.. function:: parse_multipart(fp, pdict) 306 307 Parse input of type :mimetype:`multipart/form-data` (for file uploads). 308 Arguments are *fp* for the input file and *pdict* for a dictionary containing 309 other parameters in the :mailheader:`Content-Type` header. 310 311 Returns a dictionary just like :func:`urlparse.parse_qs` keys are the field names, each 312 value is a list of values for that field. This is easy to use but not much good 313 if you are expecting megabytes to be uploaded --- in that case, use the 314 :class:`FieldStorage` class instead which is much more flexible. 315 316 Note that this does not parse nested multipart parts --- use 317 :class:`FieldStorage` for that. 318 319 320.. function:: parse_header(string) 321 322 Parse a MIME header (such as :mailheader:`Content-Type`) into a main value and a 323 dictionary of parameters. 324 325 326.. function:: test() 327 328 Robust test CGI script, usable as main program. Writes minimal HTTP headers and 329 formats all information provided to the script in HTML form. 330 331 332.. function:: print_environ() 333 334 Format the shell environment in HTML. 335 336 337.. function:: print_form(form) 338 339 Format a form in HTML. 340 341 342.. function:: print_directory() 343 344 Format the current directory in HTML. 345 346 347.. function:: print_environ_usage() 348 349 Print a list of useful (used by CGI) environment variables in HTML. 350 351 352.. function:: escape(s[, quote]) 353 354 Convert the characters ``'&'``, ``'<'`` and ``'>'`` in string *s* to HTML-safe 355 sequences. Use this if you need to display text that might contain such 356 characters in HTML. If the optional flag *quote* is true, the quotation mark 357 character (``"``) is also translated; this helps for inclusion in an HTML 358 attribute value delimited by double quotes, as in ``<a href="...">``. Note 359 that single quotes are never translated. 360 361 If the value to be quoted might include single- or double-quote characters, 362 or both, consider using the :func:`~xml.sax.saxutils.quoteattr` function in the 363 :mod:`xml.sax.saxutils` module instead. 364 365 366.. _cgi-security: 367 368Caring about security 369--------------------- 370 371.. index:: pair: CGI; security 372 373There's one important rule: if you invoke an external program (via the 374:func:`os.system` or :func:`os.popen` functions. or others with similar 375functionality), make very sure you don't pass arbitrary strings received from 376the client to the shell. This is a well-known security hole whereby clever 377hackers anywhere on the Web can exploit a gullible CGI script to invoke 378arbitrary shell commands. Even parts of the URL or field names cannot be 379trusted, since the request doesn't have to come from your form! 380 381To be on the safe side, if you must pass a string gotten from a form to a shell 382command, you should make sure the string contains only alphanumeric characters, 383dashes, underscores, and periods. 384 385 386Installing your CGI script on a Unix system 387------------------------------------------- 388 389Read the documentation for your HTTP server and check with your local system 390administrator to find the directory where CGI scripts should be installed; 391usually this is in a directory :file:`cgi-bin` in the server tree. 392 393Make sure that your script is readable and executable by "others"; the Unix file 394mode should be ``0755`` octal (use ``chmod 0755 filename``). Make sure that the 395first line of the script contains ``#!`` starting in column 1 followed by the 396pathname of the Python interpreter, for instance:: 397 398 #!/usr/local/bin/python 399 400Make sure the Python interpreter exists and is executable by "others". 401 402Make sure that any files your script needs to read or write are readable or 403writable, respectively, by "others" --- their mode should be ``0644`` for 404readable and ``0666`` for writable. This is because, for security reasons, the 405HTTP server executes your script as user "nobody", without any special 406privileges. It can only read (write, execute) files that everybody can read 407(write, execute). The current directory at execution time is also different (it 408is usually the server's cgi-bin directory) and the set of environment variables 409is also different from what you get when you log in. In particular, don't count 410on the shell's search path for executables (:envvar:`PATH`) or the Python module 411search path (:envvar:`PYTHONPATH`) to be set to anything interesting. 412 413If you need to load modules from a directory which is not on Python's default 414module search path, you can change the path in your script, before importing 415other modules. For example:: 416 417 import sys 418 sys.path.insert(0, "/usr/home/joe/lib/python") 419 sys.path.insert(0, "/usr/local/lib/python") 420 421(This way, the directory inserted last will be searched first!) 422 423Instructions for non-Unix systems will vary; check your HTTP server's 424documentation (it will usually have a section on CGI scripts). 425 426 427Testing your CGI script 428----------------------- 429 430Unfortunately, a CGI script will generally not run when you try it from the 431command line, and a script that works perfectly from the command line may fail 432mysteriously when run from the server. There's one reason why you should still 433test your script from the command line: if it contains a syntax error, the 434Python interpreter won't execute it at all, and the HTTP server will most likely 435send a cryptic error to the client. 436 437Assuming your script has no syntax errors, yet it does not work, you have no 438choice but to read the next section. 439 440 441Debugging CGI scripts 442--------------------- 443 444.. index:: pair: CGI; debugging 445 446First of all, check for trivial installation errors --- reading the section 447above on installing your CGI script carefully can save you a lot of time. If 448you wonder whether you have understood the installation procedure correctly, try 449installing a copy of this module file (:file:`cgi.py`) as a CGI script. When 450invoked as a script, the file will dump its environment and the contents of the 451form in HTML form. Give it the right mode etc, and send it a request. If it's 452installed in the standard :file:`cgi-bin` directory, it should be possible to 453send it a request by entering a URL into your browser of the form: 454 455.. code-block:: none 456 457 http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home 458 459If this gives an error of type 404, the server cannot find the script -- perhaps 460you need to install it in a different directory. If it gives another error, 461there's an installation problem that you should fix before trying to go any 462further. If you get a nicely formatted listing of the environment and form 463content (in this example, the fields should be listed as "addr" with value "At 464Home" and "name" with value "Joe Blow"), the :file:`cgi.py` script has been 465installed correctly. If you follow the same procedure for your own script, you 466should now be able to debug it. 467 468The next step could be to call the :mod:`cgi` module's :func:`test` function 469from your script: replace its main code with the single statement :: 470 471 cgi.test() 472 473This should produce the same results as those gotten from installing the 474:file:`cgi.py` file itself. 475 476When an ordinary Python script raises an unhandled exception (for whatever 477reason: of a typo in a module name, a file that can't be opened, etc.), the 478Python interpreter prints a nice traceback and exits. While the Python 479interpreter will still do this when your CGI script raises an exception, most 480likely the traceback will end up in one of the HTTP server's log files, or be 481discarded altogether. 482 483Fortunately, once you have managed to get your script to execute *some* code, 484you can easily send tracebacks to the Web browser using the :mod:`cgitb` module. 485If you haven't done so already, just add the lines:: 486 487 import cgitb 488 cgitb.enable() 489 490to the top of your script. Then try running it again; when a problem occurs, 491you should see a detailed report that will likely make apparent the cause of the 492crash. 493 494If you suspect that there may be a problem in importing the :mod:`cgitb` module, 495you can use an even more robust approach (which only uses built-in modules):: 496 497 import sys 498 sys.stderr = sys.stdout 499 print "Content-Type: text/plain" 500 print 501 ...your code here... 502 503This relies on the Python interpreter to print the traceback. The content type 504of the output is set to plain text, which disables all HTML processing. If your 505script works, the raw HTML will be displayed by your client. If it raises an 506exception, most likely after the first two lines have been printed, a traceback 507will be displayed. Because no HTML interpretation is going on, the traceback 508will be readable. 509 510 511Common problems and solutions 512----------------------------- 513 514* Most HTTP servers buffer the output from CGI scripts until the script is 515 completed. This means that it is not possible to display a progress report on 516 the client's display while the script is running. 517 518* Check the installation instructions above. 519 520* Check the HTTP server's log files. (``tail -f logfile`` in a separate window 521 may be useful!) 522 523* Always check a script for syntax errors first, by doing something like 524 ``python script.py``. 525 526* If your script does not have any syntax errors, try adding ``import cgitb; 527 cgitb.enable()`` to the top of the script. 528 529* When invoking external programs, make sure they can be found. Usually, this 530 means using absolute path names --- :envvar:`PATH` is usually not set to a very 531 useful value in a CGI script. 532 533* When reading or writing external files, make sure they can be read or written 534 by the userid under which your CGI script will be running: this is typically the 535 userid under which the web server is running, or some explicitly specified 536 userid for a web server's ``suexec`` feature. 537 538* Don't try to give a CGI script a set-uid mode. This doesn't work on most 539 systems, and is a security liability as well. 540 541.. rubric:: Footnotes 542 543.. [#] Note that some recent versions of the HTML specification do state what order the 544 field values should be supplied in, but knowing whether a request was 545 received from a conforming browser, or even from a browser at all, is tedious 546 and error-prone. 547