1Wiki Example 2============ 3 4:author: Ian Bicking <ianb@colorstudy.com> 5 6.. contents:: 7 8Introduction 9------------ 10 11This is an example of how to write a WSGI application using WebOb. 12WebOb isn't itself intended to write applications -- it is not a web 13framework on its own -- but it is *possible* to write applications 14using just WebOb. 15 16The `file serving example <file-example.html>`_ is a better example of 17advanced HTTP usage. The `comment middleware example 18<comment-example.html>`_ is a better example of using middleware. 19This example provides some completeness by showing an 20application-focused end point. 21 22This example implements a very simple wiki. 23 24Code 25---- 26 27The finished code for this is available in 28`docs/wiki-example-code/example.py 29<https://github.com/Pylons/webob/tree/master/docs/wiki-example-code/example.py>`_ 30-- you can run that file as a script to try it out. 31 32Creating an Application 33----------------------- 34 35A common pattern for creating small WSGI applications is to have a 36class which is instantiated with the configuration. For our 37application we'll be storing the pages under a directory. 38 39.. code-block:: python 40 41 class WikiApp(object): 42 43 def __init__(self, storage_dir): 44 self.storage_dir = os.path.abspath(os.path.normpath(storage_dir)) 45 46WSGI applications are callables like ``wsgi_app(environ, 47start_response)``. *Instances* of `WikiApp` are WSGI applications, so 48we'll implement a ``__call__`` method: 49 50.. code-block:: python 51 52 class WikiApp(object): 53 ... 54 def __call__(self, environ, start_response): 55 # what we'll fill in 56 57To make the script runnable we'll create a simple command-line 58interface: 59 60.. code-block:: python 61 62 if __name__ == '__main__': 63 import optparse 64 parser = optparse.OptionParser( 65 usage='%prog --port=PORT' 66 ) 67 parser.add_option( 68 '-p', '--port', 69 default='8080', 70 dest='port', 71 type='int', 72 help='Port to serve on (default 8080)') 73 parser.add_option( 74 '--wiki-data', 75 default='./wiki', 76 dest='wiki_data', 77 help='Place to put wiki data into (default ./wiki/)') 78 options, args = parser.parse_args() 79 print 'Writing wiki pages to %s' % options.wiki_data 80 app = WikiApp(options.wiki_data) 81 from wsgiref.simple_server import make_server 82 httpd = make_server('localhost', options.port, app) 83 print 'Serving on http://localhost:%s' % options.port 84 try: 85 httpd.serve_forever() 86 except KeyboardInterrupt: 87 print '^C' 88 89There's not much to talk about in this code block. The application is 90instantiated and served with the built-in module 91`wsgiref.simple_server 92<http://www.python.org/doc/current/lib/module-wsgiref.simple_server.html>`_. 93 94The WSGI Application 95-------------------- 96 97Of course all the interesting stuff is in that ``__call__`` method. 98WebOb lets you ignore some of the details of WSGI, like what 99``start_response`` really is. ``environ`` is a CGI-like dictionary, 100but ``webob.Request`` gives an object interface to it. 101``webob.Response`` represents a response, and is itself a WSGI 102application. Here's kind of the hello world of WSGI applications 103using these objects: 104 105.. code-block:: python 106 107 from webob import Request, Response 108 109 class WikiApp(object): 110 ... 111 112 def __call__(self, environ, start_response): 113 req = Request(environ) 114 resp = Response( 115 'Hello %s!' % req.params.get('name', 'World')) 116 return resp(environ, start_response) 117 118``req.params.get('name', 'World')`` gets any query string parameter 119(like ``?name=Bob``), or if it's a POST form request it will look for 120a form parameter ``name``. We instantiate the response with the body 121of the response. You could also give keyword arguments like 122``content_type='text/plain'`` (``text/html`` is the default content 123type and ``200 OK`` is the default status). 124 125For the wiki application we'll support a couple different kinds of 126screens, and we'll make our ``__call__`` method dispatch to different 127methods depending on the request. We'll support an ``action`` 128parameter like ``?action=edit``, and also dispatch on the method (GET, 129POST, etc, in ``req.method``). We'll pass in the request and expect a 130response object back. 131 132Also, WebOb has a series of exceptions in ``webob.exc``, like 133``webob.exc.HTTPNotFound``, ``webob.exc.HTTPTemporaryRedirect``, etc. 134We'll also let the method raise one of these exceptions and turn it 135into a response. 136 137One last thing we'll do in our ``__call__`` method is create our 138``Page`` object, which represents a wiki page. 139 140All this together makes: 141 142.. code-block:: python 143 144 from webob import Request, Response 145 from webob import exc 146 147 class WikiApp(object): 148 ... 149 150 def __call__(self, environ, start_response): 151 req = Request(environ) 152 action = req.params.get('action', 'view') 153 # Here's where we get the Page domain object: 154 page = self.get_page(req.path_info) 155 try: 156 try: 157 # The method name is action_{action_param}_{request_method}: 158 meth = getattr(self, 'action_%s_%s' % (action, req.method)) 159 except AttributeError: 160 # If the method wasn't found there must be 161 # something wrong with the request: 162 raise exc.HTTPBadRequest('No such action %r' % action) 163 resp = meth(req, page) 164 except exc.HTTPException, e: 165 # The exception object itself is a WSGI application/response: 166 resp = e 167 return resp(environ, start_response) 168 169The Domain Object 170----------------- 171 172The ``Page`` domain object isn't really related to the web, but it is 173important to implementing this. Each ``Page`` is just a file on the 174filesystem. Our ``get_page`` method figures out the filename given 175the path (the path is in ``req.path_info``, which is all the path 176after the base path). The ``Page`` class handles getting and setting 177the title and content. 178 179Here's the method to figure out the filename: 180 181.. code-block:: python 182 183 import os 184 185 class WikiApp(object): 186 ... 187 188 def get_page(self, path): 189 path = path.lstrip('/') 190 if not path: 191 # The path was '/', the home page 192 path = 'index' 193 path = os.path.join(self.storage_dir) 194 path = os.path.normpath(path) 195 if path.endswith('/'): 196 path += 'index' 197 if not path.startswith(self.storage_dir): 198 raise exc.HTTPBadRequest("Bad path") 199 path += '.html' 200 return Page(path) 201 202Mostly this is just the kind of careful path construction you have to 203do when mapping a URL to a filename. While the server *may* normalize 204the path (so that a path like ``/../../`` can't be requested), you can 205never really be sure. By using ``os.path.normpath`` we eliminate 206these, and then we make absolutely sure that the resulting path is 207under our ``self.storage_dir`` with ``if not 208path.startswith(self.storage_dir): raise exc.HTTPBadRequest("Bad 209path")``. 210 211Here's the actual domain object: 212 213.. code-block:: python 214 215 class Page(object): 216 def __init__(self, filename): 217 self.filename = filename 218 219 @property 220 def exists(self): 221 return os.path.exists(self.filename) 222 223 @property 224 def title(self): 225 if not self.exists: 226 # we need to guess the title 227 basename = os.path.splitext(os.path.basename(self.filename))[0] 228 basename = re.sub(r'[_-]', ' ', basename) 229 return basename.capitalize() 230 content = self.full_content 231 match = re.search(r'<title>(.*?)</title>', content, re.I|re.S) 232 return match.group(1) 233 234 @property 235 def full_content(self): 236 f = open(self.filename, 'rb') 237 try: 238 return f.read() 239 finally: 240 f.close() 241 242 @property 243 def content(self): 244 if not self.exists: 245 return '' 246 content = self.full_content 247 match = re.search(r'<body[^>]*>(.*?)</body>', content, re.I|re.S) 248 return match.group(1) 249 250 @property 251 def mtime(self): 252 if not self.exists: 253 return None 254 else: 255 return int(os.stat(self.filename).st_mtime) 256 257 def set(self, title, content): 258 dir = os.path.dirname(self.filename) 259 if not os.path.exists(dir): 260 os.makedirs(dir) 261 new_content = """<html><head><title>%s</title></head><body>%s</body></html>""" % ( 262 title, content) 263 f = open(self.filename, 'wb') 264 f.write(new_content) 265 f.close() 266 267Basically it provides a ``.title`` attribute, a ``.content`` 268attribute, the ``.mtime`` (last modified time), and the page can exist 269or not (giving appropriate guesses for title and content when the page 270does not exist). It encodes these on the filesystem as a simple HTML 271page that is parsed by some regular expressions. 272 273None of this really applies much to the web or WebOb, so I'll leave it 274to you to figure out the details of this. 275 276URLs, PATH_INFO, and SCRIPT_NAME 277-------------------------------- 278 279This is an aside for the tutorial, but an important concept. In WSGI, 280and accordingly with WebOb, the URL is split up into several pieces. 281Some of these are obvious and some not. 282 283An example:: 284 285 http://example.com:8080/wiki/article/12?version=10 286 287There are several components here: 288 289* req.scheme: ``http`` 290* req.host: ``example.com:8080`` 291* req.server_name: ``example.com`` 292* req.server_port: 8080 293* req.script_name: ``/wiki`` 294* req.path_info: ``/article/12`` 295* req.query_string: ``version=10`` 296 297One non-obvious part is ``req.script_name`` and ``req.path_info``. 298These correspond to the CGI environmental variables ``SCRIPT_NAME`` 299and ``PATH_INFO``. ``req.script_name`` points to the *application*. 300You might have several applications in your site at different paths: 301one at ``/wiki``, one at ``/blog``, one at ``/``. Each application 302doesn't necessarily know about the others, but it has to construct its 303URLs properly -- so any internal links to the wiki application should 304start with ``/wiki``. 305 306Just as there are pieces to the URL, there are several properties in 307WebOb to construct URLs based on these: 308 309* req.host_url: ``http://example.com:8080`` 310* req.application_url: ``http://example.com:8080/wiki`` 311* req.path_url: ``http://example.com:8080/wiki/article/12`` 312* req.path: ``/wiki/article/12`` 313* req.path_qs: ``/wiki/article/12?version=10`` 314* req.url: ``http://example.com:8080/wiki/article/12?version10`` 315 316You can also create URLs with 317``req.relative_url('some/other/page')``. In this example that would 318resolve to ``http://example.com:8080/wiki/article/some/other/page``. 319You can also create a relative URL to the application URL 320(SCRIPT_NAME) like ``req.relative_url('some/other/page', True)`` which 321would be ``http://example.com:8080/wiki/some/other/page``. 322 323Back to the Application 324----------------------- 325 326We have a dispatching function with ``__call__`` and we have a domain 327object with ``Page``, but we aren't actually doing anything. 328 329The dispatching goes to ``action_ACTION_METHOD``, where ACTION 330defaults to ``view``. So a simple page view will be 331``action_view_GET``. Let's implement that: 332 333.. code-block:: python 334 335 class WikiApp(object): 336 ... 337 338 def action_view_GET(self, req, page): 339 if not page.exists: 340 return exc.HTTPTemporaryRedirect( 341 location=req.url + '?action=edit') 342 text = self.view_template.substitute( 343 page=page, req=req) 344 resp = Response(text) 345 resp.last_modified = page.mtime 346 resp.conditional_response = True 347 return resp 348 349The first thing we do is redirect the user to the edit screen if the 350page doesn't exist. ``exc.HTTPTemporaryRedirect`` is a response that 351gives a ``307 Temporary Redirect`` response with the given location. 352 353Otherwise we fill in a template. The template language we're going to 354use in this example is `Tempita <http://pythonpaste.org/tempita/>`_, a 355very simple template language with a similar interface to 356`string.Template <http://python.org/doc/current/lib/node40.html>`_. 357 358The template actually looks like this: 359 360.. code-block:: python 361 362 from tempita import HTMLTemplate 363 364 VIEW_TEMPLATE = HTMLTemplate("""\ 365 <html> 366 <head> 367 <title>{{page.title}}</title> 368 </head> 369 <body> 370 <h1>{{page.title}}</h1> 371 372 <div>{{page.content|html}}</div> 373 374 <hr> 375 <a href="{{req.url}}?action=edit">Edit</a> 376 </body> 377 </html> 378 """) 379 380 class WikiApp(object): 381 view_template = VIEW_TEMPLATE 382 ... 383 384As you can see it's a simple template using the title and the body, 385and a link to the edit screen. We copy the template object into a 386class method (``view_template = VIEW_TEMPLATE``) so that potentially a 387subclass could override these templates. 388 389``tempita.HTMLTemplate`` is a template that does automatic HTML 390escaping. Our wiki will just be written in plain HTML, so we disable 391escaping of the content with ``{{page.content|html}}``. 392 393So let's look at the ``action_view_GET`` method again: 394 395.. code-block:: python 396 397 def action_view_GET(self, req, page): 398 if not page.exists: 399 return exc.HTTPTemporaryRedirect( 400 location=req.url + '?action=edit') 401 text = self.view_template.substitute( 402 page=page, req=req) 403 resp = Response(text) 404 resp.last_modified = page.mtime 405 resp.conditional_response = True 406 return resp 407 408The template should be pretty obvious now. We create a response with 409``Response(text)``, which already has a default Content-Type of 410``text/html``. 411 412To allow conditional responses we set ``resp.last_modified``. You can 413set this attribute to a date, None (effectively removing the header), 414a time tuple (like produced by ``time.localtime()``), or as in this 415case to an integer timestamp. If you get the value back it will 416always be a `datetime 417<http://python.org/doc/current/lib/datetime-datetime.html>`_ object 418(or None). With this header we can process requests with 419If-Modified-Since headers, and return ``304 Not Modified`` if 420appropriate. It won't actually do that unless you set 421``resp.conditional_response`` to True. 422 423.. note:: 424 425 If you subclass ``webob.Response`` you can set the class attribute 426 ``default_conditional_response = True`` and this setting will be 427 on by default. You can also set other defaults, like the 428 ``default_charset`` (``"utf8"``), or ``default_content_type`` 429 (``"text/html"``). 430 431The Edit Screen 432--------------- 433 434The edit screen will be implemented in the method 435``action_edit_GET``. There's a template and a very simple method: 436 437.. code-block:: python 438 439 EDIT_TEMPLATE = HTMLTemplate("""\ 440 <html> 441 <head> 442 <title>Edit: {{page.title}}</title> 443 </head> 444 <body> 445 {{if page.exists}} 446 <h1>Edit: {{page.title}}</h1> 447 {{else}} 448 <h1>Create: {{page.title}}</h1> 449 {{endif}} 450 451 <form action="{{req.path_url}}" method="POST"> 452 <input type="hidden" name="mtime" value="{{page.mtime}}"> 453 Title: <input type="text" name="title" style="width: 70%" value="{{page.title}}"><br> 454 Content: <input type="submit" value="Save"> 455 <a href="{{req.path_url}}">Cancel</a> 456 <br> 457 <textarea name="content" style="width: 100%; height: 75%" rows="40">{{page.content}}</textarea> 458 <br> 459 <input type="submit" value="Save"> 460 <a href="{{req.path_url}}">Cancel</a> 461 </form> 462 </body></html> 463 """) 464 465 class WikiApp(object): 466 ... 467 468 edit_template = EDIT_TEMPLATE 469 470 def action_edit_GET(self, req, page): 471 text = self.edit_template.substitute( 472 page=page, req=req) 473 return Response(text) 474 475As you can see, all the action here is in the template. 476 477In ``<form action="{{req.path_url}}" method="POST">`` we submit to 478``req.path_url``; that's everything *but* ``?action=edit``. So we are 479POSTing right over the view page. This has the nice side effect of 480automatically invalidating any caches of the original page. It also 481is vaguely `RESTful 482<http://en.wikipedia.org/wiki/Representational_State_Transfer>`_. 483 484We save the last modified time in a hidden ``mtime`` field. This way 485we can detect concurrent updates. If start editing the page who's 486mtime is 100000, and someone else edits and saves a revision changing 487the mtime to 100010, we can use this hidden field to detect that 488conflict. Actually resolving the conflict is a little tricky and 489outside the scope of this particular tutorial, we'll just note the 490conflict to the user in an error. 491 492From there we just have a very straight-forward HTML form. Note that 493we don't quote the values because that is done automatically by 494``HTMLTemplate``; if you are using something like ``string.Template`` 495or a templating language that doesn't do automatic quoting, you have 496to be careful to quote all the field values. 497 498We don't have any error conditions in our application, but if there 499were error conditions we might have to re-display this form with the 500input values the user already gave. In that case we'd do something 501like:: 502 503 <input type="text" name="title" 504 value="{{req.params.get('title', page.title)}}"> 505 506This way we use the value in the request (``req.params`` is both the 507query string parameters and any variables in a POST response), but if 508there is no value (e.g., first request) then we use the page values. 509 510Processing the Form 511------------------- 512 513The form submits to ``action_view_POST`` (``view`` is the default 514action). So we have to implement that method: 515 516.. code-block:: python 517 518 class WikiApp(object): 519 ... 520 521 def action_view_POST(self, req, page): 522 submit_mtime = int(req.params.get('mtime') or '0') or None 523 if page.mtime != submit_mtime: 524 return exc.HTTPPreconditionFailed( 525 "The page has been updated since you started editing it") 526 page.set( 527 title=req.params['title'], 528 content=req.params['content']) 529 resp = exc.HTTPSeeOther( 530 location=req.path_url) 531 return resp 532 533The first thing we do is check the mtime value. It can be an empty 534string (when there's no mtime, like when you are creating a page) or 535an integer. ``int(req.params.get('time') or '0') or None`` basically 536makes sure we don't pass ``""`` to ``int()`` (which is an error) then 537turns 0 into None (``0 or None`` will evaluate to None in Python -- 538``false_value or other_value`` in Python resolves to ``other_value``). 539If it fails we just give a not-very-helpful error message, using ``412 540Precondition Failed`` (typically preconditions are HTTP headers like 541``If-Unmodified-Since``, but we can't really get the browser to send 542requests like that, so we use the hidden field instead). 543 544.. note:: 545 546 Error statuses in HTTP are often under-used because people think 547 they need to either return an error (useful for machines) or an 548 error message or interface (useful for humans). In fact you can 549 do both: you can give any human readable error message with your 550 error response. 551 552 One problem is that Internet Explorer will replace error messages 553 with its own incredibly unhelpful error messages. However, it 554 will only do this if the error message is short. If it's fairly 555 large (4Kb is large enough) it will show the error message it was 556 given. You can load your error with a big HTML comment to 557 accomplish this, like ``"<!-- %s -->" % ('x'*4000)``. 558 559 You can change the status of any response with ``resp.status_int = 560 412``, or you can change the body of an ``exc.HTTPSomething`` with 561 ``resp.body = new_body``. The primary advantage of using the 562 classes in ``webob.exc`` is giving the response a clear name and a 563 boilerplate error message. 564 565After we check the mtime we get the form parameters from 566``req.params`` and issue a redirect back to the original view page. 567``303 See Other`` is a good response to give after accepting a POST 568form submission, as it gets rid of the POST (no warning messages for the 569user if they try to go back). 570 571In this example we've used ``req.params`` for all the form values. If 572we wanted to be specific about where we get the values from, they 573could come from ``req.GET`` (the query string, a misnomer since the 574query string is present even in POST requests) or ``req.POST`` (a POST 575form body). While sometimes it's nice to distinguish between these 576two locations, for the most part it doesn't matter. If you want to 577check the request method (e.g., make sure you can't change a page with 578a GET request) there's no reason to do it by accessing these 579method-specific getters. It's better to just handle the method 580specifically. We do it here by including the request method in our 581dispatcher (dispatching to ``action_view_GET`` or 582``action_view_POST``). 583 584 585Cookies 586------- 587 588One last little improvement we can do is show the user a message when 589they update the page, so it's not quite so mysteriously just another 590page view. 591 592A simple way to do this is to set a cookie after the save, then 593display it in the page view. To set it on save, we add a little to 594``action_view_POST``: 595 596.. code-block:: python 597 598 def action_view_POST(self, req, page): 599 ... 600 resp = exc.HTTPSeeOther( 601 location=req.path_url) 602 resp.set_cookie('message', 'Page updated') 603 return resp 604 605And then in ``action_view_GET``: 606 607.. code-block:: python 608 609 610 VIEW_TEMPLATE = HTMLTemplate("""\ 611 ... 612 {{if message}} 613 <div style="background-color: #99f">{{message}}</div> 614 {{endif}} 615 ...""") 616 617 class WikiApp(object): 618 ... 619 620 def action_view_GET(self, req, page): 621 ... 622 if req.cookies.get('message'): 623 message = req.cookies['message'] 624 else: 625 message = None 626 text = self.view_template.substitute( 627 page=page, req=req, message=message) 628 resp = Response(text) 629 if message: 630 resp.delete_cookie('message') 631 else: 632 resp.last_modified = page.mtime 633 resp.conditional_response = True 634 return resp 635 636``req.cookies`` is just a dictionary, and we also delete the cookie if 637it is present (so the message doesn't keep getting set). The 638conditional response stuff only applies when there isn't any 639message, as messages are private. Another alternative would be to 640display the message with Javascript, like:: 641 642 <script type="text/javascript"> 643 function readCookie(name) { 644 var nameEQ = name + "="; 645 var ca = document.cookie.split(';'); 646 for (var i=0; i < ca.length; i++) { 647 var c = ca[i]; 648 while (c.charAt(0) == ' ') c = c.substring(1,c.length); 649 if (c.indexOf(nameEQ) == 0) return c.substring(nameEQ.length,c.length); 650 } 651 return null; 652 } 653 654 function createCookie(name, value, days) { 655 if (days) { 656 var date = new Date(); 657 date.setTime(date.getTime()+(days*24*60*60*1000)); 658 var expires = "; expires="+date.toGMTString(); 659 } else { 660 var expires = ""; 661 } 662 document.cookie = name+"="+value+expires+"; path=/"; 663 } 664 665 function eraseCookie(name) { 666 createCookie(name, "", -1); 667 } 668 669 function showMessage() { 670 var message = readCookie('message'); 671 if (message) { 672 var el = document.getElementById('message'); 673 el.innerHTML = message; 674 el.style.display = ''; 675 eraseCookie('message'); 676 } 677 } 678 </script> 679 680Then put ``<div id="messaage" style="display: none"></div>`` in the 681page somewhere. This has the advantage of being very cacheable and 682simple on the server side. 683 684Conclusion 685---------- 686 687We're done, hurrah! 688