• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Wiki Example
2============
3
4:author: Ian Bicking <ianb@colorstudy.com>
5
6.. contents::
7
8Introduction
9------------
10
11This is an example of how to write a WSGI application using WebOb.
12WebOb isn't itself intended to write applications -- it is not a web
13framework on its own -- but it is *possible* to write applications
14using just WebOb.
15
16The `file serving example <file-example.html>`_ is a better example of
17advanced HTTP usage.  The `comment middleware example
18<comment-example.html>`_ is a better example of using middleware.
19This example provides some completeness by showing an
20application-focused end point.
21
22This example implements a very simple wiki.
23
24Code
25----
26
27The finished code for this is available in
28`docs/wiki-example-code/example.py
29<https://github.com/Pylons/webob/tree/master/docs/wiki-example-code/example.py>`_
30-- you can run that file as a script to try it out.
31
32Creating an Application
33-----------------------
34
35A common pattern for creating small WSGI applications is to have a
36class which is instantiated with the configuration.  For our
37application we'll be storing the pages under a directory.
38
39.. code-block:: python
40
41    class WikiApp(object):
42
43        def __init__(self, storage_dir):
44            self.storage_dir = os.path.abspath(os.path.normpath(storage_dir))
45
46WSGI applications are callables like ``wsgi_app(environ,
47start_response)``.  *Instances* of `WikiApp` are WSGI applications, so
48we'll implement a ``__call__`` method:
49
50.. code-block:: python
51
52    class WikiApp(object):
53        ...
54        def __call__(self, environ, start_response):
55            # what we'll fill in
56
57To make the script runnable we'll create a simple command-line
58interface:
59
60.. code-block:: python
61
62    if __name__ == '__main__':
63        import optparse
64        parser = optparse.OptionParser(
65            usage='%prog --port=PORT'
66            )
67        parser.add_option(
68            '-p', '--port',
69            default='8080',
70            dest='port',
71            type='int',
72            help='Port to serve on (default 8080)')
73        parser.add_option(
74            '--wiki-data',
75            default='./wiki',
76            dest='wiki_data',
77            help='Place to put wiki data into (default ./wiki/)')
78        options, args = parser.parse_args()
79        print 'Writing wiki pages to %s' % options.wiki_data
80        app = WikiApp(options.wiki_data)
81        from wsgiref.simple_server import make_server
82        httpd = make_server('localhost', options.port, app)
83        print 'Serving on http://localhost:%s' % options.port
84        try:
85            httpd.serve_forever()
86        except KeyboardInterrupt:
87            print '^C'
88
89There's not much to talk about in this code block.  The application is
90instantiated and served with the built-in module
91`wsgiref.simple_server
92<http://www.python.org/doc/current/lib/module-wsgiref.simple_server.html>`_.
93
94The WSGI Application
95--------------------
96
97Of course all the interesting stuff is in that ``__call__`` method.
98WebOb lets you ignore some of the details of WSGI, like what
99``start_response`` really is.  ``environ`` is a CGI-like dictionary,
100but ``webob.Request`` gives an object interface to it.
101``webob.Response`` represents a response, and is itself a WSGI
102application.  Here's kind of the hello world of WSGI applications
103using these objects:
104
105.. code-block:: python
106
107    from webob import Request, Response
108
109    class WikiApp(object):
110        ...
111
112        def __call__(self, environ, start_response):
113            req = Request(environ)
114            resp = Response(
115                'Hello %s!' % req.params.get('name', 'World'))
116            return resp(environ, start_response)
117
118``req.params.get('name', 'World')`` gets any query string parameter
119(like ``?name=Bob``), or if it's a POST form request it will look for
120a form parameter ``name``.  We instantiate the response with the body
121of the response.  You could also give keyword arguments like
122``content_type='text/plain'`` (``text/html`` is the default content
123type and ``200 OK`` is the default status).
124
125For the wiki application we'll support a couple different kinds of
126screens, and we'll make our ``__call__`` method dispatch to different
127methods depending on the request.  We'll support an ``action``
128parameter like ``?action=edit``, and also dispatch on the method (GET,
129POST, etc, in ``req.method``).  We'll pass in the request and expect a
130response object back.
131
132Also, WebOb has a series of exceptions in ``webob.exc``, like
133``webob.exc.HTTPNotFound``, ``webob.exc.HTTPTemporaryRedirect``, etc.
134We'll also let the method raise one of these exceptions and turn it
135into a response.
136
137One last thing we'll do in our ``__call__`` method is create our
138``Page`` object, which represents a wiki page.
139
140All this together makes:
141
142.. code-block:: python
143
144    from webob import Request, Response
145    from webob import exc
146
147    class WikiApp(object):
148        ...
149
150        def __call__(self, environ, start_response):
151            req = Request(environ)
152            action = req.params.get('action', 'view')
153            # Here's where we get the Page domain object:
154            page = self.get_page(req.path_info)
155            try:
156                try:
157                    # The method name is action_{action_param}_{request_method}:
158                    meth = getattr(self, 'action_%s_%s' % (action, req.method))
159                except AttributeError:
160                    # If the method wasn't found there must be
161                    # something wrong with the request:
162                    raise exc.HTTPBadRequest('No such action %r' % action)
163                resp = meth(req, page)
164            except exc.HTTPException, e:
165                # The exception object itself is a WSGI application/response:
166                resp = e
167            return resp(environ, start_response)
168
169The Domain Object
170-----------------
171
172The ``Page`` domain object isn't really related to the web, but it is
173important to implementing this.  Each ``Page`` is just a file on the
174filesystem.  Our ``get_page`` method figures out the filename given
175the path (the path is in ``req.path_info``, which is all the path
176after the base path).  The ``Page`` class handles getting and setting
177the title and content.
178
179Here's the method to figure out the filename:
180
181.. code-block:: python
182
183    import os
184
185    class WikiApp(object):
186        ...
187
188        def get_page(self, path):
189            path = path.lstrip('/')
190            if not path:
191                # The path was '/', the home page
192                path = 'index'
193            path = os.path.join(self.storage_dir)
194            path = os.path.normpath(path)
195            if path.endswith('/'):
196                path += 'index'
197            if not path.startswith(self.storage_dir):
198                raise exc.HTTPBadRequest("Bad path")
199            path += '.html'
200            return Page(path)
201
202Mostly this is just the kind of careful path construction you have to
203do when mapping a URL to a filename.  While the server *may* normalize
204the path (so that a path like ``/../../`` can't be requested), you can
205never really be sure.  By using ``os.path.normpath`` we eliminate
206these, and then we make absolutely sure that the resulting path is
207under our ``self.storage_dir`` with ``if not
208path.startswith(self.storage_dir): raise exc.HTTPBadRequest("Bad
209path")``.
210
211Here's the actual domain object:
212
213.. code-block:: python
214
215    class Page(object):
216        def __init__(self, filename):
217            self.filename = filename
218
219        @property
220        def exists(self):
221            return os.path.exists(self.filename)
222
223        @property
224        def title(self):
225            if not self.exists:
226                # we need to guess the title
227                basename = os.path.splitext(os.path.basename(self.filename))[0]
228                basename = re.sub(r'[_-]', ' ', basename)
229                return basename.capitalize()
230            content = self.full_content
231            match = re.search(r'<title>(.*?)</title>', content, re.I|re.S)
232            return match.group(1)
233
234        @property
235        def full_content(self):
236            f = open(self.filename, 'rb')
237            try:
238                return f.read()
239            finally:
240                f.close()
241
242        @property
243        def content(self):
244            if not self.exists:
245                return ''
246            content = self.full_content
247            match = re.search(r'<body[^>]*>(.*?)</body>', content, re.I|re.S)
248            return match.group(1)
249
250        @property
251        def mtime(self):
252            if not self.exists:
253                return None
254            else:
255                return int(os.stat(self.filename).st_mtime)
256
257        def set(self, title, content):
258            dir = os.path.dirname(self.filename)
259            if not os.path.exists(dir):
260                os.makedirs(dir)
261            new_content = """<html><head><title>%s</title></head><body>%s</body></html>""" % (
262                title, content)
263            f = open(self.filename, 'wb')
264            f.write(new_content)
265            f.close()
266
267Basically it provides a ``.title`` attribute, a ``.content``
268attribute, the ``.mtime`` (last modified time), and the page can exist
269or not (giving appropriate guesses for title and content when the page
270does not exist).  It encodes these on the filesystem as a simple HTML
271page that is parsed by some regular expressions.
272
273None of this really applies much to the web or WebOb, so I'll leave it
274to you to figure out the details of this.
275
276URLs, PATH_INFO, and SCRIPT_NAME
277--------------------------------
278
279This is an aside for the tutorial, but an important concept.  In WSGI,
280and accordingly with WebOb, the URL is split up into several pieces.
281Some of these are obvious and some not.
282
283An example::
284
285  http://example.com:8080/wiki/article/12?version=10
286
287There are several components here:
288
289* req.scheme: ``http``
290* req.host: ``example.com:8080``
291* req.server_name: ``example.com``
292* req.server_port: 8080
293* req.script_name: ``/wiki``
294* req.path_info: ``/article/12``
295* req.query_string: ``version=10``
296
297One non-obvious part is ``req.script_name`` and ``req.path_info``.
298These correspond to the CGI environmental variables ``SCRIPT_NAME``
299and ``PATH_INFO``.  ``req.script_name`` points to the *application*.
300You might have several applications in your site at different paths:
301one at ``/wiki``, one at ``/blog``, one at ``/``.  Each application
302doesn't necessarily know about the others, but it has to construct its
303URLs properly -- so any internal links to the wiki application should
304start with ``/wiki``.
305
306Just as there are pieces to the URL, there are several properties in
307WebOb to construct URLs based on these:
308
309* req.host_url: ``http://example.com:8080``
310* req.application_url: ``http://example.com:8080/wiki``
311* req.path_url: ``http://example.com:8080/wiki/article/12``
312* req.path: ``/wiki/article/12``
313* req.path_qs: ``/wiki/article/12?version=10``
314* req.url: ``http://example.com:8080/wiki/article/12?version10``
315
316You can also create URLs with
317``req.relative_url('some/other/page')``.  In this example that would
318resolve to ``http://example.com:8080/wiki/article/some/other/page``.
319You can also create a relative URL to the application URL
320(SCRIPT_NAME) like ``req.relative_url('some/other/page', True)`` which
321would be ``http://example.com:8080/wiki/some/other/page``.
322
323Back to the Application
324-----------------------
325
326We have a dispatching function with ``__call__`` and we have a domain
327object with ``Page``, but we aren't actually doing anything.
328
329The dispatching goes to ``action_ACTION_METHOD``, where ACTION
330defaults to ``view``.  So a simple page view will be
331``action_view_GET``.  Let's implement that:
332
333.. code-block:: python
334
335    class WikiApp(object):
336        ...
337
338        def action_view_GET(self, req, page):
339            if not page.exists:
340                return exc.HTTPTemporaryRedirect(
341                    location=req.url + '?action=edit')
342            text = self.view_template.substitute(
343                page=page, req=req)
344            resp = Response(text)
345            resp.last_modified = page.mtime
346            resp.conditional_response = True
347            return resp
348
349The first thing we do is redirect the user to the edit screen if the
350page doesn't exist.  ``exc.HTTPTemporaryRedirect`` is a response that
351gives a ``307 Temporary Redirect`` response with the given location.
352
353Otherwise we fill in a template.  The template language we're going to
354use in this example is `Tempita <http://pythonpaste.org/tempita/>`_, a
355very simple template language with a similar interface to
356`string.Template <http://python.org/doc/current/lib/node40.html>`_.
357
358The template actually looks like this:
359
360.. code-block:: python
361
362    from tempita import HTMLTemplate
363
364    VIEW_TEMPLATE = HTMLTemplate("""\
365    <html>
366     <head>
367      <title>{{page.title}}</title>
368     </head>
369     <body>
370    <h1>{{page.title}}</h1>
371
372    <div>{{page.content|html}}</div>
373
374    <hr>
375    <a href="{{req.url}}?action=edit">Edit</a>
376     </body>
377    </html>
378    """)
379
380    class WikiApp(object):
381        view_template = VIEW_TEMPLATE
382        ...
383
384As you can see it's a simple template using the title and the body,
385and a link to the edit screen.  We copy the template object into a
386class method (``view_template = VIEW_TEMPLATE``) so that potentially a
387subclass could override these templates.
388
389``tempita.HTMLTemplate`` is a template that does automatic HTML
390escaping.  Our wiki will just be written in plain HTML, so we disable
391escaping of the content with ``{{page.content|html}}``.
392
393So let's look at the ``action_view_GET`` method again:
394
395.. code-block:: python
396
397        def action_view_GET(self, req, page):
398            if not page.exists:
399                return exc.HTTPTemporaryRedirect(
400                    location=req.url + '?action=edit')
401            text = self.view_template.substitute(
402                page=page, req=req)
403            resp = Response(text)
404            resp.last_modified = page.mtime
405            resp.conditional_response = True
406            return resp
407
408The template should be pretty obvious now.  We create a response with
409``Response(text)``, which already has a default Content-Type of
410``text/html``.
411
412To allow conditional responses we set ``resp.last_modified``.  You can
413set this attribute to a date, None (effectively removing the header),
414a time tuple (like produced by ``time.localtime()``), or as in this
415case to an integer timestamp.  If you get the value back it will
416always be a `datetime
417<http://python.org/doc/current/lib/datetime-datetime.html>`_ object
418(or None).  With this header we can process requests with
419If-Modified-Since headers, and return ``304 Not Modified`` if
420appropriate.  It won't actually do that unless you set
421``resp.conditional_response`` to True.
422
423.. note::
424
425    If you subclass ``webob.Response`` you can set the class attribute
426    ``default_conditional_response = True`` and this setting will be
427    on by default.  You can also set other defaults, like the
428    ``default_charset`` (``"utf8"``), or ``default_content_type``
429    (``"text/html"``).
430
431The Edit Screen
432---------------
433
434The edit screen will be implemented in the method
435``action_edit_GET``.  There's a template and a very simple method:
436
437.. code-block:: python
438
439    EDIT_TEMPLATE = HTMLTemplate("""\
440    <html>
441     <head>
442      <title>Edit: {{page.title}}</title>
443     </head>
444     <body>
445    {{if page.exists}}
446    <h1>Edit: {{page.title}}</h1>
447    {{else}}
448    <h1>Create: {{page.title}}</h1>
449    {{endif}}
450
451    <form action="{{req.path_url}}" method="POST">
452     <input type="hidden" name="mtime" value="{{page.mtime}}">
453     Title: <input type="text" name="title" style="width: 70%" value="{{page.title}}"><br>
454     Content: <input type="submit" value="Save">
455     <a href="{{req.path_url}}">Cancel</a>
456       <br>
457     <textarea name="content" style="width: 100%; height: 75%" rows="40">{{page.content}}</textarea>
458       <br>
459     <input type="submit" value="Save">
460     <a href="{{req.path_url}}">Cancel</a>
461    </form>
462    </body></html>
463    """)
464
465    class WikiApp(object):
466        ...
467
468        edit_template = EDIT_TEMPLATE
469
470        def action_edit_GET(self, req, page):
471            text = self.edit_template.substitute(
472                page=page, req=req)
473            return Response(text)
474
475As you can see, all the action here is in the template.
476
477In ``<form action="{{req.path_url}}" method="POST">`` we submit to
478``req.path_url``; that's everything *but* ``?action=edit``.  So we are
479POSTing right over the view page.  This has the nice side effect of
480automatically invalidating any caches of the original page.  It also
481is vaguely `RESTful
482<http://en.wikipedia.org/wiki/Representational_State_Transfer>`_.
483
484We save the last modified time in a hidden ``mtime`` field.  This way
485we can detect concurrent updates.  If start editing the page who's
486mtime is 100000, and someone else edits and saves a revision changing
487the mtime to 100010, we can use this hidden field to detect that
488conflict.  Actually resolving the conflict is a little tricky and
489outside the scope of this particular tutorial, we'll just note the
490conflict to the user in an error.
491
492From there we just have a very straight-forward HTML form.  Note that
493we don't quote the values because that is done automatically by
494``HTMLTemplate``; if you are using something like ``string.Template``
495or a templating language that doesn't do automatic quoting, you have
496to be careful to quote all the field values.
497
498We don't have any error conditions in our application, but if there
499were error conditions we might have to re-display this form with the
500input values the user already gave.  In that case we'd do something
501like::
502
503    <input type="text" name="title"
504     value="{{req.params.get('title', page.title)}}">
505
506This way we use the value in the request (``req.params`` is both the
507query string parameters and any variables in a POST response), but if
508there is no value (e.g., first request) then we use the page values.
509
510Processing the Form
511-------------------
512
513The form submits to ``action_view_POST`` (``view`` is the default
514action).  So we have to implement that method:
515
516.. code-block:: python
517
518    class WikiApp(object):
519        ...
520
521        def action_view_POST(self, req, page):
522            submit_mtime = int(req.params.get('mtime') or '0') or None
523            if page.mtime != submit_mtime:
524                return exc.HTTPPreconditionFailed(
525                    "The page has been updated since you started editing it")
526            page.set(
527                title=req.params['title'],
528                content=req.params['content'])
529            resp = exc.HTTPSeeOther(
530                location=req.path_url)
531            return resp
532
533The first thing we do is check the mtime value.  It can be an empty
534string (when there's no mtime, like when you are creating a page) or
535an integer.  ``int(req.params.get('time') or '0') or None`` basically
536makes sure we don't pass ``""`` to ``int()`` (which is an error) then
537turns 0 into None (``0 or None`` will evaluate to None in Python --
538``false_value or other_value`` in Python resolves to ``other_value``).
539If it fails we just give a not-very-helpful error message, using ``412
540Precondition Failed`` (typically preconditions are HTTP headers like
541``If-Unmodified-Since``, but we can't really get the browser to send
542requests like that, so we use the hidden field instead).
543
544.. note::
545
546    Error statuses in HTTP are often under-used because people think
547    they need to either return an error (useful for machines) or an
548    error message or interface (useful for humans).  In fact you can
549    do both: you can give any human readable error message with your
550    error response.
551
552    One problem is that Internet Explorer will replace error messages
553    with its own incredibly unhelpful error messages.  However, it
554    will only do this if the error message is short.  If it's fairly
555    large (4Kb is large enough) it will show the error message it was
556    given.  You can load your error with a big HTML comment to
557    accomplish this, like ``"<!-- %s -->" % ('x'*4000)``.
558
559    You can change the status of any response with ``resp.status_int =
560    412``, or you can change the body of an ``exc.HTTPSomething`` with
561    ``resp.body = new_body``.  The primary advantage of using the
562    classes in ``webob.exc`` is giving the response a clear name and a
563    boilerplate error message.
564
565After we check the mtime we get the form parameters from
566``req.params`` and issue a redirect back to the original view page.
567``303 See Other`` is a good response to give after accepting a POST
568form submission, as it gets rid of the POST (no warning messages for the
569user if they try to go back).
570
571In this example we've used ``req.params`` for all the form values.  If
572we wanted to be specific about where we get the values from, they
573could come from ``req.GET`` (the query string, a misnomer since the
574query string is present even in POST requests) or ``req.POST`` (a POST
575form body).  While sometimes it's nice to distinguish between these
576two locations, for the most part it doesn't matter.  If you want to
577check the request method (e.g., make sure you can't change a page with
578a GET request) there's no reason to do it by accessing these
579method-specific getters.  It's better to just handle the method
580specifically.  We do it here by including the request method in our
581dispatcher (dispatching to ``action_view_GET`` or
582``action_view_POST``).
583
584
585Cookies
586-------
587
588One last little improvement we can do is show the user a message when
589they update the page, so it's not quite so mysteriously just another
590page view.
591
592A simple way to do this is to set a cookie after the save, then
593display it in the page view.  To set it on save, we add a little to
594``action_view_POST``:
595
596.. code-block:: python
597
598    def action_view_POST(self, req, page):
599        ...
600        resp = exc.HTTPSeeOther(
601            location=req.path_url)
602        resp.set_cookie('message', 'Page updated')
603        return resp
604
605And then in ``action_view_GET``:
606
607.. code-block:: python
608
609
610    VIEW_TEMPLATE = HTMLTemplate("""\
611    ...
612    {{if message}}
613    <div style="background-color: #99f">{{message}}</div>
614    {{endif}}
615    ...""")
616
617    class WikiApp(object):
618        ...
619
620        def action_view_GET(self, req, page):
621            ...
622            if req.cookies.get('message'):
623                message = req.cookies['message']
624            else:
625                message = None
626            text = self.view_template.substitute(
627                page=page, req=req, message=message)
628            resp = Response(text)
629            if message:
630                resp.delete_cookie('message')
631            else:
632                resp.last_modified = page.mtime
633                resp.conditional_response = True
634            return resp
635
636``req.cookies`` is just a dictionary, and we also delete the cookie if
637it is present (so the message doesn't keep getting set).  The
638conditional response stuff only applies when there isn't any
639message, as messages are private.  Another alternative would be to
640display the message with Javascript, like::
641
642    <script type="text/javascript">
643    function readCookie(name) {
644        var nameEQ = name + "=";
645        var ca = document.cookie.split(';');
646        for (var i=0; i < ca.length; i++) {
647            var c = ca[i];
648            while (c.charAt(0) == ' ') c = c.substring(1,c.length);
649            if (c.indexOf(nameEQ) == 0) return c.substring(nameEQ.length,c.length);
650        }
651        return null;
652    }
653
654    function createCookie(name, value, days) {
655        if (days) {
656            var date = new Date();
657            date.setTime(date.getTime()+(days*24*60*60*1000));
658            var expires = "; expires="+date.toGMTString();
659        } else {
660            var expires = "";
661        }
662        document.cookie = name+"="+value+expires+"; path=/";
663    }
664
665    function eraseCookie(name) {
666        createCookie(name, "", -1);
667    }
668
669    function showMessage() {
670        var message = readCookie('message');
671        if (message) {
672            var el = document.getElementById('message');
673            el.innerHTML = message;
674            el.style.display = '';
675            eraseCookie('message');
676        }
677    }
678    </script>
679
680Then put ``<div id="messaage" style="display: none"></div>`` in the
681page somewhere.  This has the advantage of being very cacheable and
682simple on the server side.
683
684Conclusion
685----------
686
687We're done, hurrah!
688