• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<html>
2<head>
3<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
4<title>Rationale</title>
5<link rel="stylesheet" href="../../../../../doc/src/boostbook.css" type="text/css">
6<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
7<link rel="home" href="../index.html" title="Spirit 2.5.8">
8<link rel="up" href="../index.html" title="Spirit 2.5.8">
9<link rel="prev" href="notes/style_guide.html" title="Style Guide">
10<link rel="next" href="repository.html" title="Spirit Repository">
11</head>
12<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
13<table cellpadding="2" width="100%"><tr>
14<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../boost.png"></td>
15<td align="center"><a href="../../../../../index.html">Home</a></td>
16<td align="center"><a href="../../../../../libs/libraries.htm">Libraries</a></td>
17<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
18<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
19<td align="center"><a href="../../../../../more/index.htm">More</a></td>
20</tr></table>
21<hr>
22<div class="spirit-nav">
23<a accesskey="p" href="notes/style_guide.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="repository.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a>
24</div>
25<div class="section">
26<div class="titlepage"><div><div><h2 class="title" style="clear: both">
27<a name="spirit.rationale"></a><a class="link" href="rationale.html" title="Rationale">Rationale</a>
28</h2></div></div></div>
29<h4>
30<a name="spirit.rationale.h0"></a>
31      <span class="phrase"><a name="spirit.rationale.naming"></a></span><a class="link" href="rationale.html#spirit.rationale.naming">Naming</a>
32    </h4>
33<p>
34      Why use the name "Spirit", "Qi" and "Karma"?
35      Because xpressive names have a better spirit, brings qi to your software and
36      will enhance your karma so they can heal your (con)fusion and make you wave
37      like a phoenix from the ashes. (Joachim Faulhaber)
38    </p>
39<h4>
40<a name="spirit.rationale.h1"></a>
41      <span class="phrase"><a name="spirit.rationale.type_erasure__from_static_to_dynamic_c__"></a></span><a class="link" href="rationale.html#spirit.rationale.type_erasure__from_static_to_dynamic_c__">Type Erasure:
42      From static to dynamic C++</a>
43    </h4>
44<p>
45      Rules straddle the border between static and dynamic C++. In effect, a rule
46      transforms compile-time polymorphism (using templates) into run-time polymorphism
47      (using virtual functions). This is necessary due to C++'s inability to automatically
48      declare a variable of a type deduced from an arbitrarily complex expression
49      in the right-hand side (rhs) of an assignment. Basically, we want to do something
50      like:
51    </p>
52<pre class="programlisting"><span class="identifier">T</span> <span class="identifier">rule</span> <span class="special">=</span> <span class="identifier">an_arbitrarily_complex_expression</span><span class="special">;</span>
53</pre>
54<p>
55      without having to know or care about the resulting type of the right-hand side
56      (rhs) of the assignment expression. This can be done with modern C++ 0x compilers
57      using <code class="computeroutput"><span class="keyword">auto</span></code>:
58    </p>
59<pre class="programlisting"><span class="keyword">auto</span> <span class="identifier">rule</span> <span class="special">=</span> <span class="identifier">an_arbitrarily_complex_expression</span><span class="special">;</span>
60</pre>
61<p>
62      Apart from this, we also need a facility to forward declare an unknown type:
63    </p>
64<pre class="programlisting"><span class="identifier">T</span> <span class="identifier">rule</span><span class="special">;</span>
65<span class="special">...</span>
66<span class="identifier">rule</span> <span class="special">=</span> <span class="identifier">a</span> <span class="special">|</span> <span class="identifier">b</span><span class="special">;</span>
67</pre>
68<p>
69      These limitations lead us to this implementation of rules. This comes at the
70      expense of the overhead of a type-erased call which is an indirect function
71      call that connot be inlined, once through each invocation of a rule.
72    </p>
73<h4>
74<a name="spirit.rationale.h2"></a>
75      <span class="phrase"><a name="spirit.rationale.multiple_declaration"></a></span><a class="link" href="rationale.html#spirit.rationale.multiple_declaration">Multiple
76      declaration</a>
77    </h4>
78<p>
79      Some BNF variants allow multiple declarations of a rule. The declarations are
80      taken as alternatives. Example:
81    </p>
82<pre class="programlisting"><span class="identifier">r</span> <span class="special">=</span> <span class="identifier">a</span><span class="special">;</span>
83<span class="identifier">r</span> <span class="special">=</span> <span class="identifier">b</span><span class="special">;</span>
84</pre>
85<p>
86      is equivalent to:
87    </p>
88<pre class="programlisting"><span class="identifier">r</span> <span class="special">=</span> <span class="identifier">a</span> <span class="special">|</span> <span class="identifier">b</span><span class="special">;</span>
89</pre>
90<p>
91      Spirit v1.3 allowed this behavior. However, the current version of Spirit no
92      longer allows this because experience shows that this behavior leads to unwanted
93      gotchas (for instance, it does not allow rules to be held in containers). In
94      the current release of Spirit, a second assignment to a rule will simply redefine
95      it. The old definition is destructed. This follows more closely C++ semantics
96      and is more in line with what the user expects the rule to behave.
97    </p>
98<h4>
99<a name="spirit.rationale.h3"></a>
100      <span class="phrase"><a name="spirit.rationale.sequencing_syntax"></a></span><a class="link" href="rationale.html#spirit.rationale.sequencing_syntax">Sequencing
101      Syntax</a>
102    </h4>
103<p>
104      The comma operator as in <code class="computeroutput"><span class="identifier">a</span><span class="special">,</span> <span class="identifier">b</span></code> seems
105      to be a better candidate, syntax-wise. But then the problem is with its precedence.
106      It has the lowest precedence in C/C++, which makes it virtually useless.
107    </p>
108<p>
109      Bjarne Stroustrup, in his article "Generalizing Overloading for C++2000"
110      talks about overloading whitespace. Such a feature would allow juxtapositioning
111      of parser objects exactly as we do in (E)BNF (e.g. a b | c instead of a &gt;&gt;
112      b | c). Unfortunately, the article was dated April 1, 1998. Oh well.
113    </p>
114<h4>
115<a name="spirit.rationale.h4"></a>
116      <span class="phrase"><a name="spirit.rationale.forward_iterators"></a></span><a class="link" href="rationale.html#spirit.rationale.forward_iterators">Forward
117      iterators</a>
118    </h4>
119<p>
120      In general, the expected iterator is at least a standard conforming forward
121      iterator. Forward iterators are needed for backtracking where the iterator
122      needs to be saved and restored later. Generally speaking, Spirit is a backtracking
123      parser. The implication of this is that at some point, the iterator position
124      will have to be saved to allow the parser to backtrack to a previous point.
125      Thus, for backtracking to work, the framework requires at least a forward iterator.
126    </p>
127<p>
128      There are remedies of course. In cases where we need to use input iterators,
129      you can use the <a class="link" href="support/multi_pass.html" title="The multi pass iterator"><code class="computeroutput"><span class="identifier">multi_pass</span></code></a>
130      iterator to make the forward iterators.
131    </p>
132<p>
133      Some parsers might require more specialized iterators (bi-directional or even
134      random access). Perhaps in the future, deterministic parsers when added to
135      the framework, will perform no backtracking and will need just a single token
136      lookahead, hence will require input iterators only.
137    </p>
138<h4>
139<a name="spirit.rationale.h5"></a>
140      <span class="phrase"><a name="spirit.rationale.exhaustive_backtracking_and_greedy_rd"></a></span><a class="link" href="rationale.html#spirit.rationale.exhaustive_backtracking_and_greedy_rd">Exhaustive
141      backtracking and greedy RD</a>
142    </h4>
143<p>
144      Spirit doesn't do exhaustive backtracking like regular expressions are expected
145      to. For example:
146    </p>
147<pre class="programlisting"><span class="special">*</span><span class="identifier">char_</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="identifier">char_</span><span class="special">(</span><span class="char">'a'</span><span class="special">);</span>
148</pre>
149<p>
150      will always fail to match because Spirit's Kleene star does not back off when
151      the rest of the rule fails to match.
152    </p>
153<p>
154      Actually, there's a solution to this greedy RD problem. Such a scheme is discussed
155      in section 6.6.2 of Parsing Techniques: A Practical Guide. The trick involves
156      passing a tail parser (in addition to the scanner) to each parser. The start
157      parser will then simply be:
158    </p>
159<pre class="programlisting"><span class="identifier">start</span> <span class="special">&gt;&gt;</span> <span class="identifier">end</span><span class="special">;</span>
160</pre>
161<p>
162      (end is the start's tail).
163    </p>
164<p>
165      Spirit is greedy --using straight forward, naive RD. It is certainly possible
166      to implement the fully backtracking scheme presented above, but there will
167      be also certainly be a performance hit. The scheme will always try to match
168      all possible parser paths (full parser hierarchy traversal) until it reaches
169      a point of certainty, that the whole thing matches or fails to match.
170    </p>
171<div class="blockquote"><blockquote class="blockquote">
172<p>
173        Backtracking and Greedy RD
174      </p>
175<p>
176        Spirit is quite consistent and intuitive about when it backtracks and to
177        where, although it may not be obvious to those coming from different backgrounds.
178        In general, any (sub)parser will, given the same input, always match the
179        same portion of the input (or fail to match the input at all). This means
180        that Spirit is inherently greedy. Spirit will only backtrack when a (sub)parser
181        fails to match the input, and it will always backtrack to the next choice
182        point upward (not backward) in the parser structure. In other words abb|ab
183        will match <code class="computeroutput"><span class="string">"ab"</span></code>, as
184        will <code class="computeroutput"><span class="identifier">a</span><span class="special">(</span><span class="identifier">bb</span><span class="special">|</span><span class="identifier">b</span><span class="special">)</span></code>, but <code class="computeroutput"><span class="special">(</span><span class="identifier">ab</span><span class="special">|</span><span class="identifier">a</span><span class="special">)</span><span class="identifier">b</span></code> won't
185        because the <code class="computeroutput"><span class="special">(</span><span class="identifier">ab</span><span class="special">|</span><span class="identifier">a</span><span class="special">)</span></code>
186        subparser will always match the <code class="computeroutput"><span class="char">'b'</span></code>
187        after the <code class="computeroutput"><span class="char">'a'</span></code> if it is available.
188      </p>
189<p>
190        --Rainer Deyke
191      </p>
192</blockquote></div>
193<p>
194      This is the very nature of <a class="link" href="abstracts/parsing_expression_grammar.html" title="Parsing Expression Grammar">Parsing
195      Expression Grammar</a>.
196    </p>
197<p>
198      There's a strong preference on "simplicity with all the knobs when you
199      need them" approach, right now. On the other hand, the flexibility of
200      Spirit makes it possible to have different optional schemes available. It might
201      be possible to implement an exhaustive backtracking RD scheme as an optional
202      feature in the future.
203    </p>
204</div>
205<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
206<td align="left"></td>
207<td align="right"><div class="copyright-footer">Copyright © 2001-2011 Joel de Guzman, Hartmut Kaiser<p>
208        Distributed under the Boost Software License, Version 1.0. (See accompanying
209        file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
210      </p>
211</div></td>
212</tr></table>
213<hr>
214<div class="spirit-nav">
215<a accesskey="p" href="notes/style_guide.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="repository.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a>
216</div>
217</body>
218</html>
219