1<html> 2<head> 3<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> 4<title>Rationale</title> 5<link rel="stylesheet" href="../../../../../doc/src/boostbook.css" type="text/css"> 6<meta name="generator" content="DocBook XSL Stylesheets V1.79.1"> 7<link rel="home" href="../index.html" title="Spirit 2.5.8"> 8<link rel="up" href="../index.html" title="Spirit 2.5.8"> 9<link rel="prev" href="notes/style_guide.html" title="Style Guide"> 10<link rel="next" href="repository.html" title="Spirit Repository"> 11</head> 12<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> 13<table cellpadding="2" width="100%"><tr> 14<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../boost.png"></td> 15<td align="center"><a href="../../../../../index.html">Home</a></td> 16<td align="center"><a href="../../../../../libs/libraries.htm">Libraries</a></td> 17<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td> 18<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td> 19<td align="center"><a href="../../../../../more/index.htm">More</a></td> 20</tr></table> 21<hr> 22<div class="spirit-nav"> 23<a accesskey="p" href="notes/style_guide.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="repository.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a> 24</div> 25<div class="section"> 26<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 27<a name="spirit.rationale"></a><a class="link" href="rationale.html" title="Rationale">Rationale</a> 28</h2></div></div></div> 29<h4> 30<a name="spirit.rationale.h0"></a> 31 <span class="phrase"><a name="spirit.rationale.naming"></a></span><a class="link" href="rationale.html#spirit.rationale.naming">Naming</a> 32 </h4> 33<p> 34 Why use the name "Spirit", "Qi" and "Karma"? 35 Because xpressive names have a better spirit, brings qi to your software and 36 will enhance your karma so they can heal your (con)fusion and make you wave 37 like a phoenix from the ashes. (Joachim Faulhaber) 38 </p> 39<h4> 40<a name="spirit.rationale.h1"></a> 41 <span class="phrase"><a name="spirit.rationale.type_erasure__from_static_to_dynamic_c__"></a></span><a class="link" href="rationale.html#spirit.rationale.type_erasure__from_static_to_dynamic_c__">Type Erasure: 42 From static to dynamic C++</a> 43 </h4> 44<p> 45 Rules straddle the border between static and dynamic C++. In effect, a rule 46 transforms compile-time polymorphism (using templates) into run-time polymorphism 47 (using virtual functions). This is necessary due to C++'s inability to automatically 48 declare a variable of a type deduced from an arbitrarily complex expression 49 in the right-hand side (rhs) of an assignment. Basically, we want to do something 50 like: 51 </p> 52<pre class="programlisting"><span class="identifier">T</span> <span class="identifier">rule</span> <span class="special">=</span> <span class="identifier">an_arbitrarily_complex_expression</span><span class="special">;</span> 53</pre> 54<p> 55 without having to know or care about the resulting type of the right-hand side 56 (rhs) of the assignment expression. This can be done with modern C++ 0x compilers 57 using <code class="computeroutput"><span class="keyword">auto</span></code>: 58 </p> 59<pre class="programlisting"><span class="keyword">auto</span> <span class="identifier">rule</span> <span class="special">=</span> <span class="identifier">an_arbitrarily_complex_expression</span><span class="special">;</span> 60</pre> 61<p> 62 Apart from this, we also need a facility to forward declare an unknown type: 63 </p> 64<pre class="programlisting"><span class="identifier">T</span> <span class="identifier">rule</span><span class="special">;</span> 65<span class="special">...</span> 66<span class="identifier">rule</span> <span class="special">=</span> <span class="identifier">a</span> <span class="special">|</span> <span class="identifier">b</span><span class="special">;</span> 67</pre> 68<p> 69 These limitations lead us to this implementation of rules. This comes at the 70 expense of the overhead of a type-erased call which is an indirect function 71 call that connot be inlined, once through each invocation of a rule. 72 </p> 73<h4> 74<a name="spirit.rationale.h2"></a> 75 <span class="phrase"><a name="spirit.rationale.multiple_declaration"></a></span><a class="link" href="rationale.html#spirit.rationale.multiple_declaration">Multiple 76 declaration</a> 77 </h4> 78<p> 79 Some BNF variants allow multiple declarations of a rule. The declarations are 80 taken as alternatives. Example: 81 </p> 82<pre class="programlisting"><span class="identifier">r</span> <span class="special">=</span> <span class="identifier">a</span><span class="special">;</span> 83<span class="identifier">r</span> <span class="special">=</span> <span class="identifier">b</span><span class="special">;</span> 84</pre> 85<p> 86 is equivalent to: 87 </p> 88<pre class="programlisting"><span class="identifier">r</span> <span class="special">=</span> <span class="identifier">a</span> <span class="special">|</span> <span class="identifier">b</span><span class="special">;</span> 89</pre> 90<p> 91 Spirit v1.3 allowed this behavior. However, the current version of Spirit no 92 longer allows this because experience shows that this behavior leads to unwanted 93 gotchas (for instance, it does not allow rules to be held in containers). In 94 the current release of Spirit, a second assignment to a rule will simply redefine 95 it. The old definition is destructed. This follows more closely C++ semantics 96 and is more in line with what the user expects the rule to behave. 97 </p> 98<h4> 99<a name="spirit.rationale.h3"></a> 100 <span class="phrase"><a name="spirit.rationale.sequencing_syntax"></a></span><a class="link" href="rationale.html#spirit.rationale.sequencing_syntax">Sequencing 101 Syntax</a> 102 </h4> 103<p> 104 The comma operator as in <code class="computeroutput"><span class="identifier">a</span><span class="special">,</span> <span class="identifier">b</span></code> seems 105 to be a better candidate, syntax-wise. But then the problem is with its precedence. 106 It has the lowest precedence in C/C++, which makes it virtually useless. 107 </p> 108<p> 109 Bjarne Stroustrup, in his article "Generalizing Overloading for C++2000" 110 talks about overloading whitespace. Such a feature would allow juxtapositioning 111 of parser objects exactly as we do in (E)BNF (e.g. a b | c instead of a >> 112 b | c). Unfortunately, the article was dated April 1, 1998. Oh well. 113 </p> 114<h4> 115<a name="spirit.rationale.h4"></a> 116 <span class="phrase"><a name="spirit.rationale.forward_iterators"></a></span><a class="link" href="rationale.html#spirit.rationale.forward_iterators">Forward 117 iterators</a> 118 </h4> 119<p> 120 In general, the expected iterator is at least a standard conforming forward 121 iterator. Forward iterators are needed for backtracking where the iterator 122 needs to be saved and restored later. Generally speaking, Spirit is a backtracking 123 parser. The implication of this is that at some point, the iterator position 124 will have to be saved to allow the parser to backtrack to a previous point. 125 Thus, for backtracking to work, the framework requires at least a forward iterator. 126 </p> 127<p> 128 There are remedies of course. In cases where we need to use input iterators, 129 you can use the <a class="link" href="support/multi_pass.html" title="The multi pass iterator"><code class="computeroutput"><span class="identifier">multi_pass</span></code></a> 130 iterator to make the forward iterators. 131 </p> 132<p> 133 Some parsers might require more specialized iterators (bi-directional or even 134 random access). Perhaps in the future, deterministic parsers when added to 135 the framework, will perform no backtracking and will need just a single token 136 lookahead, hence will require input iterators only. 137 </p> 138<h4> 139<a name="spirit.rationale.h5"></a> 140 <span class="phrase"><a name="spirit.rationale.exhaustive_backtracking_and_greedy_rd"></a></span><a class="link" href="rationale.html#spirit.rationale.exhaustive_backtracking_and_greedy_rd">Exhaustive 141 backtracking and greedy RD</a> 142 </h4> 143<p> 144 Spirit doesn't do exhaustive backtracking like regular expressions are expected 145 to. For example: 146 </p> 147<pre class="programlisting"><span class="special">*</span><span class="identifier">char_</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">char_</span><span class="special">(</span><span class="char">'a'</span><span class="special">);</span> 148</pre> 149<p> 150 will always fail to match because Spirit's Kleene star does not back off when 151 the rest of the rule fails to match. 152 </p> 153<p> 154 Actually, there's a solution to this greedy RD problem. Such a scheme is discussed 155 in section 6.6.2 of Parsing Techniques: A Practical Guide. The trick involves 156 passing a tail parser (in addition to the scanner) to each parser. The start 157 parser will then simply be: 158 </p> 159<pre class="programlisting"><span class="identifier">start</span> <span class="special">>></span> <span class="identifier">end</span><span class="special">;</span> 160</pre> 161<p> 162 (end is the start's tail). 163 </p> 164<p> 165 Spirit is greedy --using straight forward, naive RD. It is certainly possible 166 to implement the fully backtracking scheme presented above, but there will 167 be also certainly be a performance hit. The scheme will always try to match 168 all possible parser paths (full parser hierarchy traversal) until it reaches 169 a point of certainty, that the whole thing matches or fails to match. 170 </p> 171<div class="blockquote"><blockquote class="blockquote"> 172<p> 173 Backtracking and Greedy RD 174 </p> 175<p> 176 Spirit is quite consistent and intuitive about when it backtracks and to 177 where, although it may not be obvious to those coming from different backgrounds. 178 In general, any (sub)parser will, given the same input, always match the 179 same portion of the input (or fail to match the input at all). This means 180 that Spirit is inherently greedy. Spirit will only backtrack when a (sub)parser 181 fails to match the input, and it will always backtrack to the next choice 182 point upward (not backward) in the parser structure. In other words abb|ab 183 will match <code class="computeroutput"><span class="string">"ab"</span></code>, as 184 will <code class="computeroutput"><span class="identifier">a</span><span class="special">(</span><span class="identifier">bb</span><span class="special">|</span><span class="identifier">b</span><span class="special">)</span></code>, but <code class="computeroutput"><span class="special">(</span><span class="identifier">ab</span><span class="special">|</span><span class="identifier">a</span><span class="special">)</span><span class="identifier">b</span></code> won't 185 because the <code class="computeroutput"><span class="special">(</span><span class="identifier">ab</span><span class="special">|</span><span class="identifier">a</span><span class="special">)</span></code> 186 subparser will always match the <code class="computeroutput"><span class="char">'b'</span></code> 187 after the <code class="computeroutput"><span class="char">'a'</span></code> if it is available. 188 </p> 189<p> 190 --Rainer Deyke 191 </p> 192</blockquote></div> 193<p> 194 This is the very nature of <a class="link" href="abstracts/parsing_expression_grammar.html" title="Parsing Expression Grammar">Parsing 195 Expression Grammar</a>. 196 </p> 197<p> 198 There's a strong preference on "simplicity with all the knobs when you 199 need them" approach, right now. On the other hand, the flexibility of 200 Spirit makes it possible to have different optional schemes available. It might 201 be possible to implement an exhaustive backtracking RD scheme as an optional 202 feature in the future. 203 </p> 204</div> 205<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr> 206<td align="left"></td> 207<td align="right"><div class="copyright-footer">Copyright © 2001-2011 Joel de Guzman, Hartmut Kaiser<p> 208 Distributed under the Boost Software License, Version 1.0. (See accompanying 209 file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>) 210 </p> 211</div></td> 212</tr></table> 213<hr> 214<div class="spirit-nav"> 215<a accesskey="p" href="notes/style_guide.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="repository.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a> 216</div> 217</body> 218</html> 219