1<html> 2<head> 3<title>FAQ</title> 4<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> 5<link rel="stylesheet" href="theme/style.css" type="text/css"> 6</head> 7 8<body> 9<table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2"> 10 <tr> 11 <td width="10"> 12 </td> 13 <td width="85%"> <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>FAQ</b></font></td> 14 <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td> 15 </tr> 16</table> 17<br> 18<table border="0"> 19 <tr> 20 <td width="10"></td> 21 <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> 22 <td width="30"><a href="techniques.html"><img src="theme/l_arr.gif" border="0"></a></td> 23 <td width="30"><a href="rationale.html"><img src="theme/r_arr.gif" border="0"></a></td> 24 </tr> 25</table> 26<ul> 27 <li><a href="#scanner_business">The Scanner Business</a></li> 28 <li><a href="#left_recursion">Eliminating Left Recursion</a> </li> 29 <li><a href="#right_associativity">Implementing Right Associativity</a></li> 30 <li><a href="#lexeme_and_rules">The lexeme_d directive and rules</a></li> 31 <li><a href="#kleene_star">Kleene Star infinite loop</a></li> 32 <li><a href="#CVS">Boost CVS and Spirit CVS</a></li> 33 <li><a href="#compilation_times">How to reduce compilation times with complex 34 Spirit grammars</a></li> 35 <li><strong><a href="#frame_assertion">Closure frame assertion</a></strong></li> 36 <li><strong><a href="#greedy_rd">Greedy RD</a></strong></li> 37 <li><strong><a href="#referencing_a_rule_at_construction">Referencing a rule 38 at construction time</a></strong></li> 39 <li><strong><a href="#storing_rules">Storing Rules</a></strong></li> 40 <li><strong><a href="#parsing_ints_and_reals">Parsing ints and reals</a> </strong></li> 41 <li><strong><a href="#output_operator">BOOST_SPIRIT_DEBUG and missing <tt>operator<<</tt></a></strong></li> 42 <li><strong><a href="#repository">Applications that used to be part of spirit</a></strong></li> 43</ul> 44<p><b> <a name="scanner_business" id="scanner_business"></a> The Scanner Business</b></p> 45<p><font color="#FF0000">Question:</font> Why doesn't this compile?</p> 46<pre><code><font color="#000000"><span class=special> </span><span class=identifier>rule</span><span class=special><> </span><span class=identifier>r </span><span class=special>= /*...*/; 47</span> <span class=identifier>parse</span><span class=special>(</span><span class=string>"hello world"</span><span class=special>, </span><span class=identifier>r</span><span class=special>, </span><span class=identifier>space_p</span><span class=special>); </span><span class=comment>// BAD [attempts phrase level parsing]</span></font></code></pre> 48<p>But if I <font color="#000000">remove the skip-parser, everything goes back 49 to normal again:<code></code></font></p> 50<pre><code><font color="#000000"> <span class=identifier>rule</span><span class=special><> </span><span class=identifier>r </span><span class=special>= *</span><span class=identifier>anychar_p</span><span class=special>; 51 </span><span class=identifier>parse</span><span class=special>(</span><span class=string>"hello world"</span><span class=special>, </span><span class=identifier>r</span><span class=special>); </span><span class=comment>// OK [character level parsing]</span></font></code></pre> 52<p>Sometimes you'll want to pass in a rule to one of the functions parse functions 53 that Spirit provides. The problem is that the rule is a template class that 54 is parameterized by the scanner type. This is rather awkward but unavoidable: 55 <strong>the rule is tied to a scanner</strong>. What's not obvious is that this 56 scanner must be compatible with the scanner that is ultimately passed to the 57 rule's parse member function. Otherwise, the compiler will complain. </p> 58<p>Why does the first call to parse not compile? Because of scanner incompatibility. 59 Behind the scenes, the free parse function creates a scanner from the iterators 60 passed in. In the first call to parse, the scanner created is a plain vanilla 61 <tt>scanner<></tt>. This is compatible with the default scanner type of 62 <tt>rule<></tt> [see default template parameters of <a href="rule.html">the 63 rule</a>]. The second call creates a scanner of type <tt><a href="scanner.html#phrase_scanner_t">phrase_scanner_t</a></tt>. 64 Thus, in order for the second call to succeed, the rule must be parameterized 65 as <tt>rule<phrase_scanner_t></tt>:</p> 66<pre><code><font color="#000000"><span class=comment> </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>phrase_scanner_t</span><span class=special>> </span><span class=identifier>r </span><span class=special>= </span><span class=special>*</span><span class=identifier>anychar_p</span><span class=special>; 67 </span><span class=identifier>parse</span><span class=special>(</span><span class=string>"hello world"</span><span class=special>, </span><span class=identifier>r</span><span class=special>, </span><span class=identifier>space_p</span><span class=special>); </span><span class=comment>// OK [phrase level parsing]</span></font></code></pre> 68<p>Take note however that <tt>phrase_scanner_t</tt> is compatible only when you 69 are using <tt>char const*</tt> iterators and <tt>space_p</tt> as the skip parser. 70 Other than that, you'll have to find the right type of scanner. This is tedious 71 to do correctly. In light of this issue, <strong>it is best to avoid rules as 72 arguments to the parse functions</strong>. Keep in mind that this happens only 73 with rules. The rule is the only parser that has to be tied to a particular 74 scanner type. For instance:</p> 75<pre><span class=comment> </span><span class=identifier>parse</span><span class=special>(</span><span class=string>"hello world"</span><span class=special>, *</span><span class=identifier>anychar_p</span><span class=special>); </span><span class=comment><code><font color="#000000"><span class=comment>// OK [character level parsing]</span></font></code> 76 </span><span class=identifier>parse</span><span class=special>(</span><span class=string>"hello world"</span><span class=special>, *</span><span class=identifier>anychar_p</span><span class=special>, </span><span class=identifier>space_p</span><span class=special>); </span><span class="comment">// OK [phrase level parsing]</span></pre> 77<table width="80%" border="0" align="center"> 78 <tr> 79 <td class="note_box"> <strong><img src="theme/note.gif" width="16" height="16"> 80 Multiple Scanner Support</strong><br> 81 <br> 82 As of v1.8.0, rules can use one or more scanner types. There are cases, 83 for instance, where we need a rule that can work on the phrase and character 84 levels. Rule/scanner mismatch has been a source of confusion and is the 85 no. 1 <a href="faq.html#scanner_business">FAQ</a>. To address this issue, 86 we now have <a href="rule.html#multiple_scanner_support">multiple scanner 87 support</a>. <br> 88 <br> 89 <img src="theme/bulb.gif" width="13" height="18"> See the techniques section 90 for an <a href="techniques.html#multiple_scanner_support">example</a> of 91 a <a href="grammar.html">grammar</a> using a multiple scanner enabled rule, 92 <a href="scanner.html#lexeme_scanner">lexeme_scanner</a> and <a href="scanner.html#as_lower_scanner">as_lower_scanner.</a></td> 93 </tr> 94</table> 95<p><b> <a name="left_recursion"></a> Eliminating Left Recursion </b></p> 96<p><font color="#FF0000">Question:</font> I ported a grammar from YACC. It's "kinda" 97 working - the parser itself compiles with no errors. But when I try to parse, 98 it gives me an "invalid page fault". I tracked down the problem to 99 this grammar snippet:</p> 100<pre> <span class=identifier>or_expr </span><span class=special>= </span><span class=identifier>xor_expr </span><span class=special>| (</span><span class=identifier>or_expr </span><span class=special>>> </span><span class=identifier>VBAR </span><span class=special>>> </span><span class=identifier>xor_expr</span><span class=special>);</span></pre> 101<p>What you should do is to eliminate direct and indirect left-recursion. This 102 causes the invalid page fault because the program enters an infinite loop. The 103 code above is good for bottom up parsers such as YACC but not for LL parsers 104 such as Spirit.</p> 105<p>This is similar to a rule in Hartmut Kaiser's C 106 parser (this should be available for download from <a href="http://spirit.sf.net">Spirit's site</a> as soon as you read this).</p> 107<pre> 108 <span class=identifier>inclusive_or_expression 109 </span><span class=special>= </span><span class=identifier>exclusive_or_expression 110 </span><span class=special>| </span><span class=identifier>inclusive_or_expression </span><span class=special>>> </span><span class=identifier>OR </span><span class=special>>> </span><span class=identifier>exclusive_or_expression 111 </span><span class=special>;</span></pre> 112<p><span class=special></span>Transforming left recursion to right recursion, 113 we have:</p> 114<pre> <span class=identifier>inclusive_or_expression 115 </span><span class=special>= </span><span class=identifier>exclusive_or_expression </span><span class=special>>> </span><span class=identifier>inclusive_or_expression_helper 116 </span><span class=special>; 117 118 </span><span class=identifier>inclusive_or_expression_helper 119 </span><span class=special>= </span><span class=identifier>OR </span><span class=special>>> </span><span class=identifier>exclusive_or_expression </span><span class=special>>> </span><span class=identifier>inclusive_or_expression_helper 120 </span><span class=special>| </span><span class=identifier>epsilon_p 121 </span><span class=special>;</span></pre> 122<p><span class=special></span>I'd go further. Since:</p> 123<pre> <span class=identifier>r </span><span class=special>= </span><span class=identifier>a </span><span class=special>| </span><span class=identifier>epsilon_p</span><span class=special>;</span></pre> 124<p><span class=special></span>is equivalent to:<span class=special><br> 125 </span></p> 126<pre> <span class=identifier>r </span><span class=special>= !</span><span class=identifier>a</span><span class=special>;</span></pre> 127<p>we can simplify <tt>inclusive_or_expression_helper</tt> thus:</p> 128<pre> <span class=identifier>inclusive_or_expression_helper 129 </span><span class=special>= !(</span><span class=identifier>OR </span><span class=special>>> </span><span class=identifier>exclusive_or_expression </span><span class=special>>> </span><span class=identifier>inclusive_or_expression_helper</span><span class=special>) 130 ;</span></pre> 131<p><span class=special></span>Now, since:</p> 132<pre> <span class=identifier>r </span><span class=special>= !(</span><span class=identifier>a </span><span class=special>>> </span><span class=identifier>r</span><span class=special>);</span></pre> 133<p><span class=special></span>is equivalent to:</p> 134<pre> <span class=identifier>r </span><span class=special>= *</span><span class=identifier>a</span><span class=special>;</span></pre> 135<p><span class=special></span>we have:</p> 136<pre> <span class=identifier>inclusive_or_expression_helper 137 </span><span class=special>= *(</span><span class=identifier>OR </span><span class=special>>> </span><span class=identifier>exclusive_or_expression</span><span class=special>) 138 ;</span></pre> 139<p><span class=special></span>Now simplifying <tt>inclusive_or_expression</tt> 140 fully, we have:</p> 141<pre> <span class=identifier>inclusive_or_expression 142 </span><span class=special>= </span><span class=identifier>exclusive_or_expression </span><span class=special>>> *(</span><span class=identifier>OR </span><span class=special>>> </span><span class=identifier>exclusive_or_expression</span><span class=special>) 143 ;</span></pre> 144<p><span class=special></span>Reminds me of the calculators. So in short:</p> 145<pre> <span class=identifier>a </span><span class=special>= </span><span class=identifier>b </span><span class=special>| </span><span class=identifier>a </span><span class=special>>> </span><span class=identifier>op </span><span class=special>>> </span><span class=identifier>b</span><span class=special>;</span></pre> 146<p><span class=special></span><span class=identifier>in </span><span class=identifier>pseudo-YACC 147 </span><span class=identifier>is</span><span class=special>:</span></p> 148<pre> <span class=identifier>a </span><span class=special>= </span><span class=identifier>b </span><span class=special>>> *(</span><span class=identifier>op </span><span class=special>>> </span><span class=identifier>b</span><span class=special>);</span></pre> 149<p><span class=special></span>in Spirit. What could be simpler? Look Ma, no recursion, 150 just iteration.</p> 151<p><b> <a name="right_associativity" id="right_associativity"></a> Implementing Right Associativity </b></p> 152<p> <font color="#FF0000">Question:</font> I tried adding <tt>'^'</tt> as an operator to compute the power to a calculator grammar. The following code 153</p> 154<pre> <span class=identifier>pow_expression 155 </span><span class=special>= </span><span class=identifier>pow_operand </span><span class=special>>> </span><span class=special>*( </span><span class=literal>'^' </span><span class=special>>> </span><span class=identifier>pow_operand </span><span class=special>[ </span><span class=special>& </span><span class=identifier>do_pow </span><span class=special>] 156 </span><span class=special>) 157 </span><span class=special>;</span> 158</pre> 159<p>parses the input correctly, but I want the operator to be evalutated from right to left. In other words, the expression <tt>2^3^4</tt> is supposed to have the same semantics as <tt>2^(3^4)</tt> instead of <tt>(2^3)^4</tt>. How do I do it? 160</p> 161<p> The "textbook recipe" for Right Associativity is Right Recursion. In BNF that means: 162<pre> <pow_expression> ::= <pow_operand> '^' <pow_expression> | <pow_operand> 163</pre> 164<p>But we better don't take the theory too literally here, because if the first alternative fails, the semantic actions within <tt>pow_operand</tt> might have been executed already and will then be executed again when trying the second alternative. So let's apply Left Factorization to factor out <tt>pow_operand</tt>: 165<pre> <pow_expression> ::= <pow_operand> <pow_expression_helper> 166 <pow_expression_helper> ::= '^' <pow_expression> | <i>ε</i> 167</pre> 168<p>The production <tt>pow_expression_helper</tt> matches the empty string <i>ε</i>, so we can replace the alternative with the optional operator in Spirit code. 169</p> 170<pre> <span class=identifier>pow_expression 171 </span><span class=special>= </span><span class=identifier>pow_operand </span><span class=special>>> </span><span class=special>!( </span><span class=literal>'^' </span><span class=special>>> </span><span class=identifier>pow_expression </span><span class=special>[ </span><span class=special>& </span><span class=identifier>do_pow </span><span class=special>] 172 </span><span class=special>) 173 </span><span class=special>;</span> 174</pre> 175<p>Now any semantic actions within <tt>pow_operand</tt> can safely be executed. For stack-based evaluation that means that each match of <tt>pow_operand</tt> will leave one value on the stack and the recursion makes sure there are (at least) two values on the stack when <tt>do_pow</tt> is fired to reduce these two values to their power. 176</p> 177<p>In cases where this technique isn't applicable, such as C-style assignment 178<pre> <span class=identifier>assignment 179 </span><span class=special>= </span><span class=identifier>lvalue </span><span class=special>>> </span><span class=literal>'=' </span><span class=special>>> </span><span class=identifier>assignment 180 </span><span class=special>| </span><span class=identifier>ternary_conditional 181 </span><span class=special>;</span> 182</pre> 183<p>you can append <tt>| epsilon_p [ <i>action</i> ] >> nothing_p</tt> to a parser to correct the semantic context when backtracking occurs (in the example case that would be dropping the address pushed by <tt>lvalue</tt> off the evaluation stack): 184</p> 185<pre> <span class=identifier>assignment 186 </span><span class=special>= </span><span class=identifier>lvalue </span><span class=special>>> </span><span class=special>( </span><span class=literal>'=' </span><span class=special>>> </span><span class=identifier>assignment </span></span><span class=special>[ </span><span class=special>& </span><span class=identifier>do_store </span><span class=special>] 187 </span><span class=special>| </span><span class=identifier>epsilon_p </span><span class=special>[ </span><span class=special>& </span><span class=identifier>do_drop </span><span class=special>] 188 </span><span class=special>>> </span><span class=identifier>nothing_p 189 </span><span class=special>) 190 </span><span class=special>| </span><span class=identifier>ternary_conditional 191 </span><span class=special>;</span> 192</pre> 193<p>However, this trick compromises the clear separation of syntax and semantics, so you also might want to consider using an <a href="trees.html">AST</a> instead of semantic actions so you can just go with the first definition of <tt>assignment</tt>. 194</p> 195<p><b> <a name="lexeme_and_rules" id="lexeme_and_rules"></a> The lexeme_d directive 196 and rules</b></p> 197<p> <font color="#FF0000">Question:</font> Does lexeme_d not support expressions 198 which include rules? In the example below, the definition of atomicRule compiles, 199</p> 200<pre> <span class=identifier></span><span class=identifier>rule</span><span class=special><</span><span class=identifier>phrase_scanner_t</span><span class=special>> </span><span class=identifier>atomicRule</span> 201 <span class=special>= </span><span class=identifier>lexeme_d</span><span class=special>[(</span><span class=identifier>alpha_p </span><span class=special>| </span><span class=literal>'_'</span><span class=special>) >> *(</span><span class=identifier>alnum_p </span><span class=special>| </span><span class=literal>'.' </span><span class=special>| </span><span class=literal>'-' </span><span class=special>| </span><span class=literal>'_'</span><span class=special>)];</span></pre> 202<p>but if I move <tt>alnum_p | '.' | '-' | '_'</tt> into its own rule, the compiler 203 complains about conversion from <tt>const scanner<...></tt> to <tt>const 204 phrase_scaner_t&</tt>. </p> 205<pre> <span class=identifier>rule</span><span class=special><</span><span class=identifier>phrase_scanner_t</span><span class=special>> </span><span class=identifier>ch </span><span class=special> 206 = </span><span class=identifier>alnum_p </span><span class=special>| </span><span class=literal>'.' </span><span class=special>| </span><span class=literal>'-' </span><span class=special>| </span><span class=literal>'_'</span><span class=special>;</span> 207 208<span class=identifier> rule</span><span class=special><</span><span class=identifier>phrase_scanner_t</span><span class=special>> </span><span class=identifier>compositeRule</span> 209 <span class=special>= </span><span class=identifier>lexeme_d</span><span class=special>[(</span><span class=identifier>alpha_p </span><span class=special>| </span><span class=literal>'_'</span><span class=special>) >> *(</span><span class=identifier>ch</span><span class=special>)]; </span><span class="comment">// <- error source</span></pre> 210<p>You might get the impression that the <tt>lexeme_d</tt> directive and rules 211 do not mix. Actually, this problem is related to the first FAQ entry: The Scanner 212 Business. More precisely, the <tt>lexeme_d</tt> directive and rules with incompatible 213 scanner types do not mix. This problem is more subtle. What's causing the scanner 214 incompatibility is the directive itself. The <tt>lexeme_d</tt> directive transforms 215 the scanner it receives into something that disables the skip parser. This non-skipping 216 scanner, unfortunately, is incompatible with the original scanner before transformation 217 took place.</p> 218<p>The simplest solution is not to use rules in the <tt>lexeme_d</tt>. Instead, 219 you can definitely apply <tt>lexeme_d</tt> to subrules and grammars if you really 220 need more complex parsers inside the <tt>lexeme_d</tt>. If you really must use 221 a rule, you need to know the exact scanner used by the directive. The <tt>lexeme_scanner</tt> 222 metafunction is your friend here. The example above will work as expected once 223 we give the <tt>ch</tt> rule a correct scanner type:</p> 224<pre> <span class=identifier>rule</span><span class=special><</span><span class=identifier>lexeme_scanner</span><span class="special"><</span><span class=identifier>phrase_scanner_t</span><span class=special>>::</span><span class="identifier">type</span><span class=special>> </span><span class=identifier>ch </span><span class=special> 225 = </span><span class=identifier>alnum_p </span><span class=special>| </span><span class=literal>'.' </span><span class=special>| </span><span class=literal>'-' </span><span class=special>| </span><span class=literal>'_'</span><span class=special>;</span></pre> 226<p>Note: make sure to add "<tt>typename</tt>" before <tt>lexeme_scanner</tt> 227 when this is used inside a template class or function.</p> 228<p>The same thing happens when rules are used inside the <tt>as_lower_d</tt> directive. 229 In such cases, you can use the <tt>as_lower_scanner</tt>. See the <span class=identifier><tt><a href="scanner.html#lexeme_scanner">lexeme_scanner</a></tt></span> 230 and <tt><a href="scanner.html#as_lower_scanner">as_lower_scanner</a></tt>.</p> 231<table width="80%" border="0" align="center"> 232 <tr> 233 <td class="note_box"><img src="theme/bulb.gif" width="13" height="18"> See 234 the techniques section for an <a href="techniques.html#multiple_scanner_support">example</a> 235 of a <a href="grammar.html">grammar</a> using a <a href="rule.html#multiple_scanner_support">multiple 236 scanner enabled rule,</a> <a href="scanner.html#lexeme_scanner">lexeme_scanner</a> 237 and <a href="scanner.html#as_lower_scanner">as_lower_scanner.</a></td> 238 </tr> 239</table> 240<p><strong><a name="kleene_star"></a>Kleene Star infinite loop</strong></p> 241<p><font color="#FF0000">Question</font>: Why Does This Loop Forever?</p> 242<pre> <span class=identifier>rule</span><span class=special><> </span><span class=identifier>optional </span><span class=special>= !(</span>str_p<span class="special">(</span><span class="string">"optional"</span><span class="special">)); 243 </span><span class=identifier>rule</span><span class=special><> </span><span class="identifier">list_of_optional </span><span class=special>= *</span><span class=identifier>optional</span><span class="special">;</span></pre> 244<p>The problem with this is that the kleene star will continue looping until it 245 gets a no-match from it's enclosed parser. Because the <tt>optional</tt> rule 246 is optional, it will always return a match. Even if the input doesn't match 247 "optional" it will return a zero length match. <tt>list_of_optional</tt> 248 will keep calling optional forever since optional will never return a no-match. 249 So in general, any rule that can be "nullable" (meaning it can return 250 a zero length match) must not be put inside a kleene star.</p> 251<p><strong><a name="CVS"></a>Boost CVS and Spirit CVS</strong></p> 252<p><font color="#FF0000">Question:</font> There is Boost CVS and Spirit CVS. Which 253 is used for further development of Spirit?</p> 254<p> Generally, development takes place in Spirit's CVS. However, from time to 255 time a new version of Spirit will be integrated in Boost. When this happens 256 development takes place in the Boost CVS. There will be announcements on the 257 Spirit mailing lists whenever the status of the Spirit CVS changes.<br> 258</p> 259<table width="80%" border="0" align="center"> 260 <tr> 261 <td class="note_box"><img src="theme/alert.gif" width="16" height="16"> 262 During development of Spirit v1.8.1 (released as part of boost-1.32.0) and 263 v1.6.2, Spirit's developers decided to stop maintaining Spirit CVS for 264 BRANCH_1_8 and BRANCH_1_6. This was necessary to reduce the added work of 265 maintaining and synch'ing two repositories. The maintenance of these branches 266 will take place on Boost CVS. At this time, new developments towards Spirit 267 v2 and other experimental developments are expected to happen in Spirit 268 CVS.</td> 269 </tr> 270</table> 271<p><strong><a name="compilation_times"></a>How to reduce compilation times with 272 complex Spirit grammars </strong></p> 273<p><font color="#FF0000">Question:</font> Are there any techniques to minimize 274 compile times using spirit? For simple parsers compile time doesn't seem to 275 be a big issue, but recently I created a parser with about 78 rules 276 and it took about 2 hours to compile. I would like to break the grammar up into 277 smaller chunks, but it is not as easy as I thought it would be because rules 278 in two grammar capsules are defined in terms of each other. Any thoughts?</p> 279<p> The only way to reduce compile times is </p> 280<ul> 281 <li> to split up your grammars into smaller chunks</li> 282 <li> prevent the compiler from seeing all grammar definitions at the same time 283 (in the same compilation unit)</li> 284</ul> 285<p>The first task is merely logistical, the second is rather a technical one. </p> 286<p>A good example of solving the first task is given in the Spirit cpp_lexer example 287 written by JCAB (you may find it on the <a href="http://spirit.sourceforge.net/repository/applications/show_contents.php">applications' repository</a>). 288</p> 289<p>The cross referencing problems may be solved by some kind of forward declaration, 290 or, if this does not work, by introducing some dummy template argument to the 291 non-templated grammars. Thus allows the instantiation time to be deferred until the 292 compiler has seen all the definitions:</p> 293<pre> <span class="keyword">template</span> <<span class="keyword">typename</span> T = <span class="keyword">int</span>><br> grammar2;</p> 294 295 <span class="keyword">template</span> <<span class="keyword">typename</span> T = <span class="keyword">int</span>><br> <span class="keyword">struct</span> grammar1 : <span class="keyword">public</span> grammar<grammar1><br> { 296 <span class="comment">// refers to grammar2<></span> 297 }; 298 299 <span class="keyword">template</span> <typename T> 300 <span class="keyword">struct</span> grammar2 : <span class="keyword">public</span> grammar<grammar2> 301 { 302 <span class="comment">// refers to grammar1<></span> 303 }; 304 305 //... 306 grammar1<> g; <span class="comment">// both grammars instantiated here</span> 307</pre> 308<p>The second task is slightly more complex. You must ensure that in the first 309 compilation unit the compiler sees only some function/template <strong>declaration</strong> 310 and in the second compilation unit the function/template <strong>definition</strong>. 311 Still no problem, if no templates are involved. If templates are involved, 312 you need to manually (explicitly) instantiate these templates with the correct 313 template parameters inside a separate compilation unit. This way the compilation 314 time is split between several compilation units, reducing the overall 315 required time drastically too. </p> 316<p>For a sample, showing how to achieve this, you may want to look at the <tt>Wave</tt> 317 preprocessor library, where this technique is used extensively. (this should be available for download from <a href="http://spirit.sf.net">Spirit's site</a> as soon as you read this).</p> 318<p><strong><a name="frame_assertion" id="frame_assertion"></a>Closure frame assertion</strong></p> 319<p><font color="#FF0000">Question:</font> When I run the parser I get an assertion 320 <span class="string">"frame.get() != 0 in file closures.hpp"</span>. 321 What am I doing wrong?</p> 322<p>Basically, the assertion fires when you are accessing a closure variable that 323 is not constructed yet. Here's an example. We have three rules <tt>a</tt>, <tt>b</tt> 324 and <tt>c</tt>. Consider that the rule <tt>a</tt> has a closure member <tt>m</tt>. 325 Now:</p> 326<pre> <span class="identifier">a</span> <span class="special">=</span> <span class="identifier">b</span><span class="special">;</span> 327 <span class="identifier">b</span> <span class="special">=</span> <span class="identifier">int_p</span><span class="special">[</span><span class="identifier">a</span><span class="special">.</span><span class="identifier">m</span> <span class="special">=</span> 123<span class="special">];</span> 328 <span class="identifier">c</span> <span class="special">=</span> <span class="identifier">b</span><span class="special">;</span></pre> 329<p>When the rule <tt>a</tt> is invoked, its frame is set, along with its member 330 <tt>m</tt>. So, when <tt>b</tt> is called from <tt>a</tt>, the semantic action 331 <tt>[a.m = 123]</tt>will store <tt>123</tt> into <tt>a</tt>'s closure member 332 <tt>m</tt>. On the other hand, when <tt>c</tt> is invoked, and <tt>c</tt> attempts 333 to call <tt>b</tt>, no frame for <tt>a</tt> is set. Thus, when <tt>b</tt> is 334 called from <tt>c</tt>, the semantic action <tt>[a.m = 123]</tt>will fire the 335 <span class="string">"frame.get() != 0 in file closures.hpp"</span> 336 assertion.</p> 337<p><strong><a name="greedy_rd" id="greedy_rd"></a>Greedy RD</strong></p> 338<p><font color="#FF0000">Question:</font> I'm wondering why the this won't work 339 when parsed:</p> 340<pre> 341<span class="identifier"> a</span> <span class="special">= +</span><span class="identifier">anychar_p</span><span class="special">;</span> 342 <span class="identifier">b</span> = <span class="string">'('</span> <span class="special">>></span> <span class="identifier">a</span> <span class="special">>></span> <span class="string">')'</span><span class="special">;</span></pre> 343<p>Try this:</p> 344<pre> 345<span class="identifier"> a</span> <span class="special">= +(</span><span class="identifier">anychar_p - </span><span class="string">')'</span><span class="special">);</span> 346 <span class="identifier">b</span> <span class="special">=</span> <span class="string">'('</span> <span class="special">>></span> <span class="identifier">a</span> <span class="special">>></span> <span class="string">')'</span><span class="special">;</span></pre> 347<p>David Held writes: That's because it's like the langoliers--it eats everything 348 up. You usually want to say what it shouldn't eat up by subtracting the terminating 349 character from the parser. The moral being: Using <tt>*anychar_p</tt> or <tt>+anychar_p</tt> 350 all by itself is usually a <em>Bad Thing</em>™.</p> 351<p>In other words: Recursive Descent is inherently greedy (however, see <a href="rationale.html#exhaustive_rd">Exhaustive 352 backtracking and greedy RD</a>).</p> 353<p><span class="special"></span><strong><a name="referencing_a_rule_at_construction" id="referencing_a_rule_at_construction"></a>Referencing 354 a rule at construction time</strong></p> 355<p><font color="#FF0000">Question:</font> The code below terminates with a segmentation 356 fault, but I'm (obviously) confused about what I'm doing wrong.</p> 357<pre> rule<span class="special"><</span>ScannerT<span class="special">,</span> clos<span class="special">::</span>context_t<span class="special">></span> id <span class="special">=</span> int_p<span class="special">[</span>id<span class="special">.</span>i <span class="special">=</span> arg1<span class="special">];</span></pre> 358<p>You have a rule <tt>id</tt> being constructed. Before it is constructed, you 359 reference <tt>id.i</tt> in the RHS of the constructor. It's a chicken and egg 360 thing. The closure member <tt>id.i</tt> is not yet constructed at that point. 361 Using assignment will solve the problem. Try this instead:</p> 362<pre> rule<span class="special"><</span>ScannerT<span class="special">,</span> clos<span class="special">::</span>context_t<span class="special">></span> id<span class="special">;</span> 363 id <span class="special">=</span> int_p<span class="special">[</span>id<span class="special">.</span>i <span class="special">=</span> arg1<span class="special">];</span></pre> 364<p><span class="special"></span><strong><a name="storing_rules" id="storing_rules"></a>Storing 365 Rules </strong></p> 366<p><font color="#FF0000">Question:</font> Why can't I store rules in STL containers 367 for later use and why can't I pass and return rules to and from functions by 368 value? </p> 369<p>EBNF is primarily declarative. Like in functional programming, It's a static 370 recipe and there's no notion of do this then that. However, in Spirit, we managed 371 to coax imperative C++ to take in declarative EBNF. Hah! Fun!... We did that 372 by masquerading the C++ assignment operator to mimic EBNF's <tt>::=</tt>, among 373 other things (e.g. <tt>>></tt>, <tt>|</tt>, <tt>&</tt> etc.). We used 374 the rule class to let us do that by giving its assignment operator (and copy 375 constructor) a different meaning and semantics. Doing so made the rule unlike 376 any other C++ object. You can't copy it. You can't assign it. You can't place 377 it in a container (vector, stack, etc).Heck, you can't even return it from a 378 function *by value*.</p> 379<table width="80%" border="0" align="center"> 380 <tr> 381 <td class="note_box"><img src="theme/alert.gif" width="16" height="16"> The 382 rule is a weird object, unlike any other C++ object. It does not have the 383 proper copy and assignment semantics and cannot be stored and passed around 384 by value.</td> 385 </tr> 386</table> 387<p>However nice declarative EBNF is, the dynamic nature of C++ can be an advantage. 388 We've seen this in action here and there. There are indeed some interesting 389 applications of dynamic parsers using Spirit. Yet, we haven't fully utilized 390 the power of dynamic parsing, unless(!), we have a rule that's not so alien 391 to C++ (i.e. behaves as a good C++ object). With such a beast, we can write 392 parsers that's defined at run time, as opposed to at compile time.</p> 393<p>Now that I started focusing on rules (hey, check out the hunky new rule features), 394 it might be a good time to implement the rule-holder. It is basically just a 395 rule, but with C++ object semantics. Yet it's not as simple. Without true garbage 396 collection, the implementation will be a bit tricky. We can't simply use reference 397 counting because a rule-holder (hey, anyone here has a better name?) *is-a* 398 rule, and rules are typically recursive and thus cyclic. The problem is which 399 will own which.</p> 400<p>Ok... this will do for now. You'll definitely see more of the rule-holder in 401 the coming days.</p> 402<p><strong><a name="parsing_ints_and_reals"></a>Parsing Ints and Reals</strong></p> 403<p> <font color="#FF0000">Question:</font> I was trying to parse an int or float value with the <tt>longest_d</tt> directive and put some actors on the alternatives to visualize the results. When I parse "123.456", the output reports:</p> 404<ol> 405 <li>(int) has been matched: full match = false</li> 406 <li> (double) has been matched: full match = true</li> 407</ol> 408<p>That is not what I expected. What am I missing? </p> 409<p> Actually, the problem is that both semantic actions of the int and real branch will be triggered because both branches will be tried. This doesn't buy us much. What actually wins in the end is what you expected. But there's no easy way to know which one wins. The problem stems from the ambiguity. </p> 410<blockquote> 411 <p>Case1: Consider this input: "2". Is it an int or a real? They are both (strictly following the grammar of a real). </p> 412 <p>Case2 : Now how about "1.0"? Is it an int or a real? They are both, albeit the int part gets a partial match: "1". That is why you are getting a (partial) match for your <em>int</em> rule (full match = false). </p> 413</blockquote> 414<p> Instead of using the <tt>longest_d</tt> to parse ints and reals, what I suggest is to remove the ambiguity and use the plain short-circuiting alternatives. The first step is to use <tt><a href="numerics.html#strict_reals">strict_real_p</a> </tt>to make the first case unambiguous. Unlike 415 416 417 <tt>real_p</tt>, <tt>strict_real_p</tt> requires a dot to be present for a number to be considered a successful match. 418 419Your grammar can be written unambiguously as:</p> 420<pre> strict_real_p<span class="special"> | </span>int_p</pre> 421<p> Note that because ambiguity is resolved, attaching actions to both branches is safe. Only one will be triggered:</p> 422<pre> strict_real_p<span class="special">[</span>R<span class="special">] | </span>int_p<span class="special">[</span>I<span class="special">]</span></pre> 423<blockquote> 424 <p> "1.0" ---> triggers R<br> 425"2" ---> triggers I</p> 426</blockquote> 427<p> Again, as a rule of thumb, it is always best to resolve as much ambiguity as possible. The best grammars are those which involve no backtracking at all: an LL(1) grammar. Backtracking and semantic actions do not mix well.</p> 428<p><b><a name="output_operator" id="output_operator"></a>BOOST_SPIRIT_DEBUG and missing <tt>operator<<</tt></b></p> 429<p><font color="#FF0000">Question:</font> My code compiles fine in release mode but when I try to define <tt>BOOST_SPIRIT_DEBUG</tt> the compiler complains about a missing <tt><span class="keyword">operator</span><span class="special"><<</span></tt>.</p> 430<p>When <tt>BOOST_SPIRIT_DEBUG</tt> is defined debug output is generated for 431 spirit parsers. To this end it is expected that each closure member has the 432 default output operator defined.</p> 433<p>You may provide the operator overload either in the namespace where the 434 class is declared (will be found through Argument Dependent Lookup) or make it visible where it is 435 used, that is <tt><span class="keyword">namespace</span> <span 436 class="identifier">boost</span><span class="special">::</span><span 437 class="identifier">spirit</span></tt>. Here's an example for <tt><span 438 class="identifier">std</span><span class="special">::</span><span 439 class="identifier">pair</span></tt>:</p> 440<pre><code> 441 <span class="preprocessor">#include</span> <span class="string"><iosfwd></span> 442 <span class="preprocessor">#include</span> <span class="string"><utility></span> 443 444 <span class="keyword">namespace</span> <span class="identifier">std</span> <span class="special">{</span> 445 446 <span class="keyword">template</span> <span class="special"><</span> 447 <span class="keyword">typename</span> <span class="identifier">C</span><span class="special">,</span> 448 <span class="keyword">typename</span> <span class="identifier">E</span><span class="special">,</span> 449 <span class="keyword">typename</span> <span class="identifier">T1</span><span class="special">,</span> 450 <span class="keyword">typename</span> <span class="identifier">T2</span> 451 <span class="special">></span> 452 <span class="identifier">basic_ostream</span><span class="special"><</span><span class="identifier">C</span><span class="special">,</span> <span class="identifier">E</span><span class="special">></span> <span class="special">&</span> <span class="keyword">operator</span><span class="special"><<(</span> 453 <span class="identifier">basic_ostream</span><span class="special"><</span><span class="identifier">C</span><span class="special">,</span> <span class="identifier">E</span><span class="special">></span> <span class="special">&</span> <span class="identifier">out</span><span class="special">,</span> 454 <span class="identifier">pair</span><span class="special"><</span><span class="identifier">T1</span><span class="special">,</span> <span class="identifier">T2</span><span class="special">></span> <span class="keyword">const</span> <span class="special">&</span> <span class="identifier">what</span><span class="special">)</span> 455 <span class="special">{</span> 456 <span class="keyword">return</span> <span class="identifier">out</span> <span class="special"><<</span> <span class="string">'('</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">.</span><span class="identifier">first</span> <span class="special"><<</span> <span class="string">", "</span> 457 <span class="special"><<</span> <span class="identifier">what</span><span class="special">.</span><span class="identifier">second</span> <span class="special"><<</span> <span class="string">')'</span><span class="special">;</span> 458 <span class="special">}</span> 459 460 <span class="special">}</span> 461 462</code></pre> 463<p><b><a name="repository" id="repository"></a>Applications that used to be part of spirit</b></p> 464<p><font color="#FF0000">Question:</font> Where can I find <i><insert great application></i>, that used to be part of the Spirit distribution?</p> 465<p>Old versions of Spirit used to include applications built with it. 466 In order to streamline the distribution they were moved to a separate 467 <a href="http://spirit.sourceforge.net/repository/applications/show_contents.php">applications repository</a>. 468 In that page you'll find links to full applications that use the Spirit 469 parser framework. We encourage you to send in your own applications for 470 inclusion (see the page for instructions).</p> 471 <p>You may also check out the <a href="http://spirit.sourceforge.net/repository/grammars/show_contents.php">grammars' repository</a>.</p> 472<table width="80%" border="0" align="center"> 473 <tr> 474 <td class="note_box"> 475 <img src="theme/note.gif" width="16" height="16"> You'll still find the 476 example applications that complement (actually are part of) the 477 documentation in the usual place: <code>libs/spirit/example</code>.<br> 478 <br> 479 <img src="theme/alert.gif" width="16" height="16"> The applications and 480 grammars listed in the repositories are works of the respective authors. 481 It is the author's responsibility to provide support and maintenance. 482 Should you have any questions, please send the author an email. 483 </td> 484 </tr> 485</table> 486<br> 487<table border="0"> 488 <tr> 489 <td width="10"></td> 490 <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> 491 <td width="30"><a href="techniques.html"><img src="theme/l_arr.gif" border="0"></a></td> 492 <td width="30"><a href="rationale.html"><img src="theme/r_arr.gif" border="0"></a></td> 493 </tr> 494</table> 495<br> 496<hr size="1"> 497<p class="copyright">Copyright © 1998-2003 Joel de Guzman<br> 498<span class="copyright">Copyright © 2002-2003 Hartmut Kaiser </span><br> 499<span class="copyright">Copyright © 2006-2007 Tobias Schwinger </span><br> 500 <br> 501 <font size="2">Use, modification and distribution is subject to the Boost Software 502 License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at 503http://www.boost.org/LICENSE_1_0.txt)</font></p> 504<p class="copyright"> </p> 505</body> 506</html> 507