1<html> 2<head> 3<title>Operators</title> 4<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> 5<link rel="stylesheet" href="theme/style.css" type="text/css"> 6</head> 7 8<body> 9<table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2"> 10 <tr> 11 <td width="10"> 12 </td> 13 <td width="85%"> 14 <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Operators</b></font> 15 </td> 16 <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td> 17 </tr> 18</table> 19<br> 20<table border="0"> 21 <tr> 22 <td width="10"></td> 23 <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> 24 <td width="30"><a href="primitives.html"><img src="theme/l_arr.gif" border="0"></a></td> 25 <td width="30"><a href="numerics.html"><img src="theme/r_arr.gif" border="0"></a></td> 26 </tr> 27</table> 28<p>Operators are used as a means for object composition and embedding. Simple 29 parsers may be composed to form composites through operator overloading, crafted 30 to approximate the syntax of an Extended Backus-Normal Form (EBNF) variant. 31 An expression such as:</p> 32<pre><code><font color="#000000"> <span class=identifier>a </span><span class=special>| </span><span class=identifier>b</span></font></code></pre> 33<p>actually yields a new parser type which is a composite of its operands, a and 34 b. Taking this example further, if a and b were of type <tt>chlit</tt><>, 35 the result would have the composite type:</p> 36<pre><code><font color="#000000"> <span class=identifier>alternative</span><span class=special><</span><span class=identifier>chlit</span><span class=special><>, </span><span class=identifier>chlit</span><span class=special><> </span><span class=special>></span></font></code></pre> 37<p> In general, for any binary operator, it will take its two arguments, parser1 38 and parser2, and create a new composed parser of the form</p> 39<pre><code><font color="#000000"> <span class=identifier>op</span><span class=special><</span><span class=identifier>parser1</span><span class=special>, </span><span class=identifier>parser2</span><span class=special>></span></font></code></pre> 40<p>where parser1 and parser2 can be arbitrarily complex parsers themselves, with 41 the only limitations being what your compiler imposes. </p> 42<h3>Set Operators</h3> 43<table width="90%" border="0" align="center"> 44 <tr> 45 <td class="table_title" colspan="3">Set operators</td> 46 </tr> 47 <tr> 48 <td class="table_cells" width="20%"><code><span class=identifier>a </span><span class=special>| 49 </span><span class=identifier>b</span></code></td> 50 <td class="table_cells" width="24%">Union</td> 51 <td class="table_cells" width="56%">Match a or b. Also referred to as alternative</td> 52 </tr> 53 <tr> 54 <td class="table_cells" width="20%"><code><span class=identifier>a </span><span class=special>& 55 </span><span class=identifier>b</span></code></td> 56 <td class="table_cells" width="24%">Intersection</td> 57 <td class="table_cells" width="56%">Match a and b</td> 58 </tr> 59 <tr> 60 <td class="table_cells" width="20%"><code><span class=identifier>a </span><span class=special>- 61 </span><span class=identifier>b</span></code></td> 62 <td class="table_cells" width="24%">Difference</td> 63 <td class="table_cells" width="56%">Match a but not b. If both match and b's 64 matched text is shorter than a's matched text, a successful match is made</td> 65 </tr> 66 <tr> 67 <td class="table_cells" width="20%"><code><span class=identifier>a </span><span class=special>^ 68 </span><span class=identifier>b</span></code></td> 69 <td class="table_cells" width="24%">XOR</td> 70 <td class="table_cells" width="56%">Match a or b, but not both</td> 71 </tr> 72</table> 73<p><b>Short-circuiting</b></p> 74<p>Alternative operands are tried one by one on a first come first served basis 75 starting from the leftmost operand. After a successfully matched alternative 76 is found, the parser concludes its search, essentially short-circuiting the 77 search for other potentially viable candidates. This short-circuiting implicitly 78 gives the highest priority to the leftmost alternative.</p> 79<p>Short-circuiting is done in the same manner as C or C++'s logical expressions; 80 e.g. <tt>if</tt> <tt><span class="operators">(</span>x <span class="operators"><</span> 81 3 <span class="operators">||</span> y <span class="operators"><</span> 2<span class="operators">)</span></tt> 82 where, if <tt>x</tt> evaluates to be less than 3, the <tt>y <span class="operators"><</span> 83 2</tt> test is not done at all. In addition to providing an implicit priority 84 rule for alternatives which is necessary, given the non-deterministic nature 85 of the Spirit parser compiler, short-circuiting improves the execution time. 86 If the order of your alternatives is logically irrelevant, strive to put the 87 (expected) most common choice first for maximum efficiency.</p> 88<table width="80%" border="0" align="center"> 89 <tr> 90 <td class="note_box"><img src="theme/lens.gif" width="15" height="16"> <b>Intersections</b><br> 91 <br> 92 Some researchers assert that the intersections (e.g. <tt>a & b</tt>) 93 let us define context sensitive languages (<a href="references.html#intersections">"XBNF"</a> 94 [citing Leu-Weiner, 1973]). "The theory of defining a language as the 95 intersection of a finite number of context free languages was developed 96 by Leu and Weiner in 1973".<br> 97 <br> 98 <b><img src="theme/lens.gif" width="15" height="16"> <b></b>~ Operator</b><br> 99 <br> 100 The complement operator <tt>~</tt> was originally put into consideration. 101 Further understanding of its value and meaning leads us to uncertainty. 102 The basic problem stems from the fact that <tt>~a</tt> will yield <tt>U-a</tt>, 103 where <tt>U</tt> is the universal set of all strings. However, where it 104 makes sense, some parsers can be complemented (see the <a href="primitives.html#negation">primitive 105 character parsers</a> for examples).</td> 106 </tr> 107</table> 108<h3>Sequencing Operators</h3> 109<table width="90%" border="0" align="center"> 110 <tr> 111 <td class="table_title" colspan="3">Sequencing operators</td> 112 </tr> 113 <tr> 114 <td class="table_cells" width="21%"><code><span class=identifier>a </span><span class=special>>> 115 </span><span class=identifier>b</span></code></td> 116 <td class="table_cells" width="23%">Sequence</td> 117 <td class="table_cells" width="56%">Match a and b in sequence</td> 118 </tr> 119 <tr> 120 <td class="table_cells" width="21%"><code><span class=identifier>a </span><span class=special>&& 121 </span><span class=identifier>b</span></code></td> 122 <td class="table_cells" width="23%">Sequential-and</td> 123 <td class="table_cells" width="56%">Sequential-and. Same as above, match a 124 and b in sequence</td> 125 </tr> 126 <tr> 127 <td class="table_cells" width="21%"><code><span class=identifier>a </span><span class=special>|| 128 </span><span class=identifier>b</span></code></td> 129 <td class="table_cells" width="23%">Sequential-or</td> 130 <td class="table_cells" width="56%">Match a or b in sequence</td> 131 </tr> 132</table> 133<p>The sequencing operator <tt class="operators">>></tt> can alternatively 134 be thought of as the sequential-and operator. The expression <tt>a <span class="operators">&&</span> 135 b</tt> reads as match a and b in sequence. Continuing this logic, we can also 136 have a sequential-or operator where the expression <tt>a <span class="operators">||</span> 137 b</tt> reads as match a or b and in sequence. That is, if both a and b match, 138 it must be in sequence; this is equivalent to <tt>a >> !b | b</tt>. </p> 139<h3>Optional and Loops</h3> 140<table width="90%" border="0" align="center"> 141 <tr> 142 <td class="table_title" colspan="3">Optional and Loops</td> 143 </tr> 144 <tr> 145 <td class="table_cells" width="21%"><code><span class=special>*</span><span class=identifier>a</span></code></td> 146 <td class="table_cells" width="23%">Kleene star</td> 147 <td class="table_cells" width="56%">Match a zero (0) or more times</td> 148 </tr> 149 <tr> 150 <td class="table_cells" width="21%"><code><span class=special>+</span><span class=identifier>a</span></code></td> 151 <td class="table_cells" width="23%">Positive</td> 152 <td class="table_cells" width="56%">Match a one (1) or more times</td> 153 </tr> 154 <tr> 155 <td class="table_cells" width="21%"><code><span class=special>!</span><span class=identifier>a</span></code></td> 156 <td class="table_cells" width="23%">Optional</td> 157 <td class="table_cells" width="56%">Match a zero (0) or one (1) time</td> 158 </tr> 159 <tr> 160 <td class="table_cells" width="21%"><code><span class=identifier>a </span><span class=special>% 161 </span><span class=identifier>b</span></code></td> 162 <td class="table_cells" width="23%">List</td> 163 <td class="table_cells" width="56%">Match a list of one or more repetitions 164 of a separated by occurrences of b. This is the same as <tt>a >> *(b 165 >> a)</tt>. Note that <tt>a</tt> must not also match <tt>b</tt></td> 166 </tr> 167</table> 168<p><img src="theme/note.gif" width="16" height="16"> If we look more closely, 169 take note that we generalized the optional expression of the form <tt>!a</tt> 170 in the same category as loops. This is logical, considering that the optional 171 matches the expression following it zero (0) or one (1) time. </p> 172<p><b>Primitive type operands</b></p> 173<p> For binary operators, one of the operands but not both may be a <tt>char</tt>, 174 <tt> wchar_t</tt>, <tt>char const<span class="operators">*</span></tt> or <tt>wchar_t 175 const<span class="operators">*</span></tt>. Where P is a parser object, here 176 are some examples:</p> 177<pre><code><span class=identifier> </span><span class=identifier>P </span><span class=special>| </span><span class=literal>'x' 178 </span><span class=identifier>P </span><span class=special>- </span><span class=identifier>L</span><span class=string>"Hello World" 179 </span><span class=literal>'x' </span><span class=special>>> </span><span class=identifier>P 180 </span><span class=string>"bebop" </span><span class=special>>> </span><span class=identifier>P</span></code></pre> 181<p>It is important to emphasize that C++ mandates that operators may only be overloaded 182 if at least one argument is a user-defined type. Typically, in an expression 183 involving multiple operators, explicitly typing the leftmost operand as a parser 184 is enough to cause propagation to all the rest of the operands to its right 185 to be regarded as parsers. Examples:</p> 186<pre><code><font color="#000000"><span class=identifier> </span><span class=identifier>r </span><span class=special>= </span><span class=literal>'a' </span><span class=special>| </span><span class=literal>'b' </span><span class=special>| </span><span class=literal>'c' </span><span class=special>| </span><span class=literal>'d'</span><span class=special>; </span><span class=comment>// ill formed 187 </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'a'</span><span class=special>) </span><span class=special>| </span><span class=literal>'b' </span><span class=special>| </span><span class=literal>'c' </span><span class=special>| </span><span class=literal>'d'</span><span class=special>; </span><span class=comment>// OK</span></font></code></pre> 188<p>The second case is parsed as follows:</p> 189<pre><code><font color="#000000"> r <font color="#0000ff"><img src="theme/arrow.gif"> <span class=special>(((</span><span class=identifier>chlit</span><span class=special><</span><span class=keyword>char</span><span class=special>> </span><span class=special>| </span><span class=keyword>char</span><span class=special>) </span><span class=special>| </span><span class=keyword>char</span><span class=special>) </span><span class=special>| </span><span class=keyword>char</span><span class=special>)</span></font> 190 191 a <font color="#0000ff"><img src="theme/arrow.gif"> <span class=special>(</span><span class=identifier>chlit</span><span class=special><</span><span class=keyword>char</span><span class=special>> </span><span class=special>| </span><span class=keyword>char</span><span class=special>)</span></font> 192 r <font color="#0000ff"><img src="theme/arrow.gif"> <span class=special>(((</span><span class=identifier>a</span><span class=special>) </span><span class=special>| </span><span class=keyword>char</span><span class=special>) </span><span class=special>| </span><span class=keyword>char</span><span class=special>)</span></font> 193 194 b <font color="#0000ff"><img src="theme/arrow.gif"> <span class=special>(</span><span class=identifier>a </span><span class=special>| </span><span class=keyword>char</span><span class=special>)</span></font> 195 r <font color="#0000ff"><img src="theme/arrow.gif"> <span class=special>(((</span><span class=identifier>b</span><span class=special>)) </span><span class=special>| </span><span class=keyword>char</span><span class=special>)</span></font> 196 197 c <font color="#0000ff"><img src="theme/arrow.gif"> <span class=special>(</span><span class=identifier>b </span><span class=special>| </span><span class=keyword>char</span><span class=special>)</span></font> 198 r <font color="#0000ff"><img src="theme/arrow.gif"> <span class=special>(((</span><span class=identifier>c</span><span class=special>)))</span></font></font></code></pre> 199<p><b>Operator precedence and grouping</b></p> 200<p>Since we are defining our meta-language in C++, we follow C/C++'s operator 201 precedence rules. Grouping expressions inside the parentheses override this 202 (e.g., <tt><span class="operators">*(</span>a <span class="operators">|</span> 203 b<span class="operators">)</span></tt> reads: match a or b zero (0) or more 204 times). </p> 205<table border="0"> 206 <tr> 207 <td width="10"></td> 208 <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> 209 <td width="30"><a href="primitives.html"><img src="theme/l_arr.gif" border="0"></a></td> 210 <td width="30"><a href="numerics.html"><img src="theme/r_arr.gif" border="0"></a></td> 211 </tr> 212</table> 213<br> 214<hr size="1"> 215<p class="copyright">Copyright © 1998-2003 Joel de Guzman<br> 216 <br> 217 <font size="2">Use, modification and distribution is subject to the Boost Software 218 License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at 219 http://www.boost.org/LICENSE_1_0.txt) </font> </p> 220<p> </p> 221</body> 222</html> 223