• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<html>
2<head>
3<title>In-depth The Scanner</title>
4<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
5<link rel="stylesheet" href="theme/style.css" type="text/css">
6</head>
7
8<body>
9<table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
10  <tr>
11    <td width="10">
12    </td>
13    <td width="85%"> <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>In-depth:
14      The Scanner</b></font> </td>
15    <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
16  </tr>
17</table>
18<br>
19<table border="0">
20  <tr>
21    <td width="10"></td>
22    <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
23    <td width="30"><a href="indepth_the_parser.html"><img src="theme/l_arr.gif" border="0"></a></td>
24    <td width="30"><a href="indepth_the_parser_context.html"><img src="theme/r_arr.gif" border="0"></a></td>
25   </tr>
26</table>
27<h2>Basic Scanner API </h2>
28<table width="90%" border="0" align="center">
29  <tr>
30    <td class="table_title" colspan="10"> class scanner </td>
31  </tr>
32  <tr>
33  <tr>
34    <td class="table_cells"><code><span class=identifier>value_t</span></code></td>
35    <td class="table_cells">typedef: The value type of the scanner's iterator</td>
36  </tr>
37  <td class="table_cells"><code><span class=identifier>ref_t</span></code></td>
38  <td class="table_cells">typedef: The reference type of the scanner's iterator</td>
39  </tr>
40  <td class="table_cells"><code><span class=keyword>bool </span><span class=identifier>at_end</span><span class=special>()
41    </span><span class=keyword>const</span></code></td>
42  <td class="table_cells">Returns true if the input is exhausted</td>
43  </tr>
44  <td class="table_cells"><code><span class=identifier>value_t </span><span class=keyword>operator</span><span class=special>*()
45    </span><span class=keyword>const</span></code></td>
46    <td class="table_cells">Dereference/get a <code><span class=identifier>value_t</span></code>
47      from the input</td>
48  </tr>
49  <td class="table_cells"><code><span class=keyword> </span><span class=identifier>scanner
50    </span><span class=keyword>const</span><span class=special>&amp; </span><span class=keyword>operator</span><span class=special>++()</span></code></td>
51  <td class="table_cells">move the scanner forward</td>
52  </tr>
53  <tr>
54    <td class="table_cells"><code><span class=identifier>IteratorT&amp; first</span><span class=special></span></code></td>
55    <td class="table_cells">The iterator pointing to the current input position.
56      Held by reference</td>
57  </tr>
58  <tr>
59    <td class="table_cells"><code><span class=identifier>IteratorT </span><span class=keyword>const</span>
60      <span class=identifier>last</span><span class=special></span></code></td>
61    <td class="table_cells">The iterator pointing to the end of the input. Held
62      by value</td>
63  </tr>
64</table>
65<p> The basic behavior of the scanner is handled by policies. The actual execution
66  of the scanner's public member functions listed in the table above is implemented
67  by the scanner policies.</p>
68<p> Three sets of policies govern the behavior of the scanner. These policies
69  make it possible to extend Spirit non-intrusively. The scanner policies allow
70  the core-functionality to be extended without requiring any potentially destabilizing
71  changes to the code. A library writer might provide her own policies that override
72  the ones that are already in place to fine tune the parsing process
73  to fit her own needs. Layers above the core might also want to take advantage
74  of this policy based machanism. Abstract syntax tree generation, debuggers and
75  lexers come to mind.</p>
76<p> There are three sets of policies that govern:</p>
77<ul>
78  <li>Iteration and filtering</li>
79  <li>Recognition and matching</li>
80  <li>Handling semantic actions</li>
81</ul>
82<a name="iteration_policy"></a>
83<h2>iteration_policy</h2>
84<p> Here are the default policies that govern iteration and filtering:</p>
85<pre>
86    <code><span class=keyword>struct </span><span class=identifier>iteration_policy
87    </span><span class=special>{
88        </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
89        </span><span class=keyword>void
90        </span><span class=identifier>advance</span><span class=special>(</span><span class=identifier>ScannerT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>scan</span><span class=special>) </span><span class=keyword>const
91        </span><span class=special>{ </span><span class=special>++</span><span class=identifier>scan</span><span class=special>.</span><span class=identifier>first</span><span class=special>; </span><span class=special>}
92
93        </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
94        </span><span class=keyword>bool </span><span class=identifier>at_end</span><span class=special>(</span><span class=identifier>ScannerT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>scan</span><span class=special>) </span><span class=keyword>const
95        </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>scan</span><span class=special>.</span><span class=identifier>first </span><span class=special>== </span><span class=identifier>scan</span><span class=special>.</span><span class=identifier>last</span><span class=special>; </span><span class=special>}
96
97        </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>T</span><span class=special>&gt;
98        </span><span class=identifier>T </span><span class=identifier>filter</span><span class=special>(</span><span class=identifier>T </span><span class=identifier>ch</span><span class=special>) </span><span class=keyword>const
99        </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>ch</span><span class=special>; </span><span class=special>}
100
101        </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
102        </span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>::</span><span class=identifier>ref_t
103        </span><span class=identifier>get</span><span class=special>(</span><span class=identifier>ScannerT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>scan</span><span class=special>) </span><span class=keyword>const
104        </span><span class=special>{ </span><span class=keyword>return </span><span class=special>*</span><span class=identifier>scan</span><span class=special>.</span><span class=identifier>first</span><span class=special>; </span><span class=special>}
105    </span><span class=special>};</span></code></pre>
106<table width="90%" border="0" align="center">
107  <tr>
108    <td class="table_title" colspan="8"> Iteration and filtering policies </td>
109  </tr>
110  <tr>
111  <tr>
112    <td class="table_cells"><b>advance</b></td>
113    <td class="table_cells">Move the iterator forward</td>
114  </tr>
115  <td class="table_cells"><b>at_end</b></td>
116    <td class="table_cells">Return true if the input is exhausted</td>
117  </tr>
118  <td class="table_cells"><b>filter</b></td>
119    <td class="table_cells">Filter a character read from the input</td>
120  </tr>
121  <td class="table_cells"><b>get</b></td>
122    <td class="table_cells">Read a character from the input</td>
123  </tr>
124</table>
125<p> The following code snippet demonstrates a simple policy that converts all
126  characters to lower case:</p>
127<pre>
128    <code><span class=keyword>struct </span><span class=identifier>inhibit_case_iteration_policy </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>iteration_policy
129    </span><span class=special>{
130        </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>CharT</span><span class=special>&gt;
131        </span><span class=identifier>CharT filter</span><span class=special>(</span><span class=identifier>CharT ch</span><span class=special>) </span><span class=keyword>const
132        </span><span class=special>{
133            </span><span class=keyword>return </span>std::<span class=identifier>tolower</span><span class=special>(</span><span class=identifier>ch</span><span class=special>);
134        }
135    };</span></code></pre>
136<a name="match_policy"></a>
137<h2>match_policy</h2>
138<p> Here are the default policies that govern recognition and matching:</p>
139<pre>
140    <code><span class=keyword>struct </span><span class=identifier>match_policy
141    </span><span class=special>{
142        </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>T</span><span class=special>&gt;
143        </span><span class=keyword>struct </span><span class=identifier>result </span><span class=special>
144        {
145            </span><span class=keyword>typedef </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>T</span><span class=special>&gt; </span><span class=identifier>type</span><span class=special>; </span><span class=special>
146        };
147
148        </span><span class=keyword>const </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>nil_t</span><span class=special>&gt;
149        </span><span class=identifier>no_match</span><span class=special>() </span><span class=keyword>const
150        </span><span class=special>{ </span><span class=keyword>
151            return </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>nil_t</span><span class=special>&gt;(); </span><span class=special>
152        }
153
154        </span><span class=keyword>const </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>nil_t</span><span class=special>&gt;
155        </span><span class=identifier>empty_match</span><span class=special>() </span><span class=keyword>const
156        </span><span class=special>{ </span><span class=keyword>
157            return </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>nil_t</span><span class=special>&gt;(</span><span class=number>0</span><span class=special>, </span><span class=identifier>nil_t</span><span class=special>());
158        </span><span class=special>}
159
160        </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>AttrT</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>IteratorT</span><span class=special>&gt;
161        </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>AttrT</span><span class=special>&gt;
162        </span><span class=identifier>create_match</span><span class=special>(
163            </span><span class=keyword>std::size_t         </span><span class=identifier>length</span><span class=special>,
164            </span><span class=identifier>AttrT </span><span class=keyword>const</span><span class=special>&amp;        </span><span class=identifier>val</span><span class=special>,
165            </span><span class=identifier>IteratorT </span><span class=keyword>const</span><span class=special>&amp;    </span><span class=comment>/*first*/</span><span class=special>,
166            </span><span class=identifier>IteratorT </span><span class=keyword>const</span><span class=special>&amp;    </span><span class=comment>/*last*/</span><span class=special>) </span><span class=keyword>const
167        </span><span class=special>{ </span><span class=keyword>
168            return </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>AttrT</span><span class=special>&gt;(</span><span class=identifier>length</span><span class=special>, </span><span class=identifier>val</span><span class=special>); </span><span class=special>
169        }
170
171        </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>MatchT</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>IteratorT</span><span class=special>&gt;
172        </span><span class=keyword>void
173        </span><span class=identifier>group_match</span><span class=special>(
174            </span><span class=identifier>MatchT</span><span class=special>&amp;             </span><span class=comment>/*m*/</span><span class=special>,
175            </span><span class=identifier>parser_id </span><span class=keyword>const</span><span class=special>&amp;    </span><span class=comment>/*id*/</span><span class=special>,
176            </span><span class=identifier>IteratorT </span><span class=keyword>const</span><span class=special>&amp;    </span><span class=comment>/*first*/</span><span class=special>,
177            </span><span class=identifier>IteratorT </span><span class=keyword>const</span><span class=special>&amp;    </span><span class=comment>/*last*/</span><span class=special>) </span><span class=keyword>const </span><span class=special>{}
178
179        </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>Match1T</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>Match2T</span><span class=special>&gt;
180        </span><span class=keyword>void
181        </span><span class=identifier>concat_match</span><span class=special>(</span><span class=identifier>Match1T</span><span class=special>&amp; </span><span class=identifier>l</span><span class=special>, </span><span class=identifier>Match2T </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>r</span><span class=special>) </span><span class=keyword>const
182        </span><span class=special>{ </span><span class=identifier>
183            l</span><span class=special>.</span><span class=identifier>concat</span><span class=special>(</span><span class=identifier>r</span><span class=special>);
184        </span><span class=special>}
185    </span><span class=special>};</span></code></pre>
186<table width="90%" border="0" align="center">
187  <tr>
188    <td class="table_title" colspan="12"> Recognition and matching </td>
189  </tr>
190  <tr>
191  <tr>
192    <td class="table_cells"><b>result</b></td>
193    <td class="table_cells">A metafunction that returns a match type given an
194      attribute type (see In-depth: The Parser)</td>
195  </tr>
196  <td class="table_cells"><b>no_match</b></td>
197  <td class="table_cells">Create a failed match</td>
198  </tr>
199  <td class="table_cells"><b>empty_match</b></td>
200  <td class="table_cells">Create an empty match. An empty match is a successful
201    epsilon match (matching length == 0)</td>
202  </tr>
203  <td class="table_cells"><b>create_match</b></td>
204  <td class="table_cells">Create a match given the matching length, an attribute
205    and the iterator pair pointing to the matching portion of the input</td>
206  </tr>
207  <td class="table_cells"><b>group_match</b></td>
208  <td class="table_cells">For non terminals such as rules, this is called after
209    a successful match has been made to allow post processing</td>
210  </tr>
211  <td class="table_cells"><b>concat_match</b></td>
212  <td class="table_cells">Concatenate two match objects</td>
213  </tr>
214</table>
215<a name="action_policy"></a>
216<h2>action_policy</h2>
217<p> The action policy has only one function for handling semantic actions:</p>
218<pre>
219    <code><span class=keyword>struct </span><span class=identifier>action_policy
220    </span><span class=special>{
221        </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ActorT</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>AttrT</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>IteratorT</span><span class=special>&gt;
222        </span><span class=keyword>void
223        </span><span class=identifier>do_action</span><span class=special>(
224            </span><span class=identifier>ActorT </span><span class=keyword>const</span><span class=special>&amp;       </span><span class=identifier>actor</span><span class=special>,
225            </span><span class=identifier>AttrT </span><span class=keyword>const</span><span class=special>&amp;        </span><span class=identifier>val</span><span class=special>,
226            </span><span class=identifier>IteratorT </span><span class=keyword>const</span><span class=special>&amp;    </span><span class=identifier>first</span><span class=special>,
227            </span><span class=identifier>IteratorT </span><span class=keyword>const</span><span class=special>&amp;    </span><span class=identifier>last</span><span class=special>) </span><span class=keyword>const</span><span class=special>;
228    </span><span class=special>};</span></code></pre>
229<p> The default action policy forwards to:</p>
230<pre>
231    <code><span class=identifier>actor</span><span class=special>(</span><span class=identifier>first</span><span class=special>, </span><span class=identifier>last</span><span class=special>);</span></code></pre>
232<p> If the attribute <tt>val</tt> is of type nil_t. Otherwise:</p>
233<pre>
234    <code><span class=identifier>actor</span><span class=special>(</span><span class=identifier>val</span><span class=special>);</span></code></pre>
235<a name="scanner_policies_mixer"></a>
236<h3>scanner_policies mixer</h3>
237<p> The class <tt>scanner_policies</tt> combines the three scanner policy classes
238  above into one:</p>
239<pre>
240    <code><span class=keyword>template </span><span class=special>&lt;
241        </span><span class=keyword>typename </span><span class=identifier>IterationPolicyT   </span><span class=special>= </span><span class=identifier>iteration_policy</span><span class=special>,
242        </span><span class=keyword>typename </span><span class=identifier>MatchPolicyT       </span><span class=special>= </span><span class=identifier>match_policy</span><span class=special>,
243        </span><span class=keyword>typename </span><span class=identifier>ActionPolicyT      </span><span class=special>= </span><span class=identifier>action_policy</span><span class=special>&gt;
244    </span><span class=keyword>struct </span><span class=identifier>scanner_policies</span><span class=special>;
245</span></code></pre>
246<p> This <i>mixer</i> class inherits from all the three policies. This scanner_policies
247  class is then used to parameterize the scanner:</p>
248<pre>
249    <code><span class=keyword>template </span><span class=special>&lt;
250        </span><span class=keyword>typename </span><span class=identifier>IteratorT </span><span class=special>= </span><span class=keyword>char </span><span class=keyword>const</span><span class=special>*,
251        </span><span class=keyword>typename </span><span class=identifier>PoliciesT </span><span class=special>= </span><span class=identifier>scanner_policies</span><span class=special>&lt;&gt; </span><span class=special>&gt;
252    </span><span class=keyword>class </span><span class=identifier>scanner</span><span class=special>;
253</span></code></pre>
254<p> The scanner in turn inherits from the PoliciesT.</p>
255<a name="rebinding_policies"></a>
256<h3>Rebinding Policies</h3>
257<p> The scanner can be made to rebind to a different set of policies anytime.
258  It has a member function <tt>change_policies(new_policies)</tt>. Given a new
259  set of policies, this member function creates a new scanner with the new set
260  of policies. The result type of the <i>rebound</i> scanner can be can be obtained
261  by calling the metafunction:</p>
262<pre>
263    <code><span class=identifier>rebind_scanner_policies</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>, </span><span class=identifier>PoliciesT</span><span class=special>&gt;::</span><span class=identifier>type</span></code></pre>
264<a name="rebinding_iterators"></a>
265<h3>Rebinding Iterators</h3>
266<p> The scanner can also be made to rebind to a different iterator type anytime.
267  It has a member function <tt>change_iterator(first, last)</tt>. Given a new
268  pair of iterator of type different from the ones held by the scanner, this member
269  function creates a new scanner with the new pair of iterators. The result type
270  of the <i>rebound</i> scanner can be can be obtained by calling the metafunction:</p>
271<pre>
272    <code><span class=identifier>rebind_scanner_iterator</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>, </span><span class=identifier>IteratorT</span><span class=special>&gt;::</span><span class=identifier>type</span></code></pre>
273<table border="0">
274  <tr>
275    <td width="10"></td>
276    <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
277    <td width="30"><a href="indepth_the_parser.html"><img src="theme/l_arr.gif" border="0"></a></td>
278    <td width="30"><a href="indepth_the_parser_context.html"><img src="theme/r_arr.gif" border="0"></a></td>
279  </tr>
280</table>
281<br>
282<hr size="1">
283<p class="copyright">Copyright &copy; 1998-2003 Joel de Guzman<br>
284  <br>
285  <font size="2">Use, modification and distribution is subject to the Boost Software
286    License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
287    http://www.boost.org/LICENSE_1_0.txt)</font></p>
288<p class="copyright">&nbsp;</p>
289</body>
290</html>
291