• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[/==============================================================================
2    Copyright (C) 2001-2011 Joel de Guzman
3    Copyright (C) 2001-2011 Hartmut Kaiser
4
5    Distributed under the Boost Software License, Version 1.0. (See accompanying
6    file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
7===============================================================================/]
8
9[section:string String Parsers]
10
11This module includes parsers for strings. Currently, this module
12includes the literal and string parsers and the symbol table.
13
14[heading Module Header]
15
16    // forwards to <boost/spirit/home/qi/string.hpp>
17    #include <boost/spirit/include/qi_string.hpp>
18
19Also, see __include_structure__.
20
21[/------------------------------------------------------------------------------]
22[section:string String Parsers (`string`, `lit`)]
23
24[heading Description]
25
26The `string` parser matches a string of characters. The `string` parser
27is an implicit lexeme: the `skip` parser is not applied in between
28characters of the string. The `string` parser has an associated
29__char_encoding_namespace__. This is needed when doing basic operations
30such as inhibiting case sensitivity. Examples:
31
32    string("Hello")
33    string(L"Hello")
34    string(s) // s is a std::string
35
36`lit`, like `string`, also matches a string of characters. The main
37difference is that `lit` does not synthesize an attribute. A plain
38string like `"hello"` or a `std::basic_string` is equivalent to a `lit`.
39Examples:
40
41    "Hello"
42    lit("Hello")
43    lit(L"Hello")
44    lit(s) // s is a std::string
45
46[heading Header]
47
48    // forwards to <boost/spirit/home/qi/string/lit.hpp>
49    #include <boost/spirit/include/qi_lit.hpp>
50
51[heading Namespace]
52
53[table
54    [[Name]]
55    [[`boost::spirit::lit // alias: boost::spirit::qi::lit`]]
56    [[`ns::string`]]
57]
58
59In the table above, `ns` represents a __char_encoding_namespace__.
60
61[heading Model of]
62
63[:__primitive_parser_concept__]
64
65[variablelist Notation
66    [[`s`]      [A __string__ or a __qi_lazy_argument__ that evaluates to a __string__.]]
67    [[`ns`]     [A __char_encoding_namespace__.]]]
68
69[heading Expression Semantics]
70
71Semantics of an expression is defined only where it differs from, or is
72not defined in __primitive_parser_concept__.
73
74[table
75    [[Expression]       [Semantics]]
76    [[`s`]              [Create string parser
77                        from a string, `s`.]]
78    [[`lit(s)`]         [Create a string parser
79                        from a string, `s`.]]
80    [[`ns::string(s)`]  [Create a string parser with `ns` encoding
81                        from a string, `s`.]]
82]
83
84[heading Attributes]
85
86[table
87    [[Expression]       [Attribute]]
88    [[`s`]              [__unused__]]
89    [[`lit(s)`]         [__unused__]]
90    [[`ns::string(s)`]  [`std::basic_string<T>` where `T`
91                        is the underlying character type
92                        of `s`.]]
93]
94
95[heading Complexity]
96
97[:O(N)]
98
99where `N` is the number of characters in the string to be parsed.
100
101[heading Example]
102
103[note The test harness for the example(s) below is presented in the
104__qi_basics_examples__ section.]
105
106Some using declarations:
107
108[reference_using_declarations_lit_string]
109
110Basic literals:
111
112[reference_string_literals]
113
114From a `std::string`
115
116[reference_string_std_string]
117
118Lazy strings using __phoenix__
119
120[reference_string_phoenix]
121
122[endsect] [/ lit/string]
123
124
125[/------------------------------------------------------------------------------]
126[section:symbols Symbols Parser (`symbols`)]
127
128[heading Description]
129
130The class `symbols` implements a symbol table: an associative container
131(or map) of key-value pairs where the keys are strings. The `symbols`
132class can work efficiently with 8, 16, 32 and even 64 bit characters.
133
134Traditionally, symbol table management is maintained separately outside
135the grammar through semantic actions. Contrary to standard practice, the
136Spirit symbol table class `symbols` is-a parser, an instance of which may
137be used anywhere in the grammar specification. It is an example of a
138dynamic parser. A dynamic parser is characterized by its ability to
139modify its behavior at run time. Initially, an empty symbols object
140matches nothing. At any time, symbols may be added, thus, dynamically
141altering its behavior.
142
143[heading Header]
144
145    // forwards to <boost/spirit/home/qi/string/symbols.hpp>
146    #include <boost/spirit/include/qi_symbols.hpp>
147
148Also, see __include_structure__.
149
150[heading Namespace]
151
152[table
153    [[Name]]
154    [[`boost::spirit::qi::symbols`]]
155    [[`boost::spirit::qi::tst`]]
156    [[`boost::spirit::qi::tst_map`]]
157]
158
159[heading Synopsis]
160
161    template <typename Char, typename T, typename Lookup>
162    struct symbols;
163
164[heading Template parameters]
165
166[table
167    [[Parameter]        [Description]               [Default]]
168    [[`Char`]           [The character type
169                        of the symbol strings.]     [`char`]]
170    [[`T`]              [The data type associated
171                        with each symbol.]          [__unused_type__]]
172    [[`Lookup`]         [The symbol search
173                        implementation]             [`tst<Char, T>`]]
174]
175
176[heading Model of]
177
178[:__primitive_parser_concept__]
179
180[variablelist Notation
181    [[`Sym`]        [A `symbols` type.]]
182    [[`Char`]       [A character type.]]
183    [[`T`]          [A data type.]]
184    [[`sym`, `sym2`][`symbols` objects.]]
185    [[`sseq`]       [An __stl__ container of strings.]]
186    [[`dseq`]       [An __stl__ container of data with `value_type` `T`.]]
187    [[`s1`...`sN`]  [A __string__.]]
188    [[`d1`...`dN`]  [Objects of type `T`.]]
189    [[`f`]          [A callable function or function object.]]
190    [[`f`, `l`]     [`ForwardIterator` first/last pair.]]
191]
192
193[heading Expression Semantics]
194
195Semantics of an expression is defined only where it differs from, or is not
196defined in __primitive_parser_concept__.
197
198[table
199    [[Expression]                   [Semantics]]
200    [[`Sym()`]                      [Construct an empty symbols names `"symbols"`.]]
201    [[`Sym(name)`]                  [Construct an empty symbols named `name`.]]
202    [[`Sym(sym2)`]                  [Copy construct a symbols from `sym2` (Another `symbols` object).]]
203    [[`Sym(sseq)`]                  [Construct symbols from `sseq` (an __stl__ container of strings) named `"symbols"`.]]
204    [[`Sym(sseq, name)`]            [Construct symbols from `sseq` (an __stl__ container of strings) named `name`.]]
205    [[`Sym(sseq, dseq)`]            [Construct symbols from `sseq` and `dseq`
206                                    (An __stl__ container of strings and an __stl__ container of
207                                    data with `value_type` `T`) which is named `"symbols"`.]]
208    [[`Sym(sseq, dseq, name)`]      [Construct symbols from `sseq` and `dseq`
209                                    (An __stl__ container of strings and an __stl__ container of
210                                    data with `value_type` `T`) which is named `name`.]]
211    [[`sym = sym2`]                 [Assign `sym2` to `sym`.]]
212    [[`sym = s1, s2, ..., sN`]      [Assign one or more symbols (`s1`...`sN`) to `sym`.]]
213    [[`sym += s1, s2, ..., sN`]     [Add one or more symbols (`s1`...`sN`) to `sym`.]]
214    [[`sym.add(s1)(s2)...(sN)`]     [Add one or more symbols (`s1`...`sN`) to `sym`.]]
215    [[`sym.add(s1, d1)(s2, d2)...(sN, dN)`]
216                                    [Add one or more symbols (`s1`...`sN`)
217                                    with associated data (`d1`...`dN`) to `sym`.]]
218    [[`sym -= s1, s2, ..., sN`]     [Remove one or more symbols (`s1`...`sN`) from `sym`.]]
219    [[`sym.remove(s1)(s2)...(sN)`]  [Remove one or more symbols (`s1`...`sN`) from `sym`.]]
220    [[`sym.clear()`]                [Erase all of the symbols in `sym`.]]
221    [[`sym.at(s)`]                  [Return a reference to the object associated
222                                    with symbol, `s`. If `sym` does not already
223                                    contain such an object, `at` inserts the default
224                                    object `T()`.]]
225    [[`sym.find(s)`]                [Return a pointer to the object associated
226                                    with symbol, `s`. If `sym` does not already
227                                    contain such an object, `find` returns a null
228                                    pointer.]]
229    [[`sym.prefix_find(f, l)`]      [Return a pointer to the object associated
230                                    with longest symbol that matches the beginning
231                                    of the range `[f, l)`, and updates `f` to point
232                                    to one past the end of that match. If no symbol matches,
233                                    then return a null pointer, and `f` is unchanged.]]
234    [[`sym.for_each(f)`]            [For each symbol in `sym`, `s`, a
235                                    `std::basic_string<Char>` with associated data,
236                                    `d`, an object of type `T`, invoke `f(s, d)`]]
237    [[`sym.name()`]                 [Retrieve the current name of the symbols object.]]
238    [[`sym.name(name)`]             [Set the current name of the symbols object to be `name`.]]
239]
240
241[heading Attributes]
242
243The attribute of `symbol<Char, T>` is `T`.
244
245[heading Complexity]
246
247The default implementation uses a Ternary Search Tree (TST) with
248complexity:
249
250[:O(log n+k)]
251
252Where k is the length of the string to be searched in a TST with n
253strings.
254
255TSTs are faster than hashing for many typical search problems especially
256when the search interface is iterator based. TSTs are many times faster
257than hash tables for unsuccessful searches since mismatches are
258discovered earlier after examining only a few characters. Hash tables
259always examine an entire key when searching.
260
261An alternative implementation uses a hybrid hash-map front end (for the
262first character) plus a TST: `tst_map`. This gives us a complexity of
263
264[:O(1 + log n+k-1)]
265
266This is found to be significantly faster than plain TST, albeit with a
267bit more memory usage requirements (each slot in the hash-map is a TST
268node). If you require a lot of symbols to be searched, use the `tst_map`
269implementation. This can be done by using `tst_map` as the third
270template parameter to the symbols class:
271
272    symbols<Char, T, tst_map<Char, T> > sym;
273
274[heading Example]
275
276[note The test harness for the example(s) below is presented in the
277__qi_basics_examples__ section.]
278
279Some using declarations:
280
281[reference_using_declarations_symbols]
282
283Symbols with data:
284
285[reference_symbols_with_data]
286
287When `symbols` is used for case-insensitive parsing (in a __qi_no_case__
288directive), added symbol strings should be in lowercase. Symbol strings
289containing one or more uppercase characters will not match any input
290when symbols is used in a `no_case` directive.
291
292[reference_symbols_with_no_case]
293
294
295[endsect] [/ symbols]
296
297[endsect] [/ String]
298