1[/============================================================================== 2 Copyright (C) 2001-2011 Joel de Guzman 3 Copyright (C) 2001-2011 Hartmut Kaiser 4 5 Distributed under the Boost Software License, Version 1.0. (See accompanying 6 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) 7===============================================================================/] 8 9[section:string String Parsers] 10 11This module includes parsers for strings. Currently, this module 12includes the literal and string parsers and the symbol table. 13 14[heading Module Header] 15 16 // forwards to <boost/spirit/home/qi/string.hpp> 17 #include <boost/spirit/include/qi_string.hpp> 18 19Also, see __include_structure__. 20 21[/------------------------------------------------------------------------------] 22[section:string String Parsers (`string`, `lit`)] 23 24[heading Description] 25 26The `string` parser matches a string of characters. The `string` parser 27is an implicit lexeme: the `skip` parser is not applied in between 28characters of the string. The `string` parser has an associated 29__char_encoding_namespace__. This is needed when doing basic operations 30such as inhibiting case sensitivity. Examples: 31 32 string("Hello") 33 string(L"Hello") 34 string(s) // s is a std::string 35 36`lit`, like `string`, also matches a string of characters. The main 37difference is that `lit` does not synthesize an attribute. A plain 38string like `"hello"` or a `std::basic_string` is equivalent to a `lit`. 39Examples: 40 41 "Hello" 42 lit("Hello") 43 lit(L"Hello") 44 lit(s) // s is a std::string 45 46[heading Header] 47 48 // forwards to <boost/spirit/home/qi/string/lit.hpp> 49 #include <boost/spirit/include/qi_lit.hpp> 50 51[heading Namespace] 52 53[table 54 [[Name]] 55 [[`boost::spirit::lit // alias: boost::spirit::qi::lit`]] 56 [[`ns::string`]] 57] 58 59In the table above, `ns` represents a __char_encoding_namespace__. 60 61[heading Model of] 62 63[:__primitive_parser_concept__] 64 65[variablelist Notation 66 [[`s`] [A __string__ or a __qi_lazy_argument__ that evaluates to a __string__.]] 67 [[`ns`] [A __char_encoding_namespace__.]]] 68 69[heading Expression Semantics] 70 71Semantics of an expression is defined only where it differs from, or is 72not defined in __primitive_parser_concept__. 73 74[table 75 [[Expression] [Semantics]] 76 [[`s`] [Create string parser 77 from a string, `s`.]] 78 [[`lit(s)`] [Create a string parser 79 from a string, `s`.]] 80 [[`ns::string(s)`] [Create a string parser with `ns` encoding 81 from a string, `s`.]] 82] 83 84[heading Attributes] 85 86[table 87 [[Expression] [Attribute]] 88 [[`s`] [__unused__]] 89 [[`lit(s)`] [__unused__]] 90 [[`ns::string(s)`] [`std::basic_string<T>` where `T` 91 is the underlying character type 92 of `s`.]] 93] 94 95[heading Complexity] 96 97[:O(N)] 98 99where `N` is the number of characters in the string to be parsed. 100 101[heading Example] 102 103[note The test harness for the example(s) below is presented in the 104__qi_basics_examples__ section.] 105 106Some using declarations: 107 108[reference_using_declarations_lit_string] 109 110Basic literals: 111 112[reference_string_literals] 113 114From a `std::string` 115 116[reference_string_std_string] 117 118Lazy strings using __phoenix__ 119 120[reference_string_phoenix] 121 122[endsect] [/ lit/string] 123 124 125[/------------------------------------------------------------------------------] 126[section:symbols Symbols Parser (`symbols`)] 127 128[heading Description] 129 130The class `symbols` implements a symbol table: an associative container 131(or map) of key-value pairs where the keys are strings. The `symbols` 132class can work efficiently with 8, 16, 32 and even 64 bit characters. 133 134Traditionally, symbol table management is maintained separately outside 135the grammar through semantic actions. Contrary to standard practice, the 136Spirit symbol table class `symbols` is-a parser, an instance of which may 137be used anywhere in the grammar specification. It is an example of a 138dynamic parser. A dynamic parser is characterized by its ability to 139modify its behavior at run time. Initially, an empty symbols object 140matches nothing. At any time, symbols may be added, thus, dynamically 141altering its behavior. 142 143[heading Header] 144 145 // forwards to <boost/spirit/home/qi/string/symbols.hpp> 146 #include <boost/spirit/include/qi_symbols.hpp> 147 148Also, see __include_structure__. 149 150[heading Namespace] 151 152[table 153 [[Name]] 154 [[`boost::spirit::qi::symbols`]] 155 [[`boost::spirit::qi::tst`]] 156 [[`boost::spirit::qi::tst_map`]] 157] 158 159[heading Synopsis] 160 161 template <typename Char, typename T, typename Lookup> 162 struct symbols; 163 164[heading Template parameters] 165 166[table 167 [[Parameter] [Description] [Default]] 168 [[`Char`] [The character type 169 of the symbol strings.] [`char`]] 170 [[`T`] [The data type associated 171 with each symbol.] [__unused_type__]] 172 [[`Lookup`] [The symbol search 173 implementation] [`tst<Char, T>`]] 174] 175 176[heading Model of] 177 178[:__primitive_parser_concept__] 179 180[variablelist Notation 181 [[`Sym`] [A `symbols` type.]] 182 [[`Char`] [A character type.]] 183 [[`T`] [A data type.]] 184 [[`sym`, `sym2`][`symbols` objects.]] 185 [[`sseq`] [An __stl__ container of strings.]] 186 [[`dseq`] [An __stl__ container of data with `value_type` `T`.]] 187 [[`s1`...`sN`] [A __string__.]] 188 [[`d1`...`dN`] [Objects of type `T`.]] 189 [[`f`] [A callable function or function object.]] 190 [[`f`, `l`] [`ForwardIterator` first/last pair.]] 191] 192 193[heading Expression Semantics] 194 195Semantics of an expression is defined only where it differs from, or is not 196defined in __primitive_parser_concept__. 197 198[table 199 [[Expression] [Semantics]] 200 [[`Sym()`] [Construct an empty symbols names `"symbols"`.]] 201 [[`Sym(name)`] [Construct an empty symbols named `name`.]] 202 [[`Sym(sym2)`] [Copy construct a symbols from `sym2` (Another `symbols` object).]] 203 [[`Sym(sseq)`] [Construct symbols from `sseq` (an __stl__ container of strings) named `"symbols"`.]] 204 [[`Sym(sseq, name)`] [Construct symbols from `sseq` (an __stl__ container of strings) named `name`.]] 205 [[`Sym(sseq, dseq)`] [Construct symbols from `sseq` and `dseq` 206 (An __stl__ container of strings and an __stl__ container of 207 data with `value_type` `T`) which is named `"symbols"`.]] 208 [[`Sym(sseq, dseq, name)`] [Construct symbols from `sseq` and `dseq` 209 (An __stl__ container of strings and an __stl__ container of 210 data with `value_type` `T`) which is named `name`.]] 211 [[`sym = sym2`] [Assign `sym2` to `sym`.]] 212 [[`sym = s1, s2, ..., sN`] [Assign one or more symbols (`s1`...`sN`) to `sym`.]] 213 [[`sym += s1, s2, ..., sN`] [Add one or more symbols (`s1`...`sN`) to `sym`.]] 214 [[`sym.add(s1)(s2)...(sN)`] [Add one or more symbols (`s1`...`sN`) to `sym`.]] 215 [[`sym.add(s1, d1)(s2, d2)...(sN, dN)`] 216 [Add one or more symbols (`s1`...`sN`) 217 with associated data (`d1`...`dN`) to `sym`.]] 218 [[`sym -= s1, s2, ..., sN`] [Remove one or more symbols (`s1`...`sN`) from `sym`.]] 219 [[`sym.remove(s1)(s2)...(sN)`] [Remove one or more symbols (`s1`...`sN`) from `sym`.]] 220 [[`sym.clear()`] [Erase all of the symbols in `sym`.]] 221 [[`sym.at(s)`] [Return a reference to the object associated 222 with symbol, `s`. If `sym` does not already 223 contain such an object, `at` inserts the default 224 object `T()`.]] 225 [[`sym.find(s)`] [Return a pointer to the object associated 226 with symbol, `s`. If `sym` does not already 227 contain such an object, `find` returns a null 228 pointer.]] 229 [[`sym.prefix_find(f, l)`] [Return a pointer to the object associated 230 with longest symbol that matches the beginning 231 of the range `[f, l)`, and updates `f` to point 232 to one past the end of that match. If no symbol matches, 233 then return a null pointer, and `f` is unchanged.]] 234 [[`sym.for_each(f)`] [For each symbol in `sym`, `s`, a 235 `std::basic_string<Char>` with associated data, 236 `d`, an object of type `T`, invoke `f(s, d)`]] 237 [[`sym.name()`] [Retrieve the current name of the symbols object.]] 238 [[`sym.name(name)`] [Set the current name of the symbols object to be `name`.]] 239] 240 241[heading Attributes] 242 243The attribute of `symbol<Char, T>` is `T`. 244 245[heading Complexity] 246 247The default implementation uses a Ternary Search Tree (TST) with 248complexity: 249 250[:O(log n+k)] 251 252Where k is the length of the string to be searched in a TST with n 253strings. 254 255TSTs are faster than hashing for many typical search problems especially 256when the search interface is iterator based. TSTs are many times faster 257than hash tables for unsuccessful searches since mismatches are 258discovered earlier after examining only a few characters. Hash tables 259always examine an entire key when searching. 260 261An alternative implementation uses a hybrid hash-map front end (for the 262first character) plus a TST: `tst_map`. This gives us a complexity of 263 264[:O(1 + log n+k-1)] 265 266This is found to be significantly faster than plain TST, albeit with a 267bit more memory usage requirements (each slot in the hash-map is a TST 268node). If you require a lot of symbols to be searched, use the `tst_map` 269implementation. This can be done by using `tst_map` as the third 270template parameter to the symbols class: 271 272 symbols<Char, T, tst_map<Char, T> > sym; 273 274[heading Example] 275 276[note The test harness for the example(s) below is presented in the 277__qi_basics_examples__ section.] 278 279Some using declarations: 280 281[reference_using_declarations_symbols] 282 283Symbols with data: 284 285[reference_symbols_with_data] 286 287When `symbols` is used for case-insensitive parsing (in a __qi_no_case__ 288directive), added symbol strings should be in lowercase. Symbol strings 289containing one or more uppercase characters will not match any input 290when symbols is used in a `no_case` directive. 291 292[reference_symbols_with_no_case] 293 294 295[endsect] [/ symbols] 296 297[endsect] [/ String] 298