1[/============================================================================== 2 Copyright (C) 2001-2011 Joel de Guzman 3 Copyright (C) 2001-2011 Hartmut Kaiser 4 5 Distributed under the Boost Software License, Version 1.0. (See accompanying 6 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) 7===============================================================================/] 8[section:char Character Parsers] 9 10This module includes parsers for single characters. Currently, this 11module includes literal chars (e.g. `'x'`, `L'x'`), `char_` (single 12characters, ranges and character sets) and the encoding specific 13character classifiers (`alnum`, `alpha`, `digit`, `xdigit`, etc.). 14 15[heading Module Header] 16 17 // forwards to <boost/spirit/home/qi/char.hpp> 18 #include <boost/spirit/include/qi_char.hpp> 19 20Also, see __include_structure__. 21 22[/------------------------------------------------------------------------------] 23[section:char Character Parser (`char_`, `lit`)] 24 25[heading Description] 26 27The `char_` parser matches single characters. The `char_` parser has an 28associated __char_encoding_namespace__. This is needed when doing basic 29operations such as inhibiting case sensitivity and dealing with 30character ranges. 31 32There are various forms of `char_`. 33 34[heading char_] 35 36The no argument form of `char_` matches any character in the associated 37__char_encoding_namespace__. 38 39 char_ // matches any character 40 41[heading char_(ch)] 42 43The single argument form of `char_` (with a character argument) matches 44the supplied character. 45 46 char_('x') // matches 'x' 47 char_(L'x') // matches L'x' 48 char_(x) // matches x (a char) 49 50[heading char_(first, last)] 51 52`char_` with two arguments, matches a range of characters. 53 54 char_('a','z') // alphabetic characters 55 char_(L'0',L'9') // digits 56 57A range of characters is created from a low-high character pair. Such a 58parser matches a single character that is in the range, including both 59endpoints. Note, the first character must be /before/ the second, 60according to the underlying __char_encoding_namespace__. 61 62Character mapping is inherently platform dependent. It is not guaranteed 63in the standard for example that `'A' < 'Z'`, that is why in Spirit2, we 64purposely attach a specific __char_encoding_namespace__ (such as ASCII, 65ISO-8859-1) to the `char_` parser to eliminate such ambiguities. 66 67[note *Sparse bit vectors* 68 69To accommodate 16/32 and 64 bit characters, the char-set statically 70switches from a `std::bitset` implementation when the character type is 71not greater than 8 bits, to a sparse bit/boolean set which uses a sorted 72vector of disjoint ranges (`range_run`). The set is constructed from 73ranges such that adjacent or overlapping ranges are coalesced. 74 75`range_runs` are very space-economical in situations where there are lots 76of ranges and a few individual disjoint values. Searching is O(log n) 77where n is the number of ranges.] 78 79[heading char_(def)] 80 81Lastly, when given a string (a plain C string, a `std::basic_string`, 82etc.), the string is regarded as a char-set definition string following 83a syntax that resembles posix style regular expression character sets 84(except that double quotes delimit the set elements instead of square 85brackets and there is no special negation ^ character). Examples: 86 87 char_("a-zA-Z") // alphabetic characters 88 char_("0-9a-fA-F") // hexadecimal characters 89 char_("actgACTG") // DNA identifiers 90 char_("\x7f\x7e") // Hexadecimal 0x7F and 0x7E 91 92[heading lit(ch)] 93 94`lit`, when passed a single character, behaves like the single argument 95`char_` except that `lit` does not synthesize an attribute. A plain 96`char` or `wchar_t` is equivalent to a `lit`. 97 98[note `lit` is reused by both the [qi_lit_string string parsers] and the 99char parsers. In general, a char parser is created when you pass in a 100character and a string parser is created when you pass in a string. The 101exception is when you pass a single element literal string, e.g. 102`lit("x")`. In this case, we optimize this to create a char parser 103instead of a string parser.] 104 105Examples: 106 107 'x' 108 lit('x') 109 lit(L'x') 110 lit(c) // c is a char 111 112[heading Header] 113 114 // forwards to <boost/spirit/home/qi/char/char.hpp> 115 #include <boost/spirit/include/qi_char_.hpp> 116 117Also, see __include_structure__. 118 119[heading Namespace] 120 121[table 122 [[Name]] 123 [[`boost::spirit::lit // alias: boost::spirit::qi::lit` ]] 124 [[`ns::char_`]] 125] 126 127In the table above, `ns` represents a __char_encoding_namespace__. 128 129[heading Model of] 130 131[:__primitive_parser_concept__] 132 133[variablelist Notation 134 [[`c`, `f`, `l`] [A literal char, e.g. `'x'`, `L'x'` or anything that can be 135 converted to a `char` or `wchar_t`, or a __qi_lazy_argument__ 136 that evaluates to anything that can be converted to a `char` 137 or `wchar_t`.]] 138 [[`ns`] [A __char_encoding_namespace__.]] 139 [[`cs`] [A __string__ or a __qi_lazy_argument__ that evaluates to a __string__ 140 that specifies a char-set definition string following a syntax 141 that resembles posix style regular expression character sets 142 (except the square brackets and the negation `^` character).]] 143 [[`cp`] [A char parser, a char range parser or a char set parser.]] 144] 145 146[heading Expression Semantics] 147 148Semantics of an expression is defined only where it differs from, or is 149not defined in __primitive_parser_concept__. 150 151[table 152 [[Expression] [Semantics]] 153 [[`c`] [Create char parser from a char, `c`.]] 154 [[`lit(c)`] [Create a char parser from a char, `c`.]] 155 [[`ns::char_`] [Create a char parser that matches any character in the 156 `ns` encoding.]] 157 [[`ns::char_(c)`] [Create a char parser with `ns` encoding from a char, `c`.]] 158 [[`ns::char_(f, l)`][Create a char-range parser that matches characters from 159 range (`f` to `l`, inclusive) with `ns` encoding.]] 160 [[`ns::char_(cs)`] [Create a char-set parser with `ns` encoding from a char-set 161 definition string, `cs`.]] 162 [[`~cp`] [Negate `cp`. The result is a negated char parser that 163 matches any character in the `ns` encoding except the 164 characters matched by `cp`.]] 165] 166 167[heading Attributes] 168 169[table 170 [[Expression] [Attribute]] 171 [[`c`] [__unused__ or if `c` is a __qi_lazy_argument__, the character 172 type returned by invoking it.]] 173 [[`lit(c)`] [__unused__ or if `c` is a __qi_lazy_argument__, the character 174 type returned by invoking it.]] 175 [[`ns::char_`] [The character type of the __char_encoding_namespace__, `ns`.]] 176 [[`ns::char_(c)`] [The character type of the __char_encoding_namespace__, `ns`.]] 177 [[`ns::char_(f, l)`][The character type of the __char_encoding_namespace__, `ns`.]] 178 [[`ns::char_(cs)`] [The character type of the __char_encoding_namespace__, `ns`.]] 179 [[`~cp`] [The attribute of `cp`.]] 180] 181 182[heading Complexity] 183 184[:*O(N)*, except for char-sets with 16-bit (or more) characters (e.g. 185`wchar_t`). These have *O(log N)* complexity, where N is the number of 186distinct character ranges in the set.] 187 188[heading Example] 189 190[note The test harness for the example(s) below is presented in the 191__qi_basics_examples__ section.] 192 193Some using declarations: 194 195[reference_using_declarations_lit_char] 196 197Basic literals: 198 199[reference_char_literals] 200 201Range: 202 203[reference_char_range] 204 205Character set: 206 207[reference_char_set] 208 209Lazy char_ using __phoenix__ 210 211[reference_char_phoenix] 212 213[endsect] [/ Char] 214 215[/------------------------------------------------------------------------------] 216[section:char_class Character Classification Parsers (`alnum`, `digit`, etc.)] 217 218[heading Description] 219 220The library has the full repertoire of single character parsers for 221character classification. This includes the usual `alnum`, `alpha`, 222`digit`, `xdigit`, etc. parsers. These parsers have an associated 223__char_encoding_namespace__. This is needed when doing basic operations 224such as inhibiting case sensitivity. 225 226[heading Header] 227 228 // forwards to <boost/spirit/home/qi/char/char_class.hpp> 229 #include <boost/spirit/include/qi_char_class.hpp> 230 231Also, see __include_structure__. 232 233[heading Namespace] 234 235[table 236 [[Name]] 237 [[`ns::alnum`]] 238 [[`ns::alpha`]] 239 [[`ns::blank`]] 240 [[`ns::cntrl`]] 241 [[`ns::digit`]] 242 [[`ns::graph`]] 243 [[`ns::lower`]] 244 [[`ns::print`]] 245 [[`ns::punct`]] 246 [[`ns::space`]] 247 [[`ns::upper`]] 248 [[`ns::xdigit`]] 249] 250 251In the table above, `ns` represents a __char_encoding_namespace__. 252 253[heading Model of] 254 255[:__primitive_parser_concept__] 256 257[variablelist Notation 258 [[`ns`] [A __char_encoding_namespace__.]] 259] 260 261[heading Expression Semantics] 262 263Semantics of an expression is defined only where it differs from, or is 264not defined in __primitive_parser_concept__. 265 266[table 267 [[Expression] [Semantics]] 268 [[`ns::alnum`] [Matches alpha-numeric characters]] 269 [[`ns::alpha`] [Matches alphabetic characters]] 270 [[`ns::blank`] [Matches spaces or tabs]] 271 [[`ns::cntrl`] [Matches control characters]] 272 [[`ns::digit`] [Matches numeric digits]] 273 [[`ns::graph`] [Matches non-space printing characters]] 274 [[`ns::lower`] [Matches lower case letters]] 275 [[`ns::print`] [Matches printable characters]] 276 [[`ns::punct`] [Matches punctuation symbols]] 277 [[`ns::space`] [Matches spaces, tabs, returns, and newlines]] 278 [[`ns::upper`] [Matches upper case letters]] 279 [[`ns::xdigit`] [Matches hexadecimal digits]] 280] 281 282[heading Attributes] 283 284[:The character type of the __char_encoding_namespace__, `ns`.] 285 286[heading Complexity] 287 288[:O(N)] 289 290[heading Example] 291 292[note The test harness for the example(s) below is presented in the 293__qi_basics_examples__ section.] 294 295Some using declarations: 296 297[reference_using_declarations_char_class] 298 299Basic usage: 300 301[reference_char_class] 302 303[endsect] [/ Char Classification] 304 305[endsect] 306