1[/============================================================================== 2 Copyright (C) 2001-2011 Joel de Guzman 3 Copyright (C) 2001-2011 Hartmut Kaiser 4 5 Distributed under the Boost Software License, Version 1.0. (See accompanying 6 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) 7===============================================================================/] 8 9[section Employee - Parsing into structs] 10 11It's a common question in the __spirit_list__: How do I parse and place 12the results into a C++ struct? Of course, at this point, you already 13know various ways to do it, using semantic actions. There are many ways 14to skin a cat. Spirit2, being fully attributed, makes it even easier. 15The next example demonstrates some features of Spirit2 that make this 16easy. In the process, you'll learn about: 17 18* More about attributes 19* Auto rules 20* Some more built-in parsers 21* Directives 22 23[import ../../example/qi/employee.cpp] 24 25First, let's create a struct representing an employee: 26 27[tutorial_employee_struct] 28 29Then, we need to tell __fusion__ about our employee struct to make it a first-class 30fusion citizen that the grammar can utilize. If you don't know fusion yet, 31it is a __boost__ library for working with heterogeneous collections of data, 32commonly referred to as tuples. Spirit uses fusion extensively as part of its 33infrastructure. 34 35In fusion's view, a struct is just a form of a tuple. You can adapt any struct 36to be a fully conforming fusion tuple: 37 38[tutorial_employee_adapt_struct] 39 40Now we'll write a parser for our employee. Inputs will be of the form: 41 42 employee{ age, "surname", "forename", salary } 43 44Here goes: 45 46[tutorial_employee_parser] 47 48The full cpp file for this example can be found here: [@../../example/qi/employee.cpp] 49 50Let's walk through this one step at a time (not necessarily from top to bottom). 51 52 template <typename Iterator> 53 struct employee_parser : grammar<Iterator, employee(), space_type> 54 55`employee_parser` is a grammar. Like before, we make it a template so that we can 56reuse it for different iterator types. The grammar's signature is: 57 58 employee() 59 60meaning, the parser generates employee structs. `employee_parser` skips white 61spaces using `space_type` as its skip parser. 62 63 employee_parser() : employee_parser::base_type(start) 64 65Initializes the base class. 66 67 rule<Iterator, std::string(), space_type> quoted_string; 68 rule<Iterator, employee(), space_type> start; 69 70Declares two rules: `quoted_string` and `start`. `start` has the same template 71parameters as the grammar itself. `quoted_string` has a `std::string` attribute. 72 73[heading Lexeme] 74 75 lexeme['"' >> +(char_ - '"') >> '"']; 76 77`lexeme` inhibits space skipping from the open brace to the closing brace. 78The expression parses quoted strings. 79 80 +(char_ - '"') 81 82parses one or more chars, except the double quote. It stops when it sees 83a double quote. 84 85[heading Difference] 86 87The expression: 88 89 a - b 90 91parses `a` but not `b`. Its attribute is just `A`; the attribute of `a`. `b`'s 92attribute is ignored. Hence, the attribute of: 93 94 char_ - '"' 95 96is just `char`. 97 98[heading Plus] 99 100 +a 101 102is similar to Kleene star. Rather than match everything, `+a` matches one or more. 103Like it's related function, the Kleene star, its attribute is a `std::vector<A>` 104where `A` is the attribute of `a`. So, putting all these together, the attribute 105of 106 107 +(char_ - '"') 108 109is then: 110 111 std::vector<char> 112 113[heading Sequence Attribute] 114 115Now what's the attribute of 116 117 '"' >> +(char_ - '"') >> '"' 118 119? 120 121Well, typically, the attribute of: 122 123 a >> b >> c 124 125is: 126 127 fusion::vector<A, B, C> 128 129where `A` is the attribute of `a`, `B` is the attribute of `b` and `C` is the 130attribute of `c`. What is `fusion::vector`? - a tuple. 131 132[note If you don't know what I am talking about, see: [@http://tinyurl.com/6xun4j 133Fusion Vector]. It might be a good idea to have a look into __fusion__ at this 134point. You'll definitely see more of it in the coming pages.] 135 136[heading Attribute Collapsing] 137 138Some parsers, especially those very little literal parsers you see, like `'"'`, 139do not have attributes. 140 141Nodes without attributes are disregarded. In a sequence, like above, all nodes 142with no attributes are filtered out of the `fusion::vector`. So, since `'"'` has 143no attribute, and `+(char_ - '"')` has a `std::vector<char>` attribute, the 144whole expression's attribute should have been: 145 146 fusion::vector<std::vector<char> > 147 148But wait, there's one more collapsing rule: If the attribute is followed by a 149single element `fusion::vector`, The element is stripped naked from its container. 150To make a long story short, the attribute of the expression: 151 152 '"' >> +(char_ - '"') >> '"' 153 154is: 155 156 std::vector<char> 157 158[heading Auto Rules] 159 160It is typical to see rules like: 161 162 r = p[_val = _1]; 163 164If you have a rule definition such as the above, where the attribute of the RHS 165(right hand side) of the rule is compatible with the attribute of the LHS (left 166hand side), then you can rewrite it as: 167 168 r %= p; 169 170The attribute of `p` automatically uses the attribute of `r`. 171 172So, going back to our `quoted_string` rule: 173 174 quoted_string %= lexeme['"' >> +(char_ - '"') >> '"']; 175 176is a simplified version of: 177 178 quoted_string = lexeme['"' >> +(char_ - '"') >> '"'][_val = _1]; 179 180The attribute of the `quoted_string` rule: `std::string` *is compatible* with 181the attribute of the RHS: `std::vector<char>`. The RHS extracts the parsed 182attribute directly into the rule's attribute, in-situ. 183 184[note `r %= p` and `r = p` are equivalent if there are no semantic actions 185 associated with `p`. ] 186 187 188[heading Finally] 189 190We're down to one rule, the start rule: 191 192 start %= 193 lit("employee") 194 >> '{' 195 >> int_ >> ',' 196 >> quoted_string >> ',' 197 >> quoted_string >> ',' 198 >> double_ 199 >> '}' 200 ; 201 202Applying our collapsing rules above, the RHS has an attribute of: 203 204 fusion::vector<int, std::string, std::string, double> 205 206These nodes do not have an attribute: 207 208* `lit("employee")` 209* `'{'` 210* `','` 211* `'}'` 212 213[note In case you are wondering, `lit("employee")` is the same as "employee". We 214had to wrap it inside `lit` because immediately after it is `>> '{'`. You can't 215right-shift a `char[]` and a `char` - you know, C++ syntax rules.] 216 217Recall that the attribute of `start` is the `employee` struct: 218 219[tutorial_employee_struct] 220 221Now everything is clear, right? The `struct employee` *IS* compatible with 222`fusion::vector<int, std::string, std::string, double>`. So, the RHS of `start` 223uses start's attribute (a `struct employee`) in-situ when it does its work. 224 225[endsect] 226