1[/============================================================================== 2 Copyright (C) 2001-2015 Joel de Guzman 3 Copyright (C) 2001-2011 Hartmut Kaiser 4 5 Distributed under the Boost Software License, Version 1.0. (See accompanying 6 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) 7===============================================================================/] 8 9[section:employee Employee - Parsing into structs] 10 11It's a common question in the __spirit_list__: How do I parse and place 12the results into a C++ struct? Of course, at this point, you already 13know various ways to do it, using semantic actions. There are many ways 14to skin a cat. Spirit X3, being fully attributed, makes it even easier. 15The next example demonstrates some features of Spirit X3 that make this 16easy. In the process, you'll learn about: 17 18* More about attributes 19* Auto rules 20* Some more built-in parsers 21* Directives 22 23First, let's create a struct representing an employee: 24 25 namespace client { namespace ast 26 { 27 struct employee 28 { 29 int age; 30 std::string forename; 31 std::string surname; 32 double salary; 33 }; 34 }} 35 36Then, we need to tell __fusion__ about our employee struct to make it a first-class 37fusion citizen that the grammar can utilize. If you don't know fusion yet, 38it is a __boost__ library for working with heterogeneous collections of data, 39commonly referred to as tuples. Spirit uses fusion extensively as part of its 40infrastructure. 41 42In fusion's view, a struct is just a form of a tuple. You can adapt any struct 43to be a fully conforming fusion tuple: 44 45 BOOST_FUSION_ADAPT_STRUCT( 46 client::ast::employee, 47 age, forename, surname, salary 48 ) 49 50Now we'll write a parser for our employee. Inputs will be of the form: 51 52 employee{ age, "forename", "surname", salary } 53 54[#__tutorial_employee_parser__] 55Here goes: 56 57 namespace parser 58 { 59 namespace x3 = boost::spirit::x3; 60 namespace ascii = boost::spirit::x3::ascii; 61 62 using x3::int_; 63 using x3::lit; 64 using x3::double_; 65 using x3::lexeme; 66 using ascii::char_; 67 68 x3::rule<class employee, ast::employee> const employee = "employee"; 69 70 auto const quoted_string = lexeme['"' >> +(char_ - '"') >> '"']; 71 72 auto const employee_def = 73 lit("employee") 74 >> '{' 75 >> int_ >> ',' 76 >> quoted_string >> ',' 77 >> quoted_string >> ',' 78 >> double_ 79 >> '}' 80 ; 81 82 BOOST_SPIRIT_DEFINE(employee); 83 } 84 85The full cpp file for this example can be found here: 86[@../../../example/x3/employee.cpp employee.cpp] 87 88Let's walk through this one step at a time (not necessarily from top to bottom). 89 90[heading Rule Declaration] 91 92We are assuming that you already know about rules. We introduced rules in the 93previous [tutorial_roman Roman Numerals example]. Please go back and review 94the previous tutorial if you have to. 95 96 x3::rule<class employee, ast::employee> employee = "employee"; 97 98[heading Lexeme] 99 100 lexeme['"' >> +(char_ - '"') >> '"']; 101 102`lexeme` inhibits space skipping from the open brace to the closing brace. 103The expression parses quoted strings. 104 105 +(char_ - '"') 106 107parses one or more chars, except the double quote. It stops when it sees 108a double quote. 109 110[heading Difference] 111 112The expression: 113 114 a - b 115 116parses `a` but not `b`. Its attribute is just `A`; the attribute of `a`. `b`'s 117attribute is ignored. Hence, the attribute of: 118 119 char_ - '"' 120 121is just `char`. 122 123[heading Plus] 124 125 +a 126 127is similar to Kleene star. Rather than match everything, `+a` matches one or more. 128Like it's related function, the Kleene star, its attribute is a `std::vector<A>` 129where `A` is the attribute of `a`. So, putting all these together, the attribute 130of 131 132 +(char_ - '"') 133 134is then: 135 136 std::vector<char> 137 138[heading Sequence Attribute] 139 140Now what's the attribute of 141 142 '"' >> +(char_ - '"') >> '"' 143 144? 145 146Well, typically, the attribute of: 147 148 a >> b >> c 149 150is: 151 152 fusion::vector<A, B, C> 153 154where `A` is the attribute of `a`, `B` is the attribute of `b` and `C` is the 155attribute of `c`. What is `fusion::vector`? - a tuple. 156 157[note If you don't know what I am talking about, see: [@http://tinyurl.com/6xun4j 158Fusion Vector]. It might be a good idea to have a look into __fusion__ at this 159point. You'll definitely see more of it in the coming pages.] 160 161[heading Attribute Collapsing] 162 163Some parsers, especially those very little literal parsers you see, like `'"'`, 164do not have attributes. 165 166Nodes without attributes are disregarded. In a sequence, like above, all nodes 167with no attributes are filtered out of the `fusion::vector`. So, since `'"'` has 168no attribute, and `+(char_ - '"')` has a `std::vector<char>` attribute, the 169whole expression's attribute should have been: 170 171 fusion::vector<std::vector<char> > 172 173But wait, there's one more collapsing rule: If the attribute is followed by a 174single element `fusion::vector`, The element is stripped naked from its container. 175To make a long story short, the attribute of the expression: 176 177 '"' >> +(char_ - '"') >> '"' 178 179is: 180 181 std::vector<char> 182 183[heading Rule Definition] 184 185Again, we are assuming that you already know about rules and rule 186definitions. We introduced rules in the previous [tutorial_roman Roman 187Numerals example]. Please go back and review the previous tutorial if you 188have to. 189 190 employee = 191 lit("employee") 192 >> '{' 193 >> int_ >> ',' 194 >> quoted_string >> ',' 195 >> quoted_string >> ',' 196 >> double_ 197 >> '}' 198 ; 199 200 BOOST_SPIRIT_DEFINE(employee); 201 202Applying our collapsing rules above, the RHS has an attribute of: 203 204 fusion::vector<int, std::string, std::string, double> 205 206These nodes do not have an attribute: 207 208* `lit("employee")` 209* `'{'` 210* `','` 211* `'}'` 212 213[note In case you are wondering, `lit("employee")` is the same as "employee". We 214had to wrap it inside `lit` because immediately after it is `>> '{'`. You can't 215right-shift a `char[]` and a `char` - you know, C++ syntax rules.] 216 217Recall that the attribute of `parser::employee` is the `ast::employee` struct. 218 219Now everything is clear, right? The `struct employee` *IS* compatible with 220`fusion::vector<int, std::string, std::string, double>`. So, the RHS of `start` 221uses start's attribute (a `struct employee`) in-situ when it does its work. 222 223[endsect] 224