1[/============================================================================== 2 Copyright (C) 2015 Joel de Guzman 3 4 Distributed under the Boost Software License, Version 1.0. (See accompanying 5 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) 6===============================================================================/] 7 8[section:rexpr RExpressions - Recursive ASTs!] 9 10In this example, we'll explore more on how to create heierarchical ASTs. 11We will parse a minimalistic JSON-like language and compile the results 12into our data structures in the form of a tree. 13 14/rexpr/ is a parser for RExpressions, a language resembling a minimal subset 15of json, limited to a dictionary (composed of key=value pairs) where the value 16can itself be a string or a recursive dictionary. 17 18Here's an Example: 19 20 { 21 "color" = "blue" 22 "size" = "29 cm." 23 "position" = { 24 "x" = "123" 25 "y" = "456" 26 } 27 } 28 29This simple parser for X3 is intended as a minimal starting point. It is 30minimal, yet complete with all things necessary parts that make up rules (and 31grammars), except error handling and reporting. 32 33The example can be found here: 34[@../../../example/x3/rexpr/rexpr_min/rexpr.cpp rexpr.cpp] 35 36The same parser, but complete with error handling and reporting, better file 37organization, separate compilation of grammars, and a full test harness, can 38be found in this directory: 39 40[@../../../example/x3/rexpr/rexpr_full/ rexpr/rexpr_full/] 41 42[note rexpr_full is the canonical structure proposed as best practice on how 43parsers are written using Spirit X3.] 44 45[heading The AST] 46 47Here's the AST: 48 49 namespace client { namespace ast 50 { 51 namespace x3 = boost::spirit::x3; 52 53 struct rexpr; 54 55 struct rexpr_value : x3::variant< 56 std::string 57 , x3::forward_ast<rexpr> 58 > 59 { 60 using base_type::base_type; 61 using base_type::operator=; 62 }; 63 64 typedef std::map<std::string, rexpr_value> rexpr_map; 65 typedef std::pair<std::string, rexpr_value> rexpr_key_value; 66 67 struct rexpr 68 { 69 rexpr_map entries; 70 }; 71 }} 72 73`x3::variant` is a support utility in Spirit X3 that extends __boost_variant__. 74Typically, you use __boost_variant__ right out of the box and refer to a 75particular template instantiation using a typedef. For example: 76 77 typedef boost::variant<std::string, int> my_variant; 78 79Instead of doing that, we create a `class` (or `struct` in the case) that 80subclasses from `x3::variant`. By making the variant a subclass, you have a 81distinct type in your namespace. You also have control of the constructors 82and assignment operators, as well as giving you the freedom to add more 83functionality. 84 85`rexpr_value` has two variant elements. It can either be a `std::string` or a 86`rexpr`. Since `rexpr` recursively contains a `rexpr_value`, it has to be 87forward declared. This recursive data structure requires `rexpr` to be wrapped 88in a `x3::forward_ast` as shown in the declaration. 89 90We need to tell fusion about our `rexpr` struct to make it a first-class fusion 91citizen: 92 93 BOOST_FUSION_ADAPT_STRUCT( 94 client::ast::rexpr, 95 entries 96 ) 97 98So essentially, a `rexpr_value` value is either a `std::string` or a 99`std::map<std::string, rexpr_value>`. 100 101[heading Walking the AST] 102 103We add a utility to print out the AST: 104 105 int const tabsize = 4; 106 107 struct rexpr_printer 108 { 109 typedef void result_type; 110 111 rexpr_printer(int indent = 0) 112 : indent(indent) {} 113 114 void operator()(rexpr const& ast) const 115 { 116 std::cout << '{' << std::endl; 117 for (auto const& entry : ast.entries) 118 { 119 tab(indent+tabsize); 120 std::cout << '"' << entry.first << "\" = "; 121 boost::apply_visitor(rexpr_printer(indent+tabsize), entry.second); 122 } 123 tab(indent); 124 std::cout << '}' << std::endl; 125 } 126 127 void operator()(std::string const& text) const 128 { 129 std::cout << '"' << text << '"' << std::endl; 130 } 131 132 void tab(int spaces) const 133 { 134 for (int i = 0; i < spaces; ++i) 135 std::cout << ' '; 136 } 137 138 int indent; 139 }; 140 141Traversing the AST is a recursive exercise. `rexpr_printer` is a function 142object and the main entry point is `void operator()(rexpr const& ast)`. Notice 143how it recursively calls an instance of itself for each of the key value pairs, 144using `boost::apply_visitor`. Before and after iterating through all the 145elements in the map, the braces are printed to surround the entire body, taking 146care of correct indentation at each point in the recursive traversal. 147 148The `operator()` for `std::string` should be self explanatory. It simply prints 149the text inside double quotes. 150 151[heading The Grammar] 152 153 namespace client { namespace parser 154 { 155 namespace x3 = boost::spirit::x3; 156 namespace ascii = boost::spirit::x3::ascii; 157 158 using x3::lit; 159 using x3::lexeme; 160 161 using ascii::char_; 162 using ascii::string; 163 164 x3::rule<class rexpr_value, ast::rexpr_value> 165 rexpr_value = "rexpr_value"; 166 167 x3::rule<class rexpr, ast::rexpr> 168 rexpr = "rexpr"; 169 170 x3::rule<class rexpr_key_value, ast::rexpr_key_value> 171 rexpr_key_value = "rexpr_key_value"; 172 173 auto const quoted_string = 174 lexeme['"' >> *(char_ - '"') >> '"']; 175 176 auto const rexpr_value_def = 177 quoted_string | rexpr; 178 179 auto const rexpr_key_value_def = 180 quoted_string >> '=' >> rexpr_value; 181 182 auto const rexpr_def = 183 '{' >> *rexpr_key_value >> '}'; 184 185 BOOST_SPIRIT_DEFINE(rexpr_value, rexpr, rexpr_key_value); 186 }} 187 188We declare three rules `rexpr_value`, `rexpr` and `rexpr_key_value`. Same deal 189as before. The attributes declared for each rule are the same structures we 190declared in our AST above. These are the structures our rules will produce. 191 192So, going top-down (from the start rule), `rexpr` is defined as zero or more 193`rexpr_key_value`s enclosed inside the curly braces. That's the kleene star in 194action there: `*rexpr_key_value`. 195 196[note Take note of the convention: we use `rexpr` name as the rule name and 197`rexpr_def` as the rule definition name.] 198 199To make the grammar clean, we capture the `quoted_string` production using 200C++ `auto`. 201 202`rexpr_value` is a `quoted_string` followed by the assignment operator `'='`, 203followed by a `rexpr_value`. `rexpr_value` is a `quoted_string` /OR/ a `rexpr`. 204This uses the alternative operator `|`, and maps nicely to the variant AST the 205rule is associated with. 206 207Finally, we associate the rules and their definitions: 208 209 BOOST_SPIRIT_DEFINE(rexpr_value, rexpr, rexpr_key_value); 210 211[endsect] 212