• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[/==============================================================================
2    Copyright (C) 2015 Joel de Guzman
3
4    Distributed under the Boost Software License, Version 1.0. (See accompanying
5    file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
6===============================================================================/]
7
8[section:rexpr RExpressions - Recursive ASTs!]
9
10In this example, we'll explore more on how to create heierarchical ASTs.
11We will parse a minimalistic JSON-like language and compile the results
12into our data structures in the form of a tree.
13
14/rexpr/ is a parser for RExpressions, a language resembling a minimal subset
15of json, limited to a dictionary (composed of key=value pairs) where the value
16can itself be a string or a recursive dictionary.
17
18Here's an Example:
19
20    {
21        "color" = "blue"
22        "size" = "29 cm."
23        "position" = {
24            "x" = "123"
25            "y" = "456"
26        }
27    }
28
29This simple parser for X3 is intended as a minimal starting point. It is
30minimal, yet complete with all things necessary parts that make up rules (and
31grammars), except error handling and reporting.
32
33The example can be found here:
34[@../../../example/x3/rexpr/rexpr_min/rexpr.cpp rexpr.cpp]
35
36The same parser, but complete with error handling and reporting, better file
37organization, separate compilation of grammars, and a full test harness, can
38be found in this directory:
39
40[@../../../example/x3/rexpr/rexpr_full/ rexpr/rexpr_full/]
41
42[note rexpr_full is the canonical structure proposed as best practice on how
43parsers are written using Spirit X3.]
44
45[heading The AST]
46
47Here's the AST:
48
49    namespace client { namespace ast
50    {
51        namespace x3 = boost::spirit::x3;
52
53        struct rexpr;
54
55        struct rexpr_value : x3::variant<
56                std::string
57              , x3::forward_ast<rexpr>
58            >
59        {
60            using base_type::base_type;
61            using base_type::operator=;
62        };
63
64        typedef std::map<std::string, rexpr_value> rexpr_map;
65        typedef std::pair<std::string, rexpr_value> rexpr_key_value;
66
67        struct rexpr
68        {
69            rexpr_map entries;
70        };
71    }}
72
73`x3::variant` is a support utility in Spirit X3 that extends __boost_variant__.
74Typically, you use __boost_variant__ right out of the box and refer to a
75particular template instantiation using a typedef. For example:
76
77    typedef boost::variant<std::string, int> my_variant;
78
79Instead of doing that, we create a `class` (or `struct` in the case) that
80subclasses from `x3::variant`. By making the variant a subclass, you have a
81distinct type in your namespace. You also have control of the constructors
82and assignment operators, as well as giving you the freedom to add more
83functionality.
84
85`rexpr_value` has two variant elements. It can either be a `std::string` or a
86`rexpr`. Since `rexpr` recursively contains a `rexpr_value`, it has to be
87forward declared. This recursive data structure requires `rexpr` to be wrapped
88in a `x3::forward_ast` as shown in the declaration.
89
90We need to tell fusion about our `rexpr` struct to make it a first-class fusion
91citizen:
92
93    BOOST_FUSION_ADAPT_STRUCT(
94        client::ast::rexpr,
95        entries
96    )
97
98So essentially, a `rexpr_value` value is either a `std::string` or a
99`std::map<std::string, rexpr_value>`.
100
101[heading Walking the AST]
102
103We add a utility to print out the AST:
104
105    int const tabsize = 4;
106
107    struct rexpr_printer
108    {
109        typedef void result_type;
110
111        rexpr_printer(int indent = 0)
112          : indent(indent) {}
113
114        void operator()(rexpr const& ast) const
115        {
116            std::cout << '{' << std::endl;
117            for (auto const& entry : ast.entries)
118            {
119                tab(indent+tabsize);
120                std::cout << '"' << entry.first << "\" = ";
121                boost::apply_visitor(rexpr_printer(indent+tabsize), entry.second);
122            }
123            tab(indent);
124            std::cout << '}' << std::endl;
125        }
126
127        void operator()(std::string const& text) const
128        {
129            std::cout << '"' << text << '"' << std::endl;
130        }
131
132        void tab(int spaces) const
133        {
134            for (int i = 0; i < spaces; ++i)
135                std::cout << ' ';
136        }
137
138        int indent;
139    };
140
141Traversing the AST is a recursive exercise. `rexpr_printer` is a function
142object and the main entry point is `void operator()(rexpr const& ast)`. Notice
143how it recursively calls an instance of itself for each of the key value pairs,
144using `boost::apply_visitor`. Before and after iterating through all the
145elements in the map, the braces are printed to surround the entire body, taking
146care of correct indentation at each point in the recursive traversal.
147
148The `operator()` for `std::string` should be self explanatory. It simply prints
149the text inside double quotes.
150
151[heading The Grammar]
152
153    namespace client { namespace parser
154    {
155        namespace x3 = boost::spirit::x3;
156        namespace ascii = boost::spirit::x3::ascii;
157
158        using x3::lit;
159        using x3::lexeme;
160
161        using ascii::char_;
162        using ascii::string;
163
164        x3::rule<class rexpr_value, ast::rexpr_value>
165            rexpr_value = "rexpr_value";
166
167        x3::rule<class rexpr, ast::rexpr>
168            rexpr = "rexpr";
169
170        x3::rule<class rexpr_key_value, ast::rexpr_key_value>
171            rexpr_key_value = "rexpr_key_value";
172
173        auto const quoted_string =
174            lexeme['"' >> *(char_ - '"') >> '"'];
175
176        auto const rexpr_value_def =
177            quoted_string | rexpr;
178
179        auto const rexpr_key_value_def =
180            quoted_string >> '=' >> rexpr_value;
181
182        auto const rexpr_def =
183            '{' >> *rexpr_key_value >> '}';
184
185        BOOST_SPIRIT_DEFINE(rexpr_value, rexpr, rexpr_key_value);
186    }}
187
188We declare three rules `rexpr_value`, `rexpr` and `rexpr_key_value`. Same deal
189as before. The attributes declared for each rule are the same structures we
190declared in our AST above. These are the structures our rules will produce.
191
192So, going top-down (from the start rule), `rexpr` is defined as zero or more
193`rexpr_key_value`s enclosed inside the curly braces. That's the kleene star in
194action there: `*rexpr_key_value`.
195
196[note Take note of the convention: we use `rexpr` name as the rule name and
197`rexpr_def` as the rule definition name.]
198
199To make the grammar clean, we capture the `quoted_string` production using
200C++ `auto`.
201
202`rexpr_value` is a `quoted_string` followed by the assignment operator `'='`,
203followed by a `rexpr_value`. `rexpr_value` is a `quoted_string` /OR/ a `rexpr`.
204This uses the alternative operator `|`, and maps nicely to the variant AST the
205rule is associated with.
206
207Finally, we associate the rules and their definitions:
208
209    BOOST_SPIRIT_DEFINE(rexpr_value, rexpr, rexpr_key_value);
210
211[endsect]
212