1[#getting_started] 2[section Getting started with Boost.Metaparse] 3 4[section 1. Introduction] 5 6This tutorial shows you how to build a parser for a small calculator language 7from the ground up. The goal is not to have a complete calculator, but to show 8you the most common situations one can face while building a parser using 9Metaparse. This tutorial assumes, that you have some template metaprogramming 10experience. 11 12[section 1.1. Testing environment] 13 14While you are using Metaparse, you will be writing parsers turning an input text 15into a type. These types can later be processed by further template 16metaprograms. While you are working on your parsers, you'll probably want to 17look at the result of parsing a test input. This tutorial assumes that you can 18use [@http://metashell.org Metashell]. Since the 19[@http://metashell.org/about/demo online demo] makes the Boost 20headers available, you can use that in the tutorial as well. 21 22If you install Metashell on your computer, make sure that you have the Boost 23libraries and the `getting_started` example of Metaparse on the include path. 24For example, you can start Metashell with the following arguments: 25 26 $ metashell -I$BOOST_ROOT -I$BOOST_ROOT/libs/metaparse/example/getting_started 27 28`$BOOST_ROOT` refers to the ['boost root directory] (where you have checked 29out the Boost source code). 30 31This tutorial is long and therefore you might want to make shorter or longer 32breaks while reading it. To make it easy for you to stop at a certain point and 33continue later (or to start in the middle if you are already familiar with the 34basics) Metaparse has a `getting_started` directory in the `example`s. This 35contains the definitions for each section of this tutorial. 36 37If you're about to start (or continue) this guide from section 5.2.1, you can 38include `5_2_1.hpp`. This will define everything you need to start with that 39section. 40 41[note 42You have access to these headers in the online Metashell demo as well. For 43example you can include the `<boost/metaparse/getting_started/5_2_1.hpp>` 44header to start from section 5.2.1. 45] 46 47[endsect] 48 49[section 1.2. Using a "regular" testing environment] 50 51If you have no access to Metashell or you prefer using your regular C++ 52development environment while processing this tutorial, this is also possible. 53 54The tutorial (and usually experimenting with Metaparse) requires that you 55evaluate different template metaprogramming expressions and check their result, 56which is a type. Thus, to try the examples of this tutorial you need a way to 57be able to display the result of evaluating a template metaprogram. This section 58shows you two options. 59 60[section 1.2.1. Enforcing an error message or a warning containing the result of 61the metafunction call] 62 63You can either use `boost::mpl::print` or `mpllibs::metamonad::fail_with_type` 64to enforce a warning or an error message containing the result of a metaprogram 65evaluation. For example to see what 66[link BOOST_METAPARSE_STRING `BOOST_METAPARSE_STRING`]`("11 + 2")` refers to, 67you can create a `test.cpp` with the following content: 68 69 #include <boost/metaparse/string.hpp> 70 #include <boost/mpl/print.hpp> 71 72 boost::mpl::print<BOOST_METAPARSE_STRING("11 + 2")> x; 73 74If you try to compile it, the compiler will display warnings containing the 75type the expression 76[link BOOST_METAPARSE_STRING `BOOST_METAPARSE_STRING`]`("11 + 2")` constructs. 77To use this technique for this tutorial, you need to add all the includes and 78definitions the tutorial suggests typing in the shell to your `test.cpp` file. 79When the shell suggests to try to call some metafunction (or you'd like to try 80something out), you need to replace the template argument of `boost::mpl::print` 81with the expression in question and recompile the code. 82 83[endsect] 84 85[section 1.2.2. Displaying the result of the metafunction call at runtime] 86 87You can also display the result of metaprograms at runtime. You can use the 88[@http://boost.org/libs/type_index Boost.TypeIndex] library to do this. For 89example to see what 90[link BOOST_METAPARSE_STRING `BOOST_METAPARSE_STRING`]`("11 + 2")` refers to, 91you can create a `test.cpp` with the following content: 92 93 #include <boost/metaparse/string.hpp> 94 #include <boost/type_index.hpp> 95 #include <iostream> 96 97 int main() 98 { 99 std::cout 100 << boost::typeindex::type_id_with_cvr<BOOST_METAPARSE_STRING("11 + 2")>() 101 << std::endl; 102 } 103 104If you compile and run this code, it will display the type on the standard 105output. 106 107[endsect] 108 109[endsect] 110 111[endsect] 112 113[section 2. The text to parse] 114 115With Metaparse you can create template metaprograms parsing an input text. To 116pass the input text to the metaprograms, you need to represent them as types. 117For example let's represent the text `"Hello world"` as a type. The most 118straightforward way of doing it would be creating a variadic template class 119taking the characters of the text as template arguments: 120 121 template <char... Cs> 122 struct string; 123 124The text `"11 + 2"` can be represented the following way: 125 126 string<'1', '1', ' ', '+', ' ', '2'> 127 128Metaparse provides this type for you. Run the following command in Metashell: 129 130 > #include <boost/metaparse/string.hpp> 131 132[note 133Note that the `>` character at the beginning of the above code example is the 134prompt of Metashell. It is added to the code examples as a hint to what you 135should run in Metashell (or add to your test `cpp` file if you are using a 136regular development environment). 137] 138 139[note 140Note that in the [@http://abel.web.elte.hu/shell/metashell.html online-demo] 141of Metashell you can paste code into the shell by right-clicking on the shell 142somewhere and choosing ['Paste from browser] in the context menu. 143] 144 145This will make this type available for you. Now you can try running the 146following command: 147 148 > boost::metaparse::string<'1', '1', ' ', '+', ' ', '2'> 149 150The shell will echo (almost) the same type back to you. The only difference is 151that it is in a sub-namespace indicating the version of Metaparse being used. 152 153The nice thing about this representation is that metaprograms can easily access 154the individual characters of the text. The not so nice thing about this 155representation is that if you want to write the text `"Hello world"` in your 156source code, you have to type a lot. 157 158Metaparse provides a macro that can turn a string literal into an instance of 159[link string `boost::metaparse::string`]. This is the 160[link BOOST_METAPARSE_STRING `BOOST_METAPARSE_STRING`] macro. You get it by 161including `<boost/metaparse/string.hpp>`. Let's try it by running the following 162command in Metashell: 163 164 > BOOST_METAPARSE_STRING("11 + 2") 165 166You will get the same result as you got by instantiating 167[link string `boost::metaparse::string`] yourself. 168 169[endsect] 170 171[section 3. Creating a simple parser] 172[note Note that you can find everything that has been included and defined so far [link before_3 here].] 173 174Let's try creating a parser. We will start with creating a parser for something 175simple: we will be parsing integer numbers, such as the text `"13"`. You can 176think of this first parsing exercise as a ['template metaprogramming 177string-to-int conversion] because we expect to get the value `13` as the result 178of parsing. 179 180[note 181You know the difference between `"13"` and `13` in C++. One of them is a 182character array, the other one is an integral value. But what is the 183difference between them in template metaprogramming? They are represented by 184different types. For example `"13"` is represented by 185[link string `string`]`<'1', '3'>` while `13` is represented by 186`std::integral_constant<int, 13>`. 187] 188 189To build a parser, we need to specify the grammar to use. Metaparse provides 190building blocks (called parsers) we can use to do this and one of them is the 191[link int_ `int_`] parser which does exactly what we need: it parses integers. 192To make it available, we need to include it: 193 194 > #include <boost/metaparse/int_.hpp> 195 196Our grammar is simple: [link int_ `int_`]. (Don't worry, we'll parse more 197complicated languages later). 198 199A parser is a [link metafunction_class template metafunction class]. It can be 200used directly, but its interface is designed for completeness and not for ease 201of use. Metaparse provides the [link build_parser `build_parser`] 202[link metafunction metafunction] that adds a wrapper to parsers with a simple 203interface. 204 205[note 206In this tutorial, we will always be wrapping our parsers with this. We will 207call these wrapped parsers parsers as well. If you are interested in it, you 208can learn about the complete interface of parsers [link parser here]. 209] 210 211Let's create a parser using [link int_ `int_`] and 212[link build_parser `build_parser`]: 213 214 > #include <boost/metaparse/build_parser.hpp> 215 > using namespace boost::metaparse; 216 > using exp_parser1 = build_parser<int_>; 217 218[link getting_started_0 copy-paste friendly version] 219 220First we need to include `build_parser.hpp` to make 221[link build_parser `build_parser`] available. Then we make our lives easier by 222running `using namespace boost::metaparse;`. The third command defines the 223parser: we need to instantiate the [link build_parser `build_parser`] template 224class with our parser ([link int_ `int_`] in this case) as argument. 225 226Now that we have a parser, let's parse some text with it (if you haven't done it 227yet, include `boost/metaparse/string.hpp`): 228 229 > exp_parser1::apply<BOOST_METAPARSE_STRING("13")>::type 230 mpl_::integral_c<int, 13> 231 232`exp_parser1` is a [link metafunction_class template metafunction class] taking 233the input text as it's argument and it returns the integral representation of 234the number in the string. Try it with different numbers and see how it converts 235them. 236 237[section 3.1. Dealing with invalid input] 238[note Note that you can find everything that has been included and defined so far [link before_3_1 here].] 239 240Have you tried parsing an invalid input? Something that is not a number, such 241as: 242 243 > exp_parser1::apply<BOOST_METAPARSE_STRING("thirteen")>::type 244 << compilation error >> 245 246Well, `"thirteen"` ['is] a number, but our parser does not speak English, so it 247is considered as invalid input. As a result of this, compilation fails and you 248get a compilation error from Metashell. 249 250In the [@#dealing-with-invalid-input-1 Dealing with invalid input] section we 251will go into further details on error handling. 252 253[endsect] 254 255[section 3.2. Dealing with input containing more than what is needed] 256[note Note that you can find everything that has been included and defined so far [link before_3_2 here].] 257 258Let's try to give the parser two numbers instead of one: 259 260 > exp_parser1::apply<BOOST_METAPARSE_STRING("11 13")>::type 261 mpl_::integral_c<int, 11> 262 263You might be surprised by this: the parser did not return an error. It parsed 264the first number, `11` and ignored `13`. The way [link int_ `int_`] works is 265that it parses the number at the beginning of the input text and ignores the 266rest of the input. 267 268So `exp_parser1` has a bug: our little language consists of ['one] number, not a 269['list of numbers]. Let's fix our parser to treat more than one numbers as an 270invalid input: 271 272 > #include <boost/metaparse/entire_input.hpp> 273 274This gives us the [link entire_input `entire_input`] template class. We can 275wrap [link int_ `int_`] with [link entire_input `entire_input`] indicating 276that the number we parse with [link int_ `int_`] should be the entire input. 277Anything that comes after that is an error. So our parser is 278[link entire_input `entire_input`]`<`[link int_ `int_`]`>` now. Let's wrap it 279with [link build_parser `build_parser`]: 280 281 > using exp_parser2 = build_parser<entire_input<int_>>; 282 283Let's try this new parser out: 284 285 > exp_parser2::apply<BOOST_METAPARSE_STRING("13")>::type 286 mpl_::integral_c<int, 13> 287 288It can still parse numbers. Let's try to give it two numbers: 289 290 > exp_parser2::apply<BOOST_METAPARSE_STRING("11 13")>::type 291 << compilation error >> 292 293This generates a compilation error, since the parser failed. 294 295[endsect] 296 297[section 3.3. Accepting optional whitespaces at the end of the input] 298[note Note that you can find everything that has been included and defined so far [link before_3_3 here].] 299 300Our parser became a bit too 301restrictive now. It doesn't allow ['anything] after the number, not even 302whitespaces: 303 304 > exp_parser2::apply<BOOST_METAPARSE_STRING("11 ")>::type 305 << compilation error >> 306 307Let's allow whitespaces after the number: 308 309 > #include <boost/metaparse/token.hpp> 310 311This makes the [link token `token`] template class available. It takes a parser 312as its argument and allows optional whitespaces after that. Let's create a third 313parser allowing whitespaces after the number: 314 315 > using exp_parser3 = build_parser<entire_input<token<int_>>>; 316 317We expect [link token `token`]`<`[link int_ `int_`]`>` to be the entire input 318in this case. We allow optional whitespaces after [link int_ `int_`] but 319nothing else: 320 321 > exp_parser3::apply<BOOST_METAPARSE_STRING("11 ")>::type 322 mpl_::integral_c<int, 11> 323 324[endsect] 325 326[endsect] 327 328[section 4. Parsing simple expressions] 329[note Note that you can find everything that has been included and defined so far [link before_4 here].] 330 331We can parse numbers. Let's try parsing something more complicated, such as 332`"11 + 2"`. This is a number followed by a `+` symbol followed by another 333number. [link int_ `int_`] (or [link token `token`]`<`[link int_ `int_`]`>`) 334implements the parser for one number. 335 336First, let's write a parser for the `+` symbol. We can use the following: 337 338 > #include <boost/metaparse/lit_c.hpp> 339 340This gives us [link lit_c `lit_c`] which we can use to parse specific 341characters, such as `+`. The grammar parsing the `+` character can be 342represented by [link lit_c `lit_c`]`<'+'>`. To allow optional whitespaces after 343it, we should use [link token `token`]`<`[link lit_c `lit_c`]`<'+'>>`. 344 345So to parse `"11 + 2"` we need the following sequence of parsers: 346 347 token<int_> token<lit_c<'+'>> token<int_> 348 349Metaparse provides [link sequence `sequence`] for parsing the sequence of 350things: 351 352 > #include <boost/metaparse/sequence.hpp> 353 354We can implement the parser for our expressions using 355[link sequence `sequence`]: 356 357 sequence<token<int_>, token<lit_c<'+'>>, token<int_>> 358 359Let's create a parser using it: 360 361 > using exp_parser4 = build_parser<sequence<token<int_>, token<lit_c<'+'>>, token<int_>>>; 362 363Try parsing a simple expression using it: 364 365 > exp_parser4::apply<BOOST_METAPARSE_STRING("11 + 2")>::type 366 boost::mpl::v_item<mpl_::integral_c<int, 2>, boost::mpl::v_item<mpl_::char_<'+'> 367 , boost::mpl::v_item<mpl_::integral_c<int, 11>, boost::mpl::vector0<mpl_::na>, 0 368 >, 0>, 0> 369 370What you get might look strange to you. It is a `vector` from [Boost.MPL]( 371http://boost.org/libs/mpl). What you can see in the shell is the way this vector 372is represented. Metashell offers 373[pretty printing](metashell.org/manual/getting_started#data-structures-of-boostmpl) 374for [@http://boost.org/libs/mpl Boost.MPL] containers: 375 376 > #include <metashell/formatter.hpp> 377 378After including this header, try parsing again: 379 380 > exp_parser4::apply<BOOST_METAPARSE_STRING("11 + 2")>::type 381 boost_::mpl::vector<mpl_::integral_c<int, 11>, mpl_::char_<'+'>, mpl_::integral_c<int, 2> > 382 383What you get now looks more simple: this is a vector of three elements: 384 385* `mpl_::integral_c<int, 11>` This is the result of parsing with 386 [link token `token`]`<`[link int_ `int_`]`>`. 387* `mpl_::char_<'+'>` This is the result of parsing with 388 [link token `token`]`<`[link lit_c `lit_c`]`<'+'>>`. 389* `mpl_::integral_c<int, 2> >` This is the result of parsing with 390 [link token `token`]`<`[link int_ `int_`]`>`. 391 392The result of parsing with a [link sequence `sequence`] is the `vector` of the 393individual parsing results. 394 395[section 4.1. Tokenizer] 396[note Note that you can find everything that has been included and defined so far [link before_4_1 here].] 397 398You might have noticed that our parsers have no separate tokenizers. 399Tokenization is part of the parsing process. However, it makes the code of the 400parsers cleaner if we separate the two layers. The previous example has two 401types of tokens: 402 403* a number (eg. `13`) 404* a `+` symbol 405 406In our last solution we parsed them by using the 407[link token `token`]`<`[@int_html `int_`]`>` and 408[link token `token`]`<`[link lit_c `lit_c`]`<'+'>>` parsers. Have you noticed 409a pattern? We wrap the parsers of the tokens with [link token `token`]`<...>`. 410It is not just syntactic sugar. Our tokens might be followed (separated) by 411whitespaces, which can be ignored. That is what [link token `token`]`<...>` 412implements. 413 414So let's make the implementation of `exp_parser` cleaner by separating the 415tokenization from the rest of the parser: 416 417 > using int_token = token<int_>; 418 > using plus_token = token<lit_c<'+'>>; 419 420[link getting_started_1 copy-paste friendly version] 421 422These two definitions create type aliases for the parsers of our tokens. For the 423compiler it doesn't matter if we use `plus_token` or 424[link token `token`]`<`[link lit_c `lit_c`]`<'+'>>`, since they refer to the 425same type. But it makes the code of the parser easier to understand. 426 427We can now define our expression parser using these tokens: 428 429 > using exp_parser5 = build_parser<sequence<int_token, plus_token, int_token>>; 430 431We can use it the same way as `exp_parser4`: 432 433 > exp_parser5::apply<BOOST_METAPARSE_STRING("11 + 2")>::type 434 boost_::mpl::vector<mpl_::integral_c<int, 11>, mpl_::char_<'+'>, mpl_::integral_c<int, 2> > 435 436[endsect] 437 438[section 4.2. Evaluating the expression] 439[note Note that you can find everything that has been included and defined so far [link before_4_2 here].] 440 441It would be nice if we could evaluate the expression as well. Instead of 442returning a `vector` as the result of parsing, we should return the evaluated 443expression. For example the result of parsing `"11 + 2"` should be 444`mpl_::integral_c<int, 13>`. 445 446Metaparse provides [link transform `transform`] which we can use to implement 447this: 448 449 > #include <boost/metaparse/transform.hpp> 450 451This can be used to transform the result of a parser. For example we have the 452[link sequence `sequence`]`<int_token, plus_token, int_token>` parser which 453returns a `vector`. We want to transform this `vector` into a number, which is 454the result of evaluating the expression. We need to pass 455[link transform `transform`] the [link sequence `sequence`]`<...>` parser and 456a function which turns the `vector` into the result we need. First let's create 457this [link metafunction metafunction]: 458 459 > #include <boost/mpl/plus.hpp> 460 > #include <boost/mpl/at.hpp> 461 > template <class Vector> \ 462 ...> struct eval_plus : \ 463 ...> boost::mpl::plus< \ 464 ...> typename boost::mpl::at_c<Vector, 0>::type, \ 465 ...> typename boost::mpl::at_c<Vector, 2>::type \ 466 ...> > {}; 467 468[link getting_started_2 copy-paste friendly version] 469 470[note 471Note that if the last character of your command is the `\` character in 472Metashell, then the shell assumes that you will continue typing the same command 473and waits for that before evaluating your command. When Metashell is waiting for 474the second (or third, or fourth, etc) line of a command, it uses a special 475prompt, `...>`. 476] 477 478What it does is that using `boost::mpl::at_c` it takes the first (index 0) and 479the third (index 2) elements of the `vector` that is the result of parsing with 480[link sequence `sequence`]`<...>` and adds them. We can try it out with an 481example `vector`: 482 483 > eval_plus< \ 484 ...> boost::mpl::vector< \ 485 ...> mpl_::integral_c<int, 11>, \ 486 ...> mpl_::char_<'+'>, \ 487 ...> mpl_::integral_c<int, 2> \ 488 ...> >>::type 489 mpl_::integral_c<int, 13> 490 491[link getting_started_3 copy-paste friendly version] 492 493We can use `eval_plus` to build a parser that evaluates the expression it 494parses: 495 496 > #include <boost/mpl/quote.hpp> 497 > using exp_parser6 = \ 498 ...> build_parser< \ 499 ...> transform< \ 500 ...> sequence<int_token, plus_token, int_token>, \ 501 ...> boost::mpl::quote1<eval_plus> \ 502 ...> > \ 503 ...> >; 504 505[link getting_started_4 copy-paste friendly version] 506 507[note 508Note that we have to use `boost::mpl::quote1` to turn our `eval_plus` 509[link metafunction metafunction] into a 510[link metafunction_class metafunction class]. 511] 512 513[link transform `transform`] parses the input using 514[link sequence `sequence`]`<int_token, plus_token, int_token>` and transforms 515the result of that using `eval_plus`. Let's try it out: 516 517 > exp_parser6::apply<BOOST_METAPARSE_STRING("11 + 2")>::type 518 mpl_::integral_c<int, 13> 519 520We have created a simple expression parser. The following diagram shows how it 521works: 522 523[$images/metaparse/tutorial_diag0.png [width 50%]] 524 525The rounded boxes in the diagram are the parsers parsing the input, which are 526functions ([link metafunction_class template metafunction class]es). The arrows 527represent how the results are passed around between these parsers (they are the 528return values of the function calls). 529 530It uses [link sequence `sequence`] to parse the different elements (the first 531number, the `+` symbol and the second number) and builds a `vector`. The final 532result is calculated from that `vector` by the [link transform `transform`] 533parser. 534 535[endsect] 536 537[endsect] 538 539[section 5. Parsing longer expressions] 540[note Note that you can find everything that has been included and defined so far [link before_5 here].] 541 542We can parse simple expressions adding two numbers together. But we can't parse 543expressions adding three, four or maybe more numbers together. In this section 544we will implement a parser for expressions adding lots of numbers together. 545 546[section 5.1. Parsing a subexpression repeatedly] 547[note Note that you can find everything that has been included and defined so far [link before_5_1 here].] 548 549We can't solve this problem with [link sequence `sequence`], since we don't 550know how many numbers the input will have. We need a parser that: 551 552* parses the first number 553* keeps parsing `+ <number>` elements until the end of the input 554 555Parsing the first number is something we can already do: the `int_token` parser 556does it for us. Parsing the `+ <number>` elements is more tricky. Metaparse 557offers different tools for approaching this. The most simple is 558[link repeated `repeated`]: 559 560 > #include <boost/metaparse/any.hpp> 561 562[link repeated `repeated`] needs a parser (which parses one `+ <number>` 563element) and it keeps parsing the input with it as long as it can. This will 564parse the entire input for us. Let's create a parser for our expressions using 565it: 566 567 > using exp_parser7 = \ 568 ...> build_parser< \ 569 ...> sequence< \ 570 ...> int_token, /* The first <number> */ \ 571 ...> repeated<sequence<plus_token, int_token>> /* The "+ <number>" elements */ \ 572 ...> > \ 573 ...> >; 574 575[link getting_started_5 copy-paste friendly version] 576 577We have a [link sequence `sequence`] with two elements: 578 579* The first number (`int_token`) 580* The `+ <number>` parts 581 582The second part is an [link repeated `repeated`], which parses the `+ <number>` 583elements. One such element is parsed by 584[link sequence `sequence`]`<plus_token, int_token>`. This is just a sequence of 585the `+` symbol and the number. 586 587Let's try parsing an expression using this: 588 589 > exp_parser7::apply<BOOST_METAPARSE_STRING("1 + 2 + 3 + 4")>::type 590 591Here is a formatted version of the result which is easier to read: 592 593 boost_::mpl::vector< 594 // The result of int_token 595 mpl_::integral_c<int, 1>, 596 597 // The result of repeated< sequence<plus_token, int_token> > 598 boost_::mpl::vector< 599 boost_::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 2> >, 600 boost_::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 3> >, 601 boost_::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 4> > 602 > 603 > 604 605The result is a `vector` of two elements. The first element of this `vector` is 606the result of parsing the input with `int_token`, the second element of this 607`vector` is the result of parsing the input with 608[link repeated `repeated`]`< `[link sequence `sequence`]`<plus_token, int_token>>`. 609This second element is also a `vector`. Each element of this `vector` is the 610result of parsing the input with 611[link sequence `sequence`]`<plus_token, int_token>` once. Here is a diagram 612showing how `exp_parser7` parses the input `1 + 2 + 3 + 4`: 613 614[$images/metaparse/tutorial_diag1.png [width 90%]] 615 616The diagram shows that the `+ <number>` elements are parsed by 617[link sequence `sequence`]`<plus_token, int_token>` elements and their results 618are collected by [link repeated `repeated`], which constructs a `vector` of 619these results. The value of the first `<number>` and this `vector` are placed in 620another `vector`, which is the result of parsing. 621 622[endsect] 623 624[section 5.2. Evaluating the parsed expression] 625[note Note that you can find everything that has been included and defined so far [link before_5_2 here].] 626 627The final result here is a pair of the first number and the `vector` of the rest 628of the values. To calculate the result we need to process that data structure. 629Let's give the example output we have just parsed a name. This will make it 630easier to test the code calculating the final result from this structure: 631 632 > using temp_result = exp_parser7::apply<BOOST_METAPARSE_STRING("1 + 2 + 3 + 4")>::type; 633 634Now we can write a [link metafunction template metafunction] turning this 635structure into the result of the calculation this structure represents. 636 637[section 5.2.1. Learning about `boost::mpl::fold`] 638[note Note that you can find everything that has been included and defined so far [link before_5_2_1 here].] 639 640We have a `vector` containing 641another `vector`. Therefore, we will need to be able to summarise the elements 642of different `vector`s. We can use the `boost::mpl::fold` 643[link metafunction metafunction] to do this: 644 645 > #include <boost/mpl/fold.hpp> 646 647With this [link metafunction metafunction], we can iterate over a `vector` of 648parsed numbers and summarise them. We can provide it a 649[link metafunction metafunction] taking two arguments: the sum we have so far 650and the next element of the `vector`. This [link metafunction metafunction] 651will be called for every element of the `vector`. 652 653[note 654Note that this is very similar to the `std::accumulate` algorithm. 655[@http://boost.org/libs/mpl Boost.MPL] provides `boost::mpl::accumulate` as 656well, which is a synonym for `boost::mpl::fold`. This tutorial (and Metaparse) 657uses the name `fold`. 658] 659 660Let's start with a simple case: a `vector` of numbers. For example let's 661summarise the elements of the following `vector`: 662 663 > using vector_of_numbers = \ 664 ...> boost::mpl::vector< \ 665 ...> boost::mpl::int_<2>, \ 666 ...> boost::mpl::int_<5>, \ 667 ...> boost::mpl::int_<6> \ 668 ...> >; 669 670[link getting_started_6 copy-paste friendly version] 671 672We will write a [link metafunction template metafunction], `sum_vector` for 673summarising the elements of a `vector` of numbers: 674 675 > template <class Vector> \ 676 ...> struct sum_vector : \ 677 ...> boost::mpl::fold< \ 678 ...> Vector, \ 679 ...> boost::mpl::int_<0>, \ 680 ...> boost::mpl::lambda< \ 681 ...> boost::mpl::plus<boost::mpl::_1, boost::mpl::_2> \ 682 ...> >::type \ 683 ...> > \ 684 ...> {}; 685 686[link getting_started_7 copy-paste friendly version] 687 688This [link metafunction metafunction] takes the `vector` to summarise the 689elements of as its argument and uses `boost::mpl::fold` to calculate the sum. 690`boost::mpl::fold` takes three arguments: 691 692* The container to summarise. This is `Vector`. 693* The starting value for ['the sum we have so far]. Using `0` means that we want 694 to start the sum from `0`. 695* The function to call in every iteration while looping over the container. We 696 are using a 697 [@http://www.boost.org/libs/mpl/doc/refmanual/lambda-expression.html lambda expression] 698 in our example, which is the expression wrapped by `boost::mpl::lambda`. This 699 expression adds its two arguments together using `boost::mpl::plus`. The 700 lambda expression refers to its arguments by `boost::mpl::_1` and 701 `boost::mpl::_2`. 702 703Let's try this [link metafunction metafunction] out: 704 705 > sum_vector<vector_of_numbers>::type 706 mpl_::integral_c<int, 13> 707 708It works as expected. Here is a diagram showing how it works: 709 710[$images/metaparse/tutorial_diag2.png [width 50%]] 711 712As the diagram shows, `boost::mpl::fold` evaluates the lambda expression for 713each element of the `vector` and passes the result of the previous evaluation to 714the next lambda expression invocation. 715 716We have a [link metafunction metafunction] that can summarise a `vector` of 717numbers. The result of parsing the `+ <number>` elements is a `vector` of 718`vector`s. As a recap, here is `temp_result`: 719 720 boost_::mpl::vector< 721 // The result of int_token 722 mpl_::integral_c<int, 1>, 723 724 // The result of repeated< sequence<plus_token, int_token> > 725 boost_::mpl::vector< 726 boost_::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 2> >, 727 boost_::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 3> >, 728 boost_::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 4> > 729 > 730 > 731 732First let's summarise the result of [link repeated `repeated`]`<...>` using 733`boost::mpl::fold`. This is a `vector` of `vector`s, but that's fine. 734`boost::mpl::fold` doesn't care about what the elements of the `vector` are. 735They can be numbers, `vector`s or something else as well. The function we use to 736add two numbers together (which was a lambda expression in our previous example) 737gets these elements as its argument and has to deal with them. So to summarise 738the elements of the `vector`s we get as the result of parsing with 739[link repeated `repeated`]`<...>`, we need to write a 740[link metafunction metafunction] that can deal with these elements. One such 741element is `boost_::mpl::vector<mpl_::char<'+'>, mpl_::integral_c<int, 2>>`. 742Here is a [link metafunction metafunction] that can be used in a 743`boost::mpl::fold`: 744 745 > template <class Sum, class Item> \ 746 ...> struct sum_items : \ 747 ...> boost::mpl::plus< \ 748 ...> Sum, \ 749 ...> typename boost::mpl::at_c<Item, 1>::type \ 750 ...> > \ 751 ...> {}; 752 753[link getting_started_8 copy-paste friendly version] 754 755This function takes two arguments: 756 757* `Sum`, which is a number. This is the summary of the already processed 758 elements. 759* `Item`, the next item of the `vector`. These items are `vector`s of size two: 760 the result of parsing the `+` symbol and the number. 761 762The [link metafunction metafunction] adds the sum we have so far and the next 763number together using the `boost::mpl::plus` [link metafunction metafunction]. 764To get the next number out of `Item`, it uses `boost::mpl::at_c`. Let's try 765`sum_items` out: 766 767 > sum_items< \ 768 ...> mpl_::integral_c<int, 1>, \ 769 ...> boost::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 2>> \ 770 ...> >::type 771 mpl_::integral_c<int, 3> 772 773[link getting_started_9 copy-paste friendly version] 774 775We have called `sum_items` with values from `temp_result` and saw that it works 776as expected: it added the partial sum (`mpl_::integral_c<int, 1>`) to the next 777number (`mpl_::integral_c<int, 2>`). 778 779`boost::mpl::fold` can summarise the list we get as the result of parsing the 780`+ <number>` elements of the input, so we need to extract this list from 781`temp_result` first: 782 783 > boost::mpl::at_c<temp_result, 1>::type 784 785Here is the formatted version of the result: 786 787 boost_::mpl::vector< 788 boost_::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 2>>, 789 boost_::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 3>>, 790 boost_::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 4>> 791 > 792 793This is the second element of the `temp_result` vector (the first one is the 794value of the first `<number>` element). Let's try fold out for this: 795 796 > \ 797 ...> boost::mpl::fold< \ 798 ...> boost::mpl::at_c<temp_result, 1>::type, /* The vector to summarise */ \ 799 ...> boost::mpl::int_<0>, /* The value to start the sum from */ \ 800 ...> boost::mpl::quote2<sum_items> /* The function to call in each iteration */ \ 801 ...> >::type 802 mpl_::integral_c<int, 9> 803 804[link getting_started_10 copy-paste friendly version] 805 806[note 807We are using `sum_items` as the function to call in each iteration. We are 808passing a [link metafunction metafunction] (`sum_items`) to another 809[link metafunction metafunction] (`boost::mpl::fold`) as an argument. To be 810able to do this, we need to turn it into a 811[link metafunction_class template metafunction class] using 812`boost::mpl::quote2` (`2` means that it takes two arguments). 813] 814 815As we have seen, the result of this is the sum of the elements, which was `9` in 816our case. Here is a diagram showing how `boost::mpl::fold` works: 817 818[$images/metaparse/tutorial_diag3.png [width 50%]] 819 820It starts with the value `boost::mpl::int_<0>` and adds the elements of the 821`boost_::mpl::vector` containing the parsing results one by one. The diagram 822shows how the subresults are calculated and then used for further calculations. 823 824[endsect] 825 826[section 5.2.2. Evaluating the expression using `boost::mpl::fold`] 827[note Note that you can find everything that has been included and defined so far [link before_5_2_2 here].] 828 829Let's use `sum_items` with `boost::mpl::fold` to build the parser that 830summarises the values coming from the `+ <number>` elements. We can extend the 831parser we were using in `exp_parser7` by wrapping the 832[link repeated `repeated`]`<...>` part with [link transform `transform`], which 833transforms the result of [link repeated `repeated`]`<...>` with the folding 834expression we have just created: 835 836 > using exp_parser8 = \ 837 ...> build_parser< \ 838 ...> sequence< \ 839 ...> int_token, /* parse the first <number> */ \ 840 ...> transform< \ 841 ...> repeated<sequence<plus_token, int_token>>, /* parse the "+ <number>" elements */ \ 842 ...> \ 843 ...> /* lambda expression summarising the "+ <number>" elements using fold */ \ 844 ...> boost::mpl::lambda< \ 845 ...> /* The folding expression we have just created */ \ 846 ...> boost::mpl::fold< \ 847 ...> boost::mpl::_1, /* the argument of the lambda expression, the result */ \ 848 ...> /* of the repeated<...> parser */ \ 849 ...> boost::mpl::int_<0>, \ 850 ...> boost::mpl::quote2<sum_items> \ 851 ...> > \ 852 ...> >::type \ 853 ...> > \ 854 ...> > \ 855 ...> >; 856 857[link getting_started_11 copy-paste friendly version] 858 859It uses [link transform `transform`] to turn the result of the previous version 860of our parser into one that summarises the `+ <number>` elements. Let's try it 861out: 862 863 > exp_parser8::apply<BOOST_METAPARSE_STRING("1 + 2 + 3 + 4")>::type 864 boost_::mpl::vector<mpl_::integral_c<int, 1>, mpl_::integral_c<int, 9> > 865 866This returns a pair of numbers as the result of parsing: the first number and 867the sum of the rest. To get the value of the entire expression we need to add 868these two numbers together. We can extend our parser to do this final addition 869as well: 870 871 > using exp_parser9 = \ 872 ...> build_parser< \ 873 ...> transform< \ 874 ...> /* What we had so far */ \ 875 ...> sequence< \ 876 ...> int_token, \ 877 ...> transform< \ 878 ...> repeated<sequence<plus_token, int_token>>, \ 879 ...> boost::mpl::lambda< \ 880 ...> boost::mpl::fold< \ 881 ...> boost::mpl::_1, \ 882 ...> boost::mpl::int_<0>, \ 883 ...> boost::mpl::quote2<sum_items> \ 884 ...> > \ 885 ...> >::type \ 886 ...> > \ 887 ...> >, \ 888 ...> boost::mpl::quote1<sum_vector> /* summarise the vector of numbers */ \ 889 ...> > \ 890 ...> >; 891 892[link getting_started_12 copy-paste friendly version] 893 894`exp_parser9` wraps the parser we had so far (which gives us the two element 895`vector` as the result) with [link transform `transform`] to add the elements 896of that two element `vector` together. Since that two element `vector` is a 897`vector` of numbers, we can (re)use the `sum_vector` 898[link metafunction metafunction] for this. Let's try it out: 899 900 > exp_parser9::apply<BOOST_METAPARSE_STRING("1 + 2 + 3 + 4")>::type 901 mpl_::integral_c<int, 10> 902 903It gives us the correct result, but it is very inefficient. Let's see why: 904 905[$images/metaparse/tutorial_diag4.png [width 90%]] 906 907There are two loops in this process: 908 909* first [link repeated `repeated`] loops over the input to parse all of the 910 `+ <number>` elements. It builds a `vector` during this. (`Loop 1` on the 911 diagram) 912* then `boost::mpl::fold` loops over this `vector` to summarise the elements. 913 (`Loop 2` on the diagram) 914 915[note 916Note that we have been talking about ['loop]s while there is no such thing as 917a loop in template metaprogramming. Loops can be implemented using 918['recursion]: every recursive call is one iteration of the loop. The loop is 919stopped at the bottom of the recursive chain. 920] 921 922[endsect] 923 924[section 5.2.3. Using a folding parser combinator] 925[note Note that you can find everything that has been included and defined so far [link before_5_2_3 here].] 926 927It would be nice, if the two loops could be merged together and the temporary 928`vector` wouldn't have to be built in the middle (don't forget: there is no 929such thing as a ['garbage collector] for template metaprogramming. Once you 930instantiate a template, it will be available until the end of ... the 931compilation). 932 933Metaparse provides the [link foldl `foldl`] parser combinator: 934 935 > #include <boost/metaparse/foldl.hpp> 936 937It is almost the same as `boost::mpl::fold`, but instead of taking the `vector` 938as its first argument, which was coming from the repeated application of a 939parser ([link sequence `sequence`]`<plus_token, int_token>`) on the input, it 940takes the parser itself. [link foldl `foldl`] parses the input and calculates 941the summary on the fly. Here is how we can write our parser using it: 942 943 > using exp_parser10 = \ 944 ...> build_parser< \ 945 ...> transform< \ 946 ...> sequence< \ 947 ...> int_token, \ 948 ...> foldl< \ 949 ...> sequence<plus_token, int_token>, \ 950 ...> boost::mpl::int_<0>, \ 951 ...> boost::mpl::quote2<sum_items> \ 952 ...> > \ 953 ...> >, \ 954 ...> boost::mpl::quote1<sum_vector>> \ 955 ...> >; 956 957[link getting_started_13 copy-paste friendly version] 958 959Here are the formatted versions of `exp_parser9` and `exp_parser10` 960side-by-side: 961 962 // exp_parser9 exp_parser10 963 964 build_parser< build_parser< 965 transform< transform< 966 sequence< sequence< 967 int_token, int_token, 968 969 970 transform< foldl< 971 repeated<sequence<plus_token, int_token>>, sequence<plus_token, int_token>, 972 boost::mpl::lambda< 973 boost::mpl::fold< 974 boost::mpl::_1, 975 boost::mpl::int_<0>, boost::mpl::int_<0>, 976 boost::mpl::quote2<sum_items> boost::mpl::quote2<sum_items> 977 > 978 >::type 979 > > 980 981 982 >, >, 983 boost::mpl::quote1<sum_vector> boost::mpl::quote1<sum_vector> 984 > > 985 > > 986 987[link getting_started_14 copy-paste friendly version] 988 989In `exp_parser10` the "_[link repeated `repeated`] and then 990[link transform `transform`] with `boost::mpl::fold`_" part (the middle block of 991`exp_parser9`) has been replaced by one [link foldl `foldl`] parser that does 992the same thing but without building a `vector` in the middle. The same starting 993value (`boost::mpl::int_<0>`) and callback function (`sum_items`) could be used. 994 995Here is a diagram showing how `exp_parser10` works: 996 997[$images/metaparse/tutorial_diag5.png [width 90%]] 998 999In this case, the results of the 1000[link sequence `sequence`]`<plus_token, int_token>` parsers are passed directly 1001to a folding algorithm without an intermediate `vector`. Here is a diagram 1002showing `exp_parser9` and `exp_parser10` side-by-side to make it easier to see 1003the difference: 1004 1005[$images/metaparse/tutorial_diag6.png [width 90%]] 1006 1007[endsect] 1008 1009[section 5.2.4. Processing the initial element with the folding parser combinator] 1010[note Note that you can find everything that has been included and defined so far [link before_5_2_4 here].] 1011 1012This solution can still be improved. The [link foldl `foldl`] summarising the 1013`+ <number>` elements starts from `0` and once this is done, we add the value of 1014the first `<number>` of the input to it in the first iteration. It would be more 1015straightforward if [link foldl `foldl`] could use the value of the first 1016`<number>` as the initial value of the "['sum we have so far]". Metaparse 1017provides [link foldl_start_with_parser `foldl_start_with_parser`] for this: 1018 1019 > #include <boost/metaparse/foldl_start_with_parser.hpp> 1020 1021[link foldl_start_with_parser `foldl_start_with_parser`] is almost the same as 1022[link foldl `foldl`]. The difference is that instead of taking a starting 1023['value] for the sum it takes a ['parser]. First it parses the input with this 1024parser and uses the value it returns as the starting value. Here is how we can 1025implement our parser using it: 1026 1027 > using exp_parser11 = \ 1028 ...> build_parser< \ 1029 ...> foldl_start_with_parser< \ 1030 ...> sequence<plus_token, int_token>, /* apply this parser repeatedly */ \ 1031 ...> int_token, /* use this parser to get the initial value */ \ 1032 ...> boost::mpl::quote2<sum_items> /* use this function to add a new value to the summary */ \ 1033 ...> > \ 1034 ...> >; 1035 1036[link getting_started_15 copy-paste friendly version] 1037 1038This version of `exp_parser` uses 1039[link foldl_start_with_parser `foldl_start_with_parser`]. This implementation is 1040more compact than the earlier versions. There is no [link sequence `sequence`] 1041element in this: the first `<number>` is parsed by `int_token` and its value is 1042used as the initial value for the summary. Let's try it out: 1043 1044 > exp_parser11::apply<BOOST_METAPARSE_STRING("1 + 2 + 3 + 4")>::type 1045 mpl_::integral_c<int, 10> 1046 1047It returns the same result as the earlier version but works differently. Here is 1048a diagram showing how this implementation works: 1049 1050[$images/metaparse/tutorial_diag7.png [width 90%]] 1051 1052[endsect] 1053 1054[endsect] 1055 1056[endsect] 1057 1058[section 6. Adding support for other operators] 1059[note Note that you can find everything that has been included and defined so far [link before_6 here].] 1060 1061Our parsers now support expressions adding numbers together. In this section we 1062will add support for the `-` operator, so expressions like `1 + 2 - 3` can be 1063evaluated. 1064 1065[section 6.1. Parsing expressions containing `-` operators] 1066[note Note that you can find everything that has been included and defined so far [link before_6_1 here].] 1067 1068Currently we use the `plus_token` for parsing "the" operator, which has to be 1069`+`. We can define a new token for parsing the `-` symbol: 1070 1071 > using minus_token = token<lit_c<'-'>>; 1072 1073We need to build a parser that accepts either a `+` or a `-` symbol. This can be 1074implemented using [link one_of `one_of`]: 1075 1076 > #include <boost/metaparse/one_of.hpp> 1077 1078[link one_of `one_of`]`<plus_token, minus_token>` is a parser which accepts 1079either a `+` (using `plus_token`) or a `-` (using `minus_token`) symbol. The 1080result of parsing is the result of the parser that succeeded. 1081 1082[note 1083You can give any parser to [link one_of `one_of`], therefore it is possible 1084that more than one of them can parse the input. In those cases the order 1085matters: [link one_of `one_of`] tries parsing the input with the parsers from 1086left to right and the first one that succeeds, wins. 1087] 1088 1089Using this, we can make our parser accept subtractions as well: 1090 1091 > using exp_parser12 = \ 1092 ...> build_parser< \ 1093 ...> foldl_start_with_parser< \ 1094 ...> sequence<one_of<plus_token, minus_token>, int_token>, \ 1095 ...> int_token, \ 1096 ...> boost::mpl::quote2<sum_items> \ 1097 ...> > \ 1098 ...> >; 1099 1100[link getting_started_16 copy-paste friendly version] 1101 1102It uses [link one_of `one_of`]`<plus_token, minus_token>` as the separator for 1103the numbers. Let's try it out: 1104 1105 > exp_parser12::apply<BOOST_METAPARSE_STRING("1 + 2 - 3")>::type 1106 mpl_::integral_c<int, 6> 1107 1108The result is not correct. The reason for this is that `sum_items`, the function 1109we summarise with ignores which operator was used and assumes that it is always 1110`+`. 1111 1112[endsect] 1113 1114[section 6.2. Evaluating expressions containing `-` operators] 1115[note Note that you can find everything that has been included and defined so far [link before_6_2 here].] 1116 1117To fix the evaluation of expressions containing subtractions, we need to fix 1118the function we use for summarising. We need to write a version that takes the 1119operator being used into account. 1120 1121First of all we will need the `boost::mpl::minus` 1122[link metafunction metafunction] for implementing subtraction: 1123 1124 > #include <boost/mpl/minus.hpp> 1125 1126Let's write a helper metafunction that takes three arguments: the left operand, 1127the operator and the right operand: 1128 1129 > template <class L, char Op, class R> struct eval_binary_op; 1130 > template <class L, class R> struct eval_binary_op<L, '+', R> : boost::mpl::plus<L, R>::type {}; 1131 > template <class L, class R> struct eval_binary_op<L, '-', R> : boost::mpl::minus<L, R>::type {}; 1132 1133[link getting_started_17 copy-paste friendly version] 1134 1135The first command declares the `eval_binary_op` metafunction. The first and 1136third arguments are the left and right operands and the second argument is the 1137operator. 1138 1139[note 1140Note that it does not satisfy the expectations of a 1141[link metafunction template metafunction] since it takes the operator as a 1142`char` and not as a `class` (or `typename`) argument. For simplicity, we will 1143still call it a metafunction. 1144] 1145 1146The second and third commands define the operation for the cases when the 1147operator is `+` and `-`. When the `eval_binary_op` metafunction is called, 1148the C++ compiler chooses one of the definitions based on the operator. If you 1149have functional programming experience this approach (pattern matching) might be 1150familiar to you. Let's try `eval_binary_op` out: 1151 1152 > eval_binary_op<boost::mpl::int_<11>, '+', boost::mpl::int_<2>>::type 1153 mpl_::integral_c<int, 13> 1154 > eval_binary_op<boost::mpl::int_<13>, '-', boost::mpl::int_<2>>::type 1155 mpl_::integral_c<int, 11> 1156 1157[link getting_started_18 copy-paste friendly version] 1158 1159You might also try to use it with an operator it does not expect (yet). For 1160example `'*'`. You will see the C++ compiler complaining about that the 1161requested version of the `eval_binary_op` template has not been defined. This 1162solution can be extended and support for the `'*'` operator can always be added 1163later. 1164 1165Let's write the [link metafunction metafunction] we can use from the folding 1166parser to evaluate the expressions using `+` and `-` operators. This takes two 1167arguments: 1168 1169* The partial result we have evaluated so far. (This used to be the summary we 1170 have evaluated so far, but we are making it a more general evaluation now). 1171 This is the left operand, a number. 1172* The result of parsing `(+|-) <number>`. This a `vector` containing two 1173 elements: a character representing the operator (`+` or `-`) and the value of 1174 the `<number>`. The number is the right operand. 1175 1176Let's write the [link metafunction metafunction] `binary_op` that takes these 1177arguments and calls `eval_binary_op`: 1178 1179 > template <class S, class Item> \ 1180 ...> struct binary_op : \ 1181 ...> eval_binary_op< \ 1182 ...> S, \ 1183 ...> boost::mpl::at_c<Item, 0>::type::value, \ 1184 ...> typename boost::mpl::at_c<Item, 1>::type \ 1185 ...> > \ 1186 ...> {}; 1187 1188[link getting_started_19 copy-paste friendly version] 1189 1190This [link metafunction metafunction] takes the operator (the first element) 1191and the right operand (the second element) from `Item`. The operator is a class 1192representing a character, such as `mpl_::char_<'+'>`. To get the character value 1193out of it, one has to access its `::value`. For example `mpl_::char<'+'>::value` 1194is `'+'`. Since `eval_binary_op` takes this character value as its second 1195argument, we had to pass `boost::mpl::at_c<Item, 0>::type::value` to it. Let's 1196try it out: 1197 1198 > binary_op<boost::mpl::int_<11>, boost::mpl::vector<boost::mpl::char_<'+'>, boost::mpl::int_<2>>>::type 1199 mpl_::integral_c<int, 13> 1200 1201We passed it a number (`11`) and a `vector` of a character (`+`) and another 1202number (`2`). It added the two numbers as expected. Let's use this function as 1203the third argument of [link foldl_start_with_parser `foldl_start_with_parser`]: 1204 1205 > using exp_parser13 = \ 1206 ...> build_parser< \ 1207 ...> foldl_start_with_parser< \ 1208 ...> sequence<one_of<plus_token, minus_token>, int_token>, \ 1209 ...> int_token, \ 1210 ...> boost::mpl::quote2<binary_op> \ 1211 ...> > \ 1212 ...> >; 1213 1214[link getting_started_20 copy-paste friendly version] 1215 1216It uses `binary_op` instead of `sum_items`. Let's try it out: 1217 1218 > exp_parser13::apply<BOOST_METAPARSE_STRING("1 + 2 - 3")>::type 1219 mpl_::integral_c<int, 0> 1220 1221It returns the correct result. 1222 1223[endsect] 1224 1225[endsect] 1226 1227[section 7. Dealing with precedence] 1228[note Note that you can find everything that has been included and defined so far [link before_7 here].] 1229 1230We support addition and subtraction. Let's support multiplication as well. 1231 1232[section 7.1. Adding support for the `*` operator] 1233[note Note that you can find everything that has been included and defined so far [link before_7_1 here].] 1234 1235We can extend the solution we have built for addition and subtraction. To do 1236that, we need to add support for multiplication to `eval_binary_op`: 1237 1238 > #include <boost/mpl/times.hpp> 1239 > template <class L, class R> struct eval_binary_op<L, '*', R> : boost::mpl::times<L, R>::type {}; 1240 1241[link getting_started_21 copy-paste friendly version] 1242 1243We had to include `<boost/mpl/times.hpp>` to get the `boost::mpl::times` 1244[link metafunction metafunction] and then we could extend `eval_binary_op` to 1245support the `*` operator as well. We can try it out: 1246 1247 > eval_binary_op<boost::mpl::int_<3>, '*', boost::mpl::int_<4>>::type 1248 mpl_::integral_c<int, 12> 1249 1250This works as expected. Let's create a token for parsing the `*` symbol: 1251 1252 > using times_token = token<lit_c<'*'>>; 1253 1254Now we can extend our parser to accept the `*` symbol as an operator: 1255 1256 > using exp_parser14 = \ 1257 ...> build_parser< \ 1258 ...> foldl_start_with_parser< \ 1259 ...> sequence<one_of<plus_token, minus_token, times_token>, int_token>, \ 1260 ...> int_token, \ 1261 ...> boost::mpl::quote2<binary_op> \ 1262 ...> > \ 1263 ...> >; 1264 1265[link getting_started_22 copy-paste friendly version] 1266 1267This version accepts either a `+`, a `-` or a `*` symbol as the operator. Let's 1268try this out: 1269 1270 > exp_parser14::apply<BOOST_METAPARSE_STRING("2 * 3")>::type 1271 mpl_::integral_c<int, 6> 1272 1273This works as expected. Let's try another, slightly more complicated expression: 1274 1275 > exp_parser14::apply<BOOST_METAPARSE_STRING("1 + 2 * 3")>::type 1276 mpl_::integral_c<int, 9> 1277 1278This returns a wrong result. The value of this expression should be `7`, not 1279`9`. The problem with this is that our current implementation does not take 1280operator precedence into account. It treats this expression as `(1 + 2) * 3` 1281while we expect it to be `1 + (2 * 3)` since addition has higher precedence than 1282multiplication. 1283 1284[endsect] 1285 1286[section 7.2. Adding support for precedence of operators] 1287[note Note that you can find everything that has been included and defined so far [link before_7_2 here].] 1288 1289Let's make it possible for different operators to have different precedence. To 1290do this, we define a new parser for parsing expressions containing only `*` 1291operators (that is the operator with the lowest precedence): 1292 1293 > using mult_exp1 = foldl_start_with_parser<sequence<times_token, int_token>, int_token, boost::mpl::quote2<binary_op>>; 1294 1295`mult_exp1` can parse expressions containing only `*` operator. For example 1296`3 * 2` or `6 * 7 * 8`. Now we can create a parser supporting only the `+` and 1297`-` operators but instead of separating ['numbers] with these operators we will 1298separate ['expressions containing only `*` operators]. This means that the 1299expression `1 * 2 + 3 * 4` is interpreted as the expressions `1 * 2` and `3 * 4` 1300separated by a `+` operator. A number (eg. `13`) is the special case of an 1301['expression containing only `*` operators]. 1302 1303Here is the parser implementing this: 1304 1305 > using exp_parser15 = \ 1306 ...> build_parser< \ 1307 ...> foldl_start_with_parser< \ 1308 ...> sequence<one_of<plus_token, minus_token>, mult_exp1>, \ 1309 ...> mult_exp1, \ 1310 ...> boost::mpl::quote2<binary_op> \ 1311 ...> > \ 1312 ...> >; 1313 1314[link getting_started_23 copy-paste friendly version] 1315 1316Note that this is almost the same as `exp_parser13`. The only difference is that 1317it uses `mult_exp1` everywhere, where `exp_parser13` was using `int_token`. 1318Let's try it out: 1319 1320 > exp_parser15::apply<BOOST_METAPARSE_STRING("1 + 2 * 3")>::type 1321 mpl_::integral_c<int, 7> 1322 1323This takes the precedence rules into account. The following diagram shows how it 1324works: 1325 1326[$images/metaparse/tutorial_diag8.png [width 80%]] 1327 1328Subexpressions using `*` operators only are evaluated (by `mult_exp1`) and 1329treated as single units while interpreting expressions using `+` and `-` 1330operators. Numbers not surrounded by `*` operators are treated also as operators 1331using `*` only (containing no operations but a number). 1332 1333If you need more layers (eg. introducing the `^` operator) you can extend this 1334solution with further layers. The order of the layers determine the precedence 1335of the operators. 1336 1337[endsect] 1338 1339[endsect] 1340 1341[section 8. Dealing with associativity] 1342[note Note that you can find everything that has been included and defined so far [link before_8 here].] 1343 1344Let's add division to our calculator language. Since it has the same precedence 1345as multiplication, it should be added to that layer: 1346 1347 > #include <boost/mpl/divides.hpp> 1348 > template <class L, class R> struct eval_binary_op<L, '/', R> : boost::mpl::divides<L, R>::type {}; 1349 > using divides_token = token<lit_c<'/'>>; 1350 > using mult_exp2 = \ 1351 ...> foldl_start_with_parser< \ 1352 ...> sequence<one_of<times_token, divides_token>, int_token>, \ 1353 ...> int_token, \ 1354 ...> boost::mpl::quote2<binary_op> \ 1355 ...> >; 1356 > using exp_parser16 = \ 1357 ...> build_parser< \ 1358 ...> foldl_start_with_parser< \ 1359 ...> sequence<one_of<plus_token, minus_token>, mult_exp2>, \ 1360 ...> mult_exp2, \ 1361 ...> boost::mpl::quote2<binary_op> \ 1362 ...> > \ 1363 ...> >; 1364 1365[link getting_started_24 copy-paste friendly version] 1366 1367We have to include `<boost/mpl/divides.hpp>` to get a 1368[link metafunction metafunction] for doing a division. We need to extend the 1369`eval_binary_op` [link metafunction metafunction] to support division as well. 1370We had to introduce a new token, `divides_token` that can parse the `/` symbol. 1371 1372We have extended `mult_exp1` to accept either a `times_token` or a 1373`divides_token` as the operator. This extended parser is called `mult_exp2`. 1374 1375We have written a new parser, `exp_parser16` which is the same as `exp_parser15` 1376but uses `mult_exp2` instead of `mult_exp1`. This can parse expressions using 1377division as well (and this new operator has the right precedence). Let's try it 1378out: 1379 1380 > exp_parser16::apply<BOOST_METAPARSE_STRING("8 / 4")>::type 1381 mpl_::integral_c<int, 2> 1382 1383This works as expected. But what should be the value of `8 / 4 / 2`? The answer 1384can be either `1` or `4` depending on the associativity of the division 1385operator. If it is left associative, then this expressions is interpreted as 1386`(8 / 4) / 2` and the result is `1`. If it is right associative, this 1387expression is interpreted as `8 / (4 / 2)` and the result is `4`. 1388 1389Try to guess which result our current implementation gives before trying it 1390out. Once you have verified the current behaviour, continue reading. 1391 1392[section 8.1. Understanding the current implementation] 1393[note Note that you can find everything that has been included and defined so far [link before_8_1 here].] 1394 1395Here is a diagram showing how our current parser processes the expression 1396`8 / 4 / 2`: 1397 1398[$images/metaparse/tutorial_diag8.png [width 70%]] 1399 1400It takes the first number, `8`, divides it by the second one, `4` and then it 1401divides the result with the third one, `2`. This means, that in our current 1402implementation, division is left associative: `8 / 4 / 2` means `(8 / 4) / 2`. 1403 1404Another thing to note is that the initial value is `8` and the list of values 1405[link foldl `foldl`] iterates over is "`/ 4`", "`/ 2`". 1406 1407[endsect] 1408 1409[section 8.2. Folding in reverse order] 1410[note Note that you can find everything that has been included and defined so far [link before_8_2 here].] 1411 1412[link foldl `foldl`] applies a parser repeatedly and iterates over the parsing 1413results from ['left] to right. (This is where the `l` in the name comes from). 1414Metaparse provides another folding parser combinator, [link foldr `foldr`]. It 1415applies a parser on the input as well but it iterates from ['right] to left over 1416the results. 1417 1418Similarly to [link foldl_start_with_parser `foldl_start_with_parser`], Metaparse 1419provides [link foldr_start_with_parser `foldr_start_with_parser`] as well. A 1420major difference between the two 1421([link foldl_start_with_parser `foldl_start_with_parser`] and 1422[link foldr_start_with_parser `foldr_start_with-parser`]) solutions is that 1423while [link foldl_start_with_parser `foldl_start_with_parser`] treats the 1424['first] number as a special one, 1425[link foldr_start_with_parser `foldr_start_with_parser`] treats the ['last] 1426number as a special one. This might sound strange, but think about it: if you 1427want to summarise the elements from right to left, your starting value should be 1428the last element, not the first one, as the first one is the one you visit last. 1429 1430Due to the above difference 1431[link foldr_start_with_parser `foldr_start_with_parser`] is not a drop-in 1432replacement of [link foldl_start_with_parser `foldl_start_with_parser`]. While 1433the list of values [link foldl `foldl`] was iterating over is "`8`", "`/ 4`", 1434"`/ 2`", the list of values [link foldr `foldlr`] has to iterate over is "`2`", 1435"`4 /`", "`8 /`". 1436 1437This means that the function we use to ['"add"] a new value to the already 1438evaluated part of the expression (this has been `binary_op` so far) has to be 1439prepared for taking the next operator and operand in a reverse order (eg. by 1440taking "`4 /`" instead of "`/ 4`"). We write another 1441[link metafunction metafunction] for this purpose: 1442 1443 > template <class S, class Item> \ 1444 ...> struct reverse_binary_op : \ 1445 ...> eval_binary_op< \ 1446 ...> typename boost::mpl::at_c<Item, 0>::type, \ 1447 ...> boost::mpl::at_c<Item, 1>::type::value, \ 1448 ...> S \ 1449 ...> > \ 1450 ...> {}; 1451 1452[link getting_started_25 copy-paste friendly version] 1453 1454There are multiple differences between `binary_op` and `reverse_binary_op`: 1455 1456* The `Item` argument, which is a `vector` is expected to be 1457 `[operator, operand]` in `binary_op` and `[operand, operator]` in 1458 `reverse_binary_op`. 1459* Both versions use `eval_binary_op` to evaluate the subexpression, but 1460 `binary_op` treats `S`, the value representing the already evaluated part of 1461 the expression as the left operand, while `reverse_binary_op` treats it as the 1462 right operand. This is because in the first case we are going from left to 1463 right while in the second case we are going from right to left. 1464 1465We need to include [link foldr_start_with_parser `foldr_start_with_parser`]: 1466 1467 > #include <boost/metaparse/foldr_start_with_parser.hpp> 1468 1469We can rewrite `mult_exp` using 1470[link foldr_start_with_parser `foldr_start_with_parser`]: 1471 1472 > using mult_exp3 = \ 1473 ...> foldr_start_with_parser< \ 1474 ...> sequence<int_token, one_of<times_token, divides_token>>, /* The parser applied repeatedly */ \ 1475 ...> int_token, /* The parser parsing the last number */ \ 1476 ...> boost::mpl::quote2<reverse_binary_op> /* The function called for every result */ \ 1477 ...> /* of applying the above parser */ \ 1478 ...> >; 1479 1480[link getting_started_26 copy-paste friendly version] 1481 1482It is almost the same as `mult_exp2`, but ... 1483 1484* ... the parser applied repeatedly parses `<number> <operator>` elements 1485 instead of `<operator> <number>` elements (what `mult_exp2` did). 1486* ... this version uses `reverse_binary_op` instead of `binary_op` as the 1487 function that is called for every result of applying the above parser. 1488 1489We can create a new version of `exp_parser` that uses `mult_exp3` instead of 1490`mult_exp2`: 1491 1492 > using exp_parser17 = \ 1493 ...> build_parser< \ 1494 ...> foldl_start_with_parser< \ 1495 ...> sequence<one_of<plus_token, minus_token>, mult_exp3>, \ 1496 ...> mult_exp3, \ 1497 ...> boost::mpl::quote2<binary_op> \ 1498 ...> > \ 1499 ...> >; 1500 1501[link getting_started_27 copy-paste friendly version] 1502 1503The only difference between `exp_parser17` and the previous version, 1504`exp_parser16` is that it uses the updated version of `mult_exp`. Let's try this 1505parser out: 1506 1507 > exp_parser17::apply<BOOST_METAPARSE_STRING("8 / 4 / 2")>::type 1508 mpl_::integral_c<int, 4> 1509 1510This version of the parser gives ['the other] possible result. The one you get 1511when division is right associative, which means that the above expression is 1512evaluated as `8 / (4 / 2)`. Here is a diagram showing how the 1513[link foldr_start_with_parser `foldr_start_with_parser`]-based solution works: 1514 1515[$images/metaparse/tutorial_diag10.png [width 70%]] 1516 1517To make it easier to compare the two solutions, here is a diagram showing the 1518two approaches side-by-side: 1519 1520[$images/metaparse/tutorial_diag11.png [width 100%]] 1521 1522As we have seen, the associativity of the operators can be controlled by 1523choosing between folding solutions. The folding solutions going from left to 1524right implement left associativity, while the solutions going from right to left 1525implement right associativity. 1526 1527[note 1528Note that folding solutions going from left to right is implemented in a more 1529efficient way than folding from right to left. Therefore when both solutions 1530can be used you should prefer folding from left to right. 1531] 1532 1533[endsect] 1534 1535[endsect] 1536 1537[section 9. Dealing with unary operators] 1538[note Note that you can find everything that has been included and defined so far [link before_9 here].] 1539 1540Our calculator language provides no direct support for negative numbers. To get 1541a negative number, we need to do a subtraction. For example to get the number 1542`-13` we need to evaluate the expression `0 - 13`. 1543 1544We will implement `-` as a unary operator. Therefore the expression `-13` won't 1545be a ['negative number]. It will be the unary `-` operator applied on the number 1546`13`. 1547 1548Since `-` is an operator, it might be used multiple times. So the expression 1549`---13` is also valid and gives the same result as `-13`. This means that any 1550number of `-` symbols are valid before a number. 1551 1552Our parser can be extended to support the unary `-` operator by adding a new 1553layer to the list of precedence layers. This should have the lowest precedence, 1554which means that we should use this new layer where we have been using 1555`int_token`. Let's write a new parser: 1556 1557 > #include <boost/mpl/negate.hpp> 1558 > using unary_exp1 = \ 1559 ...> foldr_start_with_parser< \ 1560 ...> minus_token, \ 1561 ...> int_token, \ 1562 ...> boost::mpl::lambda<boost::mpl::negate<boost::mpl::_1>>::type \ 1563 ...> >; 1564 1565[link getting_started_28 copy-paste friendly version] 1566 1567We had to include `<boost/mpl/negate.hpp>` to get a 1568[link metafunction metafunction] we can negate a value with. 1569 1570`unary_exp1` is implemented with right to left folding: it starts from the 1571number (parsed by `int_token`) and processes the `-` symbols one by one. The 1572function to be called for each `-` symbol is a lambda expression that negates 1573the number. So the number is negated for every `-` symbol. 1574 1575We can implement a new version of `mult_exp` and `exp_parser`. They are the same 1576as `mult_exp2` and `exp_parser16`. The only difference is that they (directly 1577only `exp_parser18`) use `unary_exp1` instead of `int_token`. 1578 1579 > using mult_exp4 = \ 1580 ...> foldl_start_with_parser< \ 1581 ...> sequence<one_of<times_token, divides_token>, unary_exp1>, \ 1582 ...> unary_exp1, \ 1583 ...> boost::mpl::quote2<binary_op> \ 1584 ...> >; 1585 > using exp_parser18 = \ 1586 ...> build_parser< \ 1587 ...> foldl_start_with_parser< \ 1588 ...> sequence<one_of<plus_token, minus_token>, mult_exp4>, \ 1589 ...> mult_exp4, \ 1590 ...> boost::mpl::quote2<binary_op> \ 1591 ...> > \ 1592 ...> >; 1593 1594[link getting_started_29 copy-paste friendly version] 1595 1596Let's try these new parsers out: 1597 1598 > exp_parser18::apply<BOOST_METAPARSE_STRING("---13")>::type 1599 mpl_::integral_c<int, -13> 1600 > exp_parser18::apply<BOOST_METAPARSE_STRING("13")>::type 1601 mpl_::integral_c<int, 13> 1602 1603[link getting_started_30 copy-paste friendly version] 1604 1605It can deal with negative numbers correctly. 1606 1607[endsect] 1608 1609[section 10. Dealing with parens] 1610 1611Our parsers already support the precedence of the different operators. Let's add 1612support for parens as well, so users can override the precedence rules when they 1613need to. 1614 1615We can add a new parser for parsing (and evaluating) expressions in parens. 1616First we introduce tokens for parsing the `(` and `)` symbols: 1617 1618 > using lparen_token = token<lit_c<'('>>; 1619 > using rparen_token = token<lit_c<')'>>; 1620 1621[link getting_started_31 copy-paste friendly version] 1622 1623A paren can contain an expression with any operators in it, so we add a parser 1624for parsing (and evaluating) an expression containing operators of the highest 1625precedence: 1626 1627 > using plus_exp1 = \ 1628 ...> foldl_start_with_parser< \ 1629 ...> sequence<one_of<plus_token, minus_token>, mult_exp4>, \ 1630 ...> mult_exp4, \ 1631 ...> boost::mpl::quote2<binary_op> \ 1632 ...> >; 1633 1634[link getting_started_32 copy-paste friendly version] 1635 1636This was just a refactoring of our last parser for the calculator language. We 1637can build the parser for our calculator language by using 1638[link build_parser `build_parser`]`<plus_exp1>` now. Let's write a parser for a 1639paren expression: 1640 1641 > using paren_exp1 = sequence<lparen_token, plus_exp1, rparen_token>; 1642 1643This definition parses a left paren, then a complete expression followed by a 1644right paren. The result of parsing a paren expression is a `vector` of three 1645elements: the `(` character, the value of the expression and the `)` character. 1646We only need the value of the expression, which is the middle element. We could 1647wrap the whole thing with a [link transform `transform`] that gets the middle 1648element and throws the rest away, but we don't need to. This is such a common 1649pattern, that Metaparse provides [link middle_of `middle_of`] for this: 1650 1651 > #include <boost/metaparse/middle_of.hpp> 1652 > using paren_exp2 = middle_of<lparen_token, plus_exp1, rparen_token>; 1653 1654[link getting_started_33 copy-paste friendly version] 1655 1656This implementation is almost the same as `paren_exp1`. The difference is that 1657the result of parsing will be the value of the wrapped expression (the result of 1658the `plus_exp1` parser). 1659 1660Let's define a parser for a primary expression which is either a number or an 1661expression in parens: 1662 1663 > using primary_exp1 = one_of<int_token, paren_exp2>; 1664 1665This parser accepts either a number using `int_token` or an expression in parens 1666using `paren_exp1`. 1667 1668Everywhere, where one can write a number (parsed by `int_token`), one can write 1669a complete expression in parens as well. Our current parser implementation 1670parses `int_token`s in `unary_exp`, therefore we need to change that to use 1671`primary_exp` instead of `int_token`. 1672 1673There is a problem here: this makes the definitions of our parsers ['recursive]. 1674Think about it: 1675 1676* `plus_exp` uses `mult_exp` 1677* `mult_exp` uses `unary_exp` 1678* `unary_exp` uses `primary_exp` 1679* `primary_exp` uses `paren_exp` 1680* `paren_exp` uses `plus_exp` 1681 1682[note 1683Since we are versioning the different parser implementations in Metashell 1684(`paren_exp1`, `paren_exp2`, etc) you might try to define these recursive 1685parsers and it might seem to work for the first time. In that case, when you 1686later try creating a parser as part of a library (save your Metashell 1687environment to a file or re-implement the important/successful elements) you 1688face this issue. 1689] 1690 1691We have been using type aliases (`typedef` and `using`) for defining the 1692parsers. We can do it as long as their definition is not recursive. We can not 1693refer to a type alias until we have defined it and type aliases can not be 1694forward declared, so we can't find a point in the recursive cycle where we could 1695start defining things. 1696 1697A solution for this is making one of the parsers a new class instead of a type 1698alias. Classes can be forward declared, therefore we can declare the class, 1699implement the rest of the parsers as they can refer to that class and then 1700define the class at the end. 1701 1702Let's make `plus_exp` a class. So as a first step, let's forward declare it: 1703 1704 > struct plus_exp2; 1705 1706Now we can write the rest of the parsers and they can refer to `plus_exp2`: 1707 1708 > using paren_exp3 = middle_of<lparen_token, plus_exp2, rparen_token>; 1709 > using primary_exp2 = one_of<int_token, paren_exp2>; 1710 > using unary_exp2 = \ 1711 ...> foldr_start_with_parser< \ 1712 ...> minus_token, \ 1713 ...> primary_exp2, \ 1714 ...> boost::mpl::lambda<boost::mpl::negate<boost::mpl::_1>>::type \ 1715 ...> >; 1716 > using mult_exp5 = \ 1717 ...> foldl_start_with_parser< \ 1718 ...> sequence<one_of<times_token, divides_token>, unary_exp2>, \ 1719 ...> unary_exp2, \ 1720 ...> boost::mpl::quote2<binary_op> \ 1721 ...> >; 1722 1723[link getting_started_34 copy-paste friendly version] 1724 1725There is nothing new in the definition of these parsers. They build up the 1726hierarchy we have worked out in the earlier sections of this tutorial. The only 1727element missing is `plus_exp2`: 1728 1729 > struct plus_exp2 : \ 1730 ...> foldl_start_with_parser< \ 1731 ...> sequence<one_of<plus_token, minus_token>, mult_exp5>, \ 1732 ...> mult_exp5, \ 1733 ...> boost::mpl::quote2<binary_op> \ 1734 ...> > {}; 1735 1736[link getting_started_35 copy-paste friendly version] 1737 1738This definition makes use of inheritance instead of type aliasing. Now we can 1739write the parser for the calculator that supports parens as well: 1740 1741 > using exp_parser19 = build_parser<plus_exp2>; 1742 1743Let's try this parser out: 1744 1745 > exp_parser19::apply<BOOST_METAPARSE_STRING("(1 + 2) * 3")>::type 1746 mpl_::integral_c<int, 9> 1747 1748Our parser accepts and can deal with parens in the expressions. 1749 1750[endsect] 1751 1752[#dealing_with_invalid_input] 1753[section 11. Dealing with invalid input] 1754 1755So far we have been focusing on parsing valid user input. However, users of our 1756parsers will make mistakes and we should help them finding the source of the 1757problem. And we should make this process not too painful. 1758 1759The major difficulty in error reporting is that we have no direct way of showing 1760error messages to the user. The parsers are template metaprograms. When they 1761detect that the input is invalid, they can make the compilation fail and the 1762compiler (running the metaprogram) display an error message. What we can do is 1763making those error messages short and contain all information about the parsing 1764error. We should make it easy to find this information in whatever the compiler 1765displays. 1766 1767So let's try to parse some invalid expression and let's see what happens: 1768 1769 > exp_parser19::apply<BOOST_METAPARSE_STRING("hello")>::type 1770 << compilation error >> 1771 1772You will get a lot (if you have seen error messages coming from template 1773metaprograms you know: this is ['not] a lot.) of error messages. Take a closer 1774look. It contains this: 1775 1776 x__________________PARSING_FAILED__________________x< 1777 1, 1, 1778 boost::metaparse::v1::error::literal_expected<'('> 1779 > 1780 1781You can see a formatted version above. There are no line breaks in the real 1782output. This is relatively easy to spot (thanks to the `____________` part) and 1783contains answers to the main questions one has when parsing fails: 1784 1785* ['where] is the error? It is column `1` in line `1` (inside 1786 [link BOOST_METAPARSE_STRING `BOOST_METAPARSE_STRING`]). This is the `1, 1` 1787 part. 1788* ['what] is the problem? `literal_expected<'('>`. This is a bit misleading, as 1789 it contains only a part of the problem. An open paren is not the only 1790 acceptable token here, a number would also be fine. This misleading error 1791 message is ['our] fault: ['we] (the parser authors) need to make the parsing 1792 errors more descriptive. 1793 1794[section 11.1. Improving the error messages] 1795 1796So how can we improve the error messages? Let's look at what went wrong in the 1797previous case: 1798 1799* The input was `hello`. 1800* `plus_exp2` tried to parse it. 1801* `plus_exp2` tried to parse it using `mult_exp5` (assuming that this is the 1802 initial `mult_exp` in the list of `+` / `-` separated `mult_exp`s). 1803* so `mult_exp5` tried to parse it. 1804* `mult_exp5` tried to parse it using `unary_exp2` (assuming that this is the 1805 initial `unary_exp` in the list of `*` / `/` separated `unary_exp`s). 1806* so `unary_exp2` tried to parse it. 1807* `unary_exp2` parsed all of the `-` symbols using `minus_token`. There were 1808 none of them (the input started with an `h` character). 1809* so `unary_exp2` tried to parse it using `primary_exp2`. 1810* `primary_exp2` is: [link one_of `one_of`]`<int_token, paren_exp2>`. It tried 1811 parsing the input with `int_token` (which failed) and then with `paren_exp2` 1812 (which failed as well). So [link one_of `one_of`] could not parse the input 1813 with any of the choices and therefore it failed as well. In such situations 1814 `one_of` checks which parser made the most progress (consumed the most 1815 characters of the input) before failing and assumes, that that is the parser 1816 the user intended to use, thus it returns the error message coming from that 1817 parser. In this example none of the parsers could make any progress, in which 1818 case `one_of` returns the error coming from the last parser in the list. This 1819 was `paren_exp2`, and it expects the expression to start with an open paren. 1820 This is where the error message came from. The rest of the layers did not 1821 change or improve this error message so this was the error message displayed 1822 to the user. 1823 1824We, the parser authors know: we expect a primary expression there. When 1825[link one_of `one_of`] fails, it means that none was found. 1826 1827[endsect] 1828 1829[section 11.2. Defining custom errors] 1830 1831To be able to return custom error messages (like `missing_primary_expression`) 1832to the user, we need to define those error messages first. The error messages 1833are represented by classes with some requirements: 1834 1835* It should have a static method called `get_value()` returning a `std::string` 1836 containing the description of the error. 1837* It should be a [link metaprogramming_value template metaprogramming value]. 1838 1839These classes are called [link parsing_error_message parsing error message]s. 1840To make it easy to implement such classes and to make it difficult (if not 1841impossible) to forget to fulfill a requirement, Metaparse provides a macro for 1842defining these classes. To get this macro, include the following header: 1843 1844 > #include <boost/metaparse/define_error.hpp> 1845 1846Let's define the [link parsing_error_message parsing error message]: 1847 1848 > BOOST_METAPARSE_DEFINE_ERROR(missing_primary_expression, "Missing primary expression"); 1849 1850This defines a class called `missing_primary_expression` representing this error 1851message. What we need to do is making our parser return this error message when 1852[link one_of `one_of`] fails. 1853 1854Let's define `plus_exp` and `paren_exp` first. Their definition does not change: 1855 1856 > struct plus_exp3; 1857 > using paren_exp4 = middle_of<lparen_token, plus_exp3, rparen_token>; 1858 1859[link getting_started_36 copy-paste friendly version] 1860 1861When the input contains no number (parsed by `int_token`) and no paren 1862expression (parsed by `paren_exp4`), we should return the 1863`missing_primary_expression` error message. We can do it by adding a third 1864parser to `one_of<int_token, paren_exp4, ...>` which always fails with this 1865error message. Metaparse provides [link fail `fail`] for this: 1866 1867 > #include <boost/metaparse/fail.hpp> 1868 1869Now we can define the `primary_exp` parser using it: 1870 1871 > using primary_exp3 = one_of<int_token, paren_exp4, fail<missing_primary_expression>>; 1872 1873It adds [link fail `fail`]`<missing_primary_expression>` to `one_of` as the 1874last element. Therefore if none of the "real" cases parse the input ['and] none 1875of them makes any progress before failing, the error message will be 1876`missing_primary_expression`. 1877 1878We need to define the rest of the parsers. Their definition is the same as 1879before: 1880 1881 > using unary_exp3 = \ 1882 ...> foldr_start_with_parser< \ 1883 ...> minus_token, \ 1884 ...> primary_exp3, \ 1885 ...> boost::mpl::lambda<boost::mpl::negate<boost::mpl::_1>>::type \ 1886 ...> >; 1887 > using mult_exp6 = \ 1888 ...> foldl_start_with_parser< \ 1889 ...> sequence<one_of<times_token, divides_token>, unary_exp3>, \ 1890 ...> unary_exp3, \ 1891 ...> boost::mpl::quote2<binary_op> \ 1892 ...> >; 1893 > struct plus_exp3 : \ 1894 ...> foldl_start_with_parser< \ 1895 ...> sequence<one_of<plus_token, minus_token>, mult_exp6>, \ 1896 ...> mult_exp6, \ 1897 ...> boost::mpl::quote2<binary_op> \ 1898 ...> > {}; 1899 > using exp_parser20 = build_parser<plus_exp3>; 1900 1901[link getting_started_37 copy-paste friendly version] 1902 1903We can try to give our new parser an invalid input: 1904 1905 > exp_parser20::apply<BOOST_METAPARSE_STRING("hello")>::type 1906 << compilation error >> 1907 ..... x__________________PARSING_FAILED__________________x<1, 1, missing_primary_expression> .... 1908 << compilation error >> 1909 1910The error message is now more specific to the calculator language. This covers 1911only one case, where the error messages can be improved. Other cases (eg. 1912missing closing parens, missing operators, etc) can be covered in a similar way. 1913 1914[endsect] 1915 1916[section 11.3. Missing closing parens] 1917 1918Missing closing parens are common errors. Let's see how our parsers report them: 1919 1920 > exp_parser20::apply<BOOST_METAPARSE_STRING("(1+2")>::type 1921 << compilation error >> 1922 ..... x__________________PARSING_FAILED__________________x<1, 5, unpaired<1, 1, literal_expected<')'>>> .... 1923 << compilation error >> 1924 1925The parser could detect that there is a missing paren and the error report 1926points to the open paren which is not closed. This looks great, but we are not 1927done yet. Let's try a slightly more complex input: 1928 1929 > exp_parser20::apply<BOOST_METAPARSE_STRING("0+(1+2")>::type 1930 mpl_::integral_c<int, 0> 1931 1932This is getting strange now. We parse the `+ <mult_exp>` elements using 1933[link foldl_start_with_parser `foldl_start_with_parser`] (see the definition of 1934`plus_exp3`). [link foldl_start_with_parser `foldl_start_with_parser`] parses 1935the input as long as it can and stops when it fails to parse it. In the above 1936input, it parses `0` as the initial element and then it tries to parse the first 1937`+ <mult_exp>` element. But parsing the `<mult_exp>` part fails because of the 1938missing closing paren. So 1939[link foldl_start_with_parser `foldl_start_with_parser`] stops and ignores this 1940failing part of the input. 1941 1942The result of the above is that we parse only the `0` part of the input, ignore 1943the "garbage" at the end and assume that the value of the expression is `0`. 1944This could be fixed by using [link entire_input `entire_input`]. Our parser 1945would reject the input (because of the "garbage" at the end), but the error 1946message would not be useful. So we take a different approach. 1947 1948When [link foldl_start_with_parser `foldl_start_with_parser`] stops, we should 1949check if there is an extra broken `+ <mult_exp>` there or not. When there is, we 1950should report what is wrong with that broken `+ <mult_exp>` (eg. a missing 1951closing paren). Metaparse provides [link fail_at_first_char_expected 1952`fail_at_first_char_expected`] to implement such validations. 1953[link fail_at_first_char_expected `fail_at_first_char_expected`]`<parser>` 1954checks how `parser` fails to parse the input: when it fails right at the first 1955character, [link fail_at_first_char_expected `fail_at_first_char_expected`] 1956assumes that there is no garbage and accepts the input. When `parser` consumes 1957characters from the input before failing, 1958[link fail_at_first_char_expected `fail_at_first_char_expected`] assumes that 1959there is a broken expression and propagates the error. It can be used the 1960following way: 1961 1962 > #include <boost/metaparse/fail_at_first_char_expected.hpp> 1963 > #include <boost/metaparse/first_of.hpp> 1964 > struct plus_exp4 : \ 1965 ...> first_of< \ 1966 ...> foldl_start_with_parser< \ 1967 ...> sequence<one_of<plus_token, minus_token>, mult_exp6>, \ 1968 ...> mult_exp6, \ 1969 ...> boost::mpl::quote2<binary_op> \ 1970 ...> >, \ 1971 ...> fail_at_first_char_expected< \ 1972 ...> sequence<one_of<plus_token, minus_token>, mult_exp6> \ 1973 ...> > \ 1974 ...> > {}; 1975 > using exp_parser21 = build_parser<plus_exp4>; 1976 1977[link getting_started_38 copy-paste friendly version] 1978 1979[link first_of `first_of`] is similar to [link middle_of `middle_of`], but 1980keeps the result of the first element, not the middle one. We use it to keep the 1981"real" result (the result of 1982[link foldl_start_with_parser `foldl_start_with_parser`]) and to throw the dummy 1983result coming from 1984[link fail_at_first_char_expected `fail_at_first_char_expected`] away when 1985there is no broken expression at the end. [link first_of `first_of`] propagates 1986any error coming from 1987[link fail_at_first_char_expected `fail_at_first_char_expected`]. 1988 1989Let's try this new expression parser out with a missing closing paren: 1990 1991 > exp_parser21::apply<BOOST_METAPARSE_STRING("0+(1+2")>::type 1992 << compilation error >> 1993 ..... x__________________PARSING_FAILED__________________x<1, 7, unpaired<1, 3, literal_expected<')'>>> .... 1994 << compilation error >> 1995 1996This works as expected now: it tells us that there is a missing paren and it 1997points us the open paren which is not closed. 1998 1999[section 11.3.1. Simplifying the parser] 2000 2001Our parser provides useful error messages for missing closing parens, however, 2002the implementation of the parser (`plus_exp4`) is long and repetitive: it 2003contains the parser for the repeated element 2004([link sequence `sequence`]`<`[link one_of `one_of`]`<plus_token, minus_token>, mult_exp6>`) twice, and that is not ideal. 2005 2006`plus_exp4` uses [link foldl_start_with_parser `foldl_start_with_parser`] to 2007implement repetition. Metaparse provides 2008[link foldl_reject_incomplete_start_with_parser `foldl_reject_incomplete_start_with_parser`] 2009which does the same we did with [link first_of `first_of`], 2010[link foldl_start_with_parser `foldl_start_with_parser`] and 2011[link fail_at_first_char_expected `fail_at_first_char_expected`] together: 2012 2013 > #include <boost/metaparse/foldl_reject_incomplete_start_with_parser.hpp> 2014 > struct plus_exp5 : \ 2015 ...> foldl_reject_incomplete_start_with_parser< \ 2016 ...> sequence<one_of<plus_token, minus_token>, mult_exp6>, \ 2017 ...> mult_exp6, \ 2018 ...> boost::mpl::quote2<binary_op> \ 2019 ...> > {}; 2020 > using exp_parser22 = build_parser<plus_exp5>; 2021 2022[link getting_started_39 copy-paste friendly version] 2023 2024It parses the input using 2025[link sequence `sequence`]`<`[link one_of `one_of`]`<plus_token, minus_token>, mult_exp6>`) 2026repeatedly. When it fails, 2027[link foldl_reject_incomplete_start_with_parser `foldl_reject_incomplete_start_with_parser`] 2028checks if it consumed any character before failing (the same as what 2029[link fail_at_first_char_expected `fail_at_first_char_expected`] does), and if 2030yes, then 2031[link foldl_reject_incomplete_start_with_parser `foldl_reject_incomplete_start_with_parser`] 2032fails. 2033 2034This makes the implementation of the repetition with advanced error reporting 2035simpler. Let's try it out: 2036 2037 > exp_parser22::apply<BOOST_METAPARSE_STRING("0+(1+2")>::type 2038 << compilation error >> 2039 ..... x__________________PARSING_FAILED__________________x<1, 7, unpaired<1, 3, literal_expected<')'>>> .... 2040 << compilation error >> 2041 2042Note that other folding parsers have their `f` versions as well (eg. 2043[link foldr_reject_incomplete `foldr_reject_incomplete`], 2044[link foldl_reject_incomplete1 `foldl_reject_incomplete1`], etc). 2045 2046[endsect] 2047[section 11.3.2. Using `foldl_reject_incomplete_start_with_parser` at other places as well] 2048 2049We have replaced one [link foldl_start_with_parser `foldl_start_with_parser`] 2050with 2051[link foldl_reject_incomplete_start_with_parser `foldl_reject_incomplete_start_with_parser`]. 2052Other layers (`mult_exp`, `unary_exp`, etc) use folding as well. Let's use it at 2053all layers: 2054 2055 > struct plus_exp6; 2056 > using paren_exp5 = middle_of<lparen_token, plus_exp6, rparen_token>; 2057 > using primary_exp4 = one_of<int_token, paren_exp5, fail<missing_primary_expression>>; 2058 > using unary_exp4 = \ 2059 ...> foldr_start_with_parser< \ 2060 ...> minus_token, \ 2061 ...> primary_exp4, \ 2062 ...> boost::mpl::lambda<boost::mpl::negate<boost::mpl::_1>>::type \ 2063 ...> >; 2064 > using mult_exp7 = \ 2065 ...> foldl_reject_incomplete_start_with_parser< \ 2066 ...> sequence<one_of<times_token, divides_token>, unary_exp4>, \ 2067 ...> unary_exp4, \ 2068 ...> boost::mpl::quote2<binary_op> \ 2069 ...> >; 2070 > struct plus_exp6 : \ 2071 ...> foldl_reject_incomplete_start_with_parser< \ 2072 ...> sequence<one_of<plus_token, minus_token>, mult_exp7>, \ 2073 ...> mult_exp7, \ 2074 ...> boost::mpl::quote2<binary_op> \ 2075 ...> > {}; 2076 > using exp_parser23 = build_parser<plus_exp6>; 2077 2078[link getting_started_40 copy-paste friendly version] 2079 2080[note 2081Note that `unary_exp4` uses 2082[link foldr_start_with_parser `foldr_start_with_parser`] instead of 2083`foldr_reject_incomplete_start_with_parser`. The reason behind it is that there 2084is no `foldr_reject_incomplete_start_with_parser`. 2085[link foldr_start_with_parser `foldr_start_with_parser`] applies the 2086`primary_exp4` parser when `minus_token` does not accept the input any more. 2087Therefore, it is supposed to catch the errors of incomplete expressions after 2088the repetition. 2089] 2090 2091Let's try different invalid expressions: 2092 2093 > exp_parser23::apply<BOOST_METAPARSE_STRING("1+(2*")>::type 2094 << compilation error >> 2095 ..... x__________________PARSING_FAILED__________________x<1, 6, missing_primary_expression> .... 2096 << compilation error >> 2097 2098 > exp_parser23::apply<BOOST_METAPARSE_STRING("1+(2*3")>::type 2099 << compilation error >> 2100 ..... x__________________PARSING_FAILED__________________x<1, 7, unpaired<1, 3, literal_expected<')'>>> .... 2101 << compilation error >> 2102 2103[endsect] 2104 2105[endsect] 2106 2107[endsect] 2108 2109[section 12. Summary] 2110 2111This tutorial showed you how to build a parser for a calculator language. Now 2112that you understand how to do this, you should be able to use the same 2113techniques and building blocks presented here to build a parser for your own 2114language. You should start building the parser and once you face a problem (eg. 2115you need to add parens or you need better error messages) you can always return 2116to this tutorial and read the section showing you how to deal with those 2117situations. 2118 2119[endsect] 2120 2121[section Copy-paste friendly code examples] 2122 2123[include getting_started_0.qbk] 2124[include getting_started_1.qbk] 2125[include getting_started_2.qbk] 2126[include getting_started_3.qbk] 2127[include getting_started_4.qbk] 2128[include getting_started_5.qbk] 2129[include getting_started_6.qbk] 2130[include getting_started_7.qbk] 2131[include getting_started_8.qbk] 2132[include getting_started_9.qbk] 2133[include getting_started_10.qbk] 2134[include getting_started_11.qbk] 2135[include getting_started_12.qbk] 2136[include getting_started_13.qbk] 2137[include getting_started_14.qbk] 2138[include getting_started_15.qbk] 2139[include getting_started_16.qbk] 2140[include getting_started_17.qbk] 2141[include getting_started_18.qbk] 2142[include getting_started_19.qbk] 2143[include getting_started_20.qbk] 2144[include getting_started_21.qbk] 2145[include getting_started_22.qbk] 2146[include getting_started_23.qbk] 2147[include getting_started_24.qbk] 2148[include getting_started_25.qbk] 2149[include getting_started_26.qbk] 2150[include getting_started_27.qbk] 2151[include getting_started_28.qbk] 2152[include getting_started_29.qbk] 2153[include getting_started_30.qbk] 2154[include getting_started_31.qbk] 2155[include getting_started_32.qbk] 2156[include getting_started_33.qbk] 2157[include getting_started_34.qbk] 2158[include getting_started_35.qbk] 2159[include getting_started_36.qbk] 2160[include getting_started_37.qbk] 2161[include getting_started_38.qbk] 2162[include getting_started_39.qbk] 2163[include getting_started_40.qbk] 2164 2165[endsect] 2166 2167[section Definitions before each section] 2168 2169[include before_3.qbk] 2170[include before_3_1.qbk] 2171[include before_3_2.qbk] 2172[include before_3_3.qbk] 2173[include before_4.qbk] 2174[include before_4_1.qbk] 2175[include before_4_2.qbk] 2176[include before_5.qbk] 2177[include before_5_1.qbk] 2178[include before_5_2.qbk] 2179[include before_5_2_1.qbk] 2180[include before_5_2_2.qbk] 2181[include before_5_2_3.qbk] 2182[include before_5_2_4.qbk] 2183[include before_6.qbk] 2184[include before_6_1.qbk] 2185[include before_6_2.qbk] 2186[include before_7.qbk] 2187[include before_7_1.qbk] 2188[include before_7_2.qbk] 2189[include before_8.qbk] 2190[include before_8_1.qbk] 2191[include before_8_2.qbk] 2192[include before_9.qbk] 2193[include before_10.qbk] 2194[include before_11.qbk] 2195[include before_11_1.qbk] 2196[include before_11_2.qbk] 2197[include before_11_3.qbk] 2198[include before_11_3_1.qbk] 2199[include before_11_3_2.qbk] 2200[include before_12.qbk] 2201 2202[endsect] 2203 2204[endsect] 2205 2206