1[/ 2 / Copyright (c) 2007 Eric Niebler 3 / 4 / Distributed under the Boost Software License, Version 1.0. (See accompanying 5 / file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) 6 /] 7 8 9[/talk about: 10 11 tag types 12 generator metafunctions 13 accessing child nodes 14 metafunctions for tag type and arity 15 deep_copy 16 flatten 17 debug utilities 18 introspection with grammars and proto::matches 19] 20 21[/================================================================================] 22[section:intermediate_form Intermediate Form: 23 Understanding and Introspecting Expressions] 24[/================================================================================] 25 26By now, you know a bit about how to build a front-end for your EDSL "compiler" -- you can define terminals and functions that generate expression templates. But we haven't said anything about the expression templates themselves. What do they look like? What can you do with them? In this section we'll see. 27 28[/=========================] 29[heading The [^expr<>] Type] 30[/=========================] 31 32All Proto expressions are an instantiation of a template called _expr_ (or a wrapper around such an instantiation). When we define a terminal as below, we are really initializing an instance of the _expr_ template. 33 34 // Define a placeholder type 35 template<int I> 36 struct placeholder 37 {}; 38 39 // Define the Protofied placeholder terminal 40 proto::terminal< placeholder<0> >::type const _1 = {{}}; 41 42The actual type of `_1` looks like this: 43 44 proto::expr< proto::tag::terminal, proto::term< placeholder<0> >, 0 > 45 46The _expr_ template is the most important type in Proto. Although you will 47rarely need to deal with it directly, it's always there behind the scenes 48holding your expression trees together. In fact, _expr_ /is/ the expression 49tree -- branches, leaves and all. 50 51The _expr_ template makes up the nodes in expression trees. The first template 52parameter is the node type; in this case, `proto::tag::terminal`. That means 53that `_1` is a leaf-node in the expression tree. The second template parameter 54is a list of child types, or in the case of terminals, the terminal's value 55type. Terminals will always have only one type in the type list. The last 56parameter is the arity of the expression. Terminals have arity 0, unary 57expressions have arity 1, etc. 58 59The _expr_ struct is defined as follows: 60 61 template< typename Tag, typename Args, long Arity = Args::arity > 62 struct expr; 63 64 template< typename Tag, typename Args > 65 struct expr< Tag, Args, 1 > 66 { 67 typedef typename Args::child0 proto_child0; 68 proto_child0 child0; 69 // ... 70 }; 71 72The _expr_ struct does not define a constructor, or anything else that would 73prevent static initialization. All _expr_ objects are initialized using 74['aggregate initialization], with curly braces. In our example, `_1` is 75initialized with the initializer `{{}}`. The outer braces are the initializer 76for the _expr_ struct, and the inner braces are for the member `_1.child0` which 77is of type `placeholder<0>`. Note that we use braces to initialize `_1.child0` 78because `placeholder<0>` is also an aggregate. 79 80[/================================] 81[heading Building Expression Trees] 82[/================================] 83 84The `_1` node is an instantiation of _expr_, and expressions containing 85`_1` are also instantiations of _expr_. To use Proto effectively, you 86won't have to bother yourself with the actual types that Proto generates. 87These are details, but you're likely to encounter these types in compiler 88error messages, so it's helpful to be familiar with them. The types look 89like this: 90 91 // The type of the expression -_1 92 typedef 93 proto::expr< 94 proto::tag::negate 95 , proto::list1< 96 proto::expr< 97 proto::tag::terminal 98 , proto::term< placeholder<0> > 99 , 0 100 > const & 101 > 102 , 1 103 > 104 negate_placeholder_type; 105 106 negate_placeholder_type x = -_1; 107 108 // The type of the expression _1 + 42 109 typedef 110 proto::expr< 111 proto::tag::plus 112 , proto::list2< 113 proto::expr< 114 proto::tag::terminal 115 , proto::term< placeholder<0> > 116 , 0 117 > const & 118 , proto::expr< 119 proto::tag::terminal 120 , proto::term< int const & > 121 , 0 122 > 123 > 124 , 2 125 > 126 placeholder_plus_int_type; 127 128 placeholder_plus_int_type y = _1 + 42; 129 130There are a few things to note about these types: 131 132* Terminals have arity zero, unary expressions have arity one and binary 133 expressions have arity two. 134* When one Proto expression is made a child node of another Proto expression, 135 it is held by reference, ['even if it is a temporary object]. This last 136 point becomes important later. 137* Non-Proto expressions, such as the integer literal, are turned into Proto 138 expressions by wrapping them in new `expr<>` terminal objects. These new 139 wrappers are not themselves held by reference, but the object wrapped /is/. 140 Notice that the type of the Protofied `42` literal is `int const &` -- held 141 by reference. 142 143The types make it clear: everything in a Proto expression tree is held by 144reference. That means that building an expression tree is exceptionally cheap. 145It involves no copying at all. 146 147[note An astute reader will notice that the object `y` defined above will be 148left holding a dangling reference to a temporary int. In the sorts of 149high-performance applications Proto addresses, it is typical to build and 150evaluate an expression tree before any temporary objects go out of scope, so 151this dangling reference situation often doesn't arise, but it is certainly 152something to be aware of. Proto provides utilities for deep-copying expression 153trees so they can be passed around as value types without concern for dangling 154references.] 155 156[/========================================================] 157[section:left_right_child Accessing Parts of an Expression] 158[/========================================================] 159 160After assembling an expression into a tree, you'll naturally want to be 161able to do the reverse, and access a node's children. You may even want 162to be able to iterate over the children with algorithms from the 163Boost.Fusion library. This section shows how. 164 165[/==========================================] 166[heading Getting Expression Tags and Arities] 167[/==========================================] 168 169Every node in an expression tree has both a /tag/ type that describes the node, and an /arity/ corresponding to the number of child nodes it has. You can use the _tag_of_ and _arity_of_ metafunctions to fetch them. Consider the following: 170 171 template<typename Expr> 172 void check_plus_node(Expr const &) 173 { 174 // Assert that the tag type is proto::tag::plus 175 BOOST_STATIC_ASSERT(( 176 boost::is_same< 177 typename proto::tag_of<Expr>::type 178 , proto::tag::plus 179 >::value 180 )); 181 182 // Assert that the arity is 2 183 BOOST_STATIC_ASSERT( proto::arity_of<Expr>::value == 2 ); 184 } 185 186 // Create a binary plus node and use check_plus_node() 187 // to verify its tag type and arity: 188 check_plus_node( proto::lit(1) + 2 ); 189 190For a given type `Expr`, you could access the tag and arity directly as `Expr::proto_tag` and `Expr::proto_arity`, where `Expr::proto_arity` is an MPL Integral Constant. 191 192[/==============================] 193[heading Getting Terminal Values] 194[/==============================] 195 196There is no simpler expression than a terminal, and no more basic operation than extracting its value. As we've already seen, that is what _value_ is for. 197 198 proto::terminal< std::ostream & >::type cout_ = {std::cout}; 199 200 // Get the value of the cout_ terminal: 201 std::ostream & sout = proto::value( cout_ ); 202 203 // Assert that we got back what we put in: 204 assert( &sout == &std::cout ); 205 206To compute the return type of the _value_ function, you can use _result_of_value_. When the parameter to _result_of_value_ is a non-reference type, the result type of the metafunction is the type of the value as suitable for storage by value; that is, top-level reference and qualifiers are stripped from it. But when instantiated with a reference type, the result type has a reference /added/ to it, yielding a type suitable for storage by reference. If you want to know the actual type of the terminal's value including whether it is stored by value or reference, you can use `fusion::result_of::value_at<Expr, 0>::type`. 207 208The following table summarizes the above paragraph. 209 210[def _unless_ [footnote If `T` is a reference-to-function type, then the result type is simply `T`.]] 211 212[table Accessing Value Types 213 [[Metafunction Invocation][When the Value Type Is ...][The Result Is ...]] 214 [[`proto::result_of::value<Expr>::type`][`T`][``typename boost::remove_const< 215 typename boost::remove_reference<T>::type 216>::type _unless_``]] 217 [[`proto::result_of::value<Expr &>::type`][`T`][``typename boost::add_reference<T>::type``]] 218 [[`proto::result_of::value<Expr const &>::type`][`T`][``typename boost::add_reference< 219 typename boost::add_const<T>::type 220>::type``]] 221 [[`fusion::result_of::value_at<Expr, 0>::type`][`T`][`T`]] 222] 223 224[/================================] 225[heading Getting Child Expressions] 226[/================================] 227 228Each non-terminal node in an expression tree corresponds to an operator in an expression, and the children correspond to the operands, or arguments of the operator. To access them, you can use the _child_c_ function template, as demonstrated below: 229 230 proto::terminal<int>::type i = {42}; 231 232 // Get the 0-th operand of an addition operation: 233 proto::terminal<int>::type &ri = proto::child_c<0>( i + 2 ); 234 235 // Assert that we got back what we put in: 236 assert( &i == &ri ); 237 238You can use the _result_of_child_c_ metafunction to get the type of the Nth child of an expression node. Usually you don't care to know whether a child is stored by value or by reference, so when you ask for the type of the Nth child of an expression `Expr` (where `Expr` is not a reference type), you get the child's type after references and cv-qualifiers have been stripped from it. 239 240 template<typename Expr> 241 void test_result_of_child_c(Expr const &expr) 242 { 243 typedef typename proto::result_of::child_c<Expr, 0>::type type; 244 245 // Since Expr is not a reference type, 246 // result_of::child_c<Expr, 0>::type is a 247 // non-cv qualified, non-reference type: 248 BOOST_MPL_ASSERT(( 249 boost::is_same< type, proto::terminal<int>::type > 250 )); 251 } 252 253 // ... 254 proto::terminal<int>::type i = {42}; 255 test_result_of_child_c( i + 2 ); 256 257However, if you ask for the type of the Nth child of `Expr &` or `Expr const &` 258(note the reference), the result type will be a reference, regardless of whether 259the child is actually stored by reference or not. If you need to know exactly 260how the child is stored in the node, whether by reference or by value, you can 261use `fusion::result_of::value_at<Expr, N>::type`. The following table summarizes 262the behavior of the _result_of_child_c_ metafunction. 263 264[table Accessing Child Types 265 [[Metafunction Invocation][When the Child Is ...][The Result Is ...]] 266 [[`proto::result_of::child_c<Expr, N>::type`][`T`][``typename boost::remove_const< 267 typename boost::remove_reference<T>::type 268>::type``]] 269 [[`proto::result_of::child_c<Expr &, N>::type`][`T`][``typename boost::add_reference<T>::type``]] 270 [[`proto::result_of::child_c<Expr const &, N>::type`][`T`][``typename boost::add_reference< 271 typename boost::add_const<T>::type 272>::type``]] 273 [[`fusion::result_of::value_at<Expr, N>::type`][`T`][`T`]] 274] 275 276[/=======================] 277[heading Common Shortcuts] 278[/=======================] 279 280Most operators in C++ are unary or binary, so accessing the only operand, or the left and right operands, are very common operations. For this reason, Proto provides the _child_, _left_, and _right_ functions. _child_ and _left_ are synonymous with `proto::child_c<0>()`, and _right_ is synonymous with `proto::child_c<1>()`. 281 282There are also _result_of_child_, _result_of_left_, and _result_of_right_ metafunctions that merely forward to their _result_of_child_c_ counterparts. 283 284[endsect] 285 286[/===============================] 287[section Deep-copying Expressions] 288[/===============================] 289 290When you build an expression template with Proto, all the intermediate child nodes are held /by reference/. The avoids needless copies, which is crucial if you want your EDSL to perform well at runtime. Naturally, there is a danger if the temporary objects go out of scope before you try to evaluate your expression template. This is especially a problem in C++0x with the new `decltype` and `auto` keywords. Consider: 291 292 // OOPS: "ex" is left holding dangling references 293 auto ex = proto::lit(1) + 2; 294 295The problem can happen in today's C++ also if you use `BOOST_TYPEOF()` or `BOOST_AUTO()`, or if you try to pass an expression template outside the scope of its constituents. 296 297In these cases, you want to deep-copy your expression template so that all intermediate nodes and the terminals are held /by value/. That way, you can safely assign the expression template to a local variable or return it from a function without worrying about dangling references. You can do this with _deep_copy_ as fo 298llows: 299 300 // OK, "ex" has no dangling references 301 auto ex = proto::deep_copy( proto::lit(1) + 2 ); 302 303If you are using _typeof_, it would look like this: 304 305 // OK, use BOOST_AUTO() and proto::deep_copy() to 306 // store an expression template in a local variable 307 BOOST_AUTO( ex, proto::deep_copy( proto::lit(1) + 2 ) ); 308 309For the above code to work, you must include the [headerref boost/proto/proto_typeof.hpp] header, which also defines the _AUTO_ macro which automatically deep-copies its argument. With _AUTO_, the above code can be writen as: 310 311 // OK, BOOST_PROTO_AUTO() automatically deep-copies 312 // its argument: 313 BOOST_PROTO_AUTO( ex, proto::lit(1) + 2 ); 314 315When deep-copying an expression tree, all intermediate nodes and all terminals are stored by value. The only exception is terminals that are function references, which are left alone. 316 317[note _deep_copy_ makes no exception for arrays, which it stores by value. That can potentially cause a large amount of data to be copied.] 318 319[endsect] 320 321[/============================] 322[section Debugging Expressions] 323[/============================] 324 325Proto provides a utility for pretty-printing expression trees that comes in very handy when you're trying to debug your EDSL. It's called _display_expr_, and you pass it the expression to print and optionally, an `std::ostream` to which to send the output. Consider: 326 327 // Use display_expr() to pretty-print an expression tree 328 proto::display_expr( 329 proto::lit("hello") + 42 330 ); 331 332The above code writes this to `std::cout`: 333 334[pre plus( 335 terminal(hello) 336 , terminal(42) 337)] 338 339In order to call _display_expr_, all the terminals in the expression must be Streamable (that is, they can be written to a `std::ostream`). In addition, the tag types must all be Streamable as well. Here is an example that includes a custom terminal type and a custom tag: 340 341 // A custom tag type that is Streamable 342 struct MyTag 343 { 344 friend std::ostream &operator<<(std::ostream &s, MyTag) 345 { 346 return s << "MyTag"; 347 } 348 }; 349 350 // Some other Streamable type 351 struct MyTerminal 352 { 353 friend std::ostream &operator<<(std::ostream &s, MyTerminal) 354 { 355 return s << "MyTerminal"; 356 } 357 }; 358 359 int main() 360 { 361 // Display an expression tree that contains a custom 362 // tag and a user-defined type in a terminal 363 proto::display_expr( 364 proto::make_expr<MyTag>(MyTerminal()) + 42 365 ); 366 } 367 368The above code prints the following: 369 370[pre plus( 371 MyTag( 372 terminal(MyTerminal) 373 ) 374 , terminal(42) 375)] 376 377[endsect] 378 379[/=============================================================] 380[section:tags_and_metafunctions Operator Tags and Metafunctions] 381[/=============================================================] 382 383The following table lists the overloadable C++ operators, the Proto tag types for each, and the name of the metafunctions for generating the corresponding Proto expression types. And as we'll see later, the metafunctions are also usable as grammars for matching such nodes, as well as pass-through transforms. 384 385[table Operators, Tags and Metafunctions 386 [[Operator] 387 [Proto Tag] 388 [Proto Metafunction]] 389 390 [[unary `+`] 391 [`proto::tag::unary_plus`] 392 [`proto::unary_plus<>`]] 393 394 [[unary `-`] 395 [`proto::tag::negate`] 396 [`proto::negate<>`]] 397 398 [[unary `*`] 399 [`proto::tag::dereference`] 400 [`proto::dereference<>`]] 401 402 [[unary `~`] 403 [`proto::tag::complement`] 404 [`proto::complement<>`]] 405 406 [[unary `&`] 407 [`proto::tag::address_of`] 408 [`proto::address_of<>`]] 409 410 [[unary `!`] 411 [`proto::tag::logical_not`] 412 [`proto::logical_not<>`]] 413 414 [[unary prefix `++`] 415 [`proto::tag::pre_inc`] 416 [`proto::pre_inc<>`]] 417 418 [[unary prefix `--`] 419 [`proto::tag::pre_dec`] 420 [`proto::pre_dec<>`]] 421 422 [[unary postfix `++`] 423 [`proto::tag::post_inc`] 424 [`proto::post_inc<>`]] 425 426 [[unary postfix `--`] 427 [`proto::tag::post_dec`] 428 [`proto::post_dec<>`]] 429 430 [[binary `<<`] 431 [`proto::tag::shift_left`] 432 [`proto::shift_left<>`]] 433 434 [[binary `>>`] 435 [`proto::tag::shift_right`] 436 [`proto::shift_right<>`]] 437 438 [[binary `*`] 439 [`proto::tag::multiplies`] 440 [`proto::multiplies<>`]] 441 442 [[binary `/`] 443 [`proto::tag::divides`] 444 [`proto::divides<>`]] 445 446 [[binary `%`] 447 [`proto::tag::modulus`] 448 [`proto::modulus<>`]] 449 450 [[binary `+`] 451 [`proto::tag::plus`] 452 [`proto::plus<>`]] 453 454 [[binary `-`] 455 [`proto::tag::minus`] 456 [`proto::minus<>`]] 457 458 [[binary `<`] 459 [`proto::tag::less`] 460 [`proto::less<>`]] 461 462 [[binary `>`] 463 [`proto::tag::greater`] 464 [`proto::greater<>`]] 465 466 [[binary `<=`] 467 [`proto::tag::less_equal`] 468 [`proto::less_equal<>`]] 469 470 [[binary `>=`] 471 [`proto::tag::greater_equal`] 472 [`proto::greater_equal<>`]] 473 474 [[binary `==`] 475 [`proto::tag::equal_to`] 476 [`proto::equal_to<>`]] 477 478 [[binary `!=`] 479 [`proto::tag::not_equal_to`] 480 [`proto::not_equal_to<>`]] 481 482 [[binary `||`] 483 [`proto::tag::logical_or`] 484 [`proto::logical_or<>`]] 485 486 [[binary `&&`] 487 [`proto::tag::logical_and`] 488 [`proto::logical_and<>`]] 489 490 [[binary `&`] 491 [`proto::tag::bitwise_and`] 492 [`proto::bitwise_and<>`]] 493 494 [[binary `|`] 495 [`proto::tag::bitwise_or`] 496 [`proto::bitwise_or<>`]] 497 498 [[binary `^`] 499 [`proto::tag::bitwise_xor`] 500 [`proto::bitwise_xor<>`]] 501 502 [[binary `,`] 503 [`proto::tag::comma`] 504 [`proto::comma<>`]] 505 506 [[binary `->*`] 507 [`proto::tag::mem_ptr`] 508 [`proto::mem_ptr<>`]] 509 510 [[binary `=`] 511 [`proto::tag::assign`] 512 [`proto::assign<>`]] 513 514 [[binary `<<=`] 515 [`proto::tag::shift_left_assign`] 516 [`proto::shift_left_assign<>`]] 517 518 [[binary `>>=`] 519 [`proto::tag::shift_right_assign`] 520 [`proto::shift_right_assign<>`]] 521 522 [[binary `*=`] 523 [`proto::tag::multiplies_assign`] 524 [`proto::multiplies_assign<>`]] 525 526 [[binary `/=`] 527 [`proto::tag::divides_assign`] 528 [`proto::divides_assign<>`]] 529 530 [[binary `%=`] 531 [`proto::tag::modulus_assign`] 532 [`proto::modulus_assign<>`]] 533 534 [[binary `+=`] 535 [`proto::tag::plus_assign`] 536 [`proto::plus_assign<>`]] 537 538 [[binary `-=`] 539 [`proto::tag::minus_assign`] 540 [`proto::minus_assign<>`]] 541 542 [[binary `&=`] 543 [`proto::tag::bitwise_and_assign`] 544 [`proto::bitwise_and_assign<>`]] 545 546 [[binary `|=`] 547 [`proto::tag::bitwise_or_assign`] 548 [`proto::bitwise_or_assign<>`]] 549 550 [[binary `^=`] 551 [`proto::tag::bitwise_xor_assign`] 552 [`proto::bitwise_xor_assign<>`]] 553 554 [[binary subscript] 555 [`proto::tag::subscript`] 556 [`proto::subscript<>`]] 557 558 [[ternary `?:`] 559 [`proto::tag::if_else_`] 560 [`proto::if_else_<>`]] 561 562 [[n-ary function call] 563 [`proto::tag::function`] 564 [`proto::function<>`]] 565] 566 567[endsect] 568 569[/======================================] 570[section Expressions as Fusion Sequences] 571[/======================================] 572 573Boost.Fusion is a library of iterators, algorithms, containers and adaptors for manipulating heterogeneous sequences. In essence, a Proto expression is just a heterogeneous sequence of its child expressions, and so Proto expressions are valid Fusion random-access sequences. That means you can apply Fusion algorithms to them, transform them, apply Fusion filters and views to them, and access their elements using `fusion::at()`. The things Fusion can do to heterogeneous sequences are beyond the scope of this users' guide, but below is a simple example. It takes a lazy function invocation like `fun(1,2,3,4)` and uses Fusion to print the function arguments in order. 574 575 struct display 576 { 577 template<typename T> 578 void operator()(T const &t) const 579 { 580 std::cout << t << std::endl; 581 } 582 }; 583 584 struct fun_t {}; 585 proto::terminal<fun_t>::type const fun = {{}}; 586 587 // ... 588 fusion::for_each( 589 fusion::transform( 590 // pop_front() removes the "fun" child 591 fusion::pop_front(fun(1,2,3,4)) 592 // Extract the ints from the terminal nodes 593 , proto::functional::value() 594 ) 595 , display() 596 ); 597 598Recall from the Introduction that types in the `proto::functional` namespace 599define function objects that correspond to Proto's free functions. So 600`proto::functional::value()` creates a function object that is equivalent to 601the `proto::value()` function. The above invocation of `fusion::for_each()` 602displays the following: 603 604[pre 6051 6062 6073 6084 609] 610 611Terminals are also valid Fusion sequences. They contain exactly one element: their value. 612 613[/========================================] 614[heading Flattening Proto Expression Tress] 615[/========================================] 616 617Imagine a slight variation of the above example where, instead of iterating over the arguments of a lazy function invocation, we would like to iterate over the terminals in an addition expression: 618 619 proto::terminal<int>::type const _1 = {1}; 620 621 // ERROR: this doesn't work! Why? 622 fusion::for_each( 623 fusion::transform( 624 _1 + 2 + 3 + 4 625 , proto::functional::value() 626 ) 627 , display() 628 ); 629 630The reason this doesn't work is because the expression `_1 + 2 + 3 + 4` does not describe a flat sequence of terminals --- it describes a binary tree. We can treat it as a flat sequence of terminals, however, using Proto's _flatten_ function. _flatten_ returns a view which makes a tree appear as a flat Fusion sequence. If the top-most node has a tag type `T`, then the elements of the flattened sequence are the child nodes that do /not/ have tag type `T`. This process is evaluated recursively. So the above can correctly be written as: 631 632 proto::terminal<int>::type const _1 = {1}; 633 634 // OK, iterate over a flattened view 635 fusion::for_each( 636 fusion::transform( 637 proto::flatten(_1 + 2 + 3 + 4) 638 , proto::functional::value() 639 ) 640 , display() 641 ); 642 643The above invocation of `fusion::for_each()` displays the following: 644 645[pre 6461 6472 6483 6494 650] 651 652[endsect] 653 654[/============================================================================] 655[section:expression_introspection Expression Introspection: Defining a Grammar] 656[/============================================================================] 657 658Expression trees can have a very rich and complicated structure. Often, you need to know some things about an expression's structure before you can process it. This section describes the tools Proto provides for peering inside an expression tree and discovering its structure. And as you'll see in later sections, all the really interesting things you can do with Proto begin right here. 659 660[/===============================================] 661[section:patterns Finding Patterns in Expressions] 662[/===============================================] 663 664Imagine your EDSL is a miniature I/O facility, with iostream operations that execute lazily. You might want expressions representing input operations to be processed by one function, and output operations to be processed by a different function. How would you do that? 665 666The answer is to write patterns (a.k.a, /grammars/) that match the structure of input and output expressions. Proto provides utilities for defining the grammars, and the _matches_ template for checking whether a given expression type matches the grammar. 667 668First, let's define some terminals we can use in our lazy I/O expressions: 669 670 proto::terminal< std::istream & >::type cin_ = { std::cin }; 671 proto::terminal< std::ostream & >::type cout_ = { std::cout }; 672 673Now, we can use `cout_` instead of `std::cout`, and get I/O expression trees that we can execute later. To define grammars that match input and output expressions of the form `cin_ >> i` and `cout_ << 1` we do this: 674 675 struct Input 676 : proto::shift_right< proto::terminal< std::istream & >, proto::_ > 677 {}; 678 679 struct Output 680 : proto::shift_left< proto::terminal< std::ostream & >, proto::_ > 681 {}; 682 683We've seen the template `proto::terminal<>` before, but here we're using it 684without accessing the nested `::type`. When used like this, it is a very simple 685grammar, as are `proto::shift_right<>` and `proto::shift_left<>`. The newcomer 686here is `_` in the `proto` namespace. It is a wildcard that matches anything. 687The `Input` struct is a grammar that matches any right-shift expression that 688has a `std::istream` terminal as its left operand. 689 690We can use these grammars together with the _matches_ template to query at 691compile time whether a given I/O expression type is an input or output 692operation. Consider the following: 693 694 template< typename Expr > 695 void input_output( Expr const & expr ) 696 { 697 if( proto::matches< Expr, Input >::value ) 698 { 699 std::cout << "Input!\n"; 700 } 701 702 if( proto::matches< Expr, Output >::value ) 703 { 704 std::cout << "Output!\n"; 705 } 706 } 707 708 int main() 709 { 710 int i = 0; 711 input_output( cout_ << 1 ); 712 input_output( cin_ >> i ); 713 714 return 0; 715 } 716 717This program prints the following: 718 719[pre 720Output! 721Input! 722] 723 724If we wanted to break the `input_output()` function into two functions, one that handles input expressions and one for output expressions, we can use `boost::enable_if<>`, as follows: 725 726 template< typename Expr > 727 typename boost::enable_if< proto::matches< Expr, Input > >::type 728 input_output( Expr const & expr ) 729 { 730 std::cout << "Input!\n"; 731 } 732 733 template< typename Expr > 734 typename boost::enable_if< proto::matches< Expr, Output > >::type 735 input_output( Expr const & expr ) 736 { 737 std::cout << "Output!\n"; 738 } 739 740This works as the previous version did. However, the following does not compile at all: 741 742 input_output( cout_ << 1 << 2 ); // oops! 743 744What's wrong? The problem is that this expression does not match our grammar. The expression groups as if it were written like `(cout_ << 1) << 2`. It will not match the `Output` grammar, which expects the left operand to be a terminal, not another left-shift operation. We need to fix the grammar. 745 746We notice that in order to verify an expression as input or output, we'll need to recurse down to the bottom-left-most leaf and check that it is a `std::istream` or `std::ostream`. When we get to the terminal, we must stop recursing. We can express this in our grammar using _or_. Here are the correct `Input` and `Output` grammars: 747 748 struct Input 749 : proto::or_< 750 proto::shift_right< proto::terminal< std::istream & >, proto::_ > 751 , proto::shift_right< Input, proto::_ > 752 > 753 {}; 754 755 struct Output 756 : proto::or_< 757 proto::shift_left< proto::terminal< std::ostream & >, proto::_ > 758 , proto::shift_left< Output, proto::_ > 759 > 760 {}; 761 762This may look a little odd at first. We seem to be defining the `Input` and `Output` types in terms of themselves. This is perfectly OK, actually. At the point in the grammar that the `Input` and `Output` types are being used, they are /incomplete/, but by the time we actually evaluate the grammar with _matches_, the types will be complete. These are recursive grammars, and rightly so because they must match a recursive data structure! 763 764Matching an expression such as `cout_ << 1 << 2` against the `Output` grammar procedes as follows: 765 766# The first alternate of the _or_ is tried first. It will fail, because the 767 expression `cout_ << 1 << 2` does not match the grammar `proto::shift_left< 768 proto::terminal< std::ostream & >, proto::_ >`. 769# Then the second alternate is tried next. We match the expression against 770 `proto::shift_left< Output, proto::_ >`. The expression is a left-shift, so we 771 next try to match the operands. 772# The right operand `2` matches `proto::_` trivially. 773# To see if the left operand `cout_ << 1` matches `Output`, we must recursively 774 evaluate the `Output` grammar. This time we succeed, because `cout_ << 1` will 775 match the first alternate of the _or_. 776 777We're done -- the grammar matches successfully. 778 779[endsect] 780 781[/===========================================] 782[section Fuzzy and Exact Matches of Terminals] 783[/===========================================] 784 785The terminals in an expression tree could be const or non-const references, or they might not be references at all. When writing grammars, you usually don't have to worry about it because _matches_ gives you a little wiggle room when matching terminals. A grammar such as `proto::terminal<int>` will match a terminal of type `int`, `int &`, or `int const &`. 786 787You can explicitly specify that you want to match a reference type. If you do, the type must match exactly. For instance, a grammar such as `proto::terminal<int &>` will only match an `int &`. It will not match an `int` or an `int const &`. 788 789The table below shows how Proto matches terminals. The simple rule is: if you want to match only reference types, you must specify the reference in your grammar. Otherwise, leave it off and Proto will ignore const and references. 790 791[table proto::matches<> and Reference / CV-Qualification of Terminals 792 [[Terminal] [Grammar] [Matches?]] 793 [[T] [T] [yes]] 794 [[T &] [T] [yes]] 795 [[T const &] [T] [yes]] 796 [[T] [T &] [no]] 797 [[T &] [T &] [yes]] 798 [[T const &] [T &] [no]] 799 [[T] [T const &] [no]] 800 [[T &] [T const &] [no]] 801 [[T const &] [T const &] [yes]] 802] 803 804This begs the question: What if you want to match an `int`, but not an `int &` or an `int const &`? For forcing exact matches, Proto provides the _exact_ template. For instance, `proto::terminal< proto::exact<int> >` would only match an `int` held by value. 805 806Proto gives you extra wiggle room when matching array types. Array types match themselves or the pointer types they decay to. This is especially useful with character arrays. The type returned by `proto::as_expr("hello")` is `proto::terminal<char const[6]>::type`. That's a terminal containing a 6-element character array. Naturally, you can match this terminal with the grammar `proto::terminal<char const[6]>`, but the grammar `proto::terminal<char const *>` will match it as well, as the following code fragment illustrates. 807 808 struct CharString 809 : proto::terminal< char const * > 810 {}; 811 812 typedef proto::terminal< char const[6] >::type char_array; 813 814 BOOST_MPL_ASSERT(( proto::matches< char_array, CharString > )); 815 816What if we only wanted `CharString` to match terminals of exactly the type `char const *`? You can use _exact_ here to turn off the fuzzy matching of terminals, as follows: 817 818 struct CharString 819 : proto::terminal< proto::exact< char const * > > 820 {}; 821 822 typedef proto::terminal<char const[6]>::type char_array; 823 typedef proto::terminal<char const *>::type char_string; 824 825 BOOST_MPL_ASSERT(( proto::matches< char_string, CharString > )); 826 BOOST_MPL_ASSERT_NOT(( proto::matches< char_array, CharString > )); 827 828Now, `CharString` does not match array types, only character string pointers. 829 830The inverse problem is a little trickier: what if you wanted to match all character arrays, but not character pointers? As mentioned above, the expression `as_expr("hello")` has the type `proto::terminal< char const[ 6 ] >::type`. If you wanted to match character arrays of arbitrary size, you could use `proto::N`, which is an array-size wildcard. The following grammar would match any string literal: `proto::terminal< char const[ proto::N ] >`. 831 832Sometimes you need even more wiggle room when matching terminals. For example, maybe you're building a calculator EDSL and you want to allow any terminals that are convertible to `double`. For that, Proto provides the _convertible_to_ template. You can use it as: `proto::terminal< proto::convertible_to< double > >`. 833 834There is one more way you can perform a fuzzy match on terminals. Consider the problem of trying to match a `std::complex<>` terminal. You can easily match a `std::complex<float>` or a `std::complex<double>`, but how would you match any instantiation of `std::complex<>`? You can use `proto::_` here to solve this problem. Here is the grammar to match any `std::complex<>` instantiation: 835 836 struct StdComplex 837 : proto::terminal< std::complex< proto::_ > > 838 {}; 839 840When given a grammar like this, Proto will deconstruct the grammar and the terminal it is being matched against and see if it can match all the constituents. 841 842[endsect] 843 844[/====================================================] 845[section:if_and_not [^if_<>], [^and_<>], and [^not_<>]] 846[/====================================================] 847 848We've already seen how to use expression generators like `proto::terminal<>` and `proto::shift_right<>` as grammars. We've also seen _or_, which we can use to express a set of alternate grammars. There are a few others of interest; in particular, _if_, _and_ and _not_. 849 850The _not_ template is the simplest. It takes a grammar as a template parameter and logically negates it; `not_<Grammar>` will match any expression that `Grammar` does /not/ match. 851 852The _if_ template is used together with a Proto transform that is evaluated against expression types to find matches. (Proto transforms will be described later.) 853 854The _and_ template is like _or_, except that each argument of the _and_ must match in order for the _and_ to match. As an example, consider the definition of `CharString` above that uses _exact_. It could have been written without _exact_ as follows: 855 856 struct CharString 857 : proto::and_< 858 proto::terminal< proto::_ > 859 , proto::if_< boost::is_same< proto::_value, char const * >() > 860 > 861 {}; 862 863This says that a `CharString` must be a terminal, /and/ its value type must be the same as `char const *`. Notice the template argument of _if_: `boost::is_same< proto::_value, char const * >()`. This is Proto transform that compares the value type of a terminal to `char const *`. 864 865The _if_ template has a couple of variants. In addition to `if_<Condition>` you can also say `if_<Condition, ThenGrammar>` and `if_<Condition, ThenGrammar, ElseGrammar>`. These let you select one sub-grammar or another based on the `Condition`. 866 867[endsect] 868 869[/=======================================================] 870[section:switch Improving Compile Times With [^switch_<>]] 871[/=======================================================] 872 873When your Proto grammar gets large, you'll start to run into some scalability problems with _or_, the construct you use to specify alternate sub-grammars. First, due to limitations in C++, _or_ can only accept up to a certain number of sub-grammars, controlled by the `BOOST_PROTO_MAX_LOGICAL_ARITY` macro. This macro defaults to eight, and you can set it higher, but doing so will aggravate another scalability problem: long compile times. With _or_, alternate sub-grammars are tried in order -- like a series of cascading `if`'s -- leading to lots of unnecessary template instantiations. What you would prefer instead is something like `switch` that avoids the expense of cascading `if`'s. That's the purpose of _switch_; although less convenient than _or_, it improves compile times for larger grammars and does not have an arbitrary fixed limit on the number of sub-grammars. 874 875Let's illustrate how to use _switch_ by first writing a big grammar with _or_ and then translating it to an equivalent grammar using _switch_: 876 877 // Here is a big, inefficient grammar 878 struct ABigGrammar 879 : proto::or_< 880 proto::terminal<int> 881 , proto::terminal<double> 882 , proto::unary_plus<ABigGrammar> 883 , proto::negate<ABigGrammar> 884 , proto::complement<ABigGrammar> 885 , proto::plus<ABigGrammar, ABigGrammar> 886 , proto::minus<ABigGrammar, ABigGrammar> 887 , proto::or_< 888 proto::multiplies<ABigGrammar, ABigGrammar> 889 , proto::divides<ABigGrammar, ABigGrammar> 890 , proto::modulus<ABigGrammar, ABigGrammar> 891 > 892 > 893 {}; 894 895The above might be the grammar to a more elaborate calculator EDSL. Notice that since there are more than eight sub-grammars, we had to chain the sub-grammars with a nested _or_ -- not very nice. 896 897The idea behind _switch_ is to dispatch based on an expression's tag type to a sub-grammar that handles expressions of that type. To use _switch_, you define a struct with a nested `case_<>` template, specialized on tag types. The above grammar can be expressed using _switch_ as follows. It is described below. 898 899 // Redefine ABigGrammar more efficiently using proto::switch_<> 900 struct ABigGrammar; 901 902 struct ABigGrammarCases 903 { 904 // The primary template matches nothing: 905 template<typename Tag> 906 struct case_ 907 : proto::not_<_> 908 {}; 909 }; 910 911 // Terminal expressions are handled here 912 template<> 913 struct ABigGrammarCases::case_<proto::tag::terminal> 914 : proto::or_< 915 proto::terminal<int> 916 , proto::terminal<double> 917 > 918 {}; 919 920 // Non-terminals are handled similarly 921 template<> 922 struct ABigGrammarCases::case_<proto::tag::unary_plus> 923 : proto::unary_plus<ABigGrammar> 924 {}; 925 926 template<> 927 struct ABigGrammarCases::case_<proto::tag::negate> 928 : proto::negate<ABigGrammar> 929 {}; 930 931 template<> 932 struct ABigGrammarCases::case_<proto::tag::complement> 933 : proto::complement<ABigGrammar> 934 {}; 935 936 template<> 937 struct ABigGrammarCases::case_<proto::tag::plus> 938 : proto::plus<ABigGrammar, ABigGrammar> 939 {}; 940 941 template<> 942 struct ABigGrammarCases::case_<proto::tag::minus> 943 : proto::minus<ABigGrammar, ABigGrammar> 944 {}; 945 946 template<> 947 struct ABigGrammarCases::case_<proto::tag::multiplies> 948 : proto::multiplies<ABigGrammar, ABigGrammar> 949 {}; 950 951 template<> 952 struct ABigGrammarCases::case_<proto::tag::divides> 953 : proto::divides<ABigGrammar, ABigGrammar> 954 {}; 955 956 template<> 957 struct ABigGrammarCases::case_<proto::tag::modulus> 958 : proto::modulus<ABigGrammar, ABigGrammar> 959 {}; 960 961 // Define ABigGrammar in terms of ABigGrammarCases 962 // using proto::switch_<> 963 struct ABigGrammar 964 : proto::switch_<ABigGrammarCases> 965 {}; 966 967Matching an expression type `E` against `proto::switch_<C>` is equivalent to matching it against `C::case_<E::proto_tag>`. By dispatching on the expression's tag type, we can jump to the sub-grammar that handles expressions of that type, skipping over all the other sub-grammars that couldn't possibly match. If there is no specialization of `case_<>` for a particular tag type, we select the primary template. In this case, the primary template inherits from `proto::not_<_>` which matches no expressions. 968 969Notice the specialization that handles terminals: 970 971 // Terminal expressions are handled here 972 template<> 973 struct ABigGrammarCases::case_<proto::tag::terminal> 974 : proto::or_< 975 proto::terminal<int> 976 , proto::terminal<double> 977 > 978 {}; 979 980The `proto::tag::terminal` type by itself isn't enough to select an appropriate sub-grammar, so we use _or_ to list the alternate sub-grammars that match terminals. 981 982[note You might be tempted to define your `case_<>` specializations /in situ/ as follows: 983 984`` 985 struct ABigGrammarCases 986 { 987 template<typename Tag> 988 struct case_ : proto::not_<_> {}; 989 990 // ERROR: not legal C++ 991 template<> 992 struct case_<proto::tag::terminal> 993 /* ... */ 994 }; 995`` 996 997Unfortunately, for arcane reasons, it is not legal to define an explicit nested specialization /in situ/ like this. It is, however, perfectly legal to define /partial/ specializations /in situ/, so you can add a extra dummy template parameter that has a default, as follows: 998 999`` 1000 struct ABigGrammarCases 1001 { 1002 // Note extra "Dummy" template parameter here: 1003 template<typename Tag, int Dummy = 0> 1004 struct case_ : proto::not_<_> {}; 1005 1006 // OK: "Dummy" makes this a partial specialization 1007 // instead of an explicit specialization. 1008 template<int Dummy> 1009 struct case_<proto::tag::terminal, Dummy> 1010 /* ... */ 1011 }; 1012`` 1013 1014You might find this cleaner than defining explicit `case_<>` specializations outside of their enclosing struct. 1015] 1016 1017[endsect] 1018 1019[/==================================] 1020[section Matching Vararg Expressions] 1021[/==================================] 1022 1023Not all of C++'s overloadable operators are unary or binary. There is the oddball `operator()` -- the function call operator -- which can have any number of arguments. Likewise, with Proto you may define your own "operators" that could also take more that two arguments. As a result, there may be nodes in your Proto expression tree that have an arbitrary number of children (up to _MAX_ARITY_, which is configurable). How do you write a grammar to match such a node? 1024 1025For such cases, Proto provides the _vararg_ class template. Its template argument is a grammar, and the _vararg_ will match the grammar zero or more times. Consider a Proto lazy function called `fun()` that can take zero or more characters as arguments, as follows: 1026 1027 struct fun_tag {}; 1028 struct FunTag : proto::terminal< fun_tag > {}; 1029 FunTag::type const fun = {{}}; 1030 1031 // example usage: 1032 fun(); 1033 fun('a'); 1034 fun('a', 'b'); 1035 ... 1036 1037Below is the grammar that matches all the allowable invocations of `fun()`: 1038 1039 struct FunCall 1040 : proto::function< FunTag, proto::vararg< proto::terminal< char > > > 1041 {}; 1042 1043The `FunCall` grammar uses _vararg_ to match zero or more character literals as arguments of the `fun()` function. 1044 1045As another example, can you guess what the following grammar matches? 1046 1047 struct Foo 1048 : proto::or_< 1049 proto::terminal< proto::_ > 1050 , proto::nary_expr< proto::_, proto::vararg< Foo > > 1051 > 1052 {}; 1053 1054Here's a hint: the first template parameter to `proto::nary_expr<>` represents the node type, and any additional template parameters represent child nodes. The answer is that this is a degenerate grammar that matches every possible expression tree, from root to leaves. 1055 1056[endsect] 1057 1058[/=============================] 1059[section Defining EDSL Grammars] 1060[/=============================] 1061 1062In this section we'll see how to use Proto to define a grammar for your EDSL and use it to validate expression templates, giving short, readable compile-time errors for invalid expressions. 1063 1064[tip You might think that this is a backwards way of doing things. ["If Proto let 1065me select which operators to overload, my users wouldn't be able to create invalid 1066expressions in the first place, and I wouldn't need a grammar at all!] That may be 1067true, but there are reasons for preferring to do things this way. 1068 1069First, it lets you develop your EDSL rapidly -- all the operators are there for you 1070already! -- and worry about invalid syntax later. 1071 1072Second, it might be the case that some operators are only allowed in certain 1073contexts within your EDSL. This is easy to express with a grammar, and hard to do 1074with straight operator overloading. 1075 1076Third, using an EDSL grammar to flag invalid expressions can often yield better 1077errors than manually selecting the overloaded operators. 1078 1079Fourth, the grammar can be used for more than just validation. You can use your 1080grammar to define ['tree transformations] that convert expression templates into 1081other more useful objects. 1082 1083If none of the above convinces you, you actually /can/ use Proto to control which 1084operators are overloaded within your domain. And to do it, you need to define a 1085grammar!] 1086 1087In a previous section, we used Proto to define an EDSL for a lazily evaluated calculator that allowed any combination of placeholders, floating-point literals, addition, subtraction, multiplication, division and grouping. If we were to write the grammar for this EDSL in [@http://en.wikipedia.org/wiki/Extended_Backus_Naur_Form EBNF], it might look like this: 1088 1089[pre 1090group ::= '(' expression ')' 1091factor ::= double | '_1' | '_2' | group 1092term ::= factor (('\*' factor) | ('/' factor))\* 1093expression ::= term (('+' term) | ('-' term))* 1094] 1095 1096This captures the syntax, associativity and precedence rules of a calculator. 1097Writing the grammar for our calculator EDSL using Proto is /even simpler/. 1098Since we are using C++ as the host language, we are bound to the associativity 1099and precedence rules for the C++ operators. Our grammar can assume them. Also, 1100in C++ grouping is already handled for us with the use of parenthesis, so we 1101don't have to code that into our grammar. 1102 1103Let's begin our grammar for forward-declaring it: 1104 1105 struct CalculatorGrammar; 1106 1107It's an incomplete type at this point, but we'll still be able to use it to 1108define the rules of our grammar. Let's define grammar rules for the terminals: 1109 1110 struct Double 1111 : proto::terminal< proto::convertible_to< double > > 1112 {}; 1113 1114 struct Placeholder1 1115 : proto::terminal< placeholder<0> > 1116 {}; 1117 1118 struct Placeholder2 1119 : proto::terminal< placeholder<1> > 1120 {}; 1121 1122 struct Terminal 1123 : proto::or_< Double, Placeholder1, Placeholder2 > 1124 {}; 1125 1126Now let's define the rules for addition, subtraction, multiplication and division. 1127Here, we can ignore issues of associativity and precedence -- the C++ compiler will 1128enforce that for us. We only must enforce that the arguments to the operators must 1129themselves conform to the `CalculatorGrammar` that we forward-declared above. 1130 1131 struct Plus 1132 : proto::plus< CalculatorGrammar, CalculatorGrammar > 1133 {}; 1134 1135 struct Minus 1136 : proto::minus< CalculatorGrammar, CalculatorGrammar > 1137 {}; 1138 1139 struct Multiplies 1140 : proto::multiplies< CalculatorGrammar, CalculatorGrammar > 1141 {}; 1142 1143 struct Divides 1144 : proto::divides< CalculatorGrammar, CalculatorGrammar > 1145 {}; 1146 1147Now that we've defined all the parts of the grammar, we can define 1148`CalculatorGrammar`: 1149 1150 struct CalculatorGrammar 1151 : proto::or_< 1152 Terminal 1153 , Plus 1154 , Minus 1155 , Multiplies 1156 , Divides 1157 > 1158 {}; 1159 1160That's it! Now we can use `CalculatorGrammar` to enforce that an expression 1161template conforms to our grammar. We can use _matches_ and `BOOST_MPL_ASSERT()` 1162to issue readable compile-time errors for invalid expressions, as below: 1163 1164 template< typename Expr > 1165 void evaluate( Expr const & expr ) 1166 { 1167 BOOST_MPL_ASSERT(( proto::matches< Expr, CalculatorGrammar > )); 1168 // ... 1169 } 1170 1171[endsect] 1172 1173[endsect] 1174 1175[endsect] 1176