1[section Object Code] 2 3Let's look at some assembly. All assembly here was produced with Clang 4.0 4with `-O3`. Given these definitions: 5 6[arithmetic_perf_decls] 7 8Here is a _yap_-based arithmetic function: 9 10[arithmetic_perf_eval_as_yap_expr] 11 12and the assembly it produces: 13 14 arithmetic_perf[0x100001c00] <+0>: pushq %rbp 15 arithmetic_perf[0x100001c01] <+1>: movq %rsp, %rbp 16 arithmetic_perf[0x100001c04] <+4>: mulsd %xmm1, %xmm0 17 arithmetic_perf[0x100001c08] <+8>: addsd %xmm2, %xmm0 18 arithmetic_perf[0x100001c0c] <+12>: movapd %xmm0, %xmm1 19 arithmetic_perf[0x100001c10] <+16>: mulsd %xmm1, %xmm1 20 arithmetic_perf[0x100001c14] <+20>: addsd %xmm0, %xmm1 21 arithmetic_perf[0x100001c18] <+24>: movapd %xmm1, %xmm0 22 arithmetic_perf[0x100001c1c] <+28>: popq %rbp 23 arithmetic_perf[0x100001c1d] <+29>: retq 24 25And for the equivalent function using builtin expressions: 26 27[arithmetic_perf_eval_as_cpp_expr] 28 29the assembly is: 30 31 arithmetic_perf[0x100001e10] <+0>: pushq %rbp 32 arithmetic_perf[0x100001e11] <+1>: movq %rsp, %rbp 33 arithmetic_perf[0x100001e14] <+4>: mulsd %xmm1, %xmm0 34 arithmetic_perf[0x100001e18] <+8>: addsd %xmm2, %xmm0 35 arithmetic_perf[0x100001e1c] <+12>: movapd %xmm0, %xmm1 36 arithmetic_perf[0x100001e20] <+16>: mulsd %xmm1, %xmm1 37 arithmetic_perf[0x100001e24] <+20>: addsd %xmm0, %xmm1 38 arithmetic_perf[0x100001e28] <+24>: movapd %xmm1, %xmm0 39 arithmetic_perf[0x100001e2c] <+28>: popq %rbp 40 arithmetic_perf[0x100001e2d] <+29>: retq 41 42If we increase the number of terminals by a factor of four: 43 44[arithmetic_perf_eval_as_yap_expr_4x] 45 46the results are the same: in this simple case, the _yap_ and builtin 47expressions result in the same object code. 48 49However, increasing the number of terminals by an additional factor of 2.5 50(for a total of 90 terminals), the inliner can no longer do as well for _yap_ 51expressions as for builtin ones. 52 53More complex nonarithmetic code produces more mixed results. For example, here 54is a function using code from the Map Assign example: 55 56 std::map<std::string, int> make_map_with_boost_yap () 57 { 58 return map_list_of 59 ("<", 1) 60 ("<=",2) 61 (">", 3) 62 (">=",4) 63 ("=", 5) 64 ("<>",6) 65 ; 66 } 67 68By contrast, here is the Boost.Assign version of the same function: 69 70 std::map<std::string, int> make_map_with_boost_assign () 71 { 72 return boost::assign::map_list_of 73 ("<", 1) 74 ("<=",2) 75 (">", 3) 76 (">=",4) 77 ("=", 5) 78 ("<>",6) 79 ; 80 } 81 82Here is how you might do it "manually": 83 84 std::map<std::string, int> make_map_manually () 85 { 86 std::map<std::string, int> retval; 87 retval.emplace("<", 1); 88 retval.emplace("<=",2); 89 retval.emplace(">", 3); 90 retval.emplace(">=",4); 91 retval.emplace("=", 5); 92 retval.emplace("<>",6); 93 return retval; 94 } 95 96Finally, here is the same map created from an initializer list: 97 98 std::map<std::string, int> make_map_inializer_list () 99 { 100 std::map<std::string, int> retval = { 101 {"<", 1}, 102 {"<=",2}, 103 {">", 3}, 104 {">=",4}, 105 {"=", 5}, 106 {"<>",6} 107 }; 108 return retval; 109 } 110 111All of these produce roughly the same amount of assembly instructions. 112Benchmarking these four functions with Google Benchmark yields these results: 113 114[table Runtimes of Different Map Constructions 115 [[Function] [Time (ns)]] 116 117 [[make_map_with_boost_yap()] [1285]] 118 [[make_map_with_boost_assign()] [1459]] 119 [[make_map_manually()] [985]] 120 [[make_map_inializer_list()] [954]] 121] 122 123The _yap_-based implementation finishes in the middle of the pack. 124 125In general, the expression trees produced by _yap_ get evaluated down to 126something close to the hand-written equivalent. There is an abstraction 127penalty, but it is small for reasonably-sized expressions. 128 129 130[endsect] 131