1//// 2Copyright 2015-2017 Peter Dimov 3 4Distributed under the Boost Software License, Version 1.0. 5 6See accompanying file LICENSE_1_0.txt or copy at 7http://www.boost.org/LICENSE_1_0.txt 8//// 9 10# Simple {cpp}11 metaprogramming, part 2 11Peter Dimov 122015-06-20 13:toc: left 14:idprefix: 15:docinfo: shared-footer 16 17[.lead] 18__Efficient algorithms for membership testing, random access, and retrieval by 19key__ 20 21NOTE: Being late to the metaprogramming party, I make no claim of having 22invented the techniques in this article. A quick look at the implementations 23of, for example, Louis Dionne's https://github.com/ldionne/mpl11[mpl11] and 24Eric Niebler's https://github.com/ericniebler/meta[meta], shows that most of 25these tricks are already known. Dave Abrahams 26https://github.com/dabrahams/mpl11[has experimented] along these lines in 2012. 27The original inventor of the multiple inheritance trick and the `void*` 28arguments trick is probably Richard Smith, who has posted 29https://llvm.org/bugs/attachment.cgi?id=8825[two] 30https://llvm.org/bugs/attachment.cgi?id=8838[examples] in response to 31https://llvm.org/bugs/show_bug.cgi?id=13263[a Clang bug report]. 32 33## Vectors, sets, and maps 34 35<<simple_cxx11_metaprogramming.adoc#,Last time>>, I outlined a style of 36metaprogramming that operated on type lists -- variadic class templates: 37``` 38template<class... T> struct mp_list {}; 39``` 40Classic Lisp uses lists as its only data structure, but operating on a list is 41usually linear in the number of its elements. 42 43In addition to `list`, the STL has `vector`, `set`, and `map`. `vector` 44supports random access by index; `set` has efficient test for membership; `map` 45associates keys with values and has efficient lookup based on key. 46 47Instead of introducing separate data structure such as `mp_vector`, `mp_set`, 48`mp_map`, we'll keep our data in a list form, and attempt to provide efficient 49algorithms for random access, membership testing, and lookup. 50 51## mp_contains 52 53Let's starts with sets. A set is just a list with unique elements. To obtain a 54set from an arbitrary list, we'll need an algorithm that removes the 55duplicates. Let's call it `mp_unique<L>`: 56[subs=+quotes] 57``` 58// mp_if 59 60template<bool C, class T, class E> struct mp_if_c_impl; 61 62template<class T, class E> struct mp_if_c_impl<true, T, E> 63{ 64 using type = T; 65}; 66 67template<class T, class E> struct mp_if_c_impl<false, T, E> 68{ 69 using type = E; 70}; 71 72template<bool C, class T, class E> 73 using mp_if_c = typename mp_if_c_impl<C, T, E>::type; 74 75template<class C, class T, class E> 76 using mp_if = typename mp_if_c_impl<C::value != 0, T, E>::type; 77 78// mp_unique 79 80template<class L> struct mp_unique_impl; 81 82template<class L> using mp_unique = typename mp_unique_impl<L>::type; 83 84template<template<class...> class L> struct mp_unique_impl<L<>> 85{ 86 using type = L<>; 87}; 88 89template<template<class...> class L, class T1, class... T> 90 struct mp_unique_impl<L<T1, T...>> 91{ 92 using _rest = mp_unique<L<T...>>; 93 using type = mp_if<**mp_contains**<_rest, T1>, _rest, mp_push_front<_rest, T1>>; 94}; 95``` 96For membership testing, we've introduced an algorithm `mp_contains<L, V>` that 97returns `true` when `L` contains `V`. The straightforward recursive 98implementation of `mp_contains` is: 99``` 100template<class L, class V> struct mp_contains_impl; 101 102template<class L, class V> using mp_contains = typename mp_contains_impl<L, V>::type; 103 104template<template<class...> class L, class V> struct mp_contains_impl<L<>, V> 105{ 106 using type = std::false_type; 107}; 108 109template<template<class...> class L, class... T, class V> 110 struct mp_contains_impl<L<V, T...>, V> 111{ 112 using type = std::true_type; 113}; 114 115template<template<class...> class L, class T1, class... T, class V> 116 struct mp_contains_impl<L<T1, T...>, V>: mp_contains_impl<L<T...>, V> 117{ 118}; 119``` 120Note that `mp_unique<L>` makes `N` calls to `mp_contains`, where `N` is the 121length of the list `L`. This means that `mp_contains` needs to be as fast as 122possible, which the above implementation, well, isn't. 123 124Here are the compile times in seconds for invoking `mp_unique` on a list with 125`N` (distinct) elements: 126|=== 127||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 128 129|VC$$++$$ 2013, recursive |2.1 |DNF |||||| 130 131|clang$$++$$ 3.5.1, recursive |0.9 |4.5 |13.2 |30.2 |DNF ||| 132 133|g$$++$$ 4.9.2, recursive |0.7 |3.6 |10.4 |23.2 |DNF ||| 134|=== 135(Tests done under Windows/Cygwin. All compilers are 32 bit. No optimizations. 136DNF stands for "did not finish", which usually means that the compiler ran out 137of heap space or crashed.) 138 139We clearly need a better alternative. 140 141I ended the previous article with an implementation of `mp_contains` that 142relied on `mp_count`, which in turn relied on `mp_plus`. Let's see how it 143fares: 144|=== 145||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 146 147|VC$$++$$ 2013, mp_count/mp_plus |1.1 |9.8 |50.5 |DNF |||| 148 149|clang$$++$$ 3.5.1, mp_count/mp_plus |0.5 |1.4 |3.1 |6.1 |DNF ||| 150 151|g$$++$$ 4.9.2, mp_count/mp_plus |0.5 |1.3 |2.9 |5.8 |9.7 |15.6 |22.4 |32.3 152|=== 153Not _that_ bad, at least if your compiler happens to be `g$$++$$`. Still, there 154ought to be room for improvement here. 155 156To do better, we have to somehow leverage the language features, such as pack 157expansion, to do more of the work for us. For inspiration, let's turn to 158section 14.5.3 paragraph 4 of the {cpp}11 standard, which explains that pack 159expansions can occur in the following contexts: 160 161* **In a function parameter pack (8.3.5); the pattern is the 162 __parameter-declaration__ without the ellipsis.** 163* In a template parameter pack that is a pack expansion (14.1): 164* **In an __initializer-list__ (8.5); the pattern is an 165 __initializer-clause__.** 166* **In a __base-specifier-list__ (Clause 10); the pattern is a 167 __base-specifier__.** 168* In a __mem-initializer-list__ (12.6.2); the pattern is a 169 __mem-initializer__. 170* In a __template-argument-list__ (14.3); the pattern is a 171 __template-argument__. 172* In a __dynamic-exception-specification__ (15.4); the pattern is a 173 __type-id__. 174* In an __attribute-list__ (7.6.1); the pattern is an __attribute__. 175* In an __alignment-specifier__ (7.6.2); the pattern is the 176 __alignment-specifier__ without the ellipsis. 177* In a __capture-list__ (5.1.2); the pattern is a __capture__. 178* In a `sizeof$$...$$` expression (5.3.3); the pattern is an __identifier__. 179 180The **emphasis** is mine and indicates possible leads. 181 182Our first option is to expand the parameter pack into arguments for a function 183call. Since we're interested in operations that occur at compile time, calling 184a function may not appear useful; but {cpp}11 functions can be `constexpr`, and 185`constexpr` function "calls" do occur at compile time. 186 187Recall our `mp_count`: 188``` 189template<class L, class V> struct mp_count_impl; 190 191template<template<class...> class L, class... T, class V> 192 struct mp_count_impl<L<T...>, V> 193{ 194 using type = mp_plus<std::is_same<T, V>...>; 195}; 196 197template<class L, class V> using mp_count = typename mp_count_impl<L, V>::type; 198``` 199Instead of using the template alias `mp_plus` to sum the `is_same` expressions, 200we can use a `constexpr` function: 201``` 202constexpr std::size_t cx_plus() 203{ 204 return 0; 205} 206 207template<class T1, class... T> constexpr std::size_t cx_plus(T1 t1, T... t) 208{ 209 return t1 + cx_plus(t...); 210} 211 212// mp_size_t 213 214template<std::size_t N> using mp_size_t = std::integral_constant<std::size_t, N>; 215 216// mp_count 217 218template<class L, class V> struct mp_count_impl; 219 220template<template<class...> class L, class... T, class V> 221 struct mp_count_impl<L<T...>, V> 222{ 223 using type = mp_size_t<cx_plus(std::is_same<T, V>::value...)>; 224}; 225 226template<class L, class V> using mp_count = typename mp_count_impl<L, V>::type; 227``` 228with the following results: 229|=== 230||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 231 232|clang$$++$$ 3.5.1, mp_count/cx_plus |0.4 |1.1 |2.5 |5.0 |DNF ||| 233 234|g$$++$$ 4.9.2, mp_count/cx_plus |0.4 |0.9 |1.7 |2.9 |4.7 |6.7 |9.2 |11.8 235|=== 236We've improved the times, but lost VC$$++$$ 2013 due to its not implementing 237`constexpr`. 238 239Let's try pack expansion into an __initializer-list__. Instead of passing the 240`is_same` expressions to a function, we can build a constant array out of them, 241then sum the array with a `constexpr` function: 242``` 243constexpr std::size_t cx_plus2(bool const * first, bool const * last) 244{ 245 return first == last? 0: *first + cx_plus2(first + 1, last); 246} 247 248// mp_count 249 250template<class L, class V> struct mp_count_impl; 251 252template<template<class...> class L, class... T, class V> 253 struct mp_count_impl<L<T...>, V> 254{ 255 static constexpr bool _v[] = { std::is_same<T, V>::value... }; 256 using type = mp_size_t<cx_plus2(_v, _v + sizeof...(T))>; 257}; 258 259template<class L, class V> using mp_count = typename mp_count_impl<L, V>::type; 260``` 261This is a neat trick, but is it fast? 262|=== 263||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 264 265|clang$$++$$ 3.5.1, mp_count/cx_plus2 |0.4 |0.9 |1.8 |DNF |||| 266 267|g$$++$$ 4.9.2, mp_count/cx_plus2 |0.4 |0.9 |1.9 |3.4 |5.4 |7.8 |11.0 |14.7 268|=== 269That's a bit disappointing. Let's see what can we do with expanding a parameter 270pack into a base-specifier-list. We would be able to define a class that 271derives from every element of the pack: 272``` 273struct U: T... {}; 274``` 275We can then use `std::is_base_of<V, U>` to test whether a type `V` is a base of 276`U`, that is, whether it's one of the elements of the parameter pack. Which is 277exactly what we need. 278 279Arbitrary types such as `void`, `int`, or `void(int)` can't be used as base 280classes, but we'll wrap the types in an empty class template, which we'll call 281`mp_identity`. 282``` 283template<class T> struct mp_identity 284{ 285 using type = T; 286}; 287 288template<class L, class V> struct mp_contains_impl; 289 290template<class L, class V> using mp_contains = typename mp_contains_impl<L, V>::type; 291 292template<template<class...> class L, class... T, class V> 293 struct mp_contains_impl<L<T...>, V> 294{ 295 struct U: mp_identity<T>... {}; 296 using type = std::is_base_of<mp_identity<V>, U>; 297}; 298``` 299Performance? 300|=== 301||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 302 303|VC$$++$$ 2013, is_base_of |0.3 |0.6 |1.3 |2.5 |DNF ||| 304 305|clang$$++$$ 3.5.1, is_base_of |0.3 |0.4 |0.6 |0.8 |DNF ||| 306 307|g$$++$$ 4.9.2, is_base_of |0.3 |0.4 |0.6 |0.9 |1.3 |1.7 |2.3 |3.0 308|=== 309This implementation is a clear winner. 310 311In fairness, we ought to note that the first four implementations of 312`mp_contains` do not rely on the list elements being unique. This makes 313`mp_contains` an algorithm that supports arbitrary lists, not just sets. 314 315The `is_base_of` implementation, however, does not support lists that contain 316duplicates, because it's not possible to inherit directly from the same type 317twice. So it does not implement the general `mp_contains`, but something that 318should probably be named `mp_set_contains`. 319 320We can avoid the "no duplicates" requirement by modifying the implementation to 321inherit from `mp_identity<T>` indirectly, via an intermediate base class: 322[subs=+macros] 323``` 324// indirect_inherit 325 326template<std::size_t I, class T> struct inherit_second: T {}; 327 328template<class L, class S> struct indirect_inherit_impl; 329 330template<template<class...> class L, class... T, std::size_t... J> 331 struct indirect_inherit_impl<L<T...>, http://en.cppreference.com/w/cpp/utility/integer_sequence[integer_sequence]<std::size_t, J...>>: 332 inherit_second<J, mp_identity<T>>... {}; 333 334template<class L> using indirect_inherit = 335 indirect_inherit_impl<L, http://en.cppreference.com/w/cpp/utility/integer_sequence[make_index_sequence]<mp_size<L>::value>>; 336 337// mp_contains 338 339template<class L, class V> struct mp_contains_impl 340{ 341 using U = indirect_inherit<L>; 342 using type = std::is_base_of<mp_identity<V>, U>; 343}; 344 345template<class L, class V> using mp_contains = typename mp_contains_impl<L, V>::type; 346``` 347This, however, pretty much nullifies the spectacular performance gains we've 348observed with the original `is_base_of`-based implementation: 349|=== 350||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 351 352|VC$$++$$ 2013, recursive |2.1 |DNF |||||| 353 354|VC$$++$$ 2013, mp_count/mp_plus |1.1 |9.8 |50.5 |DNF |||| 355 356|VC$$++$$ 2013, is_base_of |0.3 |0.6 |1.3 |2.5 |DNF ||| 357 358|VC$$++$$ 2013, is_base_of/indirect |1.0 |9.3 |49.5 |153.8 |DNF ||| 359|=== 360|=== 361||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 362 363|clang$$++$$ 3.5.1, recursive |0.9 |4.5 |13.2 |30.2 |DNF ||| 364 365|clang$$++$$ 3.5.1, mp_count/mp_plus |0.5 |1.4 |3.1 |6.1 |DNF ||| 366 367|clang$$++$$ 3.5.1, mp_count/cx_plus |0.4 |1.1 |2.5 |5.0 |DNF ||| 368 369|clang$$++$$ 3.5.1, mp_count/cx_plus2 |0.4 |0.9 |1.8 |DNF |||| 370 371|clang$$++$$ 3.5.1, is_base_of |0.3 |0.4 |0.6 |0.8 |DNF ||| 372 373|clang$$++$$ 3.5.1, is_base_of/indirect |0.4 |0.9 |1.6 |2.5 |DNF ||| 374|=== 375|=== 376||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 377 378|g$$++$$ 4.9.2, recursive |0.7 |3.6 |10.4 |23.2 |DNF ||| 379 380|g$$++$$ 4.9.2, mp_count/mp_plus |0.5 |1.3 |2.9 |5.8 |9.7 |15.6 |22.4 |32.3 381 382|g$$++$$ 4.9.2, mp_count/cx_plus |0.4 |0.9 |1.7 |2.9 |4.7 |6.7 |9.2 |11.8 383 384|g$$++$$ 4.9.2, mp_count/cx_plus2 |0.4 |0.9 |1.9 |3.4 |5.4 |7.8 |11.0 |14.7 385 386|g$$++$$ 4.9.2, is_base_of |0.3 |0.4 |0.6 |0.9 |1.3 |1.7 |2.3 |3.0 387 388|g$$++$$ 4.9.2, is_base_of/indirect |0.5 |1.1 |2.3 |4.0 |6.6 |9.8 |13.6 |18.2 389|=== 390 391## mp_map_find 392 393A map, in the STL sense, is a data structure that associates keys with values 394and can efficiently retrieve, given a key, its associated value. For our 395purposes, a map will be any list of lists for which the inner lists have at 396least one element, the key; the rest of the elements we'll consider to be the 397associated value. For example, the list 398``` 399[[A, B], [C, D, E], [F], [G, H]] 400``` 401is a map with keys `A`, `C`, `F`, and `G`, with associated values `[B]`, 402`[D, E]`, `[]`, and `[H]`, respectively. We'll require unique keys, for reasons 403that'll become evident later. 404 405I'll show two other examples of maps, this time using real {cpp} code: 406``` 407using Map = mp_list<mp_list<int, int*>, mp_list<void, void*>, mp_list<char, char*>>; 408``` 409``` 410using Map2 = std::tuple<std::pair<int, int[2]>, std::pair<char, char[2]>>; 411``` 412The Lisp name of the algorithm that performs retrieval based on key is `ASSOC`, 413but I'll call it `mp_map_find`. `mp_map_find<M, K>` returns the element of `M` 414whose first element is `K`. For example, `mp_map_find<Map2, int>` would return 415`std::pair<int, int[2]>`. If there's no such key, it returns `void`. 416 417There's almost no need to implement and benchmark the recursive version of 418`mp_map_find` -- we can be pretty sure it will perform horribly. Still, 419``` 420template<class M, class K> struct mp_map_find_impl; 421 422template<class M, class K> using mp_map_find = typename mp_map_find_impl<M, K>::type; 423 424template<template<class...> class M, class K> struct mp_map_find_impl<M<>, K> 425{ 426 using type = void; 427}; 428 429template<template<class...> class M, class T1, class... T, class K> 430 struct mp_map_find_impl<M<T1, T...>, K> 431{ 432 using type = mp_if<std::is_same<mp_front<T1>, K>, T1, mp_map_find<M<T...>, K>>; 433}; 434``` 435The compile time, in seconds, for `N` lookups into a map of size `N`, is as 436follows: 437|=== 438||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 439 440|VC$$++$$ 2013, recursive |38.2 |DNF |||||| 441 442|clang$$++$$ 3.5.1, recursive |2.5 |13.7 |DNF ||||| 443 444|g$$++$$ 4.9.2, recursive |1.9 |10.2 |28.8 |DNF |||| 445|=== 446I told you there was no point. 447 448But, I hear some of you say, you're evaluating the else branch even if the 449condition is true, and that's horribly inefficient! 450 451Well, this would only improve the performance by a factor of approximately two 452on average, and only if the element is present, but fine, let's try it. The 453element happens to be present in the benchmark, so let's see. 454``` 455// mp_eval_if 456 457template<bool C, class T, template<class...> class E, class... A> 458 struct mp_eval_if_c_impl; 459 460template<class T, template<class...> class E, class... A> 461 struct mp_eval_if_c_impl<true, T, E, A...> 462{ 463 using type = T; 464}; 465 466template<class T, template<class...> class E, class... A> 467 struct mp_eval_if_c_impl<false, T, E, A...> 468{ 469 using type = E<A...>; 470}; 471 472template<bool C, class T, template<class...> class E, class... A> 473 using mp_eval_if_c = typename mp_eval_if_c_impl<C, T, E, A...>::type; 474 475template<class C, class T, template<class...> class E, class... A> 476 using mp_eval_if = typename mp_eval_if_c_impl<C::value != 0, T, E, A...>::type; 477 478// mp_map_find 479 480template<class M, class K> struct mp_map_find_impl; 481 482template<class M, class K> using mp_map_find = typename mp_map_find_impl<M, K>::type; 483 484template<template<class...> class M, class K> struct mp_map_find_impl<M<>, K> 485{ 486 using type = void; 487}; 488 489template<template<class...> class M, class T1, class... T, class K> 490 struct mp_map_find_impl<M<T1, T...>, K> 491{ 492 using type = mp_eval_if<std::is_same<mp_front<T1>, K>, T1, mp_map_find, M<T...>, K>; 493}; 494``` 495There you go: 496|=== 497||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 498 499|VC$$++$$ 2013, recursive |15.6 |DNF |||||| 500 501|clang$$++$$ 3.5.1, recursive |1.8 |9.5 |DNF ||||| 502 503|g$$++$$ 4.9.2, recursive |1.4 |7.0 |19.7 |DNF |||| 504|=== 505I told you there was no point. 506 507Point or no, to establish that the recursive implementation is inefficient is 508not the same as to come up with an efficient one. There are two things that 509make the `mp_contains` techniques inapplicable to our present case: first, 510`mp_contains` only had to return true or false, whereas `mp_map_find` returns a 511type, and second, in `mp_contains` we knew the exact type of the element for 512which we were looking, whereas here, we only know its `mp_front`. 513 514Fortunately, there does exist a language feature that can solve both: {cpp} can 515deduce the template parameters of base classes when passed a derived class. In 516this example, 517``` 518struct K1 {}; 519struct V1 {}; 520 521struct X: std::pair<K1, V1> {}; 522 523template<class A, class B> void f(std::pair<A, B> const & p); 524 525int main() 526{ 527 f(X()); 528} 529``` 530the call to `f(X())` deduces `A` as `K1` and `B` as `V1`. If we have more than 531one `std::pair` base class, we can fix `A` to be `K1`: 532``` 533struct K1 {}; 534struct V1 {}; 535 536struct K2 {}; 537struct V2 {}; 538 539struct X: std::pair<K1, V1>, std::pair<K2, V2> {}; 540 541template<class B> void f(std::pair<K1, B> const & p); 542 543int main() 544{ 545 f(X()); 546} 547``` 548and `B` will be deduced as `V1`. 549 550We can retrieve the results of the deduction by returning the type we want: 551``` 552template<class B> std::pair<K1, B> f(std::pair<K1, B> const & p); 553``` 554and then using `decltype(f(X()))` to obtain this return type. 555 556What if `X` doesn't have a base of type `std::pair<K1, B>`? The deduction will 557fail and we'll get an error that `f(X())` cannot be called. To avoid it, we can 558add an overload of `f` that takes anything and returns `void`. But in this 559case, what will happen if `X` has two bases of the form that match the first 560`f` overload, such as for example `std::pair<K1, Y>` and `std::pair<K1, Z>`? 561 562The deduction will fail, the second overload will again be chosen and we'll get 563`void`. This is why we require maps to have unique keys. 564 565Here's an implementation of `mp_map_find` based on this technique: 566``` 567template<class M, class K> struct mp_map_find_impl; 568 569template<class M, class K> 570 using mp_map_find = typename mp_map_find_impl<M, K>::type; 571 572template<template<class...> class M, class... T, class K> 573 struct mp_map_find_impl<M<T...>, K> 574{ 575 struct U: mp_identity<T>... {}; 576 577 template<template<class...> class L, class... U> 578 static mp_identity<L<K, U...>> 579 f( mp_identity<L<K, U...>>* ); 580 581 static mp_identity<void> f( ... ); 582 583 using V = decltype( f((U*)0) ); 584 585 using type = typename V::type; 586}; 587``` 588and its corresponding compile times: 589|=== 590||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 591 592|VC$$++$$ 2013, deduction |0.3 |0.7 |1.8 |3.6 |6.4 |10.4 |16.2 |DNF 593 594|clang$$++$$ 3.5.1, deduction |0.3 |0.4 |0.6 |0.9 |1.2 |1.6 |2.2 |2.7 595 596|g$$++$$ 4.9.2, deduction |0.3 |0.5 |0.9 |1.6 |2.3 |3.4 |4.7 |6.3 597|=== 598This looks ready to ship. 599 600The implementation contains one inefficiency though. If we evaluate 601`mp_map_find<M, K1>`, then `mp_map_find<M, K2>`, the two nested `U` types are 602the same as they only depend on `M`, but the compiler doesn't know that and 603will instantiate each one separately. We should move this type outside 604`mp_map_find_impl` so that it can be reused: 605[subs=+quotes] 606``` 607template<class... T> struct **mp_inherit**: T... {}; 608 609template<class M, class K> struct mp_map_find_impl; 610 611template<class M, class K> 612 using mp_map_find = typename mp_map_find_impl<M, K>::type; 613 614template<template<class...> class M, class... T, class K> 615 struct mp_map_find_impl<M<T...>, K> 616{ 617 **using U = mp_inherit<mp_identity<T>...>;** 618 619 template<template<class...> class L, class... U> 620 static mp_identity<L<K, U...>> 621 f( mp_identity<L<K, U...>>* ); 622 623 static mp_identity<void> f( ... ); 624 625 using V = decltype( f((U*)0) ); 626 627 using type = typename V::type; 628}; 629``` 630(This same optimization, by the way, applies to our `is_base_of` implementation 631of `mp_contains`.) 632 633The improvement in compile times on our benchmark is measurable: 634|=== 635||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 636 637|VC$$++$$ 2013, deduction+mp_inherit |0.3 |0.6 |1.4 |2.6 |4.5 |7.1 |10.7 |DNF 638 639|clang$$++$$ 3.5.1, deduction+mp_inherit |0.3 |0.4 |0.6 |0.8 |1.0 |1.4 |1.8 |2.2 640 641|g$$++$$ 4.9.2, deduction+mp_inherit |0.3 |0.4 |0.6 |0.9 |1.3 |1.8 |2.3 |2.9 642|=== 643 644## mp_at 645 646With sets and maps covered, it's time to tackle vectors. Vectors for us are 647just lists, to which we'll need to add the ability to efficiently access an 648element based on its index. The customary name for this accessor is 649`mp_at<L, I>`, where `L` is a list and `I` is an `integral_constant` that 650represents the index. We'll also follow the Boost.MPL convention and add 651`mp_at_c<L, I>`, where `I` is the index of type `size_t`. 652 653The recursive implementation of `mp_at` is: 654``` 655template<class L, std::size_t I> struct mp_at_c_impl; 656 657template<class L, std::size_t I> using mp_at_c = typename mp_at_c_impl<L, I>::type; 658 659template<class L, class I> using mp_at = typename mp_at_c_impl<L, I::value>::type; 660 661template<template<class...> class L, class T1, class... T> 662 struct mp_at_c_impl<L<T1, T...>, 0> 663{ 664 using type = T1; 665}; 666 667template<template<class...> class L, class T1, class... T, std::size_t I> 668 struct mp_at_c_impl<L<T1, T...>, I> 669{ 670 using type = mp_at_c<L<T...>, I-1>; 671}; 672``` 673and the compile times for making `N` calls to `mp_at` with a list of size `N` 674as the first argument are: 675|=== 676||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 677 678|VC$$++$$ 2013, recursive |3.6 |DNF |||||| 679 680|clang$$++$$ 3.5.1, recursive |1.0 |5.1 |15.3 |DNF |||| 681 682|g$$++$$ 4.9.2, recursive |0.9 |4.7 |14.2 |32.4 |DNF ||| 683|=== 684To improve upon this appalling result, we'll again exploit pack expansion into a 685function call, but in a novel way. Let's suppose that we need to access the 686fourth element (`I = 3`). We'll generate the function signature 687``` 688template<class W> W f( void*, void*, void*, W*, ... ); 689``` 690and then, given a list `L<T1, T2, T3, T4, T5, T6, T7>`, we'll evaluate the 691expression 692``` 693decltype( f( (T1*)0, (T2*)0, (T3*)0, (T4*)0, (T5*)0, (T6*)0, (T7*)0 ) ) 694``` 695The three `void*` parameters will eat the first three elements, `W` will be 696deduced as the fourth, and the ellipsis will take care of the rest. 697 698A working implementation based on this technique is shown below: 699``` 700// mp_repeat_c 701 702template<std::size_t N, class... T> struct mp_repeat_c_impl 703{ 704 using _l1 = typename mp_repeat_c_impl<N/2, T...>::type; 705 using _l2 = typename mp_repeat_c_impl<N%2, T...>::type; 706 707 using type = mp_append<_l1, _l1, _l2>; 708}; 709 710template<class... T> struct mp_repeat_c_impl<0, T...> 711{ 712 using type = mp_list<>; 713}; 714 715template<class... T> struct mp_repeat_c_impl<1, T...> 716{ 717 using type = mp_list<T...>; 718}; 719 720template<std::size_t N, class... T> using mp_repeat_c = 721 typename mp_repeat_c_impl<N, T...>::type; 722 723// mp_at 724 725template<class L, class L2> struct mp_at_c_impl; 726 727template<template<class...> class L, class... T, 728 template<class...> class L2, class... U> 729 struct mp_at_c_impl<L<T...>, L2<U...>> 730{ 731 template<class W> static W f( U*..., W*, ... ); 732 733 using R = decltype( f( (mp_identity<T>*)0 ... ) ); 734 735 using type = typename R::type; 736}; 737 738template<class L, std::size_t I> using mp_at_c = 739 typename mp_at_c_impl<L, mp_repeat_c<I, void>>::type; 740 741template<class L, class I> using mp_at = mp_at_c<L, I::value>; 742``` 743and the compile times in the following table show it to be good enough for most 744practical purposes. 745|=== 746||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 747 748|VC$$++$$ 2013, void* |0.4 |1.1 |2.4 |4.7 |DNF ||| 749 750|clang$$++$$ 3.5.1, void* |0.4 |0.7 |1.2 |1.9 |2.7 |3.8 |5.0 |6.6 751 752|g$$++$$ 4.9.2, void* |0.3 |0.5 |0.9 |1.3 |2.1 |3.0 |4.2 |5.5 753|=== 754Are we done with `mp_at`, then? 755 756Let's try something else -- transform the input list `[T1, T2, T3]` into a map 757`[[0, T1], [1, T2], [2, T3]]`, then use `mp_map_find` for the lookup: 758[subs=+macros] 759``` 760// mp_map_from_list 761 762template<class L, class S> struct mp_map_from_list_impl; 763 764template<template<class...> class L, class... T, std::size_t... J> 765 struct mp_map_from_list_impl<L<T...>, http://en.cppreference.com/w/cpp/utility/integer_sequence[integer_sequence]<std::size_t, J...>> 766{ 767 using type = mp_list<mp_list<mp_size_t<J>, T>...>; 768}; 769 770template<class L> using mp_map_from_list = typename mp_map_from_list_impl<L, 771 http://en.cppreference.com/w/cpp/utility/integer_sequence[make_index_sequence]<mp_size<L>::value>>::type; 772 773// mp_at 774 775template<class L, std::size_t I> struct mp_at_c_impl 776{ 777 using map = mp_map_from_list<L>; 778 using type = mp_second<mp_map_find<map, mp_size_t<I>>>; 779}; 780 781template<class L, std::size_t I> using mp_at_c = typename mp_at_c_impl<L, I>::type; 782 783template<class L, class I> using mp_at = typename mp_at_c_impl<L, I::value>::type; 784``` 785At first sight, this looks ridiculous, but metaprogramming has its own rules. 786Let's measure: 787|=== 788||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 789 790|VC$$++$$ 2013, map |0.3 |0.7 |1.5 |2.9 |5.0 |7.8 |11.9 |DNF 791 792|clang$$++$$ 3.5.1, map |0.3 |0.4 |0.6 |0.8 |1.1 |1.5 |1.8 |2.3 793 794|g$$++$$ 4.9.2, map |0.3 |0.4 |0.7 |1.0 |1.4 |1.9 |2.5 |3.2 795|=== 796Surprise, this is the best implementation. 797 798## mp_drop 799 800It turned out that we didn't need the `void*` trick for `mp_at`, but I'll show 801an example where we do: `mp_drop`. `mp_drop<L, N>` returns the list `L` without 802its first `N` elements; or, in other words, it drops the first `N` elements 803(presumably on the cutting room floor) and returns what's left. 804 805To implement `mp_drop`, we just need to change 806``` 807template<class W> W f( void*, void*, void*, W*, ... ); 808``` 809from above to return the rest of the elements, rather than just one: 810``` 811template<class... W> mp_list<W> f( void*, void*, void*, W*... ); 812``` 813Adding the necessary `mp_identity` seasoning produces the following working 814implementation: 815``` 816template<class L, class L2> struct mp_drop_c_impl; 817 818template<template<class...> class L, class... T, 819 template<class...> class L2, class... U> 820 struct mp_drop_c_impl<L<T...>, L2<U...>> 821{ 822 template<class... W> static mp_identity<L<W...>> f( U*..., mp_identity<W>*... ); 823 824 using R = decltype( f( (mp_identity<T>*)0 ... ) ); 825 826 using type = typename R::type; 827}; 828 829template<class L, std::size_t N> using mp_drop_c = 830 typename mp_drop_c_impl<L, mp_repeat_c<N, void>>::type; 831 832template<class L, class N> using mp_drop = mp_drop_c<L, N::value>; 833``` 834I'll skip the recursive implementation and the performance comparison for this 835one. We can pretty much tell who's going to win, and by how much. 836 837## mp_find_index 838 839The final algorithm that I'll bring to your attention is `mp_find_index`. 840`mp_find_index<L, V>` returns an integral constant of type `size_t` with a 841value that is the index of the first occurrence of `V` in `L`. If `V` is not in 842`L`, the return value is the size of `L`. 843 844We'll start with the recursive implementation, as usual: 845``` 846template<class L, class V> struct mp_find_index_impl; 847 848template<class L, class V> using mp_find_index = typename mp_find_index_impl<L, V>::type; 849 850template<template<class...> class L, class V> struct mp_find_index_impl<L<>, V> 851{ 852 using type = mp_size_t<0>; 853}; 854 855template<template<class...> class L, class... T, class V> 856 struct mp_find_index_impl<L<V, T...>, V> 857{ 858 using type = mp_size_t<0>; 859}; 860 861template<template<class...> class L, class T1, class... T, class V> 862 struct mp_find_index_impl<L<T1, T...>, V> 863{ 864 using type = mp_size_t<1 + mp_find_index<L<T...>, V>::value>; 865}; 866``` 867and will continue with the compile times for `N` calls to `mp_find_index` on a 868list with `N` elements, as usual: 869|=== 870||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 871 872|VC$$++$$ 2013, recursive |3.5 |DNF |||||| 873 874|clang$$++$$ 3.5.1, recursive |1.1 |5.5 |DNF ||||| 875 876|g$$++$$ 4.9.2, recursive |0.8 |4.6 |13.6 |DNF |||| 877|=== 878What can we do here? 879 880Let's go back to `mp_contains` and look at the "mp_count/cx_plus2" 881implementation which we rejected in favor of the inheritance-based one. It 882built a `constexpr` array of booleans and summed them in a `constexpr` 883function. We can do the same here, except instead of summing the array, we can 884find the index of the first true value: 885``` 886template<class L, class V> struct mp_find_index_impl; 887 888template<class L, class V> using mp_find_index = typename mp_find_index_impl<L, V>::type; 889 890template<template<class...> class L, class V> struct mp_find_index_impl<L<>, V> 891{ 892 using type = mp_size_t<0>; 893}; 894 895constexpr std::size_t cx_find_index( bool const * first, bool const * last ) 896{ 897 return first == last || *first? 0: 1 + cx_find_index( first + 1, last ); 898} 899 900template<template<class...> class L, class... T, class V> 901 struct mp_find_index_impl<L<T...>, V> 902{ 903 static constexpr bool _v[] = { std::is_same<T, V>::value... }; 904 905 using type = mp_size_t< cx_find_index( _v, _v + sizeof...(T) ) >; 906}; 907``` 908The performance of this version is: 909|=== 910||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 911 912|clang$$++$$ 3.5.1, constexpr |0.5 |1.3 |2.9 |DNF |||| 913 914|g$$++$$ 4.9.2, constexpr |0.5 |1.4 |3.1 |5.5 |8.7 |13.0 |18.0 |DNF 915|=== 916which, while not ideal, is significantly better than our recursive punching 917bag. But if our compiler of choice is VC$$++$$ 2013, we can't use `constexpr`. 918 919We may attempt an implementation along the same lines, but with the `constexpr` 920function replaced with ordinary metaprogramming: 921``` 922template<class L, class V> struct mp_find_index_impl; 923 924template<class L, class V> using mp_find_index = typename mp_find_index_impl<L, V>::type; 925 926template<template<class...> class L, class V> struct mp_find_index_impl<L<>, V> 927{ 928 using type = mp_size_t<0>; 929}; 930 931template<bool...> struct find_index_impl_; 932 933template<> struct find_index_impl_<> 934{ 935 static const std::size_t value = 0; 936}; 937 938template<bool B1, bool... R> struct find_index_impl_<B1, R...> 939{ 940 static const std::size_t value = B1? 0: 1 + find_index_impl_<R...>::value; 941}; 942 943template<bool B1, bool B2, bool B3, bool B4, bool B5, 944 bool B6, bool B7, bool B8, bool B9, bool B10, bool... R> 945 struct find_index_impl_<B1, B2, B3, B4, B5, B6, B7, B8, B9, B10, R...> 946{ 947 static const std::size_t value = B1? 0: B2? 1: B3? 2: B4? 3: B5? 4: 948 B6? 5: B7? 6: B8? 7: B9? 8: B10? 9: 10 + find_index_impl_<R...>::value; 949}; 950 951template<template<class...> class L, class... T, class V> 952 struct mp_find_index_impl<L<T...>, V> 953{ 954 using type = mp_size_t<find_index_impl_<std::is_same<T, V>::value...>::value>; 955}; 956``` 957This is still recursive, so we don't expect miracles, but it wouldn't hurt to 958measure: 959|=== 960||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 961 962|VC$$++$$ 2013, bool... |4.7 |94.5 |488.3 |XFA |||| 963 964|clang$$++$$ 3.5.1, bool... |0.6 |2.2 |5.8 |12.0 |21.7 |35.2 |DNF | 965 966|g$$++$$ 4.9.2, bool... |0.6 |2.4 |6.5 |13.2 |23.8 |39.1 |59.0 |DNF 967|=== 968(where XFA stands for "experimenter fell asleep".) 969 970This is an interesting tradeoff for VC$$++$$ 2013 and Clang. On the one hand, 971this implementation is slower; on the other, it doesn't crash the compiler as 972easily. Which to prefer is a matter of taste and of stern evaluation of one's 973needs to manipulate type lists of length 300. 974 975Note that once we have `mp_drop` and `mp_find_index`, we can derive the 976`mp_find<L, V>` algorithm, which returns the suffix of `L` starting with the 977first occurrence of `V`, if any, and an empty list otherwise, by using 978`mp_drop<L, mp_find_index<L, V>>`. 979 980## Conclusion 981 982In this article, I have shown efficient algorithms that allow us to treat type 983lists as sets, maps and vectors, demonstrating various {cpp}11 implementation 984techniques in the process. 985