• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1////
2Copyright 2015-2017 Peter Dimov
3
4Distributed under the Boost Software License, Version 1.0.
5
6See accompanying file LICENSE_1_0.txt or copy at
7http://www.boost.org/LICENSE_1_0.txt
8////
9
10# Simple {cpp}11 metaprogramming, part 2
11Peter Dimov
122015-06-20
13:toc: left
14:idprefix:
15:docinfo: shared-footer
16
17[.lead]
18__Efficient algorithms for membership testing, random access, and retrieval by
19key__
20
21NOTE: Being late to the metaprogramming party, I make no claim of having
22invented the techniques in this article. A quick look at the implementations
23of, for example, Louis Dionne's https://github.com/ldionne/mpl11[mpl11] and
24Eric Niebler's https://github.com/ericniebler/meta[meta], shows that most of
25these tricks are already known. Dave Abrahams
26https://github.com/dabrahams/mpl11[has experimented] along these lines in 2012.
27The original inventor of the multiple inheritance trick and the `void*`
28arguments trick is probably Richard Smith, who has posted
29https://llvm.org/bugs/attachment.cgi?id=8825[two]
30https://llvm.org/bugs/attachment.cgi?id=8838[examples] in response to
31https://llvm.org/bugs/show_bug.cgi?id=13263[a Clang bug report].
32
33## Vectors, sets, and maps
34
35<<simple_cxx11_metaprogramming.adoc#,Last time>>, I outlined a style of
36metaprogramming that operated on type lists -- variadic class templates:
37```
38template<class... T> struct mp_list {};
39```
40Classic Lisp uses lists as its only data structure, but operating on a list is
41usually linear in the number of its elements.
42
43In addition to `list`, the STL has `vector`, `set`, and `map`. `vector`
44supports random access by index; `set` has efficient test for membership; `map`
45associates keys with values and has efficient lookup based on key.
46
47Instead of introducing separate data structure such as `mp_vector`, `mp_set`,
48`mp_map`, we'll keep our data in a list form, and attempt to provide efficient
49algorithms for random access, membership testing, and lookup.
50
51## mp_contains
52
53Let's starts with sets. A set is just a list with unique elements. To obtain a
54set from an arbitrary list, we'll need an algorithm that removes the
55duplicates. Let's call it `mp_unique<L>`:
56[subs=+quotes]
57```
58// mp_if
59
60template<bool C, class T, class E> struct mp_if_c_impl;
61
62template<class T, class E> struct mp_if_c_impl<true, T, E>
63{
64    using type = T;
65};
66
67template<class T, class E> struct mp_if_c_impl<false, T, E>
68{
69    using type = E;
70};
71
72template<bool C, class T, class E>
73    using mp_if_c = typename mp_if_c_impl<C, T, E>::type;
74
75template<class C, class T, class E>
76    using mp_if = typename mp_if_c_impl<C::value != 0, T, E>::type;
77
78// mp_unique
79
80template<class L> struct mp_unique_impl;
81
82template<class L> using mp_unique = typename mp_unique_impl<L>::type;
83
84template<template<class...> class L> struct mp_unique_impl<L<>>
85{
86    using type = L<>;
87};
88
89template<template<class...> class L, class T1, class... T>
90    struct mp_unique_impl<L<T1, T...>>
91{
92    using _rest = mp_unique<L<T...>>;
93    using type = mp_if<**mp_contains**<_rest, T1>, _rest, mp_push_front<_rest, T1>>;
94};
95```
96For membership testing, we've introduced an algorithm `mp_contains<L, V>` that
97returns `true` when `L` contains `V`. The straightforward recursive
98implementation of `mp_contains` is:
99```
100template<class L, class V> struct mp_contains_impl;
101
102template<class L, class V> using mp_contains = typename mp_contains_impl<L, V>::type;
103
104template<template<class...> class L, class V> struct mp_contains_impl<L<>, V>
105{
106    using type = std::false_type;
107};
108
109template<template<class...> class L, class... T, class V>
110    struct mp_contains_impl<L<V, T...>, V>
111{
112    using type = std::true_type;
113};
114
115template<template<class...> class L, class T1, class... T, class V>
116    struct mp_contains_impl<L<T1, T...>, V>: mp_contains_impl<L<T...>, V>
117{
118};
119```
120Note that `mp_unique<L>` makes `N` calls to `mp_contains`, where `N` is the
121length of the list `L`. This means that `mp_contains` needs to be as fast as
122possible, which the above implementation, well, isn't.
123
124Here are the compile times in seconds for invoking `mp_unique` on a list with
125`N` (distinct) elements:
126|===
127||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
128
129|VC$$++$$ 2013, recursive |2.1 |DNF ||||||
130
131|clang$$++$$ 3.5.1, recursive |0.9 |4.5 |13.2 |30.2 |DNF |||
132
133|g$$++$$ 4.9.2, recursive |0.7 |3.6 |10.4 |23.2 |DNF |||
134|===
135(Tests done under Windows/Cygwin. All compilers are 32 bit. No optimizations.
136DNF stands for "did not finish", which usually means that the compiler ran out
137of heap space or crashed.)
138
139We clearly need a better alternative.
140
141I ended the previous article with an implementation of `mp_contains` that
142relied on `mp_count`, which in turn relied on `mp_plus`. Let's see how it
143fares:
144|===
145||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
146
147|VC$$++$$ 2013, mp_count/mp_plus |1.1 |9.8 |50.5 |DNF ||||
148
149|clang$$++$$ 3.5.1, mp_count/mp_plus |0.5 |1.4 |3.1 |6.1 |DNF |||
150
151|g$$++$$ 4.9.2, mp_count/mp_plus |0.5 |1.3 |2.9 |5.8 |9.7 |15.6 |22.4 |32.3
152|===
153Not _that_ bad, at least if your compiler happens to be `g$$++$$`. Still, there
154ought to be room for improvement here.
155
156To do better, we have to somehow leverage the language features, such as pack
157expansion, to do more of the work for us. For inspiration, let's turn to
158section 14.5.3 paragraph 4 of the {cpp}11 standard, which explains that pack
159expansions can occur in the following contexts:
160
161* **In a function parameter pack (8.3.5); the pattern is the
162  __parameter-declaration__ without the ellipsis.**
163* In a template parameter pack that is a pack expansion (14.1):
164* **In an __initializer-list__ (8.5); the pattern is an
165  __initializer-clause__.**
166* **In a __base-specifier-list__ (Clause 10); the pattern is a
167  __base-specifier__.**
168* In a __mem-initializer-list__ (12.6.2); the pattern is a
169  __mem-initializer__.
170* In a __template-argument-list__ (14.3); the pattern is a
171  __template-argument__.
172* In a __dynamic-exception-specification__ (15.4); the pattern is a
173  __type-id__.
174* In an __attribute-list__ (7.6.1); the pattern is an __attribute__.
175* In an __alignment-specifier__ (7.6.2); the pattern is the
176  __alignment-specifier__ without the ellipsis.
177* In a __capture-list__ (5.1.2); the pattern is a __capture__.
178* In a `sizeof$$...$$` expression (5.3.3); the pattern is an __identifier__.
179
180The **emphasis** is mine and indicates possible leads.
181
182Our first option is to expand the parameter pack into arguments for a function
183call. Since we're interested in operations that occur at compile time, calling
184a function may not appear useful; but {cpp}11 functions can be `constexpr`, and
185`constexpr` function "calls" do occur at compile time.
186
187Recall our `mp_count`:
188```
189template<class L, class V> struct mp_count_impl;
190
191template<template<class...> class L, class... T, class V>
192    struct mp_count_impl<L<T...>, V>
193{
194    using type = mp_plus<std::is_same<T, V>...>;
195};
196
197template<class L, class V> using mp_count = typename mp_count_impl<L, V>::type;
198```
199Instead of using the template alias `mp_plus` to sum the `is_same` expressions,
200we can use a `constexpr` function:
201```
202constexpr std::size_t cx_plus()
203{
204    return 0;
205}
206
207template<class T1, class... T> constexpr std::size_t cx_plus(T1 t1, T... t)
208{
209    return t1 + cx_plus(t...);
210}
211
212// mp_size_t
213
214template<std::size_t N> using mp_size_t = std::integral_constant<std::size_t, N>;
215
216// mp_count
217
218template<class L, class V> struct mp_count_impl;
219
220template<template<class...> class L, class... T, class V>
221    struct mp_count_impl<L<T...>, V>
222{
223    using type = mp_size_t<cx_plus(std::is_same<T, V>::value...)>;
224};
225
226template<class L, class V> using mp_count = typename mp_count_impl<L, V>::type;
227```
228with the following results:
229|===
230||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
231
232|clang$$++$$ 3.5.1, mp_count/cx_plus |0.4 |1.1 |2.5 |5.0 |DNF |||
233
234|g$$++$$ 4.9.2, mp_count/cx_plus |0.4 |0.9 |1.7 |2.9 |4.7 |6.7 |9.2 |11.8
235|===
236We've improved the times, but lost VC$$++$$ 2013 due to its not implementing
237`constexpr`.
238
239Let's try pack expansion into an __initializer-list__. Instead of passing the
240`is_same` expressions to a function, we can build a constant array out of them,
241then sum the array with a `constexpr` function:
242```
243constexpr std::size_t cx_plus2(bool const * first, bool const * last)
244{
245    return first == last? 0: *first + cx_plus2(first + 1, last);
246}
247
248// mp_count
249
250template<class L, class V> struct mp_count_impl;
251
252template<template<class...> class L, class... T, class V>
253    struct mp_count_impl<L<T...>, V>
254{
255    static constexpr bool _v[] = { std::is_same<T, V>::value... };
256    using type = mp_size_t<cx_plus2(_v, _v + sizeof...(T))>;
257};
258
259template<class L, class V> using mp_count = typename mp_count_impl<L, V>::type;
260```
261This is a neat trick, but is it fast?
262|===
263||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
264
265|clang$$++$$ 3.5.1, mp_count/cx_plus2 |0.4 |0.9 |1.8 |DNF ||||
266
267|g$$++$$ 4.9.2, mp_count/cx_plus2 |0.4 |0.9 |1.9 |3.4 |5.4 |7.8 |11.0 |14.7
268|===
269That's a bit disappointing. Let's see what can we do with expanding a parameter
270pack into a base-specifier-list. We would be able to define a class that
271derives from every element of the pack:
272```
273struct U: T... {};
274```
275We can then use `std::is_base_of<V, U>` to test whether a type `V` is a base of
276`U`, that is, whether it's one of the elements of the parameter pack. Which is
277exactly what we need.
278
279Arbitrary types such as `void`, `int`, or `void(int)` can't be used as base
280classes, but we'll wrap the types in an empty class template, which we'll call
281`mp_identity`.
282```
283template<class T> struct mp_identity
284{
285    using type = T;
286};
287
288template<class L, class V> struct mp_contains_impl;
289
290template<class L, class V> using mp_contains = typename mp_contains_impl<L, V>::type;
291
292template<template<class...> class L, class... T, class V>
293    struct mp_contains_impl<L<T...>, V>
294{
295    struct U: mp_identity<T>... {};
296    using type = std::is_base_of<mp_identity<V>, U>;
297};
298```
299Performance?
300|===
301||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
302
303|VC$$++$$ 2013, is_base_of |0.3 |0.6 |1.3 |2.5 |DNF |||
304
305|clang$$++$$ 3.5.1, is_base_of |0.3 |0.4 |0.6 |0.8 |DNF |||
306
307|g$$++$$ 4.9.2, is_base_of |0.3 |0.4 |0.6 |0.9 |1.3 |1.7 |2.3 |3.0
308|===
309This implementation is a clear winner.
310
311In fairness, we ought to note that the first four implementations of
312`mp_contains` do not rely on the list elements being unique. This makes
313`mp_contains` an algorithm that supports arbitrary lists, not just sets.
314
315The `is_base_of` implementation, however, does not support lists that contain
316duplicates, because it's not possible to inherit directly from the same type
317twice. So it does not implement the general `mp_contains`, but something that
318should probably be named `mp_set_contains`.
319
320We can avoid the "no duplicates" requirement by modifying the implementation to
321inherit from `mp_identity<T>` indirectly, via an intermediate base class:
322[subs=+macros]
323```
324// indirect_inherit
325
326template<std::size_t I, class T> struct inherit_second: T {};
327
328template<class L, class S> struct indirect_inherit_impl;
329
330template<template<class...> class L, class... T, std::size_t... J>
331    struct indirect_inherit_impl<L<T...>, http://en.cppreference.com/w/cpp/utility/integer_sequence[integer_sequence]<std::size_t, J...>>:
332        inherit_second<J, mp_identity<T>>... {};
333
334template<class L> using indirect_inherit =
335    indirect_inherit_impl<L, http://en.cppreference.com/w/cpp/utility/integer_sequence[make_index_sequence]<mp_size<L>::value>>;
336
337// mp_contains
338
339template<class L, class V> struct mp_contains_impl
340{
341    using U = indirect_inherit<L>;
342    using type = std::is_base_of<mp_identity<V>, U>;
343};
344
345template<class L, class V> using mp_contains = typename mp_contains_impl<L, V>::type;
346```
347This, however, pretty much nullifies the spectacular performance gains we've
348observed with the original `is_base_of`-based implementation:
349|===
350||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
351
352|VC$$++$$ 2013, recursive |2.1 |DNF ||||||
353
354|VC$$++$$ 2013, mp_count/mp_plus |1.1 |9.8 |50.5 |DNF ||||
355
356|VC$$++$$ 2013, is_base_of |0.3 |0.6 |1.3 |2.5 |DNF |||
357
358|VC$$++$$ 2013, is_base_of/indirect |1.0 |9.3 |49.5 |153.8 |DNF |||
359|===
360|===
361||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
362
363|clang$$++$$ 3.5.1, recursive |0.9 |4.5 |13.2 |30.2 |DNF |||
364
365|clang$$++$$ 3.5.1, mp_count/mp_plus |0.5 |1.4 |3.1 |6.1 |DNF |||
366
367|clang$$++$$ 3.5.1, mp_count/cx_plus |0.4 |1.1 |2.5 |5.0 |DNF |||
368
369|clang$$++$$ 3.5.1, mp_count/cx_plus2 |0.4 |0.9 |1.8 |DNF ||||
370
371|clang$$++$$ 3.5.1, is_base_of |0.3 |0.4 |0.6 |0.8 |DNF |||
372
373|clang$$++$$ 3.5.1, is_base_of/indirect |0.4 |0.9 |1.6 |2.5 |DNF |||
374|===
375|===
376||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
377
378|g$$++$$ 4.9.2, recursive |0.7 |3.6 |10.4 |23.2 |DNF |||
379
380|g$$++$$ 4.9.2, mp_count/mp_plus |0.5 |1.3 |2.9 |5.8 |9.7 |15.6 |22.4 |32.3
381
382|g$$++$$ 4.9.2, mp_count/cx_plus |0.4 |0.9 |1.7 |2.9 |4.7 |6.7 |9.2 |11.8
383
384|g$$++$$ 4.9.2, mp_count/cx_plus2 |0.4 |0.9 |1.9 |3.4 |5.4 |7.8 |11.0 |14.7
385
386|g$$++$$ 4.9.2, is_base_of |0.3 |0.4 |0.6 |0.9 |1.3 |1.7 |2.3 |3.0
387
388|g$$++$$ 4.9.2, is_base_of/indirect |0.5 |1.1 |2.3 |4.0 |6.6 |9.8 |13.6 |18.2
389|===
390
391## mp_map_find
392
393A map, in the STL sense, is a data structure that associates keys with values
394and can efficiently retrieve, given a key, its associated value. For our
395purposes, a map will be any list of lists for which the inner lists have at
396least one element, the key; the rest of the elements we'll consider to be the
397associated value. For example, the list
398```
399[[A, B], [C, D, E], [F], [G, H]]
400```
401is a map with keys `A`, `C`, `F`, and `G`, with associated values `[B]`,
402`[D, E]`, `[]`, and `[H]`, respectively. We'll require unique keys, for reasons
403that'll become evident later.
404
405I'll show two other examples of maps, this time using real {cpp} code:
406```
407using Map = mp_list<mp_list<int, int*>, mp_list<void, void*>, mp_list<char, char*>>;
408```
409```
410using Map2 = std::tuple<std::pair<int, int[2]>, std::pair<char, char[2]>>;
411```
412The Lisp name of the algorithm that performs retrieval based on key is `ASSOC`,
413but I'll call it `mp_map_find`. `mp_map_find<M, K>` returns the element of `M`
414whose first element is `K`. For example, `mp_map_find<Map2, int>` would return
415`std::pair<int, int[2]>`. If there's no such key, it returns `void`.
416
417There's almost no need to implement and benchmark the recursive version of
418`mp_map_find` -- we can be pretty sure it will perform horribly. Still,
419```
420template<class M, class K> struct mp_map_find_impl;
421
422template<class M, class K> using mp_map_find = typename mp_map_find_impl<M, K>::type;
423
424template<template<class...> class M, class K> struct mp_map_find_impl<M<>, K>
425{
426    using type = void;
427};
428
429template<template<class...> class M, class T1, class... T, class K>
430    struct mp_map_find_impl<M<T1, T...>, K>
431{
432    using type = mp_if<std::is_same<mp_front<T1>, K>, T1, mp_map_find<M<T...>, K>>;
433};
434```
435The compile time, in seconds, for `N` lookups into a map of size `N`, is as
436follows:
437|===
438||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
439
440|VC$$++$$ 2013, recursive |38.2 |DNF ||||||
441
442|clang$$++$$ 3.5.1, recursive |2.5 |13.7 |DNF |||||
443
444|g$$++$$ 4.9.2, recursive |1.9 |10.2 |28.8 |DNF ||||
445|===
446I told you there was no point.
447
448But, I hear some of you say, you're evaluating the else branch even if the
449condition is true, and that's horribly inefficient!
450
451Well, this would only improve the performance by a factor of approximately two
452on average, and only if the element is present, but fine, let's try it. The
453element happens to be present in the benchmark, so let's see.
454```
455// mp_eval_if
456
457template<bool C, class T, template<class...> class E, class... A>
458    struct mp_eval_if_c_impl;
459
460template<class T, template<class...> class E, class... A>
461    struct mp_eval_if_c_impl<true, T, E, A...>
462{
463    using type = T;
464};
465
466template<class T, template<class...> class E, class... A>
467    struct mp_eval_if_c_impl<false, T, E, A...>
468{
469    using type = E<A...>;
470};
471
472template<bool C, class T, template<class...> class E, class... A>
473    using mp_eval_if_c = typename mp_eval_if_c_impl<C, T, E, A...>::type;
474
475template<class C, class T, template<class...> class E, class... A>
476    using mp_eval_if = typename mp_eval_if_c_impl<C::value != 0, T, E, A...>::type;
477
478// mp_map_find
479
480template<class M, class K> struct mp_map_find_impl;
481
482template<class M, class K> using mp_map_find = typename mp_map_find_impl<M, K>::type;
483
484template<template<class...> class M, class K> struct mp_map_find_impl<M<>, K>
485{
486    using type = void;
487};
488
489template<template<class...> class M, class T1, class... T, class K>
490    struct mp_map_find_impl<M<T1, T...>, K>
491{
492    using type = mp_eval_if<std::is_same<mp_front<T1>, K>, T1, mp_map_find, M<T...>, K>;
493};
494```
495There you go:
496|===
497||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
498
499|VC$$++$$ 2013, recursive |15.6 |DNF ||||||
500
501|clang$$++$$ 3.5.1, recursive |1.8 |9.5 |DNF |||||
502
503|g$$++$$ 4.9.2, recursive |1.4 |7.0 |19.7 |DNF ||||
504|===
505I told you there was no point.
506
507Point or no, to establish that the recursive implementation is inefficient is
508not the same as to come up with an efficient one. There are two things that
509make the `mp_contains` techniques inapplicable to our present case: first,
510`mp_contains` only had to return true or false, whereas `mp_map_find` returns a
511type, and second, in `mp_contains` we knew the exact type of the element for
512which we were looking, whereas here, we only know its `mp_front`.
513
514Fortunately, there does exist a language feature that can solve both: {cpp} can
515deduce the template parameters of base classes when passed a derived class. In
516this example,
517```
518struct K1 {};
519struct V1 {};
520
521struct X: std::pair<K1, V1> {};
522
523template<class A, class B> void f(std::pair<A, B> const & p);
524
525int main()
526{
527    f(X());
528}
529```
530the call to `f(X())` deduces `A` as `K1` and `B` as `V1`. If we have more than
531one `std::pair` base class, we can fix `A` to be `K1`:
532```
533struct K1 {};
534struct V1 {};
535
536struct K2 {};
537struct V2 {};
538
539struct X: std::pair<K1, V1>, std::pair<K2, V2> {};
540
541template<class B> void f(std::pair<K1, B> const & p);
542
543int main()
544{
545    f(X());
546}
547```
548and `B` will be deduced as `V1`.
549
550We can retrieve the results of the deduction by returning the type we want:
551```
552template<class B> std::pair<K1, B> f(std::pair<K1, B> const & p);
553```
554and then using `decltype(f(X()))` to obtain this return type.
555
556What if `X` doesn't have a base of type `std::pair<K1, B>`? The deduction will
557fail and we'll get an error that `f(X())` cannot be called. To avoid it, we can
558add an overload of `f` that takes anything and returns `void`. But in this
559case, what will happen if `X` has two bases of the form that match the first
560`f` overload, such as for example `std::pair<K1, Y>` and `std::pair<K1, Z>`?
561
562The deduction will fail, the second overload will again be chosen and we'll get
563`void`. This is why we require maps to have unique keys.
564
565Here's an implementation of `mp_map_find` based on this technique:
566```
567template<class M, class K> struct mp_map_find_impl;
568
569template<class M, class K>
570    using mp_map_find = typename mp_map_find_impl<M, K>::type;
571
572template<template<class...> class M, class... T, class K>
573    struct mp_map_find_impl<M<T...>, K>
574{
575    struct U: mp_identity<T>... {};
576
577    template<template<class...> class L, class... U>
578        static mp_identity<L<K, U...>>
579        f( mp_identity<L<K, U...>>* );
580
581    static mp_identity<void> f( ... );
582
583    using V = decltype( f((U*)0) );
584
585    using type = typename V::type;
586};
587```
588and its corresponding compile times:
589|===
590||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
591
592|VC$$++$$ 2013, deduction |0.3 |0.7 |1.8 |3.6 |6.4 |10.4 |16.2 |DNF
593
594|clang$$++$$ 3.5.1, deduction |0.3 |0.4 |0.6 |0.9 |1.2 |1.6 |2.2 |2.7
595
596|g$$++$$ 4.9.2, deduction |0.3 |0.5 |0.9 |1.6 |2.3 |3.4 |4.7 |6.3
597|===
598This looks ready to ship.
599
600The implementation contains one inefficiency though. If we evaluate
601`mp_map_find<M, K1>`, then `mp_map_find<M, K2>`, the two nested `U` types are
602the same as they only depend on `M`, but the compiler doesn't know that and
603will instantiate each one separately. We should move this type outside
604`mp_map_find_impl` so that it can be reused:
605[subs=+quotes]
606```
607template<class... T> struct **mp_inherit**: T... {};
608
609template<class M, class K> struct mp_map_find_impl;
610
611template<class M, class K>
612    using mp_map_find = typename mp_map_find_impl<M, K>::type;
613
614template<template<class...> class M, class... T, class K>
615    struct mp_map_find_impl<M<T...>, K>
616{
617    **using U = mp_inherit<mp_identity<T>...>;**
618
619    template<template<class...> class L, class... U>
620        static mp_identity<L<K, U...>>
621        f( mp_identity<L<K, U...>>* );
622
623    static mp_identity<void> f( ... );
624
625    using V = decltype( f((U*)0) );
626
627    using type = typename V::type;
628};
629```
630(This same optimization, by the way, applies to our `is_base_of` implementation
631of `mp_contains`.)
632
633The improvement in compile times on our benchmark is measurable:
634|===
635||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
636
637|VC$$++$$ 2013, deduction+mp_inherit |0.3 |0.6 |1.4 |2.6 |4.5 |7.1 |10.7 |DNF
638
639|clang$$++$$ 3.5.1, deduction+mp_inherit |0.3 |0.4 |0.6 |0.8 |1.0 |1.4 |1.8 |2.2
640
641|g$$++$$ 4.9.2, deduction+mp_inherit |0.3 |0.4 |0.6 |0.9 |1.3 |1.8 |2.3 |2.9
642|===
643
644## mp_at
645
646With sets and maps covered, it's time to tackle vectors. Vectors for us are
647just lists, to which we'll need to add the ability to efficiently access an
648element based on its index. The customary name for this accessor is
649`mp_at<L, I>`, where `L` is a list and `I` is an `integral_constant` that
650represents the index. We'll also follow the Boost.MPL convention and add
651`mp_at_c<L, I>`, where `I` is the index of type `size_t`.
652
653The recursive implementation of `mp_at` is:
654```
655template<class L, std::size_t I> struct mp_at_c_impl;
656
657template<class L, std::size_t I> using mp_at_c = typename mp_at_c_impl<L, I>::type;
658
659template<class L, class I> using mp_at = typename mp_at_c_impl<L, I::value>::type;
660
661template<template<class...> class L, class T1, class... T>
662    struct mp_at_c_impl<L<T1, T...>, 0>
663{
664    using type = T1;
665};
666
667template<template<class...> class L, class T1, class... T, std::size_t I>
668    struct mp_at_c_impl<L<T1, T...>, I>
669{
670    using type = mp_at_c<L<T...>, I-1>;
671};
672```
673and the compile times for making `N` calls to `mp_at` with a list of size `N`
674as the first argument are:
675|===
676||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
677
678|VC$$++$$ 2013, recursive |3.6 |DNF ||||||
679
680|clang$$++$$ 3.5.1, recursive |1.0 |5.1 |15.3 |DNF ||||
681
682|g$$++$$ 4.9.2, recursive |0.9 |4.7 |14.2 |32.4 |DNF |||
683|===
684To improve upon this appalling result, we'll again exploit pack expansion into a
685function call, but in a novel way. Let's suppose that we need to access the
686fourth element (`I = 3`). We'll generate the function signature
687```
688template<class W> W f( void*, void*, void*, W*, ... );
689```
690and then, given a list `L<T1, T2, T3, T4, T5, T6, T7>`, we'll evaluate the
691expression
692```
693decltype( f( (T1*)0, (T2*)0, (T3*)0, (T4*)0, (T5*)0, (T6*)0, (T7*)0 ) )
694```
695The three `void*` parameters will eat the first three elements, `W` will be
696deduced as the fourth, and the ellipsis will take care of the rest.
697
698A working implementation based on this technique is shown below:
699```
700// mp_repeat_c
701
702template<std::size_t N, class... T> struct mp_repeat_c_impl
703{
704    using _l1 = typename mp_repeat_c_impl<N/2, T...>::type;
705    using _l2 = typename mp_repeat_c_impl<N%2, T...>::type;
706
707    using type = mp_append<_l1, _l1, _l2>;
708};
709
710template<class... T> struct mp_repeat_c_impl<0, T...>
711{
712    using type = mp_list<>;
713};
714
715template<class... T> struct mp_repeat_c_impl<1, T...>
716{
717    using type = mp_list<T...>;
718};
719
720template<std::size_t N, class... T> using mp_repeat_c =
721    typename mp_repeat_c_impl<N, T...>::type;
722
723// mp_at
724
725template<class L, class L2> struct mp_at_c_impl;
726
727template<template<class...> class L, class... T,
728    template<class...> class L2, class... U>
729    struct mp_at_c_impl<L<T...>, L2<U...>>
730{
731    template<class W> static W f( U*..., W*, ... );
732
733    using R = decltype( f( (mp_identity<T>*)0 ... ) );
734
735    using type = typename R::type;
736};
737
738template<class L, std::size_t I> using mp_at_c =
739    typename mp_at_c_impl<L, mp_repeat_c<I, void>>::type;
740
741template<class L, class I> using mp_at = mp_at_c<L, I::value>;
742```
743and the compile times in the following table show it to be good enough for most
744practical purposes.
745|===
746||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
747
748|VC$$++$$ 2013, void* |0.4 |1.1 |2.4 |4.7 |DNF |||
749
750|clang$$++$$ 3.5.1, void* |0.4 |0.7 |1.2 |1.9 |2.7 |3.8 |5.0 |6.6
751
752|g$$++$$ 4.9.2, void* |0.3 |0.5 |0.9 |1.3 |2.1 |3.0 |4.2 |5.5
753|===
754Are we done with `mp_at`, then?
755
756Let's try something else -- transform the input list `[T1, T2, T3]` into a map
757`[[0, T1], [1, T2], [2, T3]]`, then use `mp_map_find` for the lookup:
758[subs=+macros]
759```
760// mp_map_from_list
761
762template<class L, class S> struct mp_map_from_list_impl;
763
764template<template<class...> class L, class... T, std::size_t... J>
765    struct mp_map_from_list_impl<L<T...>, http://en.cppreference.com/w/cpp/utility/integer_sequence[integer_sequence]<std::size_t, J...>>
766{
767    using type = mp_list<mp_list<mp_size_t<J>, T>...>;
768};
769
770template<class L> using mp_map_from_list = typename mp_map_from_list_impl<L,
771    http://en.cppreference.com/w/cpp/utility/integer_sequence[make_index_sequence]<mp_size<L>::value>>::type;
772
773// mp_at
774
775template<class L, std::size_t I> struct mp_at_c_impl
776{
777    using map = mp_map_from_list<L>;
778    using type = mp_second<mp_map_find<map, mp_size_t<I>>>;
779};
780
781template<class L, std::size_t I> using mp_at_c = typename mp_at_c_impl<L, I>::type;
782
783template<class L, class I> using mp_at = typename mp_at_c_impl<L, I::value>::type;
784```
785At first sight, this looks ridiculous, but metaprogramming has its own rules.
786Let's measure:
787|===
788||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
789
790|VC$$++$$ 2013, map |0.3 |0.7 |1.5 |2.9 |5.0 |7.8 |11.9 |DNF
791
792|clang$$++$$ 3.5.1, map |0.3 |0.4 |0.6 |0.8 |1.1 |1.5 |1.8 |2.3
793
794|g$$++$$ 4.9.2, map |0.3 |0.4 |0.7 |1.0 |1.4 |1.9 |2.5 |3.2
795|===
796Surprise, this is the best implementation.
797
798## mp_drop
799
800It turned out that we didn't need the `void*` trick for `mp_at`, but I'll show
801an example where we do: `mp_drop`. `mp_drop<L, N>` returns the list `L` without
802its first `N` elements; or, in other words, it drops the first `N` elements
803(presumably on the cutting room floor) and returns what's left.
804
805To implement `mp_drop`, we just need to change
806```
807template<class W> W f( void*, void*, void*, W*, ... );
808```
809from above to return the rest of the elements, rather than just one:
810```
811template<class... W> mp_list<W> f( void*, void*, void*, W*... );
812```
813Adding the necessary `mp_identity` seasoning produces the following working
814implementation:
815```
816template<class L, class L2> struct mp_drop_c_impl;
817
818template<template<class...> class L, class... T,
819    template<class...> class L2, class... U>
820    struct mp_drop_c_impl<L<T...>, L2<U...>>
821{
822    template<class... W> static mp_identity<L<W...>> f( U*..., mp_identity<W>*... );
823
824    using R = decltype( f( (mp_identity<T>*)0 ... ) );
825
826    using type = typename R::type;
827};
828
829template<class L, std::size_t N> using mp_drop_c =
830    typename mp_drop_c_impl<L, mp_repeat_c<N, void>>::type;
831
832template<class L, class N> using mp_drop = mp_drop_c<L, N::value>;
833```
834I'll skip the recursive implementation and the performance comparison for this
835one. We can pretty much tell who's going to win, and by how much.
836
837## mp_find_index
838
839The final algorithm that I'll bring to your attention is `mp_find_index`.
840`mp_find_index<L, V>` returns an integral constant of type `size_t` with a
841value that is the index of the first occurrence of `V` in `L`. If `V` is not in
842`L`, the return value is the size of `L`.
843
844We'll start with the recursive implementation, as usual:
845```
846template<class L, class V> struct mp_find_index_impl;
847
848template<class L, class V> using mp_find_index = typename mp_find_index_impl<L, V>::type;
849
850template<template<class...> class L, class V> struct mp_find_index_impl<L<>, V>
851{
852    using type = mp_size_t<0>;
853};
854
855template<template<class...> class L, class... T, class V>
856    struct mp_find_index_impl<L<V, T...>, V>
857{
858    using type = mp_size_t<0>;
859};
860
861template<template<class...> class L, class T1, class... T, class V>
862    struct mp_find_index_impl<L<T1, T...>, V>
863{
864    using type = mp_size_t<1 + mp_find_index<L<T...>, V>::value>;
865};
866```
867and will continue with the compile times for `N` calls to `mp_find_index` on a
868list with `N` elements, as usual:
869|===
870||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
871
872|VC$$++$$ 2013, recursive |3.5 |DNF ||||||
873
874|clang$$++$$ 3.5.1, recursive |1.1 |5.5 |DNF |||||
875
876|g$$++$$ 4.9.2, recursive |0.8 |4.6 |13.6 |DNF ||||
877|===
878What can we do here?
879
880Let's go back to `mp_contains` and look at the "mp_count/cx_plus2"
881implementation which we rejected in favor of the inheritance-based one. It
882built a `constexpr` array of booleans and summed them in a `constexpr`
883function. We can do the same here, except instead of summing the array, we can
884find the index of the first true value:
885```
886template<class L, class V> struct mp_find_index_impl;
887
888template<class L, class V> using mp_find_index = typename mp_find_index_impl<L, V>::type;
889
890template<template<class...> class L, class V> struct mp_find_index_impl<L<>, V>
891{
892    using type = mp_size_t<0>;
893};
894
895constexpr std::size_t cx_find_index( bool const * first, bool const * last )
896{
897    return first == last || *first? 0: 1 + cx_find_index( first + 1, last );
898}
899
900template<template<class...> class L, class... T, class V>
901    struct mp_find_index_impl<L<T...>, V>
902{
903    static constexpr bool _v[] = { std::is_same<T, V>::value... };
904
905    using type = mp_size_t< cx_find_index( _v, _v + sizeof...(T) ) >;
906};
907```
908The performance of this version is:
909|===
910||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
911
912|clang$$++$$ 3.5.1, constexpr |0.5 |1.3 |2.9 |DNF ||||
913
914|g$$++$$ 4.9.2, constexpr |0.5 |1.4 |3.1 |5.5 |8.7 |13.0 |18.0 |DNF
915|===
916which, while not ideal, is significantly better than our recursive punching
917bag. But if our compiler of choice is VC$$++$$ 2013, we can't use `constexpr`.
918
919We may attempt an implementation along the same lines, but with the `constexpr`
920function replaced with ordinary metaprogramming:
921```
922template<class L, class V> struct mp_find_index_impl;
923
924template<class L, class V> using mp_find_index = typename mp_find_index_impl<L, V>::type;
925
926template<template<class...> class L, class V> struct mp_find_index_impl<L<>, V>
927{
928    using type = mp_size_t<0>;
929};
930
931template<bool...> struct find_index_impl_;
932
933template<> struct find_index_impl_<>
934{
935    static const std::size_t value = 0;
936};
937
938template<bool B1, bool... R> struct find_index_impl_<B1, R...>
939{
940    static const std::size_t value = B1? 0: 1 + find_index_impl_<R...>::value;
941};
942
943template<bool B1, bool B2, bool B3, bool B4, bool B5,
944    bool B6, bool B7, bool B8, bool B9, bool B10, bool... R>
945    struct find_index_impl_<B1, B2, B3, B4, B5, B6, B7, B8, B9, B10, R...>
946{
947    static const std::size_t value = B1? 0: B2? 1: B3? 2: B4? 3: B5? 4:
948        B6? 5: B7? 6: B8? 7: B9? 8: B10? 9: 10 + find_index_impl_<R...>::value;
949};
950
951template<template<class...> class L, class... T, class V>
952    struct mp_find_index_impl<L<T...>, V>
953{
954    using type = mp_size_t<find_index_impl_<std::is_same<T, V>::value...>::value>;
955};
956```
957This is still recursive, so we don't expect miracles, but it wouldn't hurt to
958measure:
959|===
960||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
961
962|VC$$++$$ 2013, bool... |4.7 |94.5 |488.3 |XFA ||||
963
964|clang$$++$$ 3.5.1, bool... |0.6 |2.2 |5.8 |12.0 |21.7 |35.2 |DNF |
965
966|g$$++$$ 4.9.2, bool... |0.6 |2.4 |6.5 |13.2 |23.8 |39.1 |59.0 |DNF
967|===
968(where XFA stands for "experimenter fell asleep".)
969
970This is an interesting tradeoff for VC$$++$$ 2013 and Clang. On the one hand,
971this implementation is slower; on the other, it doesn't crash the compiler as
972easily. Which to prefer is a matter of taste and of stern evaluation of one's
973needs to manipulate type lists of length 300.
974
975Note that once we have `mp_drop` and `mp_find_index`, we can derive the
976`mp_find<L, V>` algorithm, which returns the suffix of `L` starting with the
977first occurrence of `V`, if any, and an empty list otherwise, by using
978`mp_drop<L, mp_find_index<L, V>>`.
979
980## Conclusion
981
982In this article, I have shown efficient algorithms that allow us to treat type
983lists as sets, maps and vectors, demonstrating various {cpp}11 implementation
984techniques in the process.
985