• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[library Boost.PolyCollection
2  [quickbook 1.6]
3  [authors [López Muñoz, Joaquín M]]
4  [copyright 2016-2020 Joaquín M López Muñoz]
5  [category containers]
6  [id poly_collection]
7  [dirname poly_collection]
8  [purpose
9    High-performance containers for polymorphic objects.
10  ]
11  [license
12    Distributed under the Boost Software License, Version 1.0.
13    (See accompanying file LICENSE_1_0.txt or copy at
14    [@http://www.boost.org/LICENSE_1_0.txt])
15  ]
16]
17
18[def _Boost.TypeErasure_ [@boost:/libs/type_erasure Boost.TypeErasure]]
19[def _Boost.TypeIndex_ [@boost:/libs/type_index Boost.TypeIndex]]
20[def _CopyConstructible_ [@http://en.cppreference.com/w/cpp/named_req/CopyConstructible [* `CopyConstructible`]]]
21[def _duck_typing_ [@https://en.wikipedia.org/wiki/Duck_typing ['duck typing]]]
22[def _EqualityComparable_ [@http://en.cppreference.com/w/cpp/named_req/EqualityComparable [* `EqualityComparable`]]]
23[def _ForwardIterator_ [@http://en.cppreference.com/w/cpp/named_req/ForwardIterator [* `ForwardIterator`]]]
24[def _MoveConstructible_ [@http://en.cppreference.com/w/cpp/named_req/MoveConstructible [* `MoveConstructible`]]]
25[def _MoveAssignable_ [@http://en.cppreference.com/w/cpp/named_req/MoveAssignable [* `MoveAssignable`]]]
26[def _std::for_each_ [@http://en.cppreference.com/w/cpp/algorithm/for_each `std::for_each`]]
27[def _std::function_ [@http://en.cppreference.com/w/cpp/utility/functional/function `std::function`]]
28[def _std::unordered_multiset_ [@http://en.cppreference.com/w/cpp/container/unordered_multiset `std::unordered_multiset`]]
29[def _RandomAccessIterator_ [@http://en.cppreference.com/w/cpp/named_req/RandomAccessIterator [* `RandomAccessIterator`]]]
30
31[template ticket[number]'''<ulink url="https://svn.boost.org/trac/boost/ticket/'''[number]'''">'''#[number]'''</ulink>''']
32[template github[module number]'''<ulink url="https://github.com/boostorg/'''[module]'''/issues/'''[number]'''">'''#[number]'''</ulink>''']
33[template github_pr[module number]'''<ulink url="https://github.com/boostorg/'''[module]'''/pull/'''[number]'''">'''PR#[number]'''</ulink>''']
34
35[section Introduction]
36
37Dynamic polymorphism in C++ requires that objects (such as instances
38of classes derived from an abstract base) be accessed through an indirection pointer
39because their actual /type/ and /size/ are not known at the point of usage. As a
40consequence, regular containers cannot store polymorphic objects directly: the usual
41workaround is to have containers of pointers to heap-allocated elements. In modern
42computer architectures this pattern incurs two types of inefficiency:
43
44* The lack of memory contiguity produced by heap allocation degrades CPU cache
45performance.
46* Executing virtual operations on a sequence of polymorphic objects whose
47actual types differ from one to the next results in failures in
48[@https://en.wikipedia.org/wiki/Branch_predictor branch prediction] and a
49lower execution speed.
50
51When the particular traversal order is not relevant to the user application,
52Boost.PolyCollection proposes an alternative data structure that restores memory
53contiguity and packs elements according to their concrete type. Three container
54class templates are provided:
55
56* `boost::base_collection`
57* `boost::function_collection`
58* `boost::any_collection`
59
60respectively dealing with three different types of dynamic polymorphism
61available in C++:
62
63* Classic base/derived or OOP polymorphism.
64* Function wrapping in the spirit of _std::function_.
65* So-called _duck_typing_ as implemented by _Boost.TypeErasure_.
66
67The interface of these containers closely follows that of standard containers.
68Additionally, the library provides versions of many of the standard library
69algorithms (including
70[@http://en.cppreference.com/w/cpp/algorithm/for_each `std::for_each`])
71with improved performance and a special feature called /type restitution/
72that allows user code to provide clues on the concrete types of the elements
73stored for further opportunities of increased efficiency related to inlining and
74[@http://hubicka.blogspot.com.es/2014/01/devirtualization-in-c-part-1.html ['devirtualization]].
75
76[note Boost.PolyCollection is a header-only library. C++11 support is required.
77The library has been verified to work with Visual Studio 2015, GCC 4.8 and
78Clang 3.3.]
79
80[endsect]
81
82[section An efficient polymorphic data structure]
83
84Suppose we have a `base` abstract class with implementations `derived1`, `derived2`
85and `derived3`. The memory layout of a `std::vector<base*>` (or similar constructs
86such as `std::vector<std::unique_ptr<base>>` or
87[@boost:/libs/ptr_container/ `boost::ptr_vector<base>`]) looks like
88the following:
89
90[$poly_collection/img/ptr_vector.png]
91
92Elements that are adjacent in the vector are not necessarily allocated contiguously,
93much less so if the vector has undergone mid insertions and deletions. A typical
94processing operation
95
96  std::vector<base*> v;
97  ...
98  for(base* b: v){
99    ... // access base's virtual interface
100  }
101
102is impacted negatively by two factors:
103
104* Scattering of elements throughout memory reduces CPU caching efficiency,
105which in general favor regular access loops to contiguous memory areas.
106* [@https://en.wikipedia.org/wiki/Branch_predictor Branch prediction] tries
107to minimize the effect of running conditional code (such as an
108`if`-`else` statement or the invocation of a `base` virtual function) by
109speculatively executing a given branch based on past history. This
110mechanism is rendered mostly useless when `derived1`, `derived2` and
111`derived3` elements are interspersed along the sequence without a definite pattern.
112
113These limitations are imposed by the very nature of dynamic polymorphism:
114as the exact types of the elements accessed through `base`'s interface are not
115known, an indirection through `base*` (a particular form of /type erasure/)
116is required. There is however a critical observation: even though derived types
117are not known when traversing a `std::vector<base*>`, the information is typically
118available /at compile time/ at the point of insertion in the vector:
119
120  std::vector<base*> v;
121  ...
122  v.insert(new derived2{...}); // the type derived2 is "forgotten" by v
123
124A suitably designed container can take advantage of this information to
125arrange elements contiguously according to their exact type, which results
126in an internal data structure (a map of pointers to
127[@http://en.cppreference.com/w/cpp/types/type_info `std::type_info`] objects,
128actually) pointing to as many vectors or /segments/ as there are derived classes:
129
130[$poly_collection/img/segment_map.png]
131
132Traversing such a structure reduces to looping over all the segments one after
133another: this is extremely efficient both in terms of caching and branch
134prediction. In the process we have however lost the free-order capability of a
135`std::vector<base*>` (free order can only be retained at the segment level), but
136if this is not relevant to the user application the potential performance gains
137of switching to this structure are large.
138
139The discussion has focused on base/derived programming, also known as OOP, but it
140also applies to other forms of dynamic polymorphism:
141
142* _std::function_ abstracts callable entities with the same given signature
143under a common interface. Internally, pointer indirections and virtual-like
144function calls are used. Memory fragmentation is expected to be lower than
145with OOP, though, as implementations usually feature the so-called
146[@http://talesofcpp.fusionfenix.com/post-16/episode-nine-erasing-the-concrete#a-note-on-small-buffer-optimization [' small buffer optimization]]
147to avoid heap allocation in some situations.
148* The case of `std::function` can be seen as a particular example of a
149more general form of polymorphism called _duck_typing_, where unrelated types
150are treated uniformly if they conform to the same given /interface/ (a specified
151set of member functions and/or operations). Duck typing provides the power of
152OOP while allowing for greater flexibility as polymorphic types need not derive
153from a preexisting base class or in general be designed with any particular
154interface in mind --in fact, the same object can be duck-typed to different
155interfaces. Among other libraries, _Boost.TypeErasure_ provides duck typing
156for C++. Under the hood, duck typing requires pointer indirection and virtual
157call implementation techniques analogous to those of OOP, and so there are
158the same opportunities for efficient container data structures as we have
159described.
160
161Boost.PolyCollection provides three different container class templates
162dealing with OOP, `std::function`-like polymorphism and duck typing as
163implemented by Boost.TypeErasure:
164
165* `boost::base_collection`
166* `boost::function_collection`
167* `boost::any_collection`
168
169The interfaces of these containers are mostly the same and follow the usual
170conventions of standard library containers.
171
172[endsect]
173
174[section Tutorial]
175
176[section Basics]
177
178[import ../example/rolegame.hpp]
179
180[section `boost::base_collection`]
181
182[import ../example/basic_base.cpp]
183(Code samples from [@boost:/libs/poly_collection/example/basic_base.cpp `basic_base.cpp`].)
184
185Imagine we are developing a role playing game in C++ where sprites are rendered on
186screen; for the purposes of illustration we can model rendering simply as
187outputting some information to a `std::ostream`:
188
189[rolegame_1]
190
191The game features warriors, juggernauts (a special type of warrior) and goblins,
192each represented by its own class derived (directly or indirectly) from `sprite`:
193
194[rolegame_2]
195
196Let us populate a polymorphic collection with an assortment of sprites:
197
198[basic_base_1]
199
200There are two aspects to notice here:
201
202* `boost::base_collection` does not have a `push_back` member function like, say,
203`std::vector`, as the order in which elements are stored cannot be freely
204chosen by the user code \u2014we will see more about this later. The insertion mechanisms
205are rather those of containers like _std::unordered_multiset_, namely `insert` and
206`emplace` with or without a position /hint/.
207* Elements are not created with `new` but constructed on the stack and passed
208directly much like one would do with a standard non-polymorphic container.
209
210[important Elements inserted into a `boost::base_collection` (or the other
211containers of Boost.PolyCollection) must be copyable and assignable; strictly
212speaking, they must at least model _MoveConstructible_ and either be
213_MoveAssignable_ or not throw on move construction. This might
214force you to revisit your code as it is customary to explicitly forbid
215copying at the base level of a virtual hierarchy to avoid
216[@https://en.wikipedia.org/wiki/Object_slicing ['slicing]].]
217
218Rendering can now be implemented with a simple `for` loop over `c`:
219
220[basic_base_2]
221
222The output being:
223
224[pre
225juggernaut 0,juggernaut 4,juggernaut 7,goblin 1,goblin 3,goblin 5,warrior 2,warrior 6
226]
227
228As we have forewarned, the traversal order does not correspond to that of insertion.
229Instead, the elements are grouped in /segments/ according to their concrete type.
230Here we see that juggernauts come first, followed by goblins and warriors. In
231general, no assumptions should be made about segment ordering, which may be
232different for this very example in your environment. On the other hand, elements
233inserted into an already existing segment always come at the end (except if a hint
234is provided). For instance, after inserting a latecomer goblin with `id==8`:
235
236[basic_base_3]
237
238the rendering loop outputs (new element in red):
239
240[pre
241juggernaut 0,juggernaut 4,juggernaut 7,goblin 1,goblin 3,goblin 5,[role red goblin 8],warrior 2,warrior 6
242]
243
244Deletion can be done in the usual way:
245
246[basic_base_4]
247
248Now rendering emits:
249
250[pre
251juggernaut 0,juggernaut 4,goblin 1,goblin 3,goblin 5,goblin 8,warrior 2,warrior 6
252]
253
254[endsect]
255
256[section `boost::function_collection`]
257
258[import ../example/basic_function.cpp]
259(Code samples from [@boost:/libs/poly_collection/example/basic_function.cpp `basic_function.cpp`].
260C++14 `std::make_unique` is used.)
261
262Well into the development of the game, the product manager requests that two new
263types of entities be added to the rendering loop:
264
265* Overlaid messages, such as scores, modelled as `std::string`s.
266* Pop-up windows coming from a third party library that, unfortunately, do not
267derive from `sprite` and use their own `display` member functon for rendering:
268
269[rolegame_3]
270
271We then decide to refactor the code so that sprites, messsages and windows are
272stored in dedicated containers:
273
274[basic_function_1]
275
276and define our rendering container as a `boost::function_collection` of
277/callable entities/ \u2014function pointers or objects with a function
278call `operator()`\u2014 accepting a `std::ostream&` as a parameter
279
280[basic_function_2]
281
282which we fill with suitable adaptors for `sprite`s, `std::string`s and
283`window`s, respectively. Lambda functions allow for a particularly terse code.
284
285[basic_function_3]
286
287The rendering loop now looks like this:
288
289[basic_function_4]
290
291and produces the following for a particular scenario of sprites, messages and
292windows:
293
294[pre
295juggernaut 0,goblin 1,warrior 2,goblin 3,"stamina: 10,000","game over",'''[pop-up 1]''','''[pop-up 2]'''
296]
297
298The container we have just created works in many respects like a
299`std::vector<std::function<render_callback>>`, the main difference being
300that with `boost::function_collection` callable entities of the same type
301are packed together in memory
302[footnote Note that all `sprite`s come into one segment: this is why
303goblins #1 and #3 are not adjacent. Exercise for the reader: change
304the code of the example so that sprites are further segmented according
305to their concrete type.], thus increasing performance (which is the whole
306point of using Boost.PolyCollection), while a vector of `std::function`s
307results in an individual allocation for each entity stored
308[footnote Except when small buffer optimization applies.].
309In fact, the `value_type` elements held by a
310`boost::function_collection` are actually /not/ `std::function`s,
311although they behave analogously and can be converted to `std::function` if needed:
312
313[basic_function_5]
314
315There is a reason for this: elements of a polymorphic collection cannot be freely
316assigned to by the user as this could end up trying to insert an object into a
317segment of a different type. So, unlike with `std::function`, this will not work:
318
319[basic_function_6]
320
321[endsect]
322
323[section `boost::any_collection`]
324
325[import ../example/basic_any.cpp]
326(Code samples from [@boost:/libs/poly_collection/example/basic_any.cpp `basic_any.cpp`].)
327
328[note Here we just touch on the bare essentials of _Boost.TypeErasure_ needed
329to present `boost::any_collection`. The reader is advised to read
330Boost.TypeErasure documentation for further information.]
331
332After measuring the performance of the latest changes, we find that rendering
333is too slow and decide to refactor once again: if we could store all the
334entities --sprites, messages and windows-- into one single container, that would
335eliminate a level of indirection. The problem is that these types are totally
336unrelated to each other.
337
338_Boost.TypeErasure_ provides a class template `boost::type_erasure::any<Concept>`
339able to hold an object of whatever type as long as it conforms to the interface
340specified by `Concept`. In our case, we find it particularly easy to implement
341a common interface for rendering by providing overloads of `operator<<`
342
343[basic_any_1]
344
345so that we can use it to specify the interface all entities have to adhere to:
346
347[basic_any_2]
348
349The collection just created happily accepts insertion of heterogeneous entities
350(since interface conformity is silently checked at compile time)
351
352[basic_any_3]
353
354and rendering becomes
355
356[basic_any_4]
357
358with output
359
360[pre
361'''[pop-up 1]''','''[pop-up 2]''',juggernaut 0,goblin 1,goblin 3,warrior 2,"stamina: 10,000","game over"
362]
363
364As was the case with `boost::function_collection`, this container is similar
365to a `std::vector<boost::type_erasure::any<Concept>>` but has better performance
366due to packing of same-type elements. Also, the `value_type` of a
367`boost::any_collection<Concept>` is /not/ `boost::type_erasure::any<Concept>`,
368but a similarly behaving entity
369[footnote Actually, it is
370`boost::type_erasure::any<Concept2,boost::type_erasure::_self&>` for some
371internally defined `Concept2` that extends `Concept`.].
372In any case, we are not accessing sprites through an abstract `sprite&`
373anymore, so we could as well dismantle the virtual hierarchy and implement
374each type autonomously: this is left as an exercise to the reader.
375
376[endsect]
377
378[endsect]
379
380[section Deeper into the segmented nature of Boost.PolyCollection]
381
382[import ../example/segmented_structure.cpp]
383(Code samples from [@boost:/libs/poly_collection/example/segmented_structure.cpp `segmented_structure.cpp`].
384C++14 `std::make_unique` is used.)
385
386[section Type registration]
387
388Getting back to our [link poly_collection.tutorial.basics.boost_base_collection `boost::base_collection`]
389example, suppose we want to refactor the populating code as follows:
390
391[segmented_structure_1]
392
393Unexpectedly, this piece of code throws an exception of type
394[link poly_collection.reference.header_boost_poly_collection_exc.class_unregistered_type `boost::poly_collection::unregistered_type`].
395What has changed from our original code?
396
397Suppose a `warrior` has been created by `make_sprite`. The statement
398`c.insert(*make_sprite())` is passing the object as a `sprite&`: even though
399`boost::base_collection` is smart enough to know that the object is actually
400derived from `sprite` (by using
401[@http://en.cppreference.com/w/cpp/language/typeid `typeid()`]) and
402[@https://en.wikipedia.org/wiki/Object_slicing slicing] is to be avoided,
403there is no way that a segment for it can be created without accessing
404the type `warrior` /at compile time/ for the proper internal class templates
405to be instantiated
406[footnote If this is conceptually difficult to grasp, consider the potentially
407more obvious case where `warrior` is defined in a dynamic module linked to
408the main program: the code of `boost::base_collection`, which has been compiled
409before linking, cannot even know the size of this as-of-yet unseen class, so
410hardly can it allocate a segment for the received object.].
411This did not happen in the pre-refactoring code because objects were passed
412as references to their true types.
413
414A type is said to be /registered/ into a polymorphic collection if
415there is a (potentially empty) segment created for it. This can be checked
416with:
417
418[segmented_structure_2]
419
420Registration happens automatically when the object is passed as a reference
421to its true type or
422[link poly_collection.tutorial.insertion_and_emplacement.emplacement `emplace`]'d,
423and explicitly with
424`register_types`:
425
426[segmented_structure_3]
427
428Once `T` has been registered into a polymorphic collection, it remains
429so regardless of whether objects of type `T` are stored or not, except if
430the collection is moved from, assigned to, or swapped.
431
432As slicing is mainly an issue with OOP,  `unregistered_type` exceptions are
433expected to be much less frequent with `boost::function_collection` and
434`boost::any_collection`. Contrived examples can be found, though:
435
436[segmented_structure_4]
437
438[endsect]
439
440[section Segment-specific interface]
441
442For most of the interface of a polymorphic collection, it is possible to only
443refer to the elements of a given segment by providing either their type or
444`typeid()`. For instance:
445
446[segmented_structure_5]
447
448Note that any of these (except
449[link poly_collection.tutorial.deeper_into_the_segmented_nature.reserve `reserve`])
450will throw `boost::poly_collection::unregistered_type` if the type is not
451registered. Segment-specific addressability also includes traversal:
452
453[endsect]
454
455[section Local iterators]
456
457The following runs only over the `warrior`s of the collection:
458
459[segmented_structure_6]
460
461Output:
462
463[pre
464warrior 2,warrior 6
465]
466
467`begin|end(typeid(T))` return objects of type `local_base_iterator`
468associated to the segment for `T`. These iterators dereference to the same
469value as regular iterators (in the case of `boost::base_collection<base>`,
470`base&`) but can only be used to traverse a given segment (for instance,
471`local_base_iterator`'s from different segments cannot be compared between them).
472In exchange, `local_base_iterator` is a _RandomAccessIterator_, whereas
473whole-collection iterators only model _ForwardIterator_.
474
475A terser range-based `for` loop can be used with the convenience `segment`
476member function:
477
478[segmented_structure_7]
479
480Even more powerful than `local_base_iterator` is `local_iterator<T>`:
481
482[segmented_structure_8]
483
484This appends a `"super"` prefix to the `rank` data member of existing warriors:
485
486[pre
487superwarrior 2,superwarrior 6
488]
489
490The observant reader will have noticed that in order to access `rank`, which
491is a member of `warrior` rather than its base class `sprite`, `first` (or
492`x` in the range `for` loop version) has to refer to a `warrior&`, and this
493is precisely the difference between `local_iterator<warrior>` (the type
494returned by `begin<warrior>()`) and `local_base_iterator`.
495`local_iterator<warrior>` is also a _RandomAccessIterator_: for most respects,
496\[`begin<T>()`, `end<T>()`) can be regarded as a range over an array of `T`
497objects. `local_iterator<T>`s can be explicitly converted to
498`local_base_iterator`s. Conversely, if a `local_base_iterator` is associated
499to a segment for `T`, it can then be explictly converted to a
500`local_iterator<T>` (otherwise the conversion is undefined behavior).
501
502The figure shows the action scopes of all the iterators associated to
503a polymorphic collection (`const_` versions not included):
504
505[$ poly_collection/img/poly_collection_iterators.png]
506
507We have yet to describe the bottom part of the diagram.
508Whereas `segment(typeid(T))` is used to range over the /elements/
509of a particular segment, `segment_traversal()` returns an object for ranging over
510/segments/, so that the whole collection can be processed with a nested
511segment-element `for` loop like the following:
512
513[segmented_structure_9]
514
515Segment decomposition of traversal loops forms the basis of
516[link poly_collection.tutorial.algorithms improved-performance algorithms].
517
518[endsect]
519
520[section Reserve]
521
522Much like `std::vector`, segments can be made to reserve memory in
523advance to minimize reallocations:
524
525[segmented_structure_10]
526
527If there is no segment for the indicated type (here, `goblin`), one is
528automatically created. This is in contrast with the rest of segment-specific
529member functions, which throw
530[link poly_collection.tutorial.deeper_into_the_segmented_nature.type_registration
531`boost::poly_collection::unregistered_type`].
532
533`reserve` can deal with all segments at once. The following
534
535[segmented_structure_11]
536
537instructs every /existing/ segment to reserve 1,000 elements. If a
538segment is created later (through element insertion or with
539[link poly_collection.tutorial.deeper_into_the_segmented_nature.type_registration
540type registration]), its capacity will be different.
541
542[note Unlike standard containers, collection-level `capacity()` and
543`max_size()` are not provided because their usual semantics cannot be
544applied to Boost.PolyCollection: for instance, `capacity()` is typically
545used to check how many elements can be inserted without reallocation,
546but in a segmented structure that depends on the exact types of the
547elements and whether they are registered or not.
548]
549
550[endsect]
551
552[endsect]
553
554[section Insertion and emplacement]
555
556[import ../example/insertion_emplacement.cpp]
557(Code samples from [@boost:/libs/poly_collection/example/insertion_emplacement.cpp `insertion_emplacement.cpp`].)
558
559We already know that `insert(x)`, without further positional information,
560stores `x` at the end of its associated segment. When a regular iterator `it`
561is provided, insertion happens at the position indicated if both `it` and
562`x` belong in the same segment; otherwise, `it` is ignored. For instance,
563if our sprite collection has the following entities:
564
565[pre
566juggernaut 0,juggernaut 4,juggernaut 7,goblin 1,goblin 3,goblin 5,warrior 2,warrior 6
567]
568
569then this code:
570
571[insertion_emplacement_1]
572
573puts the new `juggernaut` at the beginning:
574
575[pre
576[role red juggernaut 8],juggernaut 0,juggernaut 4,juggernaut 7,goblin 1,...
577]
578
579whereas the position hint at
580
581[insertion_emplacement_2]
582
583is ignored and the new `goblin` gets inserted at the end of its segment:
584
585[pre
586juggernaut 8,...,juggernaut 7,goblin 1,goblin 3,goblin 5,[role red goblin 9],warrior 2,...
587]
588
589[link poly_collection.tutorial.deeper_into_the_segmented_nature.local_iterators Local iterators]
590can also be used to indicate the insertion position:
591
592[insertion_emplacement_3]
593[pre
594juggernaut 8,juggernaut 0,[role red juggernaut 10],juggernaut 4,juggernaut 7,goblin 1,...
595]
596
597There is a caveat, though: when using a local iterator, the element inserted
598/must belong to the indicated segment/:
599
600[insertion_emplacement_4]
601
602Member functions are provided for range insertion, with and without a position
603hint:
604
605[insertion_emplacement_5]
606
607As [link poly_collection.tutorial.deeper_into_the_segmented_nature.type_registration already explained],
608care must be taken that types be pre-registered into the collection if
609they are not passed as references to their actual type. So, why did not
610this line
611
612[insertion_emplacement_6]
613
614throw `boost::poly_collection::unregistered_type`? As it happens, in the
615special case where the inserted range belongs to a polymorphic collection
616of the same type, registration is done automatically
617[footnote That is, Boost.PolyCollection has enough static information to
618do type registration without further assistance from the user.].
619
620[#poly_collection.tutorial.insertion_and_emplacement.emplacement]
621Emplacement is slightly different for Boost.PolyCollection than with
622standard containers. Consider this attempt at emplacing a `goblin`:
623
624[insertion_emplacement_7]
625
626If examined carefully, it is only natural that the code above not compile:
627Boost.PolyCollection cannot possibly know that it is precisely a `goblin`
628that we want to emplace and not some other type constructible from an `int`
629(like `warrior`, `juggernaut` or an unrelated class). So, the type of
630the emplaced element has to be specified explicitly:
631
632[insertion_emplacement_8]
633
634As with `insert`, a position can be indicated for emplacing:
635
636[insertion_emplacement_9]
637
638Note the naming here: `emplace_hint` is used when the position is indicated with
639a regular iterator, and for local iterators it is `emplace_pos`.
640[endsect]
641
642[section Exceptions]
643
644[import ../example/exceptions.cpp]
645(Code samples from [@boost:/libs/poly_collection/example/exceptions.cpp `exceptions.cpp`].)
646
647Besides the usual exceptions like
648[@http://en.cppreference.com/w/cpp/memory/new/bad_alloc `std::bad_alloc`] and
649those generated by user-provided types, polymorphic collections can throw the following:
650
651* `boost::poly_collection::unregistered_type`
652* `boost::poly_collection::not_copy_constructible`
653* `boost::poly_collection::not_equality_comparable`
654
655The situations in which the first is raised have been
656[link poly_collection.tutorial.deeper_into_the_segmented_nature.type_registration already discussed];
657let us focus on the other two.
658
659We have a new type of sprite in our game defined by the non-copyable
660class `elf`:
661
662[rolegame_4]
663
664and we use it without problems until we write this:
665
666[exceptions_1]
667
668The first insertion works because the `elf` object passed is /temporary/
669and can be /moved/ into the container, but the second statement actually
670needs to /copy/ the `elf` elements in `c`  to `c2`, hence the exception.
671
672The potentially surprising aspect of this behavior is that standard
673containers signal this kind of problems by /failing at compile time/.
674Here we cannot afford this luxury because the exact types contained in a
675polymorphic collection are not known until run time (for instance,
676if `elf` elements had been erased before copying `c` to `c2` everything
677would have worked): basically, the deferral of errors from compile time
678to run time is an intrinsic feature of dynamic polymorphism.
679
680In the same vein, equality comparison requires that elements themselves
681be equality comparable:
682
683[exceptions_2]
684
685The above is unremarkable once we notice we have not defined `operator==`
686for any `sprite`. The problem may go unnoticed for quite some time, however,
687because determining that two polymorphic collections are equal (i.e.
688all their non-empty segments are equal) can return `false` without
689comparing anything at all (for instance, if segment sizes differ), in which
690case no exception is thrown.
691
692[note Operators for `<`, `<=`, `>` and `>=` comparison are not provided
693because /segment/ order is not fixed and may vary across otherwise
694identical collections. The situation is similar to that of standard
695[@http://en.cppreference.com/w/cpp/named_req/UnorderedAssociativeContainer unordered
696associative containers.]
697]
698
699These three are all the intrinsic exceptions thrown by Boost.PolyCollection.
700The implication is that if a type is _CopyConstructible_,
701_MoveAssignable_ (or move construction does not throw) and _EqualityComparable_, then
702the entire interface of Boost.PolyCollection is unrestrictedly available for it
703[footnote Provided, of course, that the type has the /right/ to be in the collection,
704that is, it is derived from the specified base, or callable with the specified
705signature, etc.].
706
707[endsect]
708
709[section Algorithms]
710
711[import ../example/algorithms.cpp]
712(Code samples from [@boost:/libs/poly_collection/example/algorithms.cpp `algorithms.cpp`].
713C++14 generic lambda expressions are used.)
714
715The ultimate purpose of Boost.PolyCollection is to speed up the processing of
716large quantities of polymorphic entities, in particular for those operations
717that involve linear traversal as implemented with a `for`-loop or using the
718quintessential _std::for_each_ algorithm.
719
720[algorithms_1]
721
722Replacing the container used in the program from classic alternatives to
723Boost.PolyCollection:
724
725* `std::vector<base*>` (or similar) → `boost::base_collection<base>`
726* `std::vector<std::function<signature>>` → `boost::function_collection<signature>`
727* `std::vector<boost::type_erasure::any<concept_>>` → `boost::any_collection<concept_>`
728
729is expected to increase performance due to better caching and branch prediction
730behavior. But there is still room for improvement.
731
732Consider this transformation of the code above using a double
733segment-element loop (based on the
734[link poly_collection.tutorial.deeper_into_the_segmented_nature.local_iterators local iterator]
735capabilities of Boost.PolyCollection):
736
737[algorithms_2]
738
739Although not obvious at first sight, this version of the same basic operation
740is actually /faster/ than the first one: for a segmented structure such as used
741by Boost.PolyCollection, each iteration with the non-local iterator passed to
742`std::for_each` involves:
743
744# hopping to next position in the segment,
745# checking for /end of segment/ (always),
746# if at end (infrequent), selecting the next segment,
747# checking for /end of range/ (always),
748
749whereas in the second version, iteration on the inner loop, where most
750processing happens, is a simple increment-and-check operation, that is,
751there is one less check (which happens at the much shorter outer loop). When
752the workload of the algorithm (the actually useful stuff done with the elements
753themselves) is relatively light, the overhead of looping can be very
754significant.
755
756To make it easier for the user to take advantage of faster segment-element
757looping, Boost.PolyCollection provides its own version of `for_each` based
758on that technique:
759
760[algorithms_3]
761
762`boost::poly_collection::for_each` has the same interface and behavior as
763`std::for_each` except for the fact that it only works for (non-local)
764iterators of a polymorphic container
765[footnote For any other type of iterator, it is guaranteed not to compile.].
766Versions of other standard algorithms are provided as well:
767
768[algorithms_4]
769
770In fact, variants are given of most (though not all) of the algorithms in
771[@http://en.cppreference.com/w/cpp/algorithm `<algorithm>`]
772among those that are compatible with polymorphic collections
773[footnote For example, algorithms requiring bidirectional iterators or a higher
774category are not provided because polymorphic collections have forward-only
775iterators.].
776Check the [link poly_collection.reference.header_boost_poly_collection_alg reference]
777for details.
778
779[section Type restitution]
780
781By /type restitution/ we mean the generic process of getting a concrete entity
782from an abstract one by providing missing type information:
783
784[algorithms_5]
785
786Type restitution has the potential to increase performance. Consider for
787instance the following:
788
789[algorithms_6]
790
791Behaviorwise this code is equivalent to simply executing `std::cout<<r<<"\n"`,
792but when type restitution succeeds the statement `std::cout<<str<<"\n"` is
793skipping a virtual-like call that would have happened if `r` were used instead.
794From a general point of view, supplying the compiler with extra type
795information allows it to perform more optimizations than in the abstract case,
796inlining being the prime example.
797
798All Boost.PolyCollection algorithms accept an optional list of types for
799restitution. Let us use the
800[link poly_collection.tutorial.basics.boost_any_collection `boost::any_collection`]
801scenario to illustrate this point:
802
803[algorithms_7]
804
805Output:
806
807[pre
808warrior 2,warrior 6,[pop-up 1],[pop-up 2],juggernaut 0,juggernaut 4,
809juggernaut 7,goblin 1,goblin 3,goblin 5,"stamina: 10,000","game over"
810]
811This rendering loop differs from the non-restituting one in two aspects:
812
813* A list of types is provided as template arguments to the algorithm. This is
814an indication that the types /may/ be actually present in the collection, not
815a promise. Also, the list of types need not be exhaustive, that is, some
816unlisted types could be present in the collection \u2014in the example above,
817the loop traverses *all* elements (including `std::string`s and `window`s),
818not only those corresponding to restituted types. In general, the more types
819are restituted, the greater the potential improvement in performance.
820* The lambda function passed is a generic one accepting its argument as
821`const auto&`
822[footnote This requires C++14, but the same effect can be achieved in C++11
823providing an equivalent, if more cumbersome, functor with a templatized
824call operator.].
825
826Internally, `boost::poly_collection::for_each` checks for each segment if its
827type, say `T`, belongs in the type restitution list: if this is the case, the lambda
828function is passed a `const T&` rather than the generic
829`const boost::any_collection::value_type&`. For each restituted type we are
830saving indirection calls and possibly getting inlining optimizations, etc.
831As we see in the
832[link poly_collection.performance performance section], the speedup can be
833very significant.
834
835Type restitution works equally for the rest of collections of
836Boost.PolyCollection. When using `boost::base_collection`, though, the case
837of improved performance is more tricky:
838
839[algorithms_8]
840
841The problem here is that, even though the argument to the lambda function will
842be restituted (when appropriate) to `warrior&`, `juggernaut&` or `goblin&`,
843using it still involves doing a virtual function call in `s.render(std::cout)`.
844Whether this call is resolved to a non-virtual one depends on the quality
845of implementation of the compiler:
846
847* If the concrete class is marked as
848[@http://en.cppreference.com/w/cpp/language/final `final`], the compiler /in
849principle/ has enough information to get rid of the virtual function call.
850* Other than this,
851[@http://hubicka.blogspot.com.es/2014/01/devirtualization-in-c-part-1.html ['devirtualization]]
852capabilities /may/ be able to figure out that the type restitution scenario
853is actually casting the element to its true type, in which case, again, virtual
854calls are not needed.
855
856[endsect]
857
858[endsect]
859
860[endsect]
861
862[section Performance]
863
864[def _vs2015_x86_ Visual Studio 2015 x86]
865[def _vs2015_x64_ Visual Studio 2015 x64]
866[def _gcc63_x64_ GCC 6.3 x64]
867[def _clang40_x64_ Clang 4.0 x64]
868
869[template perf_fig[name environment file]
870[section [environment]]
871
872[role aligncenter
873'''<inlinemediaobject>
874    <imageobject><imagedata fileref="'''poly_collection/img/[file].png'''"/></imageobject>
875  </inlinemediaobject>'''
876]
877[role aligncenter [*[name], [environment]]]
878
879[endsect]
880]
881
882[import ../example/perf.cpp]
883(Testing program at [@boost:/libs/poly_collection/example/perf.cpp `perf.cpp`].)
884
885We ran tests to measure the performance of the containers of
886Boost.PolyCollection in two scenarios:
887
888* Insertion of elements.
889* Linear traversal and processing with `std::for_each` and `boost::poly_collection::for_each`
890(with and without
891[link poly_collection.tutorial.algorithms.type_restitution type restitution]).
892
893As a comparison baseline we used containers and facilities from the
894standard library and Boost (details below). Tests were run in the
895following environments:
896
897* *_vs2015_x86_*: Visual C++ 2015 in 32-bit (x86) release mode, Windows 7 64-bit, Intel Core i5-2520M @2.5GHz
898* *_vs2015_x64_*: Visual C++ 2015 in 64-bit (x64) release mode, same machine
899* *_gcc63_x64_*: GCC 6.3 release mode, Xubuntu 17.04 x64, Intel Core i7-5820 @3.3GHz
900* *_clang40_x64_*: Clang 4.0 release mode, same machine
901
902[section Container definitions]
903
904[section `boost::base_collection`]
905
906* Baseline container: `ptr_vector` = `boost::ptr_vector<base>`
907* Polymorphic collection: `base_collection` = `boost::base_collection<base>`
908* Element types: `T1` = `derived1`, `T2` = `derived2`, `T3` = `derived3`
909
910[perf_base_types]
911
912[endsect]
913
914[section `boost::function_collection`]
915
916* Baseline container: `func_vector` = `std::vector<std::function<int(int)>>`
917* Polymorphic collection: `function_collection` = `boost::function_collection<int(int)>`
918* Element types: `T1` = `concrete1`, `T2` = `concrete2`, `T3` = `concrete3`
919
920[perf_function_types]
921
922[endsect]
923
924[section `boost::any_collection`]
925
926* Baseline container: `any_vector` = `std::vector<boost::type_erasure::any<concept_>>`
927* Polymorphic collection: `any_collection` = `boost::any_collection<concept_>`
928* Element types: `T1` = `int`, `T2` = `double`, `T3` = `char`
929
930[perf_any_types]
931
932[endsect]
933
934[endsect]
935
936[section Insertion tests]
937
938Tests measure the time taken to insert /n/ elements (/n/ between
93910'''<superscript>2</superscript>''' and 10'''<superscript>7</superscript>''')
940from a source of values with types following the cyclic sequence
941
942`T1` `T1` `T2`  `T2` `T3` `T1` `T1` `T2`  `T2` `T3` ···
943
944No `reserve` operation is done before insertion.
945The figures show resulting times in nanoseconds/element.
946The horizontal axis is logarithmic.
947
948[section Results for `boost::base_collection`]
949[perf_fig Insertion.._vs2015_x86_..insert_base_vs2015_x86]
950[perf_fig Insertion.._vs2015_x64_..insert_base_vs2015_x64]
951[perf_fig Insertion.._gcc63_x64_..insert_base_gcc63_x64]
952[perf_fig Insertion.._clang40_x64_..insert_base_clang40_x64]
953[endsect]
954
955[section Results for `boost::function_collection`]
956[perf_fig Insertion.._vs2015_x86_..insert_function_vs2015_x86]
957[perf_fig Insertion.._vs2015_x64_..insert_function_vs2015_x64]
958[perf_fig Insertion.._gcc63_x64_..insert_function_gcc63_x64]
959[perf_fig Insertion.._clang40_x64_..insert_function_clang40_x64]
960[endsect]
961
962[section Results for `boost::any_collection`]
963[perf_fig Insertion.._vs2015_x86_..insert_any_vs2015_x86]
964[perf_fig Insertion.._vs2015_x64_..insert_any_vs2015_x64]
965[perf_fig Insertion.._gcc63_x64_..insert_any_gcc63_x64]
966[perf_fig Insertion.._clang40_x64_..insert_any_clang40_x64]
967[endsect]
968
969[endsect]
970
971[section Processing tests]
972
973Tests measure the time taken to traverse a container of size
974/n/ (/n/ between
97510'''<superscript>2</superscript>''' and 10'''<superscript>7</superscript>''')
976and execute an operation on each of its elements. The operation for
977`boost::base_collection` and `boost::function_collection` (and the associated
978baseline containers) is defined as
979
980[perf_for_each_callable]
981
982whereas for `boost::any_collection` we use
983
984[perf_for_each_incrementable]
985
986The baseline container is tested with three different setups:
987
988* Directly as initialized by the process described for the
989[link poly_collection.performance.insertion_tests insertion tests].
990The sequence of types is complex enough that CPU's branch prediction mechanisms
991are not able to fully anticipate it
992[footnote This has been verified empirically: simpler cycles did indeed
993yield better execution times.]. As elements are ordered according to their
994construction time, certain degree of memory contiguity is expected.
995* With an extra post-insertion stage by which elements are sorted
996according to their `typeid()`. This increases branch prediction
997efficiency at the expense of having worse cache friendliness.
998* With an extra post-insertion stage that randomly
999shuffles all the elements in the container. This is the worst possible
1000scenario both in terms of caching and branch prediction.
1001
1002As for the polymorphic collection, three variations are measured:
1003
1004* With `std::for_each` (the same as the baseline container).
1005* Using `boost::poly_collection::for_each`.
1006* Using `boost::poly_collection::for_each` with
1007[link poly_collection.tutorial.algorithms.type_restitution /type restitution/]
1008of `T1`, `T2` and `T3`.
1009
1010The figures show resulting times in nanoseconds/element.
1011The horizontal axis is logarithmic.
1012
1013[section Results for `boost::base_collection`]
1014[perf_fig Processing.._vs2015_x86_..for_each_base_vs2015_x86]
1015[perf_fig Processing.._vs2015_x64_..for_each_base_vs2015_x64]
1016[perf_fig Processing.._gcc63_x64_..for_each_base_gcc63_x64]
1017[perf_fig Processing.._clang40_x64_..for_each_base_clang40_x64]
1018[endsect]
1019
1020[section Results for `boost::function_collection`]
1021[perf_fig Processing.._vs2015_x86_..for_each_function_vs2015_x86]
1022[perf_fig Processing.._vs2015_x64_..for_each_function_vs2015_x64]
1023[perf_fig Processing.._gcc63_x64_..for_each_function_gcc63_x64]
1024[perf_fig Processing.._clang40_x64_..for_each_function_clang40_x64]
1025[endsect]
1026
1027[section Results for `boost::any_collection`]
1028[perf_fig Processing.._vs2015_x86_..for_each_any_vs2015_x86]
1029[perf_fig Processing.._vs2015_x64_..for_each_any_vs2015_x64]
1030[perf_fig Processing.._gcc63_x64_..for_each_any_gcc63_x64]
1031[perf_fig Processing.._clang40_x64_..for_each_any_clang40_x64]
1032[endsect]
1033
1034[endsect]
1035
1036[endsect]
1037
1038[include reference.qbk]
1039
1040[section Future work]
1041
1042A number of features asked by reviewers and users of Boost.PolyCollection are
1043considered for inclusion into future versions of the library.
1044
1045[section Alternative RTTI systems]
1046
1047Boost.PolyCollection can be extended to use _Boost.TypeIndex_ in RTTI-challenged
1048scenarios. Taking this idea further, it is not unusual that some environments
1049(game engines, for instance) provide their own RTTI framework: an even more
1050ambitious extension to Boost.PolyCollection would then be to make it configurable
1051for user-provided RTTI through some sort of traits class specifying replacements for
1052[@http://en.cppreference.com/w/cpp/types/type_info `std::type_info`]
1053and [@http://en.cppreference.com/w/cpp/language/typeid `typeid`].
1054
1055[endsect]
1056
1057[section Copy traits]
1058
1059`boost::base_collection` requires that stored objects be _MoveConstructible_ and
1060_MoveAssignable_; unfortunately, it is customary to restrict copying in OOP
1061hierarchies to avoid slicing, which would force users to revisit their class
1062definitions in order to use Boost.PolyCollection. This can be alleviated by
1063offering a configurable traits class where copy and assignment can be defined
1064externally
1065
1066``
1067template<typename T>
1068struct copy_traits
1069{
1070  void construct(void*,T&&);
1071  void assign(T&,T&&);
1072};
1073``
1074
1075with default implementations resorting to regular placement `new` and
1076`T::operator=`.
1077
1078[endsect]
1079
1080[section Parallel algorithms]
1081
1082C++17 introduces
1083[@http://en.cppreference.com/w/cpp/experimental/parallelism parallel algorithms],
1084like for instance a parallel version of _std::for_each_; it is only
1085natural then to provide the corresponding
1086[link poly_collection.tutorial.algorithms Boost.PolyCollection-specific algorithms].
1087The segmented nature of polymorphic collections makes them particularly amenable to
1088parallel processing.
1089
1090[endsect]
1091
1092[section `variant_collection`]
1093
1094/Closed polymorphism/ is a kind of dynamic polymorphism where the set of implementation
1095types is fixed at definition time: the prime example of this paradigm in C++ is
1096[@http://en.cppreference.com/w/cpp/utility/variant `std::variant`]. Although
1097`boost::any_collection<boost::mpl::vector<>>` can act as a sort of replacement for
1098`std::vector<std::variant<T1,...,TN>>`, this is in fact more similar to a
1099`std::vector<`[@http://en.cppreference.com/w/cpp/utility/any `std::any`]`>`,
1100and a collection class `boost::variant_collection<T1,...,TN>` could be designed to
1101better model closed polymorphism and take further advantage of the fact that
1102implementation types are fixed (for instance, internal virtual calls can be
1103completely eliminated). From a conceptual point of view, this would require introducing
1104a new [* `ClosedPolymorphicCollection`] notion and renaming the current
1105[link poly_collection.reference.polymorphic_containers.polymorphic_collections [* `PolymorphicCollection`]]
1106model to [* `OpenPolymorphicCollection`].
1107
1108[endsect]
1109
1110[section Ordered polymorphic collections]
1111
1112Users have expressed interest in polymorphic collections where elements are kept
1113ordered within their segment and optionally duplicates are excluded, much like
1114[@http://boost.org/doc/html/container/non_standard_containers.html#container.non_standard_containers.flat_xxx `boost::flat_set`/`boost::flat_multiset`]
1115do over their internal data vector. The open question remains of whether these
1116collections should also guarantee some order between segments (current ones don't)
1117to allow for the definition of container-level `operator<` and related operators.
1118
1119[endsect]
1120
1121[endsect]
1122
1123[section Release notes]
1124
1125[section Boost 1.74]
1126
1127* Fixed internal ambiguity problem between `boost::type_erasure::any` and
1128`boost::any` (issue [github poly_collection 17]).
1129* Maintenance work.
1130
1131[endsect]
1132
1133[section Boost 1.73]
1134
1135* Suppressed a potential redundant move warning in `boost::poly_collection::for_each`.
1136* Fixed a bug by which elements were copied rather than moved in
1137allocator-extended move construction and move assigment between collections
1138with non-propagating, unequal allocators.
1139* Allocator-extended move construction no longer decays to allocator-extended copy
1140construction for the legacy version of libstdc++-v3 shipped with GCC 4.8
1141(which can also be used by Clang).
1142
1143[endsect]
1144
1145[section Boost 1.72]
1146
1147* Maintenance work.
1148
1149[endsect]
1150
1151[section Boost 1.71]
1152
1153* Maintenance work.
1154
1155[endsect]
1156
1157[section Boost 1.70]
1158
1159* Improved handling of stateful allocators and allocator propagation traits,
1160after an error reported by Billy O'Neal ([github_pr poly_collection 9]).
1161* Fixed a potentially serious bug with an internal cache structure.
1162
1163[endsect]
1164
1165[section Boost 1.69]
1166
1167* Added Boost.PolyCollection-specific versions of algorithms
1168`std::for_each_n` and `std::sample`.
1169
1170[endsect]
1171
1172[section Boost 1.67]
1173
1174* Maintenance fixes.
1175
1176[endsect]
1177
1178[section Boost 1.66]
1179
1180* Boost.PolyCollection has been backported to GCC 4.8 to 4.9 and Clang 3.3 to 3.6.
1181The version of libstdc++-v3 shipped with GCC 4.8 (which can also be used by Clang)
1182has deficiencies that result in the following limitations when using
1183Boost.PolyCollection:
1184  * Stateful allocators are not properly supported.
1185  * Allocator-extended move construction decays to allocator-extended copy construction.
1186  * Copy construction crashes if an exception is thrown during element copying.
1187* Maintenance fixes.
1188
1189[endsect]
1190
1191[section Boost 1.65]
1192
1193* Initial release.
1194
1195[endsect]
1196
1197[endsect]
1198
1199[section Acknowledgments]
1200
1201The library uses [@http://www.pdimov.com/ Peter Dimov]'s
1202[@http://www.pdimov.com/cpp2/simple_cxx11_metaprogramming.html implementation
1203of `std::make_index_sequence`].
1204[@http://manu343726.github.io/ Manu Sánchez] aided immensely with CI setup and
1205performance testing. [@http://stackoverflow.com/users/85371/sehe Sehe]
1206contributed performance results for GCC 5.2, and
1207[@https://www.linkedin.com/in/francisco-jose-tapia-iba%C3%B1ez-4239a07a Francisco
1208José Tapia] for Clang 3.9.
1209
1210The Boost acceptance review took place between the 3rd and 17th of May, 2017. Thank you
1211to Ion Gaztañaga for smoothly managing the process. The following people participated
1212with full reviews or valuable comments:
1213Pete Bartlett, Hans Dembinski, Dominique Devienne, Edward Diener, Vinnie Falco,
1214Ion Gaztañaga, Andrzej Krzemienski, Brook Milligan, Thorsten Ottosen, Steven Watanabe,
1215Adam Wulkiewicz. Many thanks to all of them. Steven Watanabe gave crucial help in
1216solving some hard problems related to the usage of _Boost.TypeErasure_.
1217
1218Boost.PolyCollection was designed and written between rainy
1219[@https://es.wikipedia.org/wiki/Viav%C3%A9lez Viavélez], noisy Madrid and beautiful
1220[@https://en.wikipedia.org/wiki/C%C3%A1ceres,_Spain Cáceres], August-November, 2016.
1221Most of the after-review work in preparation for the official Boost release was done
1222in the quiet town of [@https://es.wikipedia.org/wiki/Oropesa_(Toledo) Oropesa] during
1223the spring of 2017.
1224
1225In memory of Joaquín López Borrella (1939-2015), in memory of Héctor (2004-2017):
1226may your ghosts keep us company.
1227
1228[endsect]