1[library Boost.PolyCollection 2 [quickbook 1.6] 3 [authors [López Muñoz, Joaquín M]] 4 [copyright 2016-2020 Joaquín M López Muñoz] 5 [category containers] 6 [id poly_collection] 7 [dirname poly_collection] 8 [purpose 9 High-performance containers for polymorphic objects. 10 ] 11 [license 12 Distributed under the Boost Software License, Version 1.0. 13 (See accompanying file LICENSE_1_0.txt or copy at 14 [@http://www.boost.org/LICENSE_1_0.txt]) 15 ] 16] 17 18[def _Boost.TypeErasure_ [@boost:/libs/type_erasure Boost.TypeErasure]] 19[def _Boost.TypeIndex_ [@boost:/libs/type_index Boost.TypeIndex]] 20[def _CopyConstructible_ [@http://en.cppreference.com/w/cpp/named_req/CopyConstructible [* `CopyConstructible`]]] 21[def _duck_typing_ [@https://en.wikipedia.org/wiki/Duck_typing ['duck typing]]] 22[def _EqualityComparable_ [@http://en.cppreference.com/w/cpp/named_req/EqualityComparable [* `EqualityComparable`]]] 23[def _ForwardIterator_ [@http://en.cppreference.com/w/cpp/named_req/ForwardIterator [* `ForwardIterator`]]] 24[def _MoveConstructible_ [@http://en.cppreference.com/w/cpp/named_req/MoveConstructible [* `MoveConstructible`]]] 25[def _MoveAssignable_ [@http://en.cppreference.com/w/cpp/named_req/MoveAssignable [* `MoveAssignable`]]] 26[def _std::for_each_ [@http://en.cppreference.com/w/cpp/algorithm/for_each `std::for_each`]] 27[def _std::function_ [@http://en.cppreference.com/w/cpp/utility/functional/function `std::function`]] 28[def _std::unordered_multiset_ [@http://en.cppreference.com/w/cpp/container/unordered_multiset `std::unordered_multiset`]] 29[def _RandomAccessIterator_ [@http://en.cppreference.com/w/cpp/named_req/RandomAccessIterator [* `RandomAccessIterator`]]] 30 31[template ticket[number]'''<ulink url="https://svn.boost.org/trac/boost/ticket/'''[number]'''">'''#[number]'''</ulink>'''] 32[template github[module number]'''<ulink url="https://github.com/boostorg/'''[module]'''/issues/'''[number]'''">'''#[number]'''</ulink>'''] 33[template github_pr[module number]'''<ulink url="https://github.com/boostorg/'''[module]'''/pull/'''[number]'''">'''PR#[number]'''</ulink>'''] 34 35[section Introduction] 36 37Dynamic polymorphism in C++ requires that objects (such as instances 38of classes derived from an abstract base) be accessed through an indirection pointer 39because their actual /type/ and /size/ are not known at the point of usage. As a 40consequence, regular containers cannot store polymorphic objects directly: the usual 41workaround is to have containers of pointers to heap-allocated elements. In modern 42computer architectures this pattern incurs two types of inefficiency: 43 44* The lack of memory contiguity produced by heap allocation degrades CPU cache 45performance. 46* Executing virtual operations on a sequence of polymorphic objects whose 47actual types differ from one to the next results in failures in 48[@https://en.wikipedia.org/wiki/Branch_predictor branch prediction] and a 49lower execution speed. 50 51When the particular traversal order is not relevant to the user application, 52Boost.PolyCollection proposes an alternative data structure that restores memory 53contiguity and packs elements according to their concrete type. Three container 54class templates are provided: 55 56* `boost::base_collection` 57* `boost::function_collection` 58* `boost::any_collection` 59 60respectively dealing with three different types of dynamic polymorphism 61available in C++: 62 63* Classic base/derived or OOP polymorphism. 64* Function wrapping in the spirit of _std::function_. 65* So-called _duck_typing_ as implemented by _Boost.TypeErasure_. 66 67The interface of these containers closely follows that of standard containers. 68Additionally, the library provides versions of many of the standard library 69algorithms (including 70[@http://en.cppreference.com/w/cpp/algorithm/for_each `std::for_each`]) 71with improved performance and a special feature called /type restitution/ 72that allows user code to provide clues on the concrete types of the elements 73stored for further opportunities of increased efficiency related to inlining and 74[@http://hubicka.blogspot.com.es/2014/01/devirtualization-in-c-part-1.html ['devirtualization]]. 75 76[note Boost.PolyCollection is a header-only library. C++11 support is required. 77The library has been verified to work with Visual Studio 2015, GCC 4.8 and 78Clang 3.3.] 79 80[endsect] 81 82[section An efficient polymorphic data structure] 83 84Suppose we have a `base` abstract class with implementations `derived1`, `derived2` 85and `derived3`. The memory layout of a `std::vector<base*>` (or similar constructs 86such as `std::vector<std::unique_ptr<base>>` or 87[@boost:/libs/ptr_container/ `boost::ptr_vector<base>`]) looks like 88the following: 89 90[$poly_collection/img/ptr_vector.png] 91 92Elements that are adjacent in the vector are not necessarily allocated contiguously, 93much less so if the vector has undergone mid insertions and deletions. A typical 94processing operation 95 96 std::vector<base*> v; 97 ... 98 for(base* b: v){ 99 ... // access base's virtual interface 100 } 101 102is impacted negatively by two factors: 103 104* Scattering of elements throughout memory reduces CPU caching efficiency, 105which in general favor regular access loops to contiguous memory areas. 106* [@https://en.wikipedia.org/wiki/Branch_predictor Branch prediction] tries 107to minimize the effect of running conditional code (such as an 108`if`-`else` statement or the invocation of a `base` virtual function) by 109speculatively executing a given branch based on past history. This 110mechanism is rendered mostly useless when `derived1`, `derived2` and 111`derived3` elements are interspersed along the sequence without a definite pattern. 112 113These limitations are imposed by the very nature of dynamic polymorphism: 114as the exact types of the elements accessed through `base`'s interface are not 115known, an indirection through `base*` (a particular form of /type erasure/) 116is required. There is however a critical observation: even though derived types 117are not known when traversing a `std::vector<base*>`, the information is typically 118available /at compile time/ at the point of insertion in the vector: 119 120 std::vector<base*> v; 121 ... 122 v.insert(new derived2{...}); // the type derived2 is "forgotten" by v 123 124A suitably designed container can take advantage of this information to 125arrange elements contiguously according to their exact type, which results 126in an internal data structure (a map of pointers to 127[@http://en.cppreference.com/w/cpp/types/type_info `std::type_info`] objects, 128actually) pointing to as many vectors or /segments/ as there are derived classes: 129 130[$poly_collection/img/segment_map.png] 131 132Traversing such a structure reduces to looping over all the segments one after 133another: this is extremely efficient both in terms of caching and branch 134prediction. In the process we have however lost the free-order capability of a 135`std::vector<base*>` (free order can only be retained at the segment level), but 136if this is not relevant to the user application the potential performance gains 137of switching to this structure are large. 138 139The discussion has focused on base/derived programming, also known as OOP, but it 140also applies to other forms of dynamic polymorphism: 141 142* _std::function_ abstracts callable entities with the same given signature 143under a common interface. Internally, pointer indirections and virtual-like 144function calls are used. Memory fragmentation is expected to be lower than 145with OOP, though, as implementations usually feature the so-called 146[@http://talesofcpp.fusionfenix.com/post-16/episode-nine-erasing-the-concrete#a-note-on-small-buffer-optimization [' small buffer optimization]] 147to avoid heap allocation in some situations. 148* The case of `std::function` can be seen as a particular example of a 149more general form of polymorphism called _duck_typing_, where unrelated types 150are treated uniformly if they conform to the same given /interface/ (a specified 151set of member functions and/or operations). Duck typing provides the power of 152OOP while allowing for greater flexibility as polymorphic types need not derive 153from a preexisting base class or in general be designed with any particular 154interface in mind --in fact, the same object can be duck-typed to different 155interfaces. Among other libraries, _Boost.TypeErasure_ provides duck typing 156for C++. Under the hood, duck typing requires pointer indirection and virtual 157call implementation techniques analogous to those of OOP, and so there are 158the same opportunities for efficient container data structures as we have 159described. 160 161Boost.PolyCollection provides three different container class templates 162dealing with OOP, `std::function`-like polymorphism and duck typing as 163implemented by Boost.TypeErasure: 164 165* `boost::base_collection` 166* `boost::function_collection` 167* `boost::any_collection` 168 169The interfaces of these containers are mostly the same and follow the usual 170conventions of standard library containers. 171 172[endsect] 173 174[section Tutorial] 175 176[section Basics] 177 178[import ../example/rolegame.hpp] 179 180[section `boost::base_collection`] 181 182[import ../example/basic_base.cpp] 183(Code samples from [@boost:/libs/poly_collection/example/basic_base.cpp `basic_base.cpp`].) 184 185Imagine we are developing a role playing game in C++ where sprites are rendered on 186screen; for the purposes of illustration we can model rendering simply as 187outputting some information to a `std::ostream`: 188 189[rolegame_1] 190 191The game features warriors, juggernauts (a special type of warrior) and goblins, 192each represented by its own class derived (directly or indirectly) from `sprite`: 193 194[rolegame_2] 195 196Let us populate a polymorphic collection with an assortment of sprites: 197 198[basic_base_1] 199 200There are two aspects to notice here: 201 202* `boost::base_collection` does not have a `push_back` member function like, say, 203`std::vector`, as the order in which elements are stored cannot be freely 204chosen by the user code \u2014we will see more about this later. The insertion mechanisms 205are rather those of containers like _std::unordered_multiset_, namely `insert` and 206`emplace` with or without a position /hint/. 207* Elements are not created with `new` but constructed on the stack and passed 208directly much like one would do with a standard non-polymorphic container. 209 210[important Elements inserted into a `boost::base_collection` (or the other 211containers of Boost.PolyCollection) must be copyable and assignable; strictly 212speaking, they must at least model _MoveConstructible_ and either be 213_MoveAssignable_ or not throw on move construction. This might 214force you to revisit your code as it is customary to explicitly forbid 215copying at the base level of a virtual hierarchy to avoid 216[@https://en.wikipedia.org/wiki/Object_slicing ['slicing]].] 217 218Rendering can now be implemented with a simple `for` loop over `c`: 219 220[basic_base_2] 221 222The output being: 223 224[pre 225juggernaut 0,juggernaut 4,juggernaut 7,goblin 1,goblin 3,goblin 5,warrior 2,warrior 6 226] 227 228As we have forewarned, the traversal order does not correspond to that of insertion. 229Instead, the elements are grouped in /segments/ according to their concrete type. 230Here we see that juggernauts come first, followed by goblins and warriors. In 231general, no assumptions should be made about segment ordering, which may be 232different for this very example in your environment. On the other hand, elements 233inserted into an already existing segment always come at the end (except if a hint 234is provided). For instance, after inserting a latecomer goblin with `id==8`: 235 236[basic_base_3] 237 238the rendering loop outputs (new element in red): 239 240[pre 241juggernaut 0,juggernaut 4,juggernaut 7,goblin 1,goblin 3,goblin 5,[role red goblin 8],warrior 2,warrior 6 242] 243 244Deletion can be done in the usual way: 245 246[basic_base_4] 247 248Now rendering emits: 249 250[pre 251juggernaut 0,juggernaut 4,goblin 1,goblin 3,goblin 5,goblin 8,warrior 2,warrior 6 252] 253 254[endsect] 255 256[section `boost::function_collection`] 257 258[import ../example/basic_function.cpp] 259(Code samples from [@boost:/libs/poly_collection/example/basic_function.cpp `basic_function.cpp`]. 260C++14 `std::make_unique` is used.) 261 262Well into the development of the game, the product manager requests that two new 263types of entities be added to the rendering loop: 264 265* Overlaid messages, such as scores, modelled as `std::string`s. 266* Pop-up windows coming from a third party library that, unfortunately, do not 267derive from `sprite` and use their own `display` member functon for rendering: 268 269[rolegame_3] 270 271We then decide to refactor the code so that sprites, messsages and windows are 272stored in dedicated containers: 273 274[basic_function_1] 275 276and define our rendering container as a `boost::function_collection` of 277/callable entities/ \u2014function pointers or objects with a function 278call `operator()`\u2014 accepting a `std::ostream&` as a parameter 279 280[basic_function_2] 281 282which we fill with suitable adaptors for `sprite`s, `std::string`s and 283`window`s, respectively. Lambda functions allow for a particularly terse code. 284 285[basic_function_3] 286 287The rendering loop now looks like this: 288 289[basic_function_4] 290 291and produces the following for a particular scenario of sprites, messages and 292windows: 293 294[pre 295juggernaut 0,goblin 1,warrior 2,goblin 3,"stamina: 10,000","game over",'''[pop-up 1]''','''[pop-up 2]''' 296] 297 298The container we have just created works in many respects like a 299`std::vector<std::function<render_callback>>`, the main difference being 300that with `boost::function_collection` callable entities of the same type 301are packed together in memory 302[footnote Note that all `sprite`s come into one segment: this is why 303goblins #1 and #3 are not adjacent. Exercise for the reader: change 304the code of the example so that sprites are further segmented according 305to their concrete type.], thus increasing performance (which is the whole 306point of using Boost.PolyCollection), while a vector of `std::function`s 307results in an individual allocation for each entity stored 308[footnote Except when small buffer optimization applies.]. 309In fact, the `value_type` elements held by a 310`boost::function_collection` are actually /not/ `std::function`s, 311although they behave analogously and can be converted to `std::function` if needed: 312 313[basic_function_5] 314 315There is a reason for this: elements of a polymorphic collection cannot be freely 316assigned to by the user as this could end up trying to insert an object into a 317segment of a different type. So, unlike with `std::function`, this will not work: 318 319[basic_function_6] 320 321[endsect] 322 323[section `boost::any_collection`] 324 325[import ../example/basic_any.cpp] 326(Code samples from [@boost:/libs/poly_collection/example/basic_any.cpp `basic_any.cpp`].) 327 328[note Here we just touch on the bare essentials of _Boost.TypeErasure_ needed 329to present `boost::any_collection`. The reader is advised to read 330Boost.TypeErasure documentation for further information.] 331 332After measuring the performance of the latest changes, we find that rendering 333is too slow and decide to refactor once again: if we could store all the 334entities --sprites, messages and windows-- into one single container, that would 335eliminate a level of indirection. The problem is that these types are totally 336unrelated to each other. 337 338_Boost.TypeErasure_ provides a class template `boost::type_erasure::any<Concept>` 339able to hold an object of whatever type as long as it conforms to the interface 340specified by `Concept`. In our case, we find it particularly easy to implement 341a common interface for rendering by providing overloads of `operator<<` 342 343[basic_any_1] 344 345so that we can use it to specify the interface all entities have to adhere to: 346 347[basic_any_2] 348 349The collection just created happily accepts insertion of heterogeneous entities 350(since interface conformity is silently checked at compile time) 351 352[basic_any_3] 353 354and rendering becomes 355 356[basic_any_4] 357 358with output 359 360[pre 361'''[pop-up 1]''','''[pop-up 2]''',juggernaut 0,goblin 1,goblin 3,warrior 2,"stamina: 10,000","game over" 362] 363 364As was the case with `boost::function_collection`, this container is similar 365to a `std::vector<boost::type_erasure::any<Concept>>` but has better performance 366due to packing of same-type elements. Also, the `value_type` of a 367`boost::any_collection<Concept>` is /not/ `boost::type_erasure::any<Concept>`, 368but a similarly behaving entity 369[footnote Actually, it is 370`boost::type_erasure::any<Concept2,boost::type_erasure::_self&>` for some 371internally defined `Concept2` that extends `Concept`.]. 372In any case, we are not accessing sprites through an abstract `sprite&` 373anymore, so we could as well dismantle the virtual hierarchy and implement 374each type autonomously: this is left as an exercise to the reader. 375 376[endsect] 377 378[endsect] 379 380[section Deeper into the segmented nature of Boost.PolyCollection] 381 382[import ../example/segmented_structure.cpp] 383(Code samples from [@boost:/libs/poly_collection/example/segmented_structure.cpp `segmented_structure.cpp`]. 384C++14 `std::make_unique` is used.) 385 386[section Type registration] 387 388Getting back to our [link poly_collection.tutorial.basics.boost_base_collection `boost::base_collection`] 389example, suppose we want to refactor the populating code as follows: 390 391[segmented_structure_1] 392 393Unexpectedly, this piece of code throws an exception of type 394[link poly_collection.reference.header_boost_poly_collection_exc.class_unregistered_type `boost::poly_collection::unregistered_type`]. 395What has changed from our original code? 396 397Suppose a `warrior` has been created by `make_sprite`. The statement 398`c.insert(*make_sprite())` is passing the object as a `sprite&`: even though 399`boost::base_collection` is smart enough to know that the object is actually 400derived from `sprite` (by using 401[@http://en.cppreference.com/w/cpp/language/typeid `typeid()`]) and 402[@https://en.wikipedia.org/wiki/Object_slicing slicing] is to be avoided, 403there is no way that a segment for it can be created without accessing 404the type `warrior` /at compile time/ for the proper internal class templates 405to be instantiated 406[footnote If this is conceptually difficult to grasp, consider the potentially 407more obvious case where `warrior` is defined in a dynamic module linked to 408the main program: the code of `boost::base_collection`, which has been compiled 409before linking, cannot even know the size of this as-of-yet unseen class, so 410hardly can it allocate a segment for the received object.]. 411This did not happen in the pre-refactoring code because objects were passed 412as references to their true types. 413 414A type is said to be /registered/ into a polymorphic collection if 415there is a (potentially empty) segment created for it. This can be checked 416with: 417 418[segmented_structure_2] 419 420Registration happens automatically when the object is passed as a reference 421to its true type or 422[link poly_collection.tutorial.insertion_and_emplacement.emplacement `emplace`]'d, 423and explicitly with 424`register_types`: 425 426[segmented_structure_3] 427 428Once `T` has been registered into a polymorphic collection, it remains 429so regardless of whether objects of type `T` are stored or not, except if 430the collection is moved from, assigned to, or swapped. 431 432As slicing is mainly an issue with OOP, `unregistered_type` exceptions are 433expected to be much less frequent with `boost::function_collection` and 434`boost::any_collection`. Contrived examples can be found, though: 435 436[segmented_structure_4] 437 438[endsect] 439 440[section Segment-specific interface] 441 442For most of the interface of a polymorphic collection, it is possible to only 443refer to the elements of a given segment by providing either their type or 444`typeid()`. For instance: 445 446[segmented_structure_5] 447 448Note that any of these (except 449[link poly_collection.tutorial.deeper_into_the_segmented_nature.reserve `reserve`]) 450will throw `boost::poly_collection::unregistered_type` if the type is not 451registered. Segment-specific addressability also includes traversal: 452 453[endsect] 454 455[section Local iterators] 456 457The following runs only over the `warrior`s of the collection: 458 459[segmented_structure_6] 460 461Output: 462 463[pre 464warrior 2,warrior 6 465] 466 467`begin|end(typeid(T))` return objects of type `local_base_iterator` 468associated to the segment for `T`. These iterators dereference to the same 469value as regular iterators (in the case of `boost::base_collection<base>`, 470`base&`) but can only be used to traverse a given segment (for instance, 471`local_base_iterator`'s from different segments cannot be compared between them). 472In exchange, `local_base_iterator` is a _RandomAccessIterator_, whereas 473whole-collection iterators only model _ForwardIterator_. 474 475A terser range-based `for` loop can be used with the convenience `segment` 476member function: 477 478[segmented_structure_7] 479 480Even more powerful than `local_base_iterator` is `local_iterator<T>`: 481 482[segmented_structure_8] 483 484This appends a `"super"` prefix to the `rank` data member of existing warriors: 485 486[pre 487superwarrior 2,superwarrior 6 488] 489 490The observant reader will have noticed that in order to access `rank`, which 491is a member of `warrior` rather than its base class `sprite`, `first` (or 492`x` in the range `for` loop version) has to refer to a `warrior&`, and this 493is precisely the difference between `local_iterator<warrior>` (the type 494returned by `begin<warrior>()`) and `local_base_iterator`. 495`local_iterator<warrior>` is also a _RandomAccessIterator_: for most respects, 496\[`begin<T>()`, `end<T>()`) can be regarded as a range over an array of `T` 497objects. `local_iterator<T>`s can be explicitly converted to 498`local_base_iterator`s. Conversely, if a `local_base_iterator` is associated 499to a segment for `T`, it can then be explictly converted to a 500`local_iterator<T>` (otherwise the conversion is undefined behavior). 501 502The figure shows the action scopes of all the iterators associated to 503a polymorphic collection (`const_` versions not included): 504 505[$ poly_collection/img/poly_collection_iterators.png] 506 507We have yet to describe the bottom part of the diagram. 508Whereas `segment(typeid(T))` is used to range over the /elements/ 509of a particular segment, `segment_traversal()` returns an object for ranging over 510/segments/, so that the whole collection can be processed with a nested 511segment-element `for` loop like the following: 512 513[segmented_structure_9] 514 515Segment decomposition of traversal loops forms the basis of 516[link poly_collection.tutorial.algorithms improved-performance algorithms]. 517 518[endsect] 519 520[section Reserve] 521 522Much like `std::vector`, segments can be made to reserve memory in 523advance to minimize reallocations: 524 525[segmented_structure_10] 526 527If there is no segment for the indicated type (here, `goblin`), one is 528automatically created. This is in contrast with the rest of segment-specific 529member functions, which throw 530[link poly_collection.tutorial.deeper_into_the_segmented_nature.type_registration 531`boost::poly_collection::unregistered_type`]. 532 533`reserve` can deal with all segments at once. The following 534 535[segmented_structure_11] 536 537instructs every /existing/ segment to reserve 1,000 elements. If a 538segment is created later (through element insertion or with 539[link poly_collection.tutorial.deeper_into_the_segmented_nature.type_registration 540type registration]), its capacity will be different. 541 542[note Unlike standard containers, collection-level `capacity()` and 543`max_size()` are not provided because their usual semantics cannot be 544applied to Boost.PolyCollection: for instance, `capacity()` is typically 545used to check how many elements can be inserted without reallocation, 546but in a segmented structure that depends on the exact types of the 547elements and whether they are registered or not. 548] 549 550[endsect] 551 552[endsect] 553 554[section Insertion and emplacement] 555 556[import ../example/insertion_emplacement.cpp] 557(Code samples from [@boost:/libs/poly_collection/example/insertion_emplacement.cpp `insertion_emplacement.cpp`].) 558 559We already know that `insert(x)`, without further positional information, 560stores `x` at the end of its associated segment. When a regular iterator `it` 561is provided, insertion happens at the position indicated if both `it` and 562`x` belong in the same segment; otherwise, `it` is ignored. For instance, 563if our sprite collection has the following entities: 564 565[pre 566juggernaut 0,juggernaut 4,juggernaut 7,goblin 1,goblin 3,goblin 5,warrior 2,warrior 6 567] 568 569then this code: 570 571[insertion_emplacement_1] 572 573puts the new `juggernaut` at the beginning: 574 575[pre 576[role red juggernaut 8],juggernaut 0,juggernaut 4,juggernaut 7,goblin 1,... 577] 578 579whereas the position hint at 580 581[insertion_emplacement_2] 582 583is ignored and the new `goblin` gets inserted at the end of its segment: 584 585[pre 586juggernaut 8,...,juggernaut 7,goblin 1,goblin 3,goblin 5,[role red goblin 9],warrior 2,... 587] 588 589[link poly_collection.tutorial.deeper_into_the_segmented_nature.local_iterators Local iterators] 590can also be used to indicate the insertion position: 591 592[insertion_emplacement_3] 593[pre 594juggernaut 8,juggernaut 0,[role red juggernaut 10],juggernaut 4,juggernaut 7,goblin 1,... 595] 596 597There is a caveat, though: when using a local iterator, the element inserted 598/must belong to the indicated segment/: 599 600[insertion_emplacement_4] 601 602Member functions are provided for range insertion, with and without a position 603hint: 604 605[insertion_emplacement_5] 606 607As [link poly_collection.tutorial.deeper_into_the_segmented_nature.type_registration already explained], 608care must be taken that types be pre-registered into the collection if 609they are not passed as references to their actual type. So, why did not 610this line 611 612[insertion_emplacement_6] 613 614throw `boost::poly_collection::unregistered_type`? As it happens, in the 615special case where the inserted range belongs to a polymorphic collection 616of the same type, registration is done automatically 617[footnote That is, Boost.PolyCollection has enough static information to 618do type registration without further assistance from the user.]. 619 620[#poly_collection.tutorial.insertion_and_emplacement.emplacement] 621Emplacement is slightly different for Boost.PolyCollection than with 622standard containers. Consider this attempt at emplacing a `goblin`: 623 624[insertion_emplacement_7] 625 626If examined carefully, it is only natural that the code above not compile: 627Boost.PolyCollection cannot possibly know that it is precisely a `goblin` 628that we want to emplace and not some other type constructible from an `int` 629(like `warrior`, `juggernaut` or an unrelated class). So, the type of 630the emplaced element has to be specified explicitly: 631 632[insertion_emplacement_8] 633 634As with `insert`, a position can be indicated for emplacing: 635 636[insertion_emplacement_9] 637 638Note the naming here: `emplace_hint` is used when the position is indicated with 639a regular iterator, and for local iterators it is `emplace_pos`. 640[endsect] 641 642[section Exceptions] 643 644[import ../example/exceptions.cpp] 645(Code samples from [@boost:/libs/poly_collection/example/exceptions.cpp `exceptions.cpp`].) 646 647Besides the usual exceptions like 648[@http://en.cppreference.com/w/cpp/memory/new/bad_alloc `std::bad_alloc`] and 649those generated by user-provided types, polymorphic collections can throw the following: 650 651* `boost::poly_collection::unregistered_type` 652* `boost::poly_collection::not_copy_constructible` 653* `boost::poly_collection::not_equality_comparable` 654 655The situations in which the first is raised have been 656[link poly_collection.tutorial.deeper_into_the_segmented_nature.type_registration already discussed]; 657let us focus on the other two. 658 659We have a new type of sprite in our game defined by the non-copyable 660class `elf`: 661 662[rolegame_4] 663 664and we use it without problems until we write this: 665 666[exceptions_1] 667 668The first insertion works because the `elf` object passed is /temporary/ 669and can be /moved/ into the container, but the second statement actually 670needs to /copy/ the `elf` elements in `c` to `c2`, hence the exception. 671 672The potentially surprising aspect of this behavior is that standard 673containers signal this kind of problems by /failing at compile time/. 674Here we cannot afford this luxury because the exact types contained in a 675polymorphic collection are not known until run time (for instance, 676if `elf` elements had been erased before copying `c` to `c2` everything 677would have worked): basically, the deferral of errors from compile time 678to run time is an intrinsic feature of dynamic polymorphism. 679 680In the same vein, equality comparison requires that elements themselves 681be equality comparable: 682 683[exceptions_2] 684 685The above is unremarkable once we notice we have not defined `operator==` 686for any `sprite`. The problem may go unnoticed for quite some time, however, 687because determining that two polymorphic collections are equal (i.e. 688all their non-empty segments are equal) can return `false` without 689comparing anything at all (for instance, if segment sizes differ), in which 690case no exception is thrown. 691 692[note Operators for `<`, `<=`, `>` and `>=` comparison are not provided 693because /segment/ order is not fixed and may vary across otherwise 694identical collections. The situation is similar to that of standard 695[@http://en.cppreference.com/w/cpp/named_req/UnorderedAssociativeContainer unordered 696associative containers.] 697] 698 699These three are all the intrinsic exceptions thrown by Boost.PolyCollection. 700The implication is that if a type is _CopyConstructible_, 701_MoveAssignable_ (or move construction does not throw) and _EqualityComparable_, then 702the entire interface of Boost.PolyCollection is unrestrictedly available for it 703[footnote Provided, of course, that the type has the /right/ to be in the collection, 704that is, it is derived from the specified base, or callable with the specified 705signature, etc.]. 706 707[endsect] 708 709[section Algorithms] 710 711[import ../example/algorithms.cpp] 712(Code samples from [@boost:/libs/poly_collection/example/algorithms.cpp `algorithms.cpp`]. 713C++14 generic lambda expressions are used.) 714 715The ultimate purpose of Boost.PolyCollection is to speed up the processing of 716large quantities of polymorphic entities, in particular for those operations 717that involve linear traversal as implemented with a `for`-loop or using the 718quintessential _std::for_each_ algorithm. 719 720[algorithms_1] 721 722Replacing the container used in the program from classic alternatives to 723Boost.PolyCollection: 724 725* `std::vector<base*>` (or similar) → `boost::base_collection<base>` 726* `std::vector<std::function<signature>>` → `boost::function_collection<signature>` 727* `std::vector<boost::type_erasure::any<concept_>>` → `boost::any_collection<concept_>` 728 729is expected to increase performance due to better caching and branch prediction 730behavior. But there is still room for improvement. 731 732Consider this transformation of the code above using a double 733segment-element loop (based on the 734[link poly_collection.tutorial.deeper_into_the_segmented_nature.local_iterators local iterator] 735capabilities of Boost.PolyCollection): 736 737[algorithms_2] 738 739Although not obvious at first sight, this version of the same basic operation 740is actually /faster/ than the first one: for a segmented structure such as used 741by Boost.PolyCollection, each iteration with the non-local iterator passed to 742`std::for_each` involves: 743 744# hopping to next position in the segment, 745# checking for /end of segment/ (always), 746# if at end (infrequent), selecting the next segment, 747# checking for /end of range/ (always), 748 749whereas in the second version, iteration on the inner loop, where most 750processing happens, is a simple increment-and-check operation, that is, 751there is one less check (which happens at the much shorter outer loop). When 752the workload of the algorithm (the actually useful stuff done with the elements 753themselves) is relatively light, the overhead of looping can be very 754significant. 755 756To make it easier for the user to take advantage of faster segment-element 757looping, Boost.PolyCollection provides its own version of `for_each` based 758on that technique: 759 760[algorithms_3] 761 762`boost::poly_collection::for_each` has the same interface and behavior as 763`std::for_each` except for the fact that it only works for (non-local) 764iterators of a polymorphic container 765[footnote For any other type of iterator, it is guaranteed not to compile.]. 766Versions of other standard algorithms are provided as well: 767 768[algorithms_4] 769 770In fact, variants are given of most (though not all) of the algorithms in 771[@http://en.cppreference.com/w/cpp/algorithm `<algorithm>`] 772among those that are compatible with polymorphic collections 773[footnote For example, algorithms requiring bidirectional iterators or a higher 774category are not provided because polymorphic collections have forward-only 775iterators.]. 776Check the [link poly_collection.reference.header_boost_poly_collection_alg reference] 777for details. 778 779[section Type restitution] 780 781By /type restitution/ we mean the generic process of getting a concrete entity 782from an abstract one by providing missing type information: 783 784[algorithms_5] 785 786Type restitution has the potential to increase performance. Consider for 787instance the following: 788 789[algorithms_6] 790 791Behaviorwise this code is equivalent to simply executing `std::cout<<r<<"\n"`, 792but when type restitution succeeds the statement `std::cout<<str<<"\n"` is 793skipping a virtual-like call that would have happened if `r` were used instead. 794From a general point of view, supplying the compiler with extra type 795information allows it to perform more optimizations than in the abstract case, 796inlining being the prime example. 797 798All Boost.PolyCollection algorithms accept an optional list of types for 799restitution. Let us use the 800[link poly_collection.tutorial.basics.boost_any_collection `boost::any_collection`] 801scenario to illustrate this point: 802 803[algorithms_7] 804 805Output: 806 807[pre 808warrior 2,warrior 6,[pop-up 1],[pop-up 2],juggernaut 0,juggernaut 4, 809juggernaut 7,goblin 1,goblin 3,goblin 5,"stamina: 10,000","game over" 810] 811This rendering loop differs from the non-restituting one in two aspects: 812 813* A list of types is provided as template arguments to the algorithm. This is 814an indication that the types /may/ be actually present in the collection, not 815a promise. Also, the list of types need not be exhaustive, that is, some 816unlisted types could be present in the collection \u2014in the example above, 817the loop traverses *all* elements (including `std::string`s and `window`s), 818not only those corresponding to restituted types. In general, the more types 819are restituted, the greater the potential improvement in performance. 820* The lambda function passed is a generic one accepting its argument as 821`const auto&` 822[footnote This requires C++14, but the same effect can be achieved in C++11 823providing an equivalent, if more cumbersome, functor with a templatized 824call operator.]. 825 826Internally, `boost::poly_collection::for_each` checks for each segment if its 827type, say `T`, belongs in the type restitution list: if this is the case, the lambda 828function is passed a `const T&` rather than the generic 829`const boost::any_collection::value_type&`. For each restituted type we are 830saving indirection calls and possibly getting inlining optimizations, etc. 831As we see in the 832[link poly_collection.performance performance section], the speedup can be 833very significant. 834 835Type restitution works equally for the rest of collections of 836Boost.PolyCollection. When using `boost::base_collection`, though, the case 837of improved performance is more tricky: 838 839[algorithms_8] 840 841The problem here is that, even though the argument to the lambda function will 842be restituted (when appropriate) to `warrior&`, `juggernaut&` or `goblin&`, 843using it still involves doing a virtual function call in `s.render(std::cout)`. 844Whether this call is resolved to a non-virtual one depends on the quality 845of implementation of the compiler: 846 847* If the concrete class is marked as 848[@http://en.cppreference.com/w/cpp/language/final `final`], the compiler /in 849principle/ has enough information to get rid of the virtual function call. 850* Other than this, 851[@http://hubicka.blogspot.com.es/2014/01/devirtualization-in-c-part-1.html ['devirtualization]] 852capabilities /may/ be able to figure out that the type restitution scenario 853is actually casting the element to its true type, in which case, again, virtual 854calls are not needed. 855 856[endsect] 857 858[endsect] 859 860[endsect] 861 862[section Performance] 863 864[def _vs2015_x86_ Visual Studio 2015 x86] 865[def _vs2015_x64_ Visual Studio 2015 x64] 866[def _gcc63_x64_ GCC 6.3 x64] 867[def _clang40_x64_ Clang 4.0 x64] 868 869[template perf_fig[name environment file] 870[section [environment]] 871 872[role aligncenter 873'''<inlinemediaobject> 874 <imageobject><imagedata fileref="'''poly_collection/img/[file].png'''"/></imageobject> 875 </inlinemediaobject>''' 876] 877[role aligncenter [*[name], [environment]]] 878 879[endsect] 880] 881 882[import ../example/perf.cpp] 883(Testing program at [@boost:/libs/poly_collection/example/perf.cpp `perf.cpp`].) 884 885We ran tests to measure the performance of the containers of 886Boost.PolyCollection in two scenarios: 887 888* Insertion of elements. 889* Linear traversal and processing with `std::for_each` and `boost::poly_collection::for_each` 890(with and without 891[link poly_collection.tutorial.algorithms.type_restitution type restitution]). 892 893As a comparison baseline we used containers and facilities from the 894standard library and Boost (details below). Tests were run in the 895following environments: 896 897* *_vs2015_x86_*: Visual C++ 2015 in 32-bit (x86) release mode, Windows 7 64-bit, Intel Core i5-2520M @2.5GHz 898* *_vs2015_x64_*: Visual C++ 2015 in 64-bit (x64) release mode, same machine 899* *_gcc63_x64_*: GCC 6.3 release mode, Xubuntu 17.04 x64, Intel Core i7-5820 @3.3GHz 900* *_clang40_x64_*: Clang 4.0 release mode, same machine 901 902[section Container definitions] 903 904[section `boost::base_collection`] 905 906* Baseline container: `ptr_vector` = `boost::ptr_vector<base>` 907* Polymorphic collection: `base_collection` = `boost::base_collection<base>` 908* Element types: `T1` = `derived1`, `T2` = `derived2`, `T3` = `derived3` 909 910[perf_base_types] 911 912[endsect] 913 914[section `boost::function_collection`] 915 916* Baseline container: `func_vector` = `std::vector<std::function<int(int)>>` 917* Polymorphic collection: `function_collection` = `boost::function_collection<int(int)>` 918* Element types: `T1` = `concrete1`, `T2` = `concrete2`, `T3` = `concrete3` 919 920[perf_function_types] 921 922[endsect] 923 924[section `boost::any_collection`] 925 926* Baseline container: `any_vector` = `std::vector<boost::type_erasure::any<concept_>>` 927* Polymorphic collection: `any_collection` = `boost::any_collection<concept_>` 928* Element types: `T1` = `int`, `T2` = `double`, `T3` = `char` 929 930[perf_any_types] 931 932[endsect] 933 934[endsect] 935 936[section Insertion tests] 937 938Tests measure the time taken to insert /n/ elements (/n/ between 93910'''<superscript>2</superscript>''' and 10'''<superscript>7</superscript>''') 940from a source of values with types following the cyclic sequence 941 942`T1` `T1` `T2` `T2` `T3` `T1` `T1` `T2` `T2` `T3` ··· 943 944No `reserve` operation is done before insertion. 945The figures show resulting times in nanoseconds/element. 946The horizontal axis is logarithmic. 947 948[section Results for `boost::base_collection`] 949[perf_fig Insertion.._vs2015_x86_..insert_base_vs2015_x86] 950[perf_fig Insertion.._vs2015_x64_..insert_base_vs2015_x64] 951[perf_fig Insertion.._gcc63_x64_..insert_base_gcc63_x64] 952[perf_fig Insertion.._clang40_x64_..insert_base_clang40_x64] 953[endsect] 954 955[section Results for `boost::function_collection`] 956[perf_fig Insertion.._vs2015_x86_..insert_function_vs2015_x86] 957[perf_fig Insertion.._vs2015_x64_..insert_function_vs2015_x64] 958[perf_fig Insertion.._gcc63_x64_..insert_function_gcc63_x64] 959[perf_fig Insertion.._clang40_x64_..insert_function_clang40_x64] 960[endsect] 961 962[section Results for `boost::any_collection`] 963[perf_fig Insertion.._vs2015_x86_..insert_any_vs2015_x86] 964[perf_fig Insertion.._vs2015_x64_..insert_any_vs2015_x64] 965[perf_fig Insertion.._gcc63_x64_..insert_any_gcc63_x64] 966[perf_fig Insertion.._clang40_x64_..insert_any_clang40_x64] 967[endsect] 968 969[endsect] 970 971[section Processing tests] 972 973Tests measure the time taken to traverse a container of size 974/n/ (/n/ between 97510'''<superscript>2</superscript>''' and 10'''<superscript>7</superscript>''') 976and execute an operation on each of its elements. The operation for 977`boost::base_collection` and `boost::function_collection` (and the associated 978baseline containers) is defined as 979 980[perf_for_each_callable] 981 982whereas for `boost::any_collection` we use 983 984[perf_for_each_incrementable] 985 986The baseline container is tested with three different setups: 987 988* Directly as initialized by the process described for the 989[link poly_collection.performance.insertion_tests insertion tests]. 990The sequence of types is complex enough that CPU's branch prediction mechanisms 991are not able to fully anticipate it 992[footnote This has been verified empirically: simpler cycles did indeed 993yield better execution times.]. As elements are ordered according to their 994construction time, certain degree of memory contiguity is expected. 995* With an extra post-insertion stage by which elements are sorted 996according to their `typeid()`. This increases branch prediction 997efficiency at the expense of having worse cache friendliness. 998* With an extra post-insertion stage that randomly 999shuffles all the elements in the container. This is the worst possible 1000scenario both in terms of caching and branch prediction. 1001 1002As for the polymorphic collection, three variations are measured: 1003 1004* With `std::for_each` (the same as the baseline container). 1005* Using `boost::poly_collection::for_each`. 1006* Using `boost::poly_collection::for_each` with 1007[link poly_collection.tutorial.algorithms.type_restitution /type restitution/] 1008of `T1`, `T2` and `T3`. 1009 1010The figures show resulting times in nanoseconds/element. 1011The horizontal axis is logarithmic. 1012 1013[section Results for `boost::base_collection`] 1014[perf_fig Processing.._vs2015_x86_..for_each_base_vs2015_x86] 1015[perf_fig Processing.._vs2015_x64_..for_each_base_vs2015_x64] 1016[perf_fig Processing.._gcc63_x64_..for_each_base_gcc63_x64] 1017[perf_fig Processing.._clang40_x64_..for_each_base_clang40_x64] 1018[endsect] 1019 1020[section Results for `boost::function_collection`] 1021[perf_fig Processing.._vs2015_x86_..for_each_function_vs2015_x86] 1022[perf_fig Processing.._vs2015_x64_..for_each_function_vs2015_x64] 1023[perf_fig Processing.._gcc63_x64_..for_each_function_gcc63_x64] 1024[perf_fig Processing.._clang40_x64_..for_each_function_clang40_x64] 1025[endsect] 1026 1027[section Results for `boost::any_collection`] 1028[perf_fig Processing.._vs2015_x86_..for_each_any_vs2015_x86] 1029[perf_fig Processing.._vs2015_x64_..for_each_any_vs2015_x64] 1030[perf_fig Processing.._gcc63_x64_..for_each_any_gcc63_x64] 1031[perf_fig Processing.._clang40_x64_..for_each_any_clang40_x64] 1032[endsect] 1033 1034[endsect] 1035 1036[endsect] 1037 1038[include reference.qbk] 1039 1040[section Future work] 1041 1042A number of features asked by reviewers and users of Boost.PolyCollection are 1043considered for inclusion into future versions of the library. 1044 1045[section Alternative RTTI systems] 1046 1047Boost.PolyCollection can be extended to use _Boost.TypeIndex_ in RTTI-challenged 1048scenarios. Taking this idea further, it is not unusual that some environments 1049(game engines, for instance) provide their own RTTI framework: an even more 1050ambitious extension to Boost.PolyCollection would then be to make it configurable 1051for user-provided RTTI through some sort of traits class specifying replacements for 1052[@http://en.cppreference.com/w/cpp/types/type_info `std::type_info`] 1053and [@http://en.cppreference.com/w/cpp/language/typeid `typeid`]. 1054 1055[endsect] 1056 1057[section Copy traits] 1058 1059`boost::base_collection` requires that stored objects be _MoveConstructible_ and 1060_MoveAssignable_; unfortunately, it is customary to restrict copying in OOP 1061hierarchies to avoid slicing, which would force users to revisit their class 1062definitions in order to use Boost.PolyCollection. This can be alleviated by 1063offering a configurable traits class where copy and assignment can be defined 1064externally 1065 1066`` 1067template<typename T> 1068struct copy_traits 1069{ 1070 void construct(void*,T&&); 1071 void assign(T&,T&&); 1072}; 1073`` 1074 1075with default implementations resorting to regular placement `new` and 1076`T::operator=`. 1077 1078[endsect] 1079 1080[section Parallel algorithms] 1081 1082C++17 introduces 1083[@http://en.cppreference.com/w/cpp/experimental/parallelism parallel algorithms], 1084like for instance a parallel version of _std::for_each_; it is only 1085natural then to provide the corresponding 1086[link poly_collection.tutorial.algorithms Boost.PolyCollection-specific algorithms]. 1087The segmented nature of polymorphic collections makes them particularly amenable to 1088parallel processing. 1089 1090[endsect] 1091 1092[section `variant_collection`] 1093 1094/Closed polymorphism/ is a kind of dynamic polymorphism where the set of implementation 1095types is fixed at definition time: the prime example of this paradigm in C++ is 1096[@http://en.cppreference.com/w/cpp/utility/variant `std::variant`]. Although 1097`boost::any_collection<boost::mpl::vector<>>` can act as a sort of replacement for 1098`std::vector<std::variant<T1,...,TN>>`, this is in fact more similar to a 1099`std::vector<`[@http://en.cppreference.com/w/cpp/utility/any `std::any`]`>`, 1100and a collection class `boost::variant_collection<T1,...,TN>` could be designed to 1101better model closed polymorphism and take further advantage of the fact that 1102implementation types are fixed (for instance, internal virtual calls can be 1103completely eliminated). From a conceptual point of view, this would require introducing 1104a new [* `ClosedPolymorphicCollection`] notion and renaming the current 1105[link poly_collection.reference.polymorphic_containers.polymorphic_collections [* `PolymorphicCollection`]] 1106model to [* `OpenPolymorphicCollection`]. 1107 1108[endsect] 1109 1110[section Ordered polymorphic collections] 1111 1112Users have expressed interest in polymorphic collections where elements are kept 1113ordered within their segment and optionally duplicates are excluded, much like 1114[@http://boost.org/doc/html/container/non_standard_containers.html#container.non_standard_containers.flat_xxx `boost::flat_set`/`boost::flat_multiset`] 1115do over their internal data vector. The open question remains of whether these 1116collections should also guarantee some order between segments (current ones don't) 1117to allow for the definition of container-level `operator<` and related operators. 1118 1119[endsect] 1120 1121[endsect] 1122 1123[section Release notes] 1124 1125[section Boost 1.74] 1126 1127* Fixed internal ambiguity problem between `boost::type_erasure::any` and 1128`boost::any` (issue [github poly_collection 17]). 1129* Maintenance work. 1130 1131[endsect] 1132 1133[section Boost 1.73] 1134 1135* Suppressed a potential redundant move warning in `boost::poly_collection::for_each`. 1136* Fixed a bug by which elements were copied rather than moved in 1137allocator-extended move construction and move assigment between collections 1138with non-propagating, unequal allocators. 1139* Allocator-extended move construction no longer decays to allocator-extended copy 1140construction for the legacy version of libstdc++-v3 shipped with GCC 4.8 1141(which can also be used by Clang). 1142 1143[endsect] 1144 1145[section Boost 1.72] 1146 1147* Maintenance work. 1148 1149[endsect] 1150 1151[section Boost 1.71] 1152 1153* Maintenance work. 1154 1155[endsect] 1156 1157[section Boost 1.70] 1158 1159* Improved handling of stateful allocators and allocator propagation traits, 1160after an error reported by Billy O'Neal ([github_pr poly_collection 9]). 1161* Fixed a potentially serious bug with an internal cache structure. 1162 1163[endsect] 1164 1165[section Boost 1.69] 1166 1167* Added Boost.PolyCollection-specific versions of algorithms 1168`std::for_each_n` and `std::sample`. 1169 1170[endsect] 1171 1172[section Boost 1.67] 1173 1174* Maintenance fixes. 1175 1176[endsect] 1177 1178[section Boost 1.66] 1179 1180* Boost.PolyCollection has been backported to GCC 4.8 to 4.9 and Clang 3.3 to 3.6. 1181The version of libstdc++-v3 shipped with GCC 4.8 (which can also be used by Clang) 1182has deficiencies that result in the following limitations when using 1183Boost.PolyCollection: 1184 * Stateful allocators are not properly supported. 1185 * Allocator-extended move construction decays to allocator-extended copy construction. 1186 * Copy construction crashes if an exception is thrown during element copying. 1187* Maintenance fixes. 1188 1189[endsect] 1190 1191[section Boost 1.65] 1192 1193* Initial release. 1194 1195[endsect] 1196 1197[endsect] 1198 1199[section Acknowledgments] 1200 1201The library uses [@http://www.pdimov.com/ Peter Dimov]'s 1202[@http://www.pdimov.com/cpp2/simple_cxx11_metaprogramming.html implementation 1203of `std::make_index_sequence`]. 1204[@http://manu343726.github.io/ Manu Sánchez] aided immensely with CI setup and 1205performance testing. [@http://stackoverflow.com/users/85371/sehe Sehe] 1206contributed performance results for GCC 5.2, and 1207[@https://www.linkedin.com/in/francisco-jose-tapia-iba%C3%B1ez-4239a07a Francisco 1208José Tapia] for Clang 3.9. 1209 1210The Boost acceptance review took place between the 3rd and 17th of May, 2017. Thank you 1211to Ion Gaztañaga for smoothly managing the process. The following people participated 1212with full reviews or valuable comments: 1213Pete Bartlett, Hans Dembinski, Dominique Devienne, Edward Diener, Vinnie Falco, 1214Ion Gaztañaga, Andrzej Krzemienski, Brook Milligan, Thorsten Ottosen, Steven Watanabe, 1215Adam Wulkiewicz. Many thanks to all of them. Steven Watanabe gave crucial help in 1216solving some hard problems related to the usage of _Boost.TypeErasure_. 1217 1218Boost.PolyCollection was designed and written between rainy 1219[@https://es.wikipedia.org/wiki/Viav%C3%A9lez Viavélez], noisy Madrid and beautiful 1220[@https://en.wikipedia.org/wiki/C%C3%A1ceres,_Spain Cáceres], August-November, 2016. 1221Most of the after-review work in preparation for the official Boost release was done 1222in the quiet town of [@https://es.wikipedia.org/wiki/Oropesa_(Toledo) Oropesa] during 1223the spring of 2017. 1224 1225In memory of Joaquín López Borrella (1939-2015), in memory of Héctor (2004-2017): 1226may your ghosts keep us company. 1227 1228[endsect]