1<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> 2<html> 3<head> 4<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> 5<title>An efficient polymorphic data structure</title> 6<link rel="stylesheet" href="../../../doc/src/boostbook.css" type="text/css"> 7<meta name="generator" content="DocBook XSL Stylesheets V1.79.1"> 8<link rel="home" href="../index.html" title="The Boost C++ Libraries BoostBook Documentation Subset"> 9<link rel="up" href="../poly_collection.html" title="Chapter 28. Boost.PolyCollection"> 10<link rel="prev" href="../poly_collection.html" title="Chapter 28. Boost.PolyCollection"> 11<link rel="next" href="tutorial.html" title="Tutorial"> 12</head> 13<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> 14<table cellpadding="2" width="100%"><tr> 15<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../boost.png"></td> 16<td align="center"><a href="../../../index.html">Home</a></td> 17<td align="center"><a href="../../../libs/libraries.htm">Libraries</a></td> 18<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td> 19<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td> 20<td align="center"><a href="../../../more/index.htm">More</a></td> 21</tr></table> 22<hr> 23<div class="spirit-nav"> 24<a accesskey="p" href="../poly_collection.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../poly_collection.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="tutorial.html"><img src="../../../doc/src/images/next.png" alt="Next"></a> 25</div> 26<div class="section"> 27<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 28<a name="poly_collection.an_efficient_polymorphic_data_st"></a><a class="link" href="an_efficient_polymorphic_data_st.html" title="An efficient polymorphic data structure">An efficient 29 polymorphic data structure</a> 30</h2></div></div></div> 31<p> 32 Suppose we have a <code class="computeroutput"><span class="identifier">base</span></code> abstract 33 class with implementations <code class="computeroutput"><span class="identifier">derived1</span></code>, 34 <code class="computeroutput"><span class="identifier">derived2</span></code> and <code class="computeroutput"><span class="identifier">derived3</span></code>. The memory layout of a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">base</span><span class="special">*></span></code> (or similar constructs such as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">unique_ptr</span><span class="special"><</span><span class="identifier">base</span><span class="special">>></span></code> 35 or <a href="../../../libs/ptr_container/" target="_top"><code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">ptr_vector</span><span class="special"><</span><span class="identifier">base</span><span class="special">></span></code></a>) looks like the following: 36 </p> 37<p> 38 <span class="inlinemediaobject"><img src="img/ptr_vector.png"></span> 39 </p> 40<p> 41 Elements that are adjacent in the vector are not necessarily allocated contiguously, 42 much less so if the vector has undergone mid insertions and deletions. A typical 43 processing operation 44 </p> 45<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">base</span><span class="special">*></span> <span class="identifier">v</span><span class="special">;</span> 46<span class="special">...</span> 47<span class="keyword">for</span><span class="special">(</span><span class="identifier">base</span><span class="special">*</span> <span class="identifier">b</span><span class="special">:</span> <span class="identifier">v</span><span class="special">){</span> 48 <span class="special">...</span> <span class="comment">// access base's virtual interface</span> 49<span class="special">}</span> 50</pre> 51<p> 52 is impacted negatively by two factors: 53 </p> 54<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> 55<li class="listitem"> 56 Scattering of elements throughout memory reduces CPU caching efficiency, 57 which in general favor regular access loops to contiguous memory areas. 58 </li> 59<li class="listitem"> 60 <a href="https://en.wikipedia.org/wiki/Branch_predictor" target="_top">Branch prediction</a> 61 tries to minimize the effect of running conditional code (such as an <code class="computeroutput"><span class="keyword">if</span></code>-<code class="computeroutput"><span class="keyword">else</span></code> 62 statement or the invocation of a <code class="computeroutput"><span class="identifier">base</span></code> 63 virtual function) by speculatively executing a given branch based on past 64 history. This mechanism is rendered mostly useless when <code class="computeroutput"><span class="identifier">derived1</span></code>, 65 <code class="computeroutput"><span class="identifier">derived2</span></code> and <code class="computeroutput"><span class="identifier">derived3</span></code> elements are interspersed along 66 the sequence without a definite pattern. 67 </li> 68</ul></div> 69<p> 70 These limitations are imposed by the very nature of dynamic polymorphism: as 71 the exact types of the elements accessed through <code class="computeroutput"><span class="identifier">base</span></code>'s 72 interface are not known, an indirection through <code class="computeroutput"><span class="identifier">base</span><span class="special">*</span></code> (a particular form of <span class="emphasis"><em>type erasure</em></span>) 73 is required. There is however a critical observation: even though derived types 74 are not known when traversing a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">base</span><span class="special">*></span></code>, the information is typically available 75 <span class="emphasis"><em>at compile time</em></span> at the point of insertion in the vector: 76 </p> 77<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">base</span><span class="special">*></span> <span class="identifier">v</span><span class="special">;</span> 78<span class="special">...</span> 79<span class="identifier">v</span><span class="special">.</span><span class="identifier">insert</span><span class="special">(</span><span class="keyword">new</span> <span class="identifier">derived2</span><span class="special">{...});</span> <span class="comment">// the type derived2 is "forgotten" by v</span> 80</pre> 81<p> 82 A suitably designed container can take advantage of this information to arrange 83 elements contiguously according to their exact type, which results in an internal 84 data structure (a map of pointers to <a href="http://en.cppreference.com/w/cpp/types/type_info" target="_top"><code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">type_info</span></code></a> 85 objects, actually) pointing to as many vectors or <span class="emphasis"><em>segments</em></span> 86 as there are derived classes: 87 </p> 88<p> 89 <span class="inlinemediaobject"><img src="img/segment_map.png"></span> 90 </p> 91<p> 92 Traversing such a structure reduces to looping over all the segments one after 93 another: this is extremely efficient both in terms of caching and branch prediction. 94 In the process we have however lost the free-order capability of a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">base</span><span class="special">*></span></code> (free order can only be retained at the 95 segment level), but if this is not relevant to the user application the potential 96 performance gains of switching to this structure are large. 97 </p> 98<p> 99 The discussion has focused on base/derived programming, also known as OOP, 100 but it also applies to other forms of dynamic polymorphism: 101 </p> 102<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> 103<li class="listitem"> 104 <a href="http://en.cppreference.com/w/cpp/utility/functional/function" target="_top"><code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">function</span></code></a> abstracts callable entities 105 with the same given signature under a common interface. Internally, pointer 106 indirections and virtual-like function calls are used. Memory fragmentation 107 is expected to be lower than with OOP, though, as implementations usually 108 feature the so-called <a href="http://talesofcpp.fusionfenix.com/post-16/episode-nine-erasing-the-concrete#a-note-on-small-buffer-optimization" target="_top"><span class="emphasis"><em>small 109 buffer optimization</em></span></a> to avoid heap allocation in some 110 situations. 111 </li> 112<li class="listitem"> 113 The case of <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">function</span></code> can be seen as a particular 114 example of a more general form of polymorphism called <a href="https://en.wikipedia.org/wiki/Duck_typing" target="_top"><span class="emphasis"><em>duck 115 typing</em></span></a>, where unrelated types are treated uniformly 116 if they conform to the same given <span class="emphasis"><em>interface</em></span> (a specified 117 set of member functions and/or operations). Duck typing provides the power 118 of OOP while allowing for greater flexibility as polymorphic types need 119 not derive from a preexisting base class or in general be designed with 120 any particular interface in mind --in fact, the same object can be duck-typed 121 to different interfaces. Among other libraries, <a href="../../../libs/type_erasure" target="_top">Boost.TypeErasure</a> 122 provides duck typing for C++. Under the hood, duck typing requires pointer 123 indirection and virtual call implementation techniques analogous to those 124 of OOP, and so there are the same opportunities for efficient container 125 data structures as we have described. 126 </li> 127</ul></div> 128<p> 129 Boost.PolyCollection provides three different container class templates dealing 130 with OOP, <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">function</span></code>-like polymorphism and duck typing 131 as implemented by Boost.TypeErasure: 132 </p> 133<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> 134<li class="listitem"> 135 <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">base_collection</span></code> 136 </li> 137<li class="listitem"> 138 <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">function_collection</span></code> 139 </li> 140<li class="listitem"> 141 <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">any_collection</span></code> 142 </li> 143</ul></div> 144<p> 145 The interfaces of these containers are mostly the same and follow the usual 146 conventions of standard library containers. 147 </p> 148</div> 149<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr> 150<td align="left"></td> 151<td align="right"><div class="copyright-footer">Copyright © 2016-2020 Joaquín M López Muñoz<p> 152 Distributed under the Boost Software License, Version 1.0. (See accompanying 153 file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>) 154 </p> 155</div></td> 156</tr></table> 157<hr> 158<div class="spirit-nav"> 159<a accesskey="p" href="../poly_collection.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../poly_collection.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="tutorial.html"><img src="../../../doc/src/images/next.png" alt="Next"></a> 160</div> 161</body> 162</html> 163