1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0.1 Transitional//EN"> 2 3<html> 4<head> 5<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> 6<title>Boost.Flyweight Documentation - Performance</title> 7<link rel="stylesheet" href="style.css" type="text/css"> 8<link rel="start" href="index.html"> 9<link rel="prev" href="reference/tracking.html"> 10<link rel="up" href="index.html"> 11<link rel="next" href="examples.html"> 12</head> 13 14<body> 15<h1><img src="../../../boost.png" alt="Boost logo" align= 16"middle" width="277" height="86">Boost.Flyweight Performance</h1> 17 18<div class="prev_link"><a href="reference/tracking.html"><img src="prev.gif" alt="tracking policies" border="0"><br> 19Tracking policies 20</a></div> 21<div class="up_link"><a href="index.html"><img src="up.gif" alt="index" border="0"><br> 22Index 23</a></div> 24<div class="next_link"><a href="examples.html"><img src="next.gif" alt="examples" border="0"><br> 25Examples 26</a></div><br clear="all" style="clear: all;"> 27<br clear="all" style="clear: all;"> 28 29<hr> 30 31<h2>Contents</h2> 32 33<ul> 34 <li><a href="#intro">Introduction</a></li> 35 <li><a href="#memory">Memory consumption</a> 36 <ul> 37 <li><a href="#flyweight_size">Flyweight size</a></li> 38 <li><a href="#entry_size">Entry size</a></li> 39 <li><a href="#overall_memory">Overall memory consumption</a></li> 40 </ul> 41 </li> 42 <li><a href="#time">Time efficiency</a> 43 <ul> 44 <li><a href="#initialization">Initialization</a></li> 45 <li><a href="#assignment">Assignment</a></li> 46 <li><a href="#equality_comparison">Equality comparison</a></li> 47 <li><a href="#value_access">Value access</a></li> 48 </ul> 49 </li> 50 <li><a href="#results">Experimental results</a> 51 <ul> 52 <li><a href="#msvc_80">Microsoft Visual C++ 8.0</a> 53 <ul> 54 <li><a href="#msvc_80_memory">Memory</a></li> 55 <li><a href="#msvc_80_time">Execution time</a></li> 56 </ul> 57 </li> 58 <li><a href="#gcc_344">GCC 3.4.4</a> 59 <ul> 60 <li><a href="#gcc_344_memory">Memory</a></li> 61 <li><a href="#gcc_344_time">Execution time</a></li> 62 </ul> 63 </li> 64 </ul> 65 </li> 66 <li><a href="#conclusions">Conclusions</a></li> 67</ul> 68 69<h2><a name="intro">Introduction</a></h2> 70 71<p> 72We show how to estimate the memory reduction obtained by the usage of 73Boost.Flyweight in a particular scenario and study the impact on the execution 74time for the different functional areas of <code>flyweight</code>. 75Some experimental results are provided. 76</p> 77 78<h2><a name="memory">Memory consumption</a></h2> 79 80<p> 81As we saw in the <a href="tutorial/index.html#rationale">tutorial rationale</a>, 82the flyweight pattern is based on two types of objects: 83<ul> 84 <li>The flyweight objects proper, which have very small size, typically 85 that of a pointer. 86 </li> 87 <li>The shared values, which are stored as internal <i>entries</i> into the 88 flyweight factory. 89 </li> 90</ul> 91The overall memory consumption is then a function of the size of the 92flyweight objects, the size of the entry objects and the degree of 93value redundancy. 94</p> 95 96<h3><a name="flyweight_size">Flyweight size</a></h3> 97 98<p> 99The only data member of a <code>flyweight</code> object is a so-called 100<i>handle</i>, an opaque object of small size provided by the internal 101flyweight factory to refer to the entries it stores. For the default 102<a href="tutorial/configuration.html#hashed_factory"><code>hashed_factory</code></a>, 103this handle is merely a pointer, so <code>sizeof(flyweight<T>)=sizeof(void*)</code>, 1044 bytes in typical 32-bit architectures. 105For other types of factories, the handle is an iterator to an internal 106container used in the implementation of the factory: again, its size 107is typically that of a pointer. 108</p> 109 110<h3><a name="entry_size">Entry size</a></h3> 111 112<p> 113The entries stored in the factory associated to <code>flyweight<T,...></code> 114need not only hold a value of <code>T</code>, but also contain additional 115information related to the internal implementation of 116<code>flyweight<T,...></code>: 117</p> 118 119<blockquote> 120<i>entry</i> = <code>sizeof(T)</code> + <i>overhead</i>. 121</blockquote> 122 123<p> 124For the current implementation of Boost.Flyweight, the following aspects 125contribute to <i>overhead</i>: 126<ul> 127 <li>Usage of <a href="tutorial/key_value.html">key-value flyweights</a>.</li> 128 <li>Internal overhead of the <a href="tutorial/configuration.html#factories">factory</a> 129 container. 130 </li> 131 <li>Bookkeeping information associated to the 132 <a href="tutorial/configuration.html#tracking">tracking mechanism</a>. 133 </li> 134</ul> 135The table summarizes the separate contributions to <i>overhead</i> introduced 136by the different components taking part of the definition of 137a <code>flyweight</code> instantiation. Values are given in <i>words</i>, 138i.e. the size of a pointer, which is 4 bytes in a typical 32-bit architecture. 139Alignment may introduce additional overhead. 140</p> 141 142<p align="center"> 143<table cellspacing="0"> 144 <caption><b>Entry overhead of the components of Boost.Flyweight.</b></caption> 145<tr> 146 <th align="center"colspan="2">component</th> 147 <th align="center">overhead (words)</th> 148</tr> 149<tr> 150 <td align="center" rowspan="2"> <a href="tutorial/key_value.html#key_value"><code>key_value</code></a> </td> 151 <td align="center"> with <a href="tutorial/key_value.html#key_extractor">key extractor</a> </td> 152 <td align="center"> 1<sup>(1)</sup> </td> 153</tr> 154<tr> 155 <td align="center"> without <a href="tutorial/key_value.html#key_extractor">key extractor</a> </td> 156 <td align="center"> 1 + <code>sizeof(Key) </td> 157</tr> 158<tr class="odd_tr"> 159 <td align="center" rowspan="3"> factory </td> 160 <td align="center"> <a href="tutorial/configuration.html#hashed_factory"><code>hashed_factory</code></a> </td> 161 <td align="center"> ~2.5 </td> 162</tr> 163<tr class="odd_tr"> 164 <td align="center"> <a href="tutorial/configuration.html#set_factory"><code>set_factory</code></a> </td> 165 <td align="center"> 4<sup>(2)</sup> </td> 166</tr> 167<tr class="odd_tr"> 168 <td align="center"> <a href="tutorial/configuration.html#assoc_container_factory"><code>assoc_container_factory</code></a> </td> 169 <td align="center"> depends on the container used </td> 170</tr> 171<tr> 172 <td align="center" rowspan="2"> tracking mechanism </td> 173 <td align="center"> <a href="tutorial/configuration.html#refcounted"><code>refcounted</code></a> </td> 174 <td align="center"> 2<sup>(3)</sup> </td> 175</tr> 176<tr> 177 <td align="center"> <a href="tutorial/configuration.html#no_tracking"><code>no_tracking</code></a> </td> 178 <td align="center"> 0 </td> 179</tr> 180</table> 181<sup>(1)</sup> <small>Assuming that <code>sizeof(Key)<=sizeof(Value)</code>.</small><br> 182<sup>(2)</sup> <small>For some implementations of <code>std::set</code> this overhead reduces to 3.</small><br> 183<sup>(3)</sup> <small>In some platforms this value can be 3.</small> 184</p> 185 186<p> 187For instance, for the default configuration parameters of <code>flyweight</code>, 188<i>overhead</i> is typically 2.5(<code>hashed_factory</code>) + 2(<code>refcounted</code>) 189= 4.5 words. 190</p> 191 192<h3><a name="overall_memory">Overall memory consumption</a></h3> 193 194<p> 195Consider a scenario where there are <i>N</i> different objects of type <code>T</code> 196jointly taking <i>M</i> different values. The objects consume then 197<i>S</i> = <i>N</i>·<i>T</i> bytes, where <i>T</i> is defined as the 198average size of <code>T</code> (<code>sizeof(T)</code> plus dynamic 199memory allocated by <code>T</code> objects). 200If we now replace <code>T</code> by some instantiation 201<code>flyweight<T,...></code>, the resulting memory consumption 202will be 203</p> 204 205<blockquote> 206<i>S<sub>F</sub></i> = 207<i>N</i>·<i>P</i> + <i>M</i>·(<i>T</i> + <i>overhead</i>), 208</blockquote> 209 210<p> 211where <i>P</i> is <code>sizeof(flyweight<T,...>)</code>, typically 212equal to <code>sizeof(void*)</code>, as seen <a href="#flyweight_size">before</a>. 213The ratio <i>S<sub>F</sub></i> / <i>S</i> is then 214</p> 215 216<blockquote> 217<i>S<sub>F</sub></i> / <i>S</i> = 218(<i>P</i> / <i>T</i>)+ (<i>M</i> / <i>N</i>)(1 + <i>overhead</i> / <i>T</i>). 219</blockquote> 220 221<p> 222<i>S<sub>F</sub></i> / <i>S</i> tends to its minimum, <i>P</i> / <i>T</i>, 223as <i>M</i> / <i>N</i> tends to 0, i.e. when the degree of value redundancy 224among <code>T</code> objects grows higher. On the other hand, the worst possible case 225<i>S<sub>F</sub></i> / <i>S</i> = 1 + (<i>P</i> + <i>overhead</i>) / <i>T</i> 226happens when <i>M</i> / <i>N</i> = 1, that is, if there is no value redundancy at all; in this situation there is 227no point in applying the flyweight pattern in the first place. 228</p> 229 230<p align="center"> 231<img src="memory.png" alt="relative memory consumption of Boost.Flyweight as a function of value diversity" 232width="446" height="411"><br> 233<b>Fig. 1: Relative memory consumption of Boost.Flyweight as a function of value diversity.</b> 234</p> 235 236<h2> 237<a name="time">Time efficiency</a> 238</h2> 239 240<p> 241The introduction of the flyweight pattern involves an extra level of indirection 242that, in general, results in some execution overhead when accessing the values. On 243the other hand, manipulation of flyweight objects is considerably faster than 244moving around the heavy values they stand for. We analyze qualitatively the 245execution overheads or improvements associated to the different usage contexts 246of Boost.Flyweight. 247</p> 248 249<h3><a name="initialization">Initialization</a></h3> 250 251<p> 252As compared with the initialization an object of type <code>T</code>, constructing 253a <code>flyweight<T></code> performs important extra work like looking 254up the value in the flyweight factory and inserting it if it is not present. 255So, construction of flyweights (other than copy construction, which is 256cheap), is expected to be noticeably slower than the construction of the 257underlying type <code>T</code>. Much of the time spent at constructing 258the associated <code>T</code> value proper can be saved, however, by 259using <a href="tutorial/key_value.html">key-value flyweights</a>. 260</p> 261 262<h3><a name="assignment">Assignment</a></h3> 263 264<p> 265Assignment of flyweight objects is extremely fast, as it only involves 266assigning an internal handle type used to refer to the shared value. Moreover, 267assignment of <code>flyweight</code> objects never throws. Assignment time 268is influenced by the type of <a href="tutorial/configuration.html#tracking">tracking 269policy</a> used; in this regard, 270<a href="tutorial/configuration.html#no_tracking"><code>no_tracking</code></a> 271is the fastest option. 272</p> 273 274<h3><a name="equality_comparison">Equality comparison</a></h3> 275 276<p> 277Comparing two <code>flyweight</code> objects for equality reduces to 278checking that the <i>addresses</i> of the values they are associated to 279are equal; in general, this operation is much faster than comparing the 280underlying values. This aspect is of particular relevance when the flyweight 281objects stand for complex values like those arising in the application of 282the <a href="examples.html#example3"><i>composite pattern</i></a>. 283</p> 284 285<h3><a name="value_access">Value access</a></h3> 286 287<p> 288The conversion from <code>flyweight<T></code> to <code>const T&</code> 289relies on a level of indirection relating the flyweight objects to the 290values they are associated to; so, value access is expected to be slower 291when using Boost.Flyweight as compared to using the associated values 292directly. This overhead, however, can be masked by an indirect improvement 293resulting from locality and cache effects: as the set of different <code>T</code> 294values handled by an instantiation of <code>flyweight<T></code> is 295generally much smaller than the equivalent family of <code>T</code> objects 296when Boost.Flyweight is not used, active values can fit better 297into the processor cache. 298</p> 299 300<h2><a name="results">Experimental results</a></h2> 301 302<p> 303A <a href="examples.html#example7">profiling program</a> was devised to test 304the space and time efficiency of different instantiations of <code>flyweight</code> 305against a base situation not using Boost.Flyweight. The profiled scenarios are: 306<ol> 307 <li><code>std::string</code>.</li> 308 <li><code>flyweight<std::string></code> with default configuration aspects 309 (<a href="tutorial/configuration.html#hashed_factory"><code>hashed_factory</code></a>, 310 <a href="tutorial/configuration.html#refcounted"><code>refcounted</code></a> tracking, 311 <a href="tutorial/configuration.html#simple_locking"><code>simple_locking</code></a>). 312 </li> 313 <li><code>flyweight<std::string,<a href="tutorial/configuration.html#no_tracking"><code>no_tracking</code></a>></code>.</li> 314 <li><code>flyweight<std::string,<a href="tutorial/configuration.html#set_factory"><code>set_factory</code></a>></code>.</li> 315 <li><code>flyweight<std::string,<a href="tutorial/configuration.html#set_factory"><code>set_factory</code></a>,<a href="tutorial/configuration.html#no_tracking"><code>no_tracking</code></a>></code>.</li> 316</ol> 317</p> 318 319<p> 320Actually the types tested are not exactly those listed above, but instrumented 321versions that keep track of the allocated memory for profiling purposes. 322The program parses a text file into an array of words and then perform various 323manipulations involving the different context usages of Boost.Flyweight discussed 324<a href="#time">previously</a>. As our text file we have used the 325<a href="http://www.gutenberg.org/dirs/etext99/2donq10.txt">plain text</a> 326version of Project Gutenberg edition of <a href="http://www.gutenberg.org/etext/2000"><i>Don 327Quijote</i></a> (2.04 MB). 328</p> 329 330<h3><a name="msvc_80">Microsoft Visual C++ 8.0</a></h3> 331 332<p> 333The program was built with default release settings and <code>_SECURE_SCL=0</code>. 334Tests were run under Windows XP in a machine equipped with an Intel Core 2 Duo T5500 335processor and 1 GB of RAM. 336</p> 337 338<h4><a name="msvc_80_memory">Memory</a></h4> 339 340<p align="center"> 341<img src="memory_msvc_80.png" alt="memory consumption (MB), MSVC++ 8.0" 342width="798" height="323"><br> 343<b>Fig. 2: Memory consumption, MSVC++ 8.0. Values in MB.</b> 344</p> 345 346<p> 347The results show the memory consumption figures for the different profiled 348scenarios. 349The standard library implementation of MSVC++ 8.0 features the so-called 350small buffer optimization for strings, by which <code>std::string</code> 351objects hold a small buffer that can be used when the string is short, 352thus avoding dynamic allocations. This results in <code>sizeof(std::string)</code> 353being quite high, 28 bytes. In our particular test strings are almost always 354held in the small buffer, so the minimum <a href="#overall_memory"><i>S<sub>F</sub></i> / <i>S</i></a> 355achievable is 4/28 = 14.3%, which is quite close to the experimental 356results, given that the memory devoted to storage of shared values 357is residual (around 3% of total memory) due to the high word redundancy 358of the text source. 359</p> 360 361<h4><a name="msvc_80_time">Execution time</a></h4> 362 363<p align="center"> 364<img src="time_msvc_80.png" alt="execution time (s), MSVC++ 8.0" 365width="820" height="325"><br> 366<b>Fig. 3: Execution time, MSVC++ 8.0. Values in seconds.</b> 367</p> 368 369<p> 370The figure displays execution times for the profiled scenarios in different 371usage contexts. In accordance with our previous 372<a href="#time">qualitative analysis</a>, initialization of <code>flyweight</code>s 373carries an important overhead with respect to the base case scenario (between 20% and 40% 374of additional execution time), while the other usage contexts 375(assignment, equality comparison and value access) have performance gains, 376with speedup factors of more than 10 in some cases. The use of a 377<a href="tutorial/configuration.html#refcounted"><code>refcounted</code></a> 378tracking policy introduces penalties with respect to 379<a href="tutorial/configuration.html#no_tracking"><code>no_tracking</code></a> 380in initialization and assignment, but has no effect in equality comparison 381and value access. 382</p> 383 384<h3><a name="gcc_344">GNU GCC 3.4.4</a></h3> 385 386<p> 387The Cygwin/MinGW version of the compiler was used, with command options 388<code>-ftemplate-depth-128 -O3 -finline-functions -DNDEBUG</code>. 389Tests were run under a Cygwin terminal in the same machine as <a href="#msvc_80">before</a>. 390</p> 391 392<h4><a name="gcc_344_memory">Memory</a></h4> 393 394<p align="center"> 395<img src="memory_gcc_344.png" alt="memory consumption (MB), GCC 3.4.4" 396width="798" height="323"><br> 397<b>Fig. 4: Memory consumption, GCC 3.4.4. Values in MB.</b> 398</p> 399 400<p> 401The standard library used by GCC 3.4.4 implements <code>std::string</code> 402using <a href="http://en.wikipedia.org/wiki/Copy-on-write">copy-on-write</a> 403optimization techniques, which leads to very small value redundancy for 404some usage patterns. This explains why the memory reduction achieved 405by Boost.Flyweight is so poor in this case. Other contexts where assignment 406is much less used than direct construction will favor Boost.Flyweight 407over plain copy-on-write <code>std::string</code>s. 408</p> 409 410<h4><a name="gcc_344_time">Execution time</a></h4> 411 412<p align="center"> 413<img src="time_gcc_344.png" alt="execution time (s), GCC 3.4.4" 414width="820" height="325"><br> 415<b>Fig. 5: Execution time, GCC 3.4.4. Values in seconds.</b> 416</p> 417 418<p> 419Relative performance figures are similar to those obtained for 420<a href="#msvc_80_time">MSVC++ 8.0</a>, although some of the 421speedups achieved by Boost.Flyweight are higher here (×25 422in equality comparison and up to ×100 in assignment when 423<a href="tutorial/configuration.html#no_tracking"><code>no_tracking</code></a> 424is in effect). 425</p> 426 427<h2><a name="conclusions">Conclusions</a></h2> 428 429<p> 430The introduction of Boost.Flyweight in application scenarios with very 431high value redundancy yields important reductions in memory consumption: 432this is especially relevant when data volume approaches the limits of 433physical memory in the machine, since Boost.Flyweight can avoid virtual 434memory thrashing thus making the application viable. We have shown 435how to estimate the achievable reduction in memory consumption from 436some basic value statistics and knowledge of the <code>flyweight</code> 437configuration aspects being used. 438</p> 439 440<p> 441Boost.Flyweight can also accelerate execution times in areas other than 442object initialization, due to the fastest manipulation of small 443flyweight objects and to locality and cache effects arising from the 444drastic reduction of the set of allocated values. 445</p> 446 447<hr> 448 449<div class="prev_link"><a href="reference/tracking.html"><img src="prev.gif" alt="tracking policies" border="0"><br> 450Tracking policies 451</a></div> 452<div class="up_link"><a href="index.html"><img src="up.gif" alt="index" border="0"><br> 453Index 454</a></div> 455<div class="next_link"><a href="examples.html"><img src="next.gif" alt="examples" border="0"><br> 456Examples 457</a></div><br clear="all" style="clear: all;"> 458<br clear="all" style="clear: all;"> 459 460<br> 461 462<p>Revised September 1st 2014</p> 463 464<p>© Copyright 2006-2014 Joaquín M López Muñoz. 465Distributed under the Boost Software 466License, Version 1.0. (See accompanying file <a href="../../../LICENSE_1_0.txt"> 467LICENSE_1_0.txt</a> or copy at <a href="http://www.boost.org/LICENSE_1_0.txt"> 468http://www.boost.org/LICENSE_1_0.txt</a>) 469</p> 470 471</body> 472</html> 473