1# base/containers library 2 3## What goes here 4 5This directory contains some STL-like containers. 6 7Things should be moved here that are generally applicable across the code base. 8Don't add things here just because you need them in one place and think others 9may someday want something similar. You can put specialized containers in 10your component's directory and we can promote them here later if we feel there 11is broad applicability. 12 13### Design and naming 14 15Containers should adhere as closely to STL as possible. Functions and behaviors 16not present in STL should only be added when they are related to the specific 17data structure implemented by the container. 18 19For STL-like containers our policy is that they should use STL-like naming even 20when it may conflict with the style guide. So functions and class names should 21be lower case with underscores. Non-STL-like classes and functions should use 22Google naming. Be sure to use the base namespace. 23 24## Map and set selection 25 26### Usage advice 27 28 * Generally avoid **std::unordered\_set** and **std::unordered\_map**. In the 29 common case, query performance is unlikely to be sufficiently higher than 30 std::map to make a difference, insert performance is slightly worse, and 31 the memory overhead is high. This makes sense mostly for large tables where 32 you expect a lot of lookups. 33 34 * Most maps and sets in Chrome are small and contain objects that can be 35 moved efficiently. In this case, consider **base::flat\_map** and 36 **base::flat\_set**. You need to be aware of the maximum expected size of 37 the container since individual inserts and deletes are O(n), giving O(n^2) 38 construction time for the entire map. But because it avoids mallocs in most 39 cases, inserts are better or comparable to other containers even for 40 several dozen items, and efficiently-moved types are unlikely to have 41 performance problems for most cases until you have hundreds of items. If 42 your container can be constructed in one shot, the constructor from vector 43 gives O(n log n) construction times and it should be strictly better than 44 a std::map. 45 46 * **base::small\_map** has better runtime memory usage without the poor 47 mutation performance of large containers that base::flat\_map has. But this 48 advantage is partially offset by additional code size. Prefer in cases 49 where you make many objects so that the code/heap tradeoff is good. 50 51 * Use **std::map** and **std::set** if you can't decide. Even if they're not 52 great, they're unlikely to be bad or surprising. 53 54### Map and set details 55 56Sizes are on 64-bit platforms. Stable iterators aren't invalidated when the 57container is mutated. 58 59| Container | Empty size | Per-item overhead | Stable iterators? | 60|:---------------------------------------- |:--------------------- |:----------------- |:----------------- | 61| std::map, std::set | 16 bytes | 32 bytes | Yes | 62| std::unordered\_map, std::unordered\_set | 128 bytes | 16-24 bytes | No | 63| base::flat\_map and base::flat\_set | 24 bytes | 0 (see notes) | No | 64| base::small\_map | 24 bytes (see notes) | 32 bytes | No | 65 66**Takeaways:** std::unordered\_map and std::unordered\_map have high 67overhead for small container sizes, prefer these only for larger workloads. 68 69Code size comparisons for a block of code (see appendix) on Windows using 70strings as keys. 71 72| Container | Code size | 73|:------------------- |:---------- | 74| std::unordered\_map | 1646 bytes | 75| std::map | 1759 bytes | 76| base::flat\_map | 1872 bytes | 77| base::small\_map | 2410 bytes | 78 79**Takeaways:** base::small\_map generates more code because of the inlining of 80both brute-force and red-black tree searching. This makes it less attractive 81for random one-off uses. But if your code is called frequently, the runtime 82memory benefits will be more important. The code sizes of the other maps are 83close enough it's not worth worrying about. 84 85### std::map and std::set 86 87A red-black tree. Each inserted item requires the memory allocation of a node 88on the heap. Each node contains a left pointer, a right pointer, a parent 89pointer, and a "color" for the red-black tree (32-bytes per item on 64-bits). 90 91### std::unordered\_map and std::unordered\_set 92 93A hash table. Implemented on Windows as a std::vector + std::list and in libc++ 94as the equivalent of a std::vector + a std::forward\_list. Both implementations 95allocate an 8-entry hash table (containing iterators into the list) on 96initialization, and grow to 64 entries once 8 items are inserted. Above 64 97items, the size doubles every time the load factor exceeds 1. 98 99The empty size is sizeof(std::unordered\_map) = 64 + 100the initial hash table size which is 8 pointers. The per-item overhead in the 101table above counts the list node (2 pointers on Windows, 1 pointer in libc++), 102plus amortizes the hash table assuming a 0.5 load factor on average. 103 104In a microbenchmark on Windows, inserts of 1M integers into a 105std::unordered\_set took 1.07x the time of std::set, and queries took 0.67x the 106time of std::set. For a typical 4-entry set (the statistical mode of map sizes 107in the browser), query performance is identical to std::set and base::flat\_set. 108On ARM, unordered\_set performance can be worse because integer division to 109compute the bucket is slow, and a few "less than" operations can be faster than 110computing a hash depending on the key type. The takeaway is that you should not 111default to using unordered maps because "they're faster." 112 113### base::flat\_map and base::flat\_set 114 115A sorted std::vector. Seached via binary search, inserts in the middle require 116moving elements to make room. Good cache locality. For large objects and large 117set sizes, std::vector's doubling-when-full strategy can waste memory. 118 119Supports efficient construction from a vector of items which avoids the O(n^2) 120insertion time of each element separately. 121 122The per-item overhead will depend on the underlying std::vector's reallocation 123strategy and the memory access pattern. Assuming items are being linearly added, 124one would expect it to be 3/4 full, so per-item overhead will be 0.25 * 125sizeof(T). 126 127 128flat\_set/flat\_map support a notion of transparent comparisons. Therefore you 129can, for example, lookup base::StringPiece in a set of std::strings without 130constructing a temporary std::string. This functionality is based on C++14 131extensions to std::set/std::map interface. 132 133You can find more information about transparent comparisons here: 134http://en.cppreference.com/w/cpp/utility/functional/less_void 135 136Example, smart pointer set: 137 138```cpp 139// Declare a type alias using base::UniquePtrComparator. 140template <typename T> 141using UniquePtrSet = base::flat_set<std::unique_ptr<T>, 142 base::UniquePtrComparator>; 143 144// ... 145// Collect data. 146std::vector<std::unique_ptr<int>> ptr_vec; 147ptr_vec.reserve(5); 148std::generate_n(std::back_inserter(ptr_vec), 5, []{ 149 return std::make_unique<int>(0); 150}); 151 152// Construct a set. 153UniquePtrSet<int> ptr_set(std::move(ptr_vec), base::KEEP_FIRST_OF_DUPES); 154 155// Use raw pointers to lookup keys. 156int* ptr = ptr_set.begin()->get(); 157EXPECT_TRUE(ptr_set.find(ptr) == ptr_set.begin()); 158``` 159 160Example flat_map<std\::string, int>: 161 162```cpp 163base::flat_map<std::string, int> str_to_int({{"a", 1}, {"c", 2},{"b", 2}}, 164 base::KEEP_FIRST_OF_DUPES); 165 166// Does not construct temporary strings. 167str_to_int.find("c")->second = 3; 168str_to_int.erase("c"); 169EXPECT_EQ(str_to_int.end(), str_to_int.find("c")->second); 170 171// NOTE: This does construct a temporary string. This happens since if the 172// item is not in the container, then it needs to be constructed, which is 173// something that transparent comparators don't have to guarantee. 174str_to_int["c"] = 3; 175``` 176 177### base::small\_map 178 179A small inline buffer that is brute-force searched that overflows into a full 180std::map or std::unordered\_map. This gives the memory benefit of 181base::flat\_map for small data sizes without the degenerate insertion 182performance for large container sizes. 183 184Since instantiations require both code for a std::map and a brute-force search 185of the inline container, plus a fancy iterator to cover both cases, code size 186is larger. 187 188The initial size in the above table is assuming a very small inline table. The 189actual size will be sizeof(int) + min(sizeof(std::map), sizeof(T) * 190inline\_size). 191 192# Deque 193 194### Usage advice 195 196Chromium code should always use `base::circular_deque` or `base::queue` in 197preference to `std::deque` or `std::queue` due to memory usage and platform 198variation. 199 200The `base::circular_deque` implementation (and the `base::queue` which uses it) 201provide performance consistent across platforms that better matches most 202programmer's expectations on performance (it doesn't waste as much space as 203libc++ and doesn't do as many heap allocations as MSVC). It also generates less 204code tham `std::queue`: using it across the code base saves several hundred 205kilobytes. 206 207Since `base::deque` does not have stable iterators and it will move the objects 208it contains, it may not be appropriate for all uses. If you need these, 209consider using a `std::list` which will provide constant time insert and erase. 210 211### std::deque and std::queue 212 213The implementation of `std::deque` varies considerably which makes it hard to 214reason about. All implementations use a sequence of data blocks referenced by 215an array of pointers. The standard guarantees random access, amortized 216constant operations at the ends, and linear mutations in the middle. 217 218In Microsoft's implementation, each block is the smaller of 16 bytes or the 219size of the contained element. This means in practice that every expansion of 220the deque of non-trivial classes requires a heap allocation. libc++ (on Android 221and Mac) uses 4K blocks which elimiates the problem of many heap allocations, 222but generally wastes a large amount of space (an Android analysis revealed more 223than 2.5MB wasted space from deque alone, resulting in some optimizations). 224libstdc++ uses an intermediate-size 512 byte buffer. 225 226Microsoft's implementation never shrinks the deque capacity, so the capacity 227will always be the maximum number of elements ever contained. libstdc++ 228deallocates blocks as they are freed. libc++ keeps up to two empty blocks. 229 230### base::circular_deque and base::queue 231 232A deque implemented as a circular buffer in an array. The underlying array will 233grow like a `std::vector` while the beginning and end of the deque will move 234around. The items will wrap around the underlying buffer so the storage will 235not be contiguous, but fast random access iterators are still possible. 236 237When the underlying buffer is filled, it will be reallocated and the constents 238moved (like a `std::vector`). The underlying buffer will be shrunk if there is 239too much wasted space (_unlike_ a `std::vector`). As a result, iterators are 240not stable across mutations. 241 242# Stack 243 244`std::stack` is like `std::queue` in that it is a wrapper around an underlying 245container. The default container is `std::deque` so everything from the deque 246section applies. 247 248Chromium provides `base/containers/stack.h` which defines `base::stack` that 249should be used in preference to std::stack. This changes the underlying 250container to `base::circular_deque`. The result will be very similar to 251manually specifying a `std::vector` for the underlying implementation except 252that the storage will shrink when it gets too empty (vector will never 253reallocate to a smaller size). 254 255Watch out: with some stack usage patterns it's easy to depend on unstable 256behavior: 257 258```cpp 259base::stack<Foo> stack; 260for (...) { 261 Foo& current = stack.top(); 262 DoStuff(); // May call stack.push(), say if writing a parser. 263 current.done = true; // Current may reference deleted item! 264} 265``` 266 267## Appendix 268 269### Code for map code size comparison 270 271This just calls insert and query a number of times, with printfs that prevent 272things from being dead-code eliminated. 273 274```cpp 275TEST(Foo, Bar) { 276 base::small_map<std::map<std::string, Flubber>> foo; 277 foo.insert(std::make_pair("foo", Flubber(8, "bar"))); 278 foo.insert(std::make_pair("bar", Flubber(8, "bar"))); 279 foo.insert(std::make_pair("foo1", Flubber(8, "bar"))); 280 foo.insert(std::make_pair("bar1", Flubber(8, "bar"))); 281 foo.insert(std::make_pair("foo", Flubber(8, "bar"))); 282 foo.insert(std::make_pair("bar", Flubber(8, "bar"))); 283 auto found = foo.find("asdf"); 284 printf("Found is %d\n", (int)(found == foo.end())); 285 found = foo.find("foo"); 286 printf("Found is %d\n", (int)(found == foo.end())); 287 found = foo.find("bar"); 288 printf("Found is %d\n", (int)(found == foo.end())); 289 found = foo.find("asdfhf"); 290 printf("Found is %d\n", (int)(found == foo.end())); 291 found = foo.find("bar1"); 292 printf("Found is %d\n", (int)(found == foo.end())); 293} 294``` 295 296