README.md
1# base/containers library
2
3## What goes here
4
5This directory contains some STL-like containers.
6
7Things should be moved here that are generally applicable across the code base.
8Don't add things here just because you need them in one place and think others
9may someday want something similar. You can put specialized containers in
10your component's directory and we can promote them here later if we feel there
11is broad applicability.
12
13### Design and naming
14
15Containers should adhere as closely to STL as possible. Functions and behaviors
16not present in STL should only be added when they are related to the specific
17data structure implemented by the container.
18
19For STL-like containers our policy is that they should use STL-like naming even
20when it may conflict with the style guide. So functions and class names should
21be lower case with underscores. Non-STL-like classes and functions should use
22Google naming. Be sure to use the base namespace.
23
24## Map and set selection
25
26### Usage advice
27
28 * Generally avoid **std::unordered\_set** and **std::unordered\_map**. In the
29 common case, query performance is unlikely to be sufficiently higher than
30 std::map to make a difference, insert performance is slightly worse, and
31 the memory overhead is high. This makes sense mostly for large tables where
32 you expect a lot of lookups.
33
34 * Most maps and sets in Chrome are small and contain objects that can be
35 moved efficiently. In this case, consider **base::flat\_map** and
36 **base::flat\_set**. You need to be aware of the maximum expected size of
37 the container since individual inserts and deletes are O(n), giving O(n^2)
38 construction time for the entire map. But because it avoids mallocs in most
39 cases, inserts are better or comparable to other containers even for
40 several dozen items, and efficiently-moved types are unlikely to have
41 performance problems for most cases until you have hundreds of items. If
42 your container can be constructed in one shot, the constructor from vector
43 gives O(n log n) construction times and it should be strictly better than
44 a std::map.
45
46 * **base::small\_map** has better runtime memory usage without the poor
47 mutation performance of large containers that base::flat\_map has. But this
48 advantage is partially offset by additional code size. Prefer in cases
49 where you make many objects so that the code/heap tradeoff is good.
50
51 * Use **std::map** and **std::set** if you can't decide. Even if they're not
52 great, they're unlikely to be bad or surprising.
53
54### Map and set details
55
56Sizes are on 64-bit platforms. Stable iterators aren't invalidated when the
57container is mutated.
58
59| Container | Empty size | Per-item overhead | Stable iterators? |
60|:---------------------------------------- |:--------------------- |:----------------- |:----------------- |
61| std::map, std::set | 16 bytes | 32 bytes | Yes |
62| std::unordered\_map, std::unordered\_set | 128 bytes | 16-24 bytes | No |
63| base::flat\_map and base::flat\_set | 24 bytes | 0 (see notes) | No |
64| base::small\_map | 24 bytes (see notes) | 32 bytes | No |
65
66**Takeaways:** std::unordered\_map and std::unordered\_map have high
67overhead for small container sizes, prefer these only for larger workloads.
68
69Code size comparisons for a block of code (see appendix) on Windows using
70strings as keys.
71
72| Container | Code size |
73|:------------------- |:---------- |
74| std::unordered\_map | 1646 bytes |
75| std::map | 1759 bytes |
76| base::flat\_map | 1872 bytes |
77| base::small\_map | 2410 bytes |
78
79**Takeaways:** base::small\_map generates more code because of the inlining of
80both brute-force and red-black tree searching. This makes it less attractive
81for random one-off uses. But if your code is called frequently, the runtime
82memory benefits will be more important. The code sizes of the other maps are
83close enough it's not worth worrying about.
84
85### std::map and std::set
86
87A red-black tree. Each inserted item requires the memory allocation of a node
88on the heap. Each node contains a left pointer, a right pointer, a parent
89pointer, and a "color" for the red-black tree (32-bytes per item on 64-bits).
90
91### std::unordered\_map and std::unordered\_set
92
93A hash table. Implemented on Windows as a std::vector + std::list and in libc++
94as the equivalent of a std::vector + a std::forward\_list. Both implementations
95allocate an 8-entry hash table (containing iterators into the list) on
96initialization, and grow to 64 entries once 8 items are inserted. Above 64
97items, the size doubles every time the load factor exceeds 1.
98
99The empty size is sizeof(std::unordered\_map) = 64 +
100the initial hash table size which is 8 pointers. The per-item overhead in the
101table above counts the list node (2 pointers on Windows, 1 pointer in libc++),
102plus amortizes the hash table assuming a 0.5 load factor on average.
103
104In a microbenchmark on Windows, inserts of 1M integers into a
105std::unordered\_set took 1.07x the time of std::set, and queries took 0.67x the
106time of std::set. For a typical 4-entry set (the statistical mode of map sizes
107in the browser), query performance is identical to std::set and base::flat\_set.
108On ARM, unordered\_set performance can be worse because integer division to
109compute the bucket is slow, and a few "less than" operations can be faster than
110computing a hash depending on the key type. The takeaway is that you should not
111default to using unordered maps because "they're faster."
112
113### base::flat\_map and base::flat\_set
114
115A sorted std::vector. Seached via binary search, inserts in the middle require
116moving elements to make room. Good cache locality. For large objects and large
117set sizes, std::vector's doubling-when-full strategy can waste memory.
118
119Supports efficient construction from a vector of items which avoids the O(n^2)
120insertion time of each element separately.
121
122The per-item overhead will depend on the underlying std::vector's reallocation
123strategy and the memory access pattern. Assuming items are being linearly added,
124one would expect it to be 3/4 full, so per-item overhead will be 0.25 *
125sizeof(T).
126
127
128flat\_set/flat\_map support a notion of transparent comparisons. Therefore you
129can, for example, lookup base::StringPiece in a set of std::strings without
130constructing a temporary std::string. This functionality is based on C++14
131extensions to std::set/std::map interface.
132
133You can find more information about transparent comparisons here:
134http://en.cppreference.com/w/cpp/utility/functional/less_void
135
136Example, smart pointer set:
137
138```cpp
139// Declare a type alias using base::UniquePtrComparator.
140template <typename T>
141using UniquePtrSet = base::flat_set<std::unique_ptr<T>,
142 base::UniquePtrComparator>;
143
144// ...
145// Collect data.
146std::vector<std::unique_ptr<int>> ptr_vec;
147ptr_vec.reserve(5);
148std::generate_n(std::back_inserter(ptr_vec), 5, []{
149 return std::make_unique<int>(0);
150});
151
152// Construct a set.
153UniquePtrSet<int> ptr_set(std::move(ptr_vec), base::KEEP_FIRST_OF_DUPES);
154
155// Use raw pointers to lookup keys.
156int* ptr = ptr_set.begin()->get();
157EXPECT_TRUE(ptr_set.find(ptr) == ptr_set.begin());
158```
159
160Example flat_map<std\::string, int>:
161
162```cpp
163base::flat_map<std::string, int> str_to_int({{"a", 1}, {"c", 2},{"b", 2}},
164 base::KEEP_FIRST_OF_DUPES);
165
166// Does not construct temporary strings.
167str_to_int.find("c")->second = 3;
168str_to_int.erase("c");
169EXPECT_EQ(str_to_int.end(), str_to_int.find("c")->second);
170
171// NOTE: This does construct a temporary string. This happens since if the
172// item is not in the container, then it needs to be constructed, which is
173// something that transparent comparators don't have to guarantee.
174str_to_int["c"] = 3;
175```
176
177### base::small\_map
178
179A small inline buffer that is brute-force searched that overflows into a full
180std::map or std::unordered\_map. This gives the memory benefit of
181base::flat\_map for small data sizes without the degenerate insertion
182performance for large container sizes.
183
184Since instantiations require both code for a std::map and a brute-force search
185of the inline container, plus a fancy iterator to cover both cases, code size
186is larger.
187
188The initial size in the above table is assuming a very small inline table. The
189actual size will be sizeof(int) + min(sizeof(std::map), sizeof(T) *
190inline\_size).
191
192# Deque
193
194### Usage advice
195
196Chromium code should always use `base::circular_deque` or `base::queue` in
197preference to `std::deque` or `std::queue` due to memory usage and platform
198variation.
199
200The `base::circular_deque` implementation (and the `base::queue` which uses it)
201provide performance consistent across platforms that better matches most
202programmer's expectations on performance (it doesn't waste as much space as
203libc++ and doesn't do as many heap allocations as MSVC). It also generates less
204code tham `std::queue`: using it across the code base saves several hundred
205kilobytes.
206
207Since `base::deque` does not have stable iterators and it will move the objects
208it contains, it may not be appropriate for all uses. If you need these,
209consider using a `std::list` which will provide constant time insert and erase.
210
211### std::deque and std::queue
212
213The implementation of `std::deque` varies considerably which makes it hard to
214reason about. All implementations use a sequence of data blocks referenced by
215an array of pointers. The standard guarantees random access, amortized
216constant operations at the ends, and linear mutations in the middle.
217
218In Microsoft's implementation, each block is the smaller of 16 bytes or the
219size of the contained element. This means in practice that every expansion of
220the deque of non-trivial classes requires a heap allocation. libc++ (on Android
221and Mac) uses 4K blocks which elimiates the problem of many heap allocations,
222but generally wastes a large amount of space (an Android analysis revealed more
223than 2.5MB wasted space from deque alone, resulting in some optimizations).
224libstdc++ uses an intermediate-size 512 byte buffer.
225
226Microsoft's implementation never shrinks the deque capacity, so the capacity
227will always be the maximum number of elements ever contained. libstdc++
228deallocates blocks as they are freed. libc++ keeps up to two empty blocks.
229
230### base::circular_deque and base::queue
231
232A deque implemented as a circular buffer in an array. The underlying array will
233grow like a `std::vector` while the beginning and end of the deque will move
234around. The items will wrap around the underlying buffer so the storage will
235not be contiguous, but fast random access iterators are still possible.
236
237When the underlying buffer is filled, it will be reallocated and the constents
238moved (like a `std::vector`). The underlying buffer will be shrunk if there is
239too much wasted space (_unlike_ a `std::vector`). As a result, iterators are
240not stable across mutations.
241
242# Stack
243
244`std::stack` is like `std::queue` in that it is a wrapper around an underlying
245container. The default container is `std::deque` so everything from the deque
246section applies.
247
248Chromium provides `base/containers/stack.h` which defines `base::stack` that
249should be used in preference to std::stack. This changes the underlying
250container to `base::circular_deque`. The result will be very similar to
251manually specifying a `std::vector` for the underlying implementation except
252that the storage will shrink when it gets too empty (vector will never
253reallocate to a smaller size).
254
255Watch out: with some stack usage patterns it's easy to depend on unstable
256behavior:
257
258```cpp
259base::stack<Foo> stack;
260for (...) {
261 Foo& current = stack.top();
262 DoStuff(); // May call stack.push(), say if writing a parser.
263 current.done = true; // Current may reference deleted item!
264}
265```
266
267## Appendix
268
269### Code for map code size comparison
270
271This just calls insert and query a number of times, with printfs that prevent
272things from being dead-code eliminated.
273
274```cpp
275TEST(Foo, Bar) {
276 base::small_map<std::map<std::string, Flubber>> foo;
277 foo.insert(std::make_pair("foo", Flubber(8, "bar")));
278 foo.insert(std::make_pair("bar", Flubber(8, "bar")));
279 foo.insert(std::make_pair("foo1", Flubber(8, "bar")));
280 foo.insert(std::make_pair("bar1", Flubber(8, "bar")));
281 foo.insert(std::make_pair("foo", Flubber(8, "bar")));
282 foo.insert(std::make_pair("bar", Flubber(8, "bar")));
283 auto found = foo.find("asdf");
284 printf("Found is %d\n", (int)(found == foo.end()));
285 found = foo.find("foo");
286 printf("Found is %d\n", (int)(found == foo.end()));
287 found = foo.find("bar");
288 printf("Found is %d\n", (int)(found == foo.end()));
289 found = foo.find("asdfhf");
290 printf("Found is %d\n", (int)(found == foo.end()));
291 found = foo.find("bar1");
292 printf("Found is %d\n", (int)(found == foo.end()));
293}
294```
295
296