• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[/
2    Copyright (c) 2009-2009 Joachim Faulhaber
3
4    Distributed under the Boost Software License, Version 1.0.
5    (See accompanying file LICENSE_1_0.txt or copy at
6    http://www.boost.org/LICENSE_1_0.txt)
7]
8
9[section Projects]
10
11['*Projects*] are examples on the usage of interval containers
12that go beyond small toy snippets of code. The code presented
13here addresses more serious applications that approach the
14quality of real world programming. At the same time it aims to
15guide the reader more deeply into various aspects of
16the library. In order not to overburden the reader with implementation
17details, the code in ['*projects*] tries to be ['*minimal*]. It has a focus on
18the main aspects of the projects and is not intended to be complete
19and mature like the library code itself. Cause it's minimal,
20project code lives in `namespace mini`.
21
22[/
23[section Overview]
24
25[table Overview over Icl Examples
26    [[level]   [example] [classes] [features]]
27    [[basic]   [[link boost_icl.examples.large_bitset Large Bitset]]
28        [__itv_map__][The most simple application of an interval map:
29                      Counting the overlaps of added intervals.]]
30]
31
32[endsect][/ Overview IN Projects]
33]
34
35[section Large Bitset][/ IN Projects]
36
37Bitsets are just sets. Sets of unsigned integrals,
38to be more precise.
39The prefix ['*bit*] usually only indicates,
40that the representation of those sets is organized in a compressed
41form that exploits the fact, that we can switch on an off single
42bits in machine words. Bitsets are therefore known to be very small
43and thus efficient.
44The efficiency of bitsets is usually coupled to the
45precondition that the range of values of elements
46is relatively small, like [0..32) or [0..64), values that can
47be typically represented in single or a small number of
48machine words. If we wanted to represent a set containing
49two values {1, 1000000}, we would be much better off using
50other sets like e.g. an `std::set`.
51
52Bitsets compress well, if elements spread over narrow ranges
53only. Interval sets compress well, if many elements are clustered
54over intervals. They can span large sets very efficiently then.
55In project ['*Large Bitset*] we want to ['*combine the bit compression
56and the interval compression*] to achieve a set implementation,
57that is capable of spanning large chunks of contiguous elements
58using intervals and also to represent more narrow ['nests] of varying
59bit sequences using bitset compression.
60As we will see, this can be achieved using only a small
61amount of code because most of the properties we need
62are provided by an __itv_map__ of `bitsets`:
63
64``
65typedef interval_map<IntegralT, SomeBitSet<N>, partial_absorber,
66                     std::less, inplace_bit_add, inplace_bit_and> IntervalBitmap;
67``
68
69Such an `IntervalBitmap` represents `k*N` bits for every segment.
70``
71[a, a+k)->'1111....1111' // N bits associated: Represents a total of k*N bits.
72``
73
74For the interval `[a, a+k)` above all bits are set.
75But we can also have individual ['nests] or ['clusters]
76of bitsequences.
77
78``
79[b,  b+1)->'01001011...1'
80[b+1,b+2)->'11010001...0'
81. . .
82``
83
84and we can span intervals of equal bit sequences that represent
85periodic patterns.
86
87``
88[c,d)->'010101....01'  // Every second bit is set              in range [c,d)
89[d,e)->'001100..0011'  // Every two bits alterate              in range [d,e)
90[e,f)->'bit-sequence'  // 'bit-sequence' reoccurs every N bits in range [e,f)
91``
92
93An `IntervalBitmap` can represent
94`N*(2^M)` elements, if `M` is the number of bits of the
95integral type `IntegralT`. Unlike bitsets, that usually
96represent ['*unsigned*] integral numbers, large_bitset may
97range over negative numbers as well.
98There are fields where such large
99bitsets implementations are needed. E.g. for the compact
100representation of large file allocation tables.
101What remains to be done for project *Large Bitset* is
102to code a wrapper `class large_bitset` around `IntervalBitmap`
103so that `large_bitset` looks and feels like a usual
104set class.
105
106[section Using large_bitset]
107
108To quicken your appetite for a look at the implementation
109here are a few use cases first. Within the examples that follow,
110we will use `nat`[^['k]] for unsigned integrals
111and `bits`[^['k]] for bitsets containing [^['k]] bits.
112
113Let's start large.
114In the first example . . .
115
116[import ../example/large_bitset_/large_bitset.cpp]
117[large_bitset_test_large_set_all]
118
119. . . we are testing the limits. First we set all bits and
120then we switch off the very last bit.
121
122[large_bitset_test_large_erase_last]
123
124Program output (/a little beautified/):
125``
126----- Test function test_large() -----------------------------------------------
127We have just turned on the awesome amount of 18,446,744,073,709,551,616 bits ;-)
128[                 0, 288230376151711744) -> 1111111111111111111111111111111111111111111111111111111111111111
129---- Let's swich off the very last bit -----------------------------------------
130[                 0, 288230376151711743) -> 1111111111111111111111111111111111111111111111111111111111111111
131[288230376151711743, 288230376151711744) -> 1111111111111111111111111111111111111111111111111111111111111110
132---- Venti is plenty ... let's do something small: A tall ----------------------
133``
134
135More readable is a smaller version of `large_bitset`.
136In function `test_small()` we apply a few more operations . . .
137
138[large_bitset_test_small]
139
140. . . producing this output:
141``
142----- Test function test_small() -----------
143-- Switch on all bits in range [0,64] ------
144[0,8)->11111111
145[8,9)->10000000
146--------------------------------------------
147-- Turn off bits: 25,27,28 -----------------
148[0,3)->11111111
149[3,4)->10100111
150[4,8)->11111111
151[8,9)->10000000
152--------------------------------------------
153-- Flip bits in range [24,30) --------------
154[0,3)->11111111
155[3,4)->01011011
156[4,8)->11111111
157[8,9)->10000000
158--------------------------------------------
159-- Remove the first 10 bits ----------------
160[1,2)->00111111
161[2,3)->11111111
162[3,4)->01011011
163[4,8)->11111111
164[8,9)->10000000
165-- Remove even bits in range [0,72) --------
166[1,2)->00010101
167[2,3)->01010101
168[3,4)->01010001
169[4,8)->01010101
170--    Set odd  bits in range [0,72) --------
171[0,9)->01010101
172--------------------------------------------
173``
174
175Finally, we present a little /picturesque/ example,
176that demonstrates that `large_bitset` can also serve
177as a self compressing bitmap, that we can 'paint' with.
178
179[large_bitset_test_picturesque]
180
181Note that we have two `large_bitsets` for the /outline/
182and the /interior/. Both parts are compressed but we can
183compose both by `operator +`, because the right /positions/
184are provided. This is the program output:
185
186``
187----- Test function test_picturesque() -----
188-------- empty face:       3 intervals -----
189********
190*      *
191*      *
192*      *
193*      *
194 ******
195-------- compressed smile: 2 intervals -----
196  *  *
197  ****
198-------- staring bitset:   6 intervals -----
199********
200*      *
201* *  * *
202*      *
203* **** *
204 ******
205--------------------------------------------
206``
207
208So, may be you are curious how this class template
209is coded on top of __itv_map__ using only about 250 lines
210of code. This is shown in the sections that follow.
211
212[endsect][/ Usage of a large_bitset]
213
214[section The interval_bitmap]
215
216To begin, let's look at the basic data type again, that
217will be providing the major functionality:
218
219``
220typedef interval_map<DomainT, BitSetT, partial_absorber,
221                     std::less, inplace_bit_add, inplace_bit_and> IntervalBitmap;
222``
223
224`DomainT` is supposed to be an integral
225type, the bitset type `BitSetT` will be a wrapper class around an
226unsigned integral type. `BitSetT` has to implement bitwise operators
227that will be called by the functors `inplace_bit_add<BitSetT>`
228and `inplace_bit_and<BitSetT>`.
229The type trait of interval_map is `partial_absorber`, which means
230that it is /partial/ and that empty `BitSetTs` are not stored in
231the map. This is desired and keeps the `interval_map` minimal, storing
232only bitsets, that contain at least one bit switched on.
233Functor template `inplace_bit_add` for parameter `Combine` indicates that we
234do not expect `operator +=` as addition but the bitwise operator
235`|=`. For template parameter `Section` which is instaniated by
236`inplace_bit_and` we expect the bitwise `&=` operator.
237
238[endsect][/ section The interval_bitmap]
239
240[section A class implementation for the bitset type]
241
242The code of the project is enclosed in a `namespace mini`.
243The name indicates, that the implementation is a /minimal/
244example implementation. The name of the bitset class will
245be `bits` or `mini::bits` if qualified.
246
247To be used as a codomain parameter of class template __itv_map__,
248`mini::bits` has to implement all the functions that are required
249for a codomain_type in general, which are the default constructor `bits()`
250and an equality `operator==`. Moreover `mini::bits` has to implement operators
251required by the instantiations for parameter `Combine` and `Section`
252which are `inplace_bit_add` and `inplace_bit_and`. From functors
253`inplace_bit_add` and `inplace_bit_and` there are inverse functors
254`inplace_bit_subtract` and `inplace_bit_xor`. Those functors
255use operators `|= &= ^=` and `~`. Finally if we want to apply
256lexicographical and subset comparison on large_bitset, we also
257need an `operator <`. All the operators that we need can be implemented
258for `mini::bits` on a few lines:
259
260[import ../example/large_bitset_/bits.hpp]
261[mini_bits_class_bits]
262
263Finally there is one important piece of meta information, we have
264to provide: `mini::bits` has to be recognized as a `Set` by the
265icl code. Otherwise we can not exploit the fact that a map of
266sets is model of `Set` and the resulting large_bitset would not
267behave like a set. So we have to say that `mini::bits` shall be sets:
268
269[mini_bits_is_set]
270
271This is done by adding a partial template specialization to
272the type trait template `icl::is_set`.
273For the extension of this type trait template and the result
274values of inclusion_compare we need these `#includes` for the
275implementation of `mini::bits`:
276
277[mini_bits_includes]
278
279[endsect][/ A class implementation for the bitset type]
280
281[section Implementation of a large bitset]
282
283Having finished our `mini::bits` implementation, we can start to
284code the wrapper class that hides the efficient interval map of mini::bits
285and exposes a simple and convenient set behavior to the world of users.
286
287Let's start with the required `#include`s this time:
288
289[import ../example/large_bitset_/large_bitset.hpp]
290[large_bitset_includes]
291
292Besides `boost/icl/interval_map.hpp` and `bits.hpp` the most important
293include here is `boost/operators.hpp`. We use this library in order
294to further minimize the code and to provide pretty extensive operator
295functionality using very little code.
296
297For a short and concise naming of the most important unsigned integer
298types and the corresponding `mini::bits` we define this:
299
300[large_bitset_natural_typedefs]
301
302[section large_bitset: Public interface][/ ------------------------------------]
303
304And now let's code `large_bitset`.
305
306[large_bitset_class_template_header]
307
308The first template parameter `DomainT` will be instantiated with
309an integral type that defines the kind of numbers that can
310be elements of the set. Since we want to go for a large set we
311use `nat64` as default which is a 64 bit unsigned integer ranging
312from `0` to `2^64-1`. As bitset parameter we also choose a 64-bit
313default. Parameters `Combine` and `Interval` are necessary to
314be passed to dependent type expressions. An allocator can be
315chosen, if desired.
316
317The nested list of private inheritance contains groups of template
318instantiations from
319[@http://www.boost.org/doc/libs/1_39_0/libs/utility/operators.htm Boost.Operator],
320that provides derivable operators from more fundamental once.
321Implementing the fundamental operators, we get the derivable ones
322/for free/. Below is a short overview of what we get using Boost.Operator,
323where __S stands for `large_bitset`, __i for it's `interval_type`
324and __e for it's `domain_type` or `element_type`.
325
326[table
327[[Group                    ][fundamental]      [derivable     ]]
328[[Equality, ordering       ][`==`]             [`!=`          ]]
329[[                         ][`<` ]             [`> <= >=`     ]]
330[[Set operators (__S x __S)][`+= |= -= &= ^=` ][`+ | - & ^`   ]]
331[[Set operators (__S x __e)][`+= |= -= &= ^=` ][`+ | - & ^`   ]]
332[[Set operators (__S x __i)][`+= |= -= &= ^=` ][`+ | - & ^`   ]]
333]
334
335There is a number of associated types
336
337[large_bitset_associated_types]
338
339most importantly the implementing `interval_bitmap_type` that is used
340for the implementing container.
341
342[large_bitmap_impl_map]
343
344In order to use
345[@http://www.boost.org/doc/libs/1_39_0/libs/utility/operators.htm Boost.Operator]
346we have to implement the fundamental operators as class members. This can be
347done quite schematically.
348
349[large_bitset_operators]
350
351As we can see, the seven most important operators that work on the
352class type `large_bitset` can be directly implemented by propagating
353the operation to the implementing `_map`
354of type `interval_bitmap_type`. For the operators that work on segment and
355element types, we use member functions `add`, `subtract`, `intersect` and
356`flip`. As we will see only a small amount of adaper code is needed
357to couple those functions with the functionality of the implementing
358container.
359
360Member functions `add`, `subtract`, `intersect` and `flip`,
361that allow to combine ['*intervals*] to `large_bitsets` can
362be uniformly implemented using a private function
363`segment_apply` that applies /addition/, /subtraction/,
364/intersection/ or /symmetric difference/, after having
365translated the interval's borders into the right bitset
366positions.
367
368[large_bitset_fundamental_functions]
369
370In the sample programs, that we will present to demonstrate
371the capabilities of `large_bitset` we will need a few additional
372functions specifically output functions in two different
373flavors.
374
375[large_bitset_demo_functions]
376
377* The first one, `show_segments()` shows the container
378content as it is implemented, in the compressed form.
379
380* The second function `show_matrix` shows the complete
381matrix of bits that is represented by the container.
382
383[endsect][/ large_bitset: Public interface]
384[section large_bitset: Private implementation]  [/ -------------------------------------]
385
386In order to implement operations like the addition of
387an element say `42` to
388the large bitset, we need to translate the /value/ to the
389/position/ of the associated *bit* representing `42` in the
390interval container of bitsets. As an example, suppose we
391use a
392``
393large_bitset<nat, mini::bits8> lbs;
394``
395that carries small bitsets of 8 bits only.
396The first four interval of `lbs` are assumed to
397be associated with some bitsets. Now we want to add the interval
398`[a,b]==[5,27]`. This will result in the following situation:
399``
400   [0,1)->   [1,2)->   [2,3)->   [3,4)->
401   [00101100][11001011][11101001][11100000]
402+       [111  11111111  11111111  1111]      [5,27] as bitset
403         a                           b
404
405=> [0,1)->   [1,3)->   [3,4)->
406   [00101111][11111111][11110000]
407``
408So we have to convert values 5 and 27 into a part that
409points to the interval and a part that refers to the
410position within the interval, which is done by a
411/division/ and a /modulo/ computation. (In order to have a
412consistent representation of the bitsequences across the containers,
413within this project, bitsets are denoted with the
414['*least significant bit on the left!*])
415
416``
417A = a/8 =  5/8 = 0 // refers to interval
418B = b/8 = 27/8 = 3
419R = a%8 =  5%8 = 5 // refers to the position in the associated bitset.
420S = b%8 = 27%8 = 3
421``
422
423All /division/ and /modulo operations/ needed here are always done
424using a divisor `d` that is a power of `2`: `d = 2^x`. Therefore
425division and modulo can be expressed by bitset operations.
426The constants needed for those bitset computations are defined here:
427
428[large_bitset_impl_constants]
429
430Looking at the example again, we can see that we have to identify the
431positions of the beginning and ending of the interval [5,27] that is
432to insert, and then ['*subdivide*] that range of bitsets into three partitions.
433
434# The bitset where the interval starts.
435# the bitset where the interval ends
436# The bitsets that are completely overlapped by the interval
437
438``
439combine interval [5,27] to large_bitset lbs w.r.t. some operation o
440
441   [0,1)->   [1,2)->   [2,3)->   [3,4)->
442   [00101100][11001011][11101001][11100000]
443o       [111  11111111  11111111  1111]
444         a                           b
445subdivide:
446   [first!  ][mid_1] . . .[mid_n][   !last]
447   [00000111][1...1] . . .[1...1][11110000]
448``
449
450After subdividing, we perform the operation `o` as follows:
451
452# For the first bitset: Set all bits from ther starting bit (`!`)
453  to the end of the bitset to `1`. All other bits are `0`. Then
454  perform operation `o`: `_map o= ([0,1)->00000111)`
455
456# For the last bitset: Set all bits from the beginning of the bitset
457  to the ending bit (`!`) to `1`. All other bits are `0`. Then
458  perform operation `o`: `_map o= ([3,4)->11110000)`
459
460# For the range of bitsets in between the staring and ending one,
461  perform operation `o`: `_map o= ([1,3)->11111111)`
462
463The algorithm, that has been outlined and illustrated by the
464example, is implemented by the private member function
465`segment_apply`. To make the combiner operation a variable
466in this algorithm, we use a /pointer to member function type/
467
468[large_bitset_segment_combiner]
469
470as first function argument. We will pass member functions `combine_` here,
471``
472combine_(first_of_interval, end_of_interval, some_bitset);
473``
474that take the beginning and ending of an interval and a bitset and combine
475them to the implementing `interval_bitmap_type _map`. Here are these functions:
476
477[large_bitmap_combiners]
478
479Finally we can code function `segment_apply`, that does the partitioning
480and subsequent combining:
481
482[large_bitset_segment_apply]
483
484The functions that help filling bitsets to and from a given bit are
485implemented here:
486[large_bitset_bitset_filler]
487
488This completes the implementation of class template `large_bitset`.
489Using only a small amount of mostly schematic code,
490we have been able to provide a pretty powerful, self compressing
491and generally usable set type for all integral domain types.
492
493[endsect][/ large_bitset: Private implementation]
494
495[endsect][/ Implementation of a large bitset]
496
497[endsect][/ Large Bitsets IN Projects]
498
499
500[endsect][/ Projects]
501
502
503
504