• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE section PUBLIC "-//Boost//DTD BoostBook XML V1.1//EN"
3"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
4<section id="safe_numerics.tutorial">
5  <title>Tutorial and Motivating Examples</title>
6
7  <section id="safe_numerics.tutorial.1">
8    <title>Arithmetic Expressions Can Yield Incorrect Results</title>
9
10    <para>When some operation on signed integer types results in a result
11    which exceeds the capacity of a data variable to hold it, the result is
12    undefined. In the case of unsigned integer types a similar situation
13    results in a value wrap as per modulo arithmetic. In either case the
14    result is different than in integer number arithmetic in the mathematical
15    sense. This is called "overflow". Since word size can differ between
16    machines, code which produces mathematically correct results in one set of
17    circumstances may fail when re-compiled on a machine with different
18    hardware. When this occurs, most C++ programs will continue to execute
19    with no indication that the results are wrong. It is the programmer's
20    responsibility to ensure such undefined behavior is avoided.</para>
21
22    <para>This program demonstrates this problem. The solution is to replace
23    instances of built in integer types with corresponding safe types.</para>
24
25    <programlisting><xi:include href="../../example/example1.cpp" parse="text"
26        xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting>
27
28    <screen>example 1:undetected erroneous expression evaluation
29Not using safe numerics
30error NOT detected!
31-127 != 127 + 2
32Using safe numerics
33error detected:converted signed value too large: positive overflow error
34Program ended with exit code: 0</screen>
35  </section>
36
37  <section id="safe_numerics.tutorial.2">
38    <title>Arithmetic Operations Can Overflow Silently</title>
39
40    <para>A variation of the above is when a value is incremented/decremented
41    beyond its domain.</para>
42
43    <programlisting><xi:include href="../../example/example2.cpp" parse="text"
44        xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting>
45
46    <screen>example 2:undetected overflow in data type
47Not using safe numerics
48-2147483648 != 2147483647 + 1
49error NOT detected!
50Using safe numerics
51addition result too large
52error detected!</screen>
53
54    <para>When variables of unsigned integer type are decremented below zero,
55    they "roll over" to the highest possible unsigned version of that integer
56    type. This is a common problem which is generally never detected.</para>
57  </section>
58
59  <section id="safe_numerics.tutorial.3">
60    <title>Arithmetic on Unsigned Integers Can Yield Incorrect Results</title>
61
62    <para>Subtracting two unsigned values of the same size will result in an
63    unsigned value. If the first operand is less than the second the result
64    will be arithmetically in correct. But if the size of the unsigned types
65    is less than that of an <code>unsigned int</code>, C/C++ will promote the
66    types to <code>signed int</code> before subtracting resulting in an
67    correct result. In either case, there is no indication of an error.
68    Somehow, the programmer is expected to avoid this behavior. Advice usually
69    takes the form of "Don't use unsigned integers for arithmetic". This is
70    well and good, but often not practical. C/C++ itself uses unsigned for
71    <code>sizeof(T)</code> which is then used by users in arithmetic.</para>
72
73    <para>This program demonstrates this problem. The solution is to replace
74    instances of built in integer types with corresponding safe types.</para>
75
76    <programlisting><para><xi:include href="../../example/example8.cpp"
77          parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></para></programlisting>
78
79    <screen>example 8:undetected erroneous expression evaluation
80Not using safe numerics
81error NOT detected!
824294967171 != 2 - 127
83Using safe numerics
84error detected:subtraction result cannot be negative: negative overflow error
85Program ended with exit code: 0</screen>
86  </section>
87
88  <section id="safe_numerics.tutorial.4">
89    <title>Implicit Conversions Can Lead to Erroneous Results</title>
90
91    <para>At CPPCon 2016 Jon Kalb gave a very entertaining (and disturbing)
92    <ulink url="https://www.youtube.com/watch?v=wvtFGa6XJDU">lightning
93    talk</ulink> related to C++ expressions.</para>
94
95    <para>The talk included a very, very simple example similar to the
96    following:</para>
97
98    <para><programlisting><xi:include href="../../example/example4.cpp"
99          parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 3: implicit conversions change data values
100Not using safe numerics
101a is -1 b is 1
102b is less than a
103error NOT detected!
104Using safe numerics
105a is -1 b is 1
106converted negative value to unsigned: domain error
107error detected!
108</screen></para>
109
110    <para>A normal person reads the above code and has to be dumbfounded by
111    this. The code doesn't do what the text - according to the rules of
112    algebra - says it does. But C++ doesn't follow the rules of algebra - it
113    has its own rules. There is generally no compile time error. You can get a
114    compile time warning if you set some specific compile time switches. The
115    explanation lies in reviewing how C++ reconciles binary expressions
116    (<code>a &lt; b</code> is an expression here) where operands are different
117    types. In processing this expression, the compiler:</para>
118
119    <para><itemizedlist>
120        <listitem>
121          <para>Determines the "best" common type for the two operands. In
122          this case, application of the rules in the C++ standard dictate that
123          this type will be an <code>unsigned int</code>.</para>
124        </listitem>
125
126        <listitem>
127          <para>Converts each operand to this common type. The signed value of
128          -1 is converted to an unsigned value with the same bit-wise
129          contents, 0xFFFFFFFF, on a machine with 32 bit integers. This
130          corresponds to a decimal value of 4294967295.</para>
131        </listitem>
132
133        <listitem>
134          <para>Performs the calculation - in this case it's
135          <code>&lt;</code>, the "less than" operation. Since 1 is less than
136          4294967295 the program prints "b is less than a".</para>
137        </listitem>
138      </itemizedlist></para>
139
140    <para>In order for a programmer to detect and understand this error he
141    should be pretty familiar with the implicit conversion rules of the C++
142    standard. These are available in a copy of the standard and also in the
143    canonical reference book <citetitle><link linkend="stroustrup">The C++
144    Programming Language</link></citetitle> (both are over 1200 pages long!).
145    Even experienced programmers won't spot this issue and know to take
146    precautions to avoid it. And this is a relatively easy one to spot. In the
147    more general case this will use integers which don't correspond to easily
148    recognizable numbers and/or will be buried as a part of some more complex
149    expression.</para>
150
151    <para>This example generated a good amount of web traffic along with
152    everyone's pet suggestions. See for example <ulink
153    url="https://bulldozer00.com/2016/10/16/the-unsigned-conundrum/">a blog
154    post with everyone's favorite "solution"</ulink>. All the proposed
155    "solutions" have disadvantages and attempts to agree on how handle this
156    are ultimately fruitless in spite of, or maybe because of, the <ulink
157    url="https://twitter.com/robertramey1/status/795742870045016065">emotional
158    content</ulink>. Our solution is by far the simplest: just use the safe
159    numerics library as shown in the example above.</para>
160
161    <para>Note that in this particular case, usage of the safe types results
162    in no runtime overhead in using the safe integer library. Code generated
163    will either equal or exceed the efficiency of using primitive integer
164    types.</para>
165  </section>
166
167  <section id="safe_numerics.tutorial.5">
168    <title>Mixing Data Types Can Create Subtle Errors</title>
169
170    <para>C++ contains signed and unsigned integer types. In spite of their
171    names, they function differently which often produces surprising results
172    for some operands. Program errors from this behavior can be exceedingly
173    difficult to find. This has lead to recommendations of various ad hoc
174    "rules" to avoid these problems. It's not always easy to apply these
175    "rules" to existing code without creating even more bugs. Here is a
176    typical example of this problem:</para>
177
178    <para><programlisting><xi:include href="../../example/example10.cpp"
179          parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting>Here
180    is the output of the above program:<screen>example 4: mixing types produces surprising results
181Not using safe numerics
18210000
1834294957296
184error NOT detected!
185Using safe numerics
18610000
187error detected!converted negative value to unsigned: domain error
188</screen>This solution is simple, just replace instances of <code>int</code>
189    with <code>safe&lt;int&gt;</code>.</para>
190  </section>
191
192  <section id="safe_numerics.tutorial.6">
193    <title>Array Index Value Can Exceed Array Limits</title>
194
195    <para>Using an intrinsic C++ array, it's very easy to exceed array limits.
196    This can fail to be detected when it occurs and create bugs which are hard
197    to find. There are several ways to address this, but one of the simplest
198    would be to use safe_unsigned_range;</para>
199
200    <para><programlisting><xi:include href="../../example/example5.cpp"
201          parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 5: array index values can exceed array bounds
202Not using safe numerics
203error NOT detected!
204Using safe numerics
205error detected:Value out of range for this safe type: domain error
206</screen></para>
207
208    <para>Collections like standard arrays and vectors do array index checking
209    in some function calls and not in others so this may not be the best
210    example. However it does illustrate the usage of
211    <code>safe_range&lt;T&gt;</code> for assigning legal ranges to variables.
212    This will guarantee that under no circumstances will the variable contain
213    a value outside of the specified range.</para>
214  </section>
215
216  <section id="safe_numerics.tutorial.7">
217    <title>Checking of Input Values Can Be Easily Overlooked</title>
218
219    <para>It's way too easy to overlook the checking of parameters received
220    from outside the current program.<programlisting><xi:include
221          href="../../example/example6.cpp" parse="text"
222          xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 6: checking of externally produced value can be overlooked
223Not using safe numerics
2242147483647 0
225error NOT detected!
226Using safe numerics
227error detected:error in file input: domain error
228</screen>Without safe integer, one will have to insert new code every time an
229    integer variable is retrieved. This is a tedious and error prone
230    procedure. Here we have used program input. But in fact this problem can
231    occur with any externally produced input.</para>
232  </section>
233
234  <section id="safe_numerics.tutorial.8">
235    <title>Cannot Recover From Arithmetic Errors</title>
236
237    <para>If a divide by zero error occurs in a program, it's detected by
238    hardware. The way this manifests itself to the program can and will depend
239    upon</para>
240
241    <para><itemizedlist>
242        <listitem>
243          <para>data type - int, float, etc</para>
244        </listitem>
245
246        <listitem>
247          <para>setting of compile time command line switches</para>
248        </listitem>
249
250        <listitem>
251          <para>invocation of some configuration functions which convert these
252          hardware events into C++ exceptions</para>
253        </listitem>
254      </itemizedlist>It's not all that clear how one would detect and recover
255    from a divide by zero error in a simple portable way. Usually, users just
256    ignore the issue which usually results in immediate program termination
257    when this situation occurs.</para>
258
259    <para>This library will detect divide by zero errors before the operation
260    is invoked. Any errors of this nature are handled according to the <link
261    linkend="safe_numeric.exception_policies">exception_policy</link> selected
262    by the library user.</para>
263
264    <para><programlisting><xi:include href="../../example/example13.cpp"
265          parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 7: cannot recover from arithmetic errors
266Not using safe numerics
267error NOT detectable!
268Using safe numerics
269error detected:divide by zero: domain error
270</screen></para>
271  </section>
272
273  <section id="safe_numerics.tutorial.9">
274    <title>Compile Time Arithmetic is Not Always Correct</title>
275
276    <para>If a divide by zero error occurs while a program is being compiled,
277    there is not guarantee that it will be detected. This example shows a real
278    example compiled with a recent version of CLang.</para>
279
280    <itemizedlist>
281      <listitem>
282        <para>Source code includes a constant expression containing a simple
283        arithmetic error.</para>
284      </listitem>
285
286      <listitem>
287        <para>The compiler emits a warning but otherwise calculates the wrong
288        result.</para>
289      </listitem>
290
291      <listitem>
292        <para>Replacing int with safe&lt;int&gt; will guarantee that the error
293        is detected at runtime</para>
294      </listitem>
295
296      <listitem>
297        <para>Operations using safe types are marked constexpr. So we can
298        force the operations to occur at runtime by marking the results as
299        constexpr. This will result in an error at compile time if the
300        operations cannot be correctly calculated.</para>
301      </listitem>
302    </itemizedlist>
303
304    <para><programlisting><xi:include href="../../example/example14.cpp"
305          parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 8: cannot detect compile time arithmetic errors
306Not using safe numerics
3070error NOT detected!
308Using safe numerics
309error detected:positive overflow error
310Program ended with exit code: 0</screen></para>
311  </section>
312
313  <section id="safe_numerics.tutorial.10">
314    <title>Programming by Contract is Too Slow</title>
315
316    <para>Programming by Contract is a highly regarded technique. There has
317    been much written about it and it has been proposed as an addition to the
318    C++ language <citation><xref linkend="garcia"/></citation><citation><xref
319    linkend="crowl2"/></citation> It (mostly) depends upon runtime checking of
320    parameter and object values upon entry to and exit from every function.
321    This can slow the program down considerably which in turn undermines the
322    main motivation for using C++ in the first place! One popular scheme for
323    addressing this issue is to enable parameter checking only during
324    debugging and testing which defeats the guarantee of correctness which we
325    are seeking here! Programming by Contract will never be accepted by
326    programmers as long as it is associated with significant additional
327    runtime cost.</para>
328
329    <para>The Safe Numerics Library has facilities which, in many cases, can
330    check guaranteed parameter requirements with little or no runtime
331    overhead. Consider the following example:</para>
332
333    <para><programlisting><xi:include href="../../example/example7.cpp"
334          parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 8:
335enforce contracts with zero runtime cost
336parameter error detected</screen></para>
337
338    <para>In the example above, the function <code>convert</code> incurs
339    significant runtime cost every time the function is called. By using
340    "safe" types, this cost is moved to the moment when the parameters are
341    constructed. Depending on how the program is constructed, this may totally
342    eliminate extraneous computations for parameter requirement type checking.
343    In this scenario, there is no reason to suppress the checking for release
344    mode and our program can be guaranteed to be always arithmetically
345    correct.</para>
346  </section>
347</section>
348