1<?xml version="1.0" encoding="UTF-8"?> 2<!DOCTYPE section PUBLIC "-//Boost//DTD BoostBook XML V1.1//EN" 3"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"> 4<section id="safe_numerics.tutorial"> 5 <title>Tutorial and Motivating Examples</title> 6 7 <section id="safe_numerics.tutorial.1"> 8 <title>Arithmetic Expressions Can Yield Incorrect Results</title> 9 10 <para>When some operation on signed integer types results in a result 11 which exceeds the capacity of a data variable to hold it, the result is 12 undefined. In the case of unsigned integer types a similar situation 13 results in a value wrap as per modulo arithmetic. In either case the 14 result is different than in integer number arithmetic in the mathematical 15 sense. This is called "overflow". Since word size can differ between 16 machines, code which produces mathematically correct results in one set of 17 circumstances may fail when re-compiled on a machine with different 18 hardware. When this occurs, most C++ programs will continue to execute 19 with no indication that the results are wrong. It is the programmer's 20 responsibility to ensure such undefined behavior is avoided.</para> 21 22 <para>This program demonstrates this problem. The solution is to replace 23 instances of built in integer types with corresponding safe types.</para> 24 25 <programlisting><xi:include href="../../example/example1.cpp" parse="text" 26 xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting> 27 28 <screen>example 1:undetected erroneous expression evaluation 29Not using safe numerics 30error NOT detected! 31-127 != 127 + 2 32Using safe numerics 33error detected:converted signed value too large: positive overflow error 34Program ended with exit code: 0</screen> 35 </section> 36 37 <section id="safe_numerics.tutorial.2"> 38 <title>Arithmetic Operations Can Overflow Silently</title> 39 40 <para>A variation of the above is when a value is incremented/decremented 41 beyond its domain.</para> 42 43 <programlisting><xi:include href="../../example/example2.cpp" parse="text" 44 xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting> 45 46 <screen>example 2:undetected overflow in data type 47Not using safe numerics 48-2147483648 != 2147483647 + 1 49error NOT detected! 50Using safe numerics 51addition result too large 52error detected!</screen> 53 54 <para>When variables of unsigned integer type are decremented below zero, 55 they "roll over" to the highest possible unsigned version of that integer 56 type. This is a common problem which is generally never detected.</para> 57 </section> 58 59 <section id="safe_numerics.tutorial.3"> 60 <title>Arithmetic on Unsigned Integers Can Yield Incorrect Results</title> 61 62 <para>Subtracting two unsigned values of the same size will result in an 63 unsigned value. If the first operand is less than the second the result 64 will be arithmetically in correct. But if the size of the unsigned types 65 is less than that of an <code>unsigned int</code>, C/C++ will promote the 66 types to <code>signed int</code> before subtracting resulting in an 67 correct result. In either case, there is no indication of an error. 68 Somehow, the programmer is expected to avoid this behavior. Advice usually 69 takes the form of "Don't use unsigned integers for arithmetic". This is 70 well and good, but often not practical. C/C++ itself uses unsigned for 71 <code>sizeof(T)</code> which is then used by users in arithmetic.</para> 72 73 <para>This program demonstrates this problem. The solution is to replace 74 instances of built in integer types with corresponding safe types.</para> 75 76 <programlisting><para><xi:include href="../../example/example8.cpp" 77 parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></para></programlisting> 78 79 <screen>example 8:undetected erroneous expression evaluation 80Not using safe numerics 81error NOT detected! 824294967171 != 2 - 127 83Using safe numerics 84error detected:subtraction result cannot be negative: negative overflow error 85Program ended with exit code: 0</screen> 86 </section> 87 88 <section id="safe_numerics.tutorial.4"> 89 <title>Implicit Conversions Can Lead to Erroneous Results</title> 90 91 <para>At CPPCon 2016 Jon Kalb gave a very entertaining (and disturbing) 92 <ulink url="https://www.youtube.com/watch?v=wvtFGa6XJDU">lightning 93 talk</ulink> related to C++ expressions.</para> 94 95 <para>The talk included a very, very simple example similar to the 96 following:</para> 97 98 <para><programlisting><xi:include href="../../example/example4.cpp" 99 parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 3: implicit conversions change data values 100Not using safe numerics 101a is -1 b is 1 102b is less than a 103error NOT detected! 104Using safe numerics 105a is -1 b is 1 106converted negative value to unsigned: domain error 107error detected! 108</screen></para> 109 110 <para>A normal person reads the above code and has to be dumbfounded by 111 this. The code doesn't do what the text - according to the rules of 112 algebra - says it does. But C++ doesn't follow the rules of algebra - it 113 has its own rules. There is generally no compile time error. You can get a 114 compile time warning if you set some specific compile time switches. The 115 explanation lies in reviewing how C++ reconciles binary expressions 116 (<code>a < b</code> is an expression here) where operands are different 117 types. In processing this expression, the compiler:</para> 118 119 <para><itemizedlist> 120 <listitem> 121 <para>Determines the "best" common type for the two operands. In 122 this case, application of the rules in the C++ standard dictate that 123 this type will be an <code>unsigned int</code>.</para> 124 </listitem> 125 126 <listitem> 127 <para>Converts each operand to this common type. The signed value of 128 -1 is converted to an unsigned value with the same bit-wise 129 contents, 0xFFFFFFFF, on a machine with 32 bit integers. This 130 corresponds to a decimal value of 4294967295.</para> 131 </listitem> 132 133 <listitem> 134 <para>Performs the calculation - in this case it's 135 <code><</code>, the "less than" operation. Since 1 is less than 136 4294967295 the program prints "b is less than a".</para> 137 </listitem> 138 </itemizedlist></para> 139 140 <para>In order for a programmer to detect and understand this error he 141 should be pretty familiar with the implicit conversion rules of the C++ 142 standard. These are available in a copy of the standard and also in the 143 canonical reference book <citetitle><link linkend="stroustrup">The C++ 144 Programming Language</link></citetitle> (both are over 1200 pages long!). 145 Even experienced programmers won't spot this issue and know to take 146 precautions to avoid it. And this is a relatively easy one to spot. In the 147 more general case this will use integers which don't correspond to easily 148 recognizable numbers and/or will be buried as a part of some more complex 149 expression.</para> 150 151 <para>This example generated a good amount of web traffic along with 152 everyone's pet suggestions. See for example <ulink 153 url="https://bulldozer00.com/2016/10/16/the-unsigned-conundrum/">a blog 154 post with everyone's favorite "solution"</ulink>. All the proposed 155 "solutions" have disadvantages and attempts to agree on how handle this 156 are ultimately fruitless in spite of, or maybe because of, the <ulink 157 url="https://twitter.com/robertramey1/status/795742870045016065">emotional 158 content</ulink>. Our solution is by far the simplest: just use the safe 159 numerics library as shown in the example above.</para> 160 161 <para>Note that in this particular case, usage of the safe types results 162 in no runtime overhead in using the safe integer library. Code generated 163 will either equal or exceed the efficiency of using primitive integer 164 types.</para> 165 </section> 166 167 <section id="safe_numerics.tutorial.5"> 168 <title>Mixing Data Types Can Create Subtle Errors</title> 169 170 <para>C++ contains signed and unsigned integer types. In spite of their 171 names, they function differently which often produces surprising results 172 for some operands. Program errors from this behavior can be exceedingly 173 difficult to find. This has lead to recommendations of various ad hoc 174 "rules" to avoid these problems. It's not always easy to apply these 175 "rules" to existing code without creating even more bugs. Here is a 176 typical example of this problem:</para> 177 178 <para><programlisting><xi:include href="../../example/example10.cpp" 179 parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting>Here 180 is the output of the above program:<screen>example 4: mixing types produces surprising results 181Not using safe numerics 18210000 1834294957296 184error NOT detected! 185Using safe numerics 18610000 187error detected!converted negative value to unsigned: domain error 188</screen>This solution is simple, just replace instances of <code>int</code> 189 with <code>safe<int></code>.</para> 190 </section> 191 192 <section id="safe_numerics.tutorial.6"> 193 <title>Array Index Value Can Exceed Array Limits</title> 194 195 <para>Using an intrinsic C++ array, it's very easy to exceed array limits. 196 This can fail to be detected when it occurs and create bugs which are hard 197 to find. There are several ways to address this, but one of the simplest 198 would be to use safe_unsigned_range;</para> 199 200 <para><programlisting><xi:include href="../../example/example5.cpp" 201 parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 5: array index values can exceed array bounds 202Not using safe numerics 203error NOT detected! 204Using safe numerics 205error detected:Value out of range for this safe type: domain error 206</screen></para> 207 208 <para>Collections like standard arrays and vectors do array index checking 209 in some function calls and not in others so this may not be the best 210 example. However it does illustrate the usage of 211 <code>safe_range<T></code> for assigning legal ranges to variables. 212 This will guarantee that under no circumstances will the variable contain 213 a value outside of the specified range.</para> 214 </section> 215 216 <section id="safe_numerics.tutorial.7"> 217 <title>Checking of Input Values Can Be Easily Overlooked</title> 218 219 <para>It's way too easy to overlook the checking of parameters received 220 from outside the current program.<programlisting><xi:include 221 href="../../example/example6.cpp" parse="text" 222 xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 6: checking of externally produced value can be overlooked 223Not using safe numerics 2242147483647 0 225error NOT detected! 226Using safe numerics 227error detected:error in file input: domain error 228</screen>Without safe integer, one will have to insert new code every time an 229 integer variable is retrieved. This is a tedious and error prone 230 procedure. Here we have used program input. But in fact this problem can 231 occur with any externally produced input.</para> 232 </section> 233 234 <section id="safe_numerics.tutorial.8"> 235 <title>Cannot Recover From Arithmetic Errors</title> 236 237 <para>If a divide by zero error occurs in a program, it's detected by 238 hardware. The way this manifests itself to the program can and will depend 239 upon</para> 240 241 <para><itemizedlist> 242 <listitem> 243 <para>data type - int, float, etc</para> 244 </listitem> 245 246 <listitem> 247 <para>setting of compile time command line switches</para> 248 </listitem> 249 250 <listitem> 251 <para>invocation of some configuration functions which convert these 252 hardware events into C++ exceptions</para> 253 </listitem> 254 </itemizedlist>It's not all that clear how one would detect and recover 255 from a divide by zero error in a simple portable way. Usually, users just 256 ignore the issue which usually results in immediate program termination 257 when this situation occurs.</para> 258 259 <para>This library will detect divide by zero errors before the operation 260 is invoked. Any errors of this nature are handled according to the <link 261 linkend="safe_numeric.exception_policies">exception_policy</link> selected 262 by the library user.</para> 263 264 <para><programlisting><xi:include href="../../example/example13.cpp" 265 parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 7: cannot recover from arithmetic errors 266Not using safe numerics 267error NOT detectable! 268Using safe numerics 269error detected:divide by zero: domain error 270</screen></para> 271 </section> 272 273 <section id="safe_numerics.tutorial.9"> 274 <title>Compile Time Arithmetic is Not Always Correct</title> 275 276 <para>If a divide by zero error occurs while a program is being compiled, 277 there is not guarantee that it will be detected. This example shows a real 278 example compiled with a recent version of CLang.</para> 279 280 <itemizedlist> 281 <listitem> 282 <para>Source code includes a constant expression containing a simple 283 arithmetic error.</para> 284 </listitem> 285 286 <listitem> 287 <para>The compiler emits a warning but otherwise calculates the wrong 288 result.</para> 289 </listitem> 290 291 <listitem> 292 <para>Replacing int with safe<int> will guarantee that the error 293 is detected at runtime</para> 294 </listitem> 295 296 <listitem> 297 <para>Operations using safe types are marked constexpr. So we can 298 force the operations to occur at runtime by marking the results as 299 constexpr. This will result in an error at compile time if the 300 operations cannot be correctly calculated.</para> 301 </listitem> 302 </itemizedlist> 303 304 <para><programlisting><xi:include href="../../example/example14.cpp" 305 parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 8: cannot detect compile time arithmetic errors 306Not using safe numerics 3070error NOT detected! 308Using safe numerics 309error detected:positive overflow error 310Program ended with exit code: 0</screen></para> 311 </section> 312 313 <section id="safe_numerics.tutorial.10"> 314 <title>Programming by Contract is Too Slow</title> 315 316 <para>Programming by Contract is a highly regarded technique. There has 317 been much written about it and it has been proposed as an addition to the 318 C++ language <citation><xref linkend="garcia"/></citation><citation><xref 319 linkend="crowl2"/></citation> It (mostly) depends upon runtime checking of 320 parameter and object values upon entry to and exit from every function. 321 This can slow the program down considerably which in turn undermines the 322 main motivation for using C++ in the first place! One popular scheme for 323 addressing this issue is to enable parameter checking only during 324 debugging and testing which defeats the guarantee of correctness which we 325 are seeking here! Programming by Contract will never be accepted by 326 programmers as long as it is associated with significant additional 327 runtime cost.</para> 328 329 <para>The Safe Numerics Library has facilities which, in many cases, can 330 check guaranteed parameter requirements with little or no runtime 331 overhead. Consider the following example:</para> 332 333 <para><programlisting><xi:include href="../../example/example7.cpp" 334 parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 8: 335enforce contracts with zero runtime cost 336parameter error detected</screen></para> 337 338 <para>In the example above, the function <code>convert</code> incurs 339 significant runtime cost every time the function is called. By using 340 "safe" types, this cost is moved to the moment when the parameters are 341 constructed. Depending on how the program is constructed, this may totally 342 eliminate extraneous computations for parameter requirement type checking. 343 In this scenario, there is no reason to suppress the checking for release 344 mode and our program can be guaranteed to be always arithmetically 345 correct.</para> 346 </section> 347</section> 348