1<?xml version="1.0"?> <!-- -*- sgml -*- --> 2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" 4[ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]> 5 6 7<chapter id="pc-manual" 8 xreflabel="Ptrcheck: an experimental heap, stack and global array overrun detector"> 9 <title>Ptrcheck: an experimental heap, stack and global array overrun detector</title> 10 11<para>To use this tool, you must specify 12<option>--tool=exp-ptrcheck</option> on the Valgrind 13command line.</para> 14 15 16 17 18<sect1 id="pc-manual.overview" xreflabel="Overview"> 19<title>Overview</title> 20 21<para>Ptrcheck is a tool for finding overruns of heap, stack 22and global arrays. Its functionality overlaps somewhat with 23Memcheck's, but it is able to catch invalid accesses in a number of 24cases that Memcheck would miss. A detailed comparison against 25Memcheck is presented below.</para> 26 27<para>Ptrcheck is composed of two almost completely independent tools 28that have been glued together. One part, 29in <computeroutput>h_main.[ch]</computeroutput>, checks accesses 30through heap-derived pointers. The other part, in 31<computeroutput>sg_main.[ch]</computeroutput>, checks accesses to 32stack and global arrays. The remaining 33files <computeroutput>pc_{common,main}.[ch]</computeroutput>, provide 34common error-management and coordination functions, so as to make it 35appear as a single tool.</para> 36 37<para>The heap-check part is an extensively-hacked (largely rewritten) 38version of the experimental "Annelid" tool developed and described by 39Nicholas Nethercote and Jeremy Fitzhardinge. The stack- and global- 40check part uses a heuristic approach derived from an observation about 41the likely forms of stack and global array accesses, and, as far as is 42known, is entirely novel.</para> 43 44</sect1> 45 46 47 48 49<sect1 id="pc-manual.options" xreflabel="Ptrcheck Command-line Options"> 50<title>Ptrcheck Command-line Options</title> 51 52<para>Ptrcheck-specific command-line options are:</para> 53 54<!-- start of xi:include in the manpage --> 55<variablelist id="pc.opts.list"> 56 57 <varlistentry id="opt.enable-sg-checks" xreflabel="--enable-sg-checks"> 58 <term> 59 <option><![CDATA[--enable-sg-checks=no|yes 60 [default: yes] ]]></option> 61 </term> 62 <listitem> 63 <para>By default, Ptrcheck checks for overruns of stack, global 64 and heap arrays. 65 With <varname>--enable-sg-checks=no</varname>, the stack and 66 global array checks are omitted, and only heap checking is 67 performed. This can be useful because the stack and global 68 checks are quite expensive, so omitting them speeds Ptrcheck up 69 a lot. 70 </para> 71 </listitem> 72 </varlistentry> 73 74 <varlistentry id="opt.partial-loads-ok" xreflabel="--partial-loads-ok"> 75 <term> 76 <option><![CDATA[--partial-loads-ok=<yes|no> [default: no] ]]></option> 77 </term> 78 <listitem> 79 <para>This option has the same meaning as it does for 80 Memcheck.</para> 81 <para>Controls how Ptrcheck handles word-sized, word-aligned 82 loads which partially overlap the end of heap blocks -- that is, 83 some of the bytes in the word are validly addressable, but 84 others are not. When <varname>yes</varname>, such loads do not 85 produce an address error. When <varname>no</varname> (the 86 default), loads from partially invalid addresses are treated the 87 same as loads from completely invalid addresses: an illegal heap 88 access error is issued. 89 </para> 90 <para>Note that code that behaves in this way is in violation of 91 the the ISO C/C++ standards, and should be considered broken. If 92 at all possible, such code should be fixed. This option should be 93 used only as a last resort.</para> 94 </listitem> 95 </varlistentry> 96 97</variablelist> 98<!-- end of xi:include in the manpage --> 99 100</sect1> 101 102 103 104 105<sect1 id="pc-manual.how-works.heap-checks" 106 xreflabel="How Ptrcheck Works: Heap Checks"> 107<title>How Ptrcheck Works: Heap Checks</title> 108 109<para>Ptrcheck can check for invalid uses of heap pointers, including 110out of range accesses and accesses to freed memory. The mechanism is 111however completely different from Memcheck's, and the checking is more 112powerful.</para> 113 114<para>For each pointer in the program, Ptrcheck keeps track of which 115heap block (if any) it was derived from. Then, when an access is made 116through that pointer, Ptrcheck compares the access address with the 117bounds of the associated block, and reports an error if the address is 118out of bounds, or if the block has been freed.</para> 119 120<para>Of course it is rarely the case that one wants to access a block 121only at the exact address returned by <function>malloc</function> et al. 122Ptrcheck understands that adding or subtracting offsets from a pointer to a 123block results in a pointer to the same block.</para> 124 125<para>At a fundamental level, this scheme works because a correct 126program cannot make assumptions about the addresses returned by 127<function>malloc</function> et al. In particular it cannot make any 128assumptions about the differences in addresses returned by subsequent calls 129to <function>malloc</function> et al. Hence there are very few ways to take 130an address returned by <function>malloc</function>, modify it, and still 131have a valid address. In short, the only allowable operations are adding 132and subtracting other non-pointer values. Almost all other operations 133produce a value which cannot possibly be a valid pointer.</para> 134 135</sect1> 136 137 138 139<sect1 id="pc-manual.how-works.sg-checks" 140 xreflabel="How Ptrcheck Works: Stack and Global Checks"> 141<title>How Ptrcheck Works: Stack and Global Checks</title> 142 143<para>When a source file is compiled 144with <option>-g</option>, the compiler attaches DWARF3 145debugging information which describes the location of all stack and 146global arrays in the file.</para> 147 148<para>Checking of accesses to such arrays would then be relatively 149simple, if the compiler could also tell us which array (if any) each 150memory referencing instruction was supposed to access. Unfortunately 151the DWARF3 debugging format does not provide a way to represent such 152information, so we have to resort to a heuristic technique to 153approximate the same information. The key observation is that 154 <emphasis> 155 if a memory referencing instruction accesses inside a stack or 156 global array once, then it is highly likely to always access that 157 same array</emphasis>.</para> 158 159<para>To see how this might be useful, consider the following buggy 160fragment:</para> 161<programlisting><![CDATA[ 162 { int i, a[10]; // both are auto vars 163 for (i = 0; i <= 10; i++) 164 a[i] = 42; 165 } 166]]></programlisting> 167 168<para>At run time we will know the precise address 169of <computeroutput>a[]</computeroutput> on the stack, and so we can 170observe that the first store resulting from <computeroutput>a[i] = 17142</computeroutput> writes <computeroutput>a[]</computeroutput>, and 172we will (correctly) assume that that instruction is intended always to 173access <computeroutput>a[]</computeroutput>. Then, on the 11th 174iteration, it accesses somewhere else, possibly a different local, 175possibly an un-accounted for area of the stack (eg, spill slot), so 176Ptrcheck reports an error.</para> 177 178<para>There is an important caveat.</para> 179 180<para>Imagine a function such as <function>memcpy</function>, which is used 181to read and write many different areas of memory over the lifetime of the 182program. If we insist that the read and write instructions in its memory 183copying loop only ever access one particular stack or global variable, we 184will be flooded with errors resulting from calls to 185<function>memcpy</function>.</para> 186 187<para>To avoid this problem, Ptrcheck instantiates fresh likely-target 188records for each entry to a function, and discards them on exit. This 189allows detection of cases where (e.g.) <function>memcpy</function> overflows 190its source or destination buffers for any specific call, but does not carry 191any restriction from one call to the next. Indeed, multiple threads may be 192multiple simultaneous calls to (e.g.) <function>memcpy</function> without 193mutual interference.</para> 194 195</sect1> 196 197 198 199 200<sect1 id="pc-manual.cmp-w-memcheck" 201 xreflabel="Comparison with Memcheck"> 202<title>Comparison with Memcheck</title> 203 204<para>Memcheck does not do any access checks for stack or global arrays, so 205the presence of those in Ptrcheck is a straight win. (But see 206"Limitations" below).</para> 207 208<para>Memcheck and Ptrcheck use different approaches for checking heap 209accesses. Memcheck maintains bitmaps telling it which areas of memory 210are accessible and which are not. If a memory access falls in an 211unaccessible area, it reports an error. By marking the 16 bytes 212before and after an allocated block unaccessible, Memcheck is able to 213detect small over- and underruns of the block. Similarly, by marking 214freed memory as unaccessible, Memcheck can detect all accesses to 215freed memory.</para> 216 217<para>Memcheck's approach is simple. But it's also weak. It can't 218catch block overruns beyond 16 bytes. And, more generally, because it 219focusses only on the question "is the target address accessible", it 220fails to detect invalid accesses which just happen to fall within some 221other valid area. This is not improbable, especially in crowded areas 222of the process' address space.</para> 223 224<para>Ptrcheck's approach is to keep track of pointers derived from 225heap blocks. It tracks pointers which are derived directly from calls 226to <function>malloc</function> et al, but also ones derived indirectly, by 227adding or subtracting offsets from the directly-derived pointers. When a 228pointer is finally used to access memory, Ptrcheck compares the access 229address with that of the block it was originally derived from, and 230reports an error if the access address is not within the block 231bounds.</para> 232 233<para>Consequently Ptrcheck can detect any out of bounds access 234through a heap-derived pointer, no matter how far from the original 235block it is.</para> 236 237<para>A second advantage is that Ptrcheck is better at detecting 238accesses to blocks freed very far in the past. Memcheck can detect 239these too, but only for blocks freed relatively recently. To detect 240accesses to a freed block, Memcheck must make it inaccessible, hence 241requiring a space overhead proportional to the size of the block. If 242the blocks are large, Memcheck will have to make them available for 243re-allocation relatively quickly, thereby losing the ability to detect 244invalid accesses to them.</para> 245 246<para>By contrast, Ptrcheck has a constant per-block space requirement 247of four machine words, for detection of accesses to freed blocks. A 248freed block can be reallocated immediately, yet Ptrcheck can still 249detect all invalid accesses through any pointers derived from the old 250allocation, providing only that the four-word descriptor for the old 251allocation is stored. For example, on a 64-bit machine, to detect 252accesses in any of the most recently freed 10 million blocks, Ptrcheck 253will require only 320MB of extra storage. Achieving the same level of 254detection with Memcheck is close to impossible and would likely 255involve several gigabytes of extra storage.</para> 256 257<para>Having said all that, remember that Memcheck performs uninitialised 258value checking, invalid and mismatched free checking, overlap checking, and 259leak checking, none of which Ptrcheck do. Memcheck has also benefitted from 260years of refinement, tuning, and experience with production-level usage, and 261so is much faster than Ptrcheck as it currently stands. 262</para> 263 264<para>Consequently we recommend you first make your programs run Memcheck 265clean. Once that's done, try Ptrcheck to see if you can shake out any 266further heap, global or stack errors.</para> 267 268</sect1> 269 270 271 272 273 274<sect1 id="pc-manual.limitations" 275 xreflabel="Limitations"> 276<title>Limitations</title> 277 278<para>This is an experimental tool, which relies rather too heavily on some 279not-as-robust-as-I-would-like assumptions on the behaviour of correct 280programs. There are a number of limitations which you should be aware 281of.</para> 282 283<itemizedlist> 284 285 <listitem> 286 <para>Heap checks: Ptrcheck can occasionally lose track of, or 287 become confused about, which heap block a given pointer has been 288 derived from. This can cause it to falsely report errors, or to 289 miss some errors. This is not believed to be a serious 290 problem.</para> 291 </listitem> 292 293 <listitem> 294 <para>Heap checks: Ptrcheck only tracks pointers that are stored 295 properly aligned in memory. If a pointer is stored at a misaligned 296 address, and then later read again, Ptrcheck will lose track of 297 what it points at. Similar problem if a pointer is split into 298 pieces and later reconsitituted.</para> 299 </listitem> 300 301 <listitem> 302 <para>Heap checks: Ptrcheck needs to "understand" which system 303 calls return pointers and which don't. Many, but not all system 304 calls are handled. If an unhandled one is encountered, Ptrcheck will 305 abort. Fortunately, adding support for a new syscall is very 306 easy.</para> 307 </listitem> 308 309 <listitem> 310 <para>Stack checks: It follows from the description above (<xref 311 linkend="pc-manual.how-works.sg-checks"/>) that the first access by a 312 memory referencing instruction to a stack or global array creates an 313 association between that instruction and the array, which is checked on 314 subsequent accesses by that instruction, until the containing function 315 exits. Hence, the first access by an instruction to an array (in any 316 given function instantiation) is not checked for overrun, since Ptrcheck 317 uses that as the "example" of how subsequent accesses should 318 behave.</para> 319 </listitem> 320 321 <listitem> 322 <para>Stack checks: Similarly, and more serious, it is clearly 323 possible to write legitimate pieces of code which break the basic 324 assumption upon which the stack/global checking rests. For 325 example:</para> 326 327<programlisting><![CDATA[ 328 { int a[10], b[10], *p, i; 329 for (i = 0; i < 10; i++) { 330 p = /* arbitrary condition */ ? &a[i] : &b[i]; 331 *p = 42; 332 } 333 } 334]]></programlisting> 335 336 <para>In this case the store sometimes 337 accesses <computeroutput>a[]</computeroutput> and 338 sometimes <computeroutput>b[]</computeroutput>, but in no cases is 339 the addressed array overrun. Nevertheless the change in target 340 will cause an error to be reported.</para> 341 342 <para>It is hard to see how to get around this problem. The only 343 mitigating factor is that such constructions appear very rare, at 344 least judging from the results using the tool so far. Such a 345 construction appears only once in the Valgrind sources (running 346 Valgrind on Valgrind) and perhaps two or three times for a start 347 and exit of Firefox. The best that can be done is to suppress the 348 errors.</para> 349 </listitem> 350 351 <listitem> 352 <para>Performance: the stack/global checks require reading all of 353 the DWARF3 type and variable information on the executable and its 354 shared objects. This is computationally expensive and makes 355 startup quite slow. You can expect debuginfo reading time to be in 356 the region of a minute for an OpenOffice sized application, on a 357 2.4 GHz Core 2 machine. Reading this information also requires a 358 lot of memory. To make it viable, Ptrcheck goes to considerable 359 trouble to compress the in-memory representation of the DWARF3 360 data, which is why the process of reading it appears slow.</para> 361 </listitem> 362 363 <listitem> 364 <para>Performance: Ptrcheck runs slower than Memcheck. This is 365 partly due to a lack of tuning, but partly due to algorithmic 366 difficulties. The heap-check side is potentially quite fast. The 367 stack and global checks can sometimes require a number of range 368 checks per memory access, and these are difficult to short-circuit 369 (despite considerable efforts having been made). 370 </para> 371 </listitem> 372 373 <listitem> 374 <para>Coverage: the heap checking is relatively robust, requiring 375 only that Ptrcheck can see calls to <function>malloc</function> et al. 376 In that sense it has debug-info requirements comparable with Memcheck, 377 and is able to heap-check programs even with no debugging information 378 attached.</para> 379 380 <para>Stack/global checking is much more fragile. If a shared 381 object does not have debug information attached, then Ptrcheck will 382 not be able to determine the bounds of any stack or global arrays 383 defined within that shared object, and so will not be able to check 384 accesses to them. This is true even when those arrays are accessed 385 from some other shared object which was compiled with debug 386 info.</para> 387 388 <para>At the moment Ptrcheck accepts objects lacking debuginfo 389 without comment. This is dangerous as it causes Ptrcheck to 390 silently skip stack and global checking for such objects. It would 391 be better to print a warning in such circumstances.</para> 392 </listitem> 393 394 <listitem> 395 <para>Coverage: Ptrcheck checks that the areas read or written by 396 system calls do not overrun heap blocks. But it doesn't currently 397 check them for overruns stack and global arrays. This would be 398 easy to add.</para> 399 </listitem> 400 401 <listitem> 402 <para>Platforms: the stack/global checks won't work properly on any 403 PowerPC platforms, only on x86 and amd64 targets. That's because 404 the stack and global checking requires tracking function calls and 405 exits reliably, and there's no obvious way to do it with the PPC 406 ABIs. (In comparison, with the x86 and amd64 ABIs this is relatively 407 straightforward.)</para> 408 </listitem> 409 410 <listitem> 411 <para>Robustness: related to the previous point. Function 412 call/exit tracking for x86/amd64 is believed to work properly even 413 in the presence of longjmps within the same stack (although this 414 has not been tested). However, code which switches stacks is 415 likely to cause breakage/chaos.</para> 416 </listitem> 417</itemizedlist> 418 419</sect1> 420 421 422 423 424 425<sect1 id="pc-manual.todo-user-visible" 426 xreflabel="Still To Do: User-visible Functionality"> 427<title>Still To Do: User-visible Functionality</title> 428 429<itemizedlist> 430 431 <listitem> 432 <para>Extend system call checking to work on stack and global arrays.</para> 433 </listitem> 434 435 <listitem> 436 <para>Print a warning if a shared object does not have debug info 437 attached, or if, for whatever reason, debug info could not be 438 found, or read.</para> 439 </listitem> 440 441</itemizedlist> 442 443</sect1> 444 445 446 447 448<sect1 id="pc-manual.todo-implementation" 449 xreflabel="Still To Do: Implementation Tidying"> 450<title>Still To Do: Implementation Tidying</title> 451 452<para>Items marked CRITICAL are considered important for correctness: 453non-fixage of them is liable to lead to crashes or assertion failures 454in real use.</para> 455 456<itemizedlist> 457 458 <listitem> 459 <para>h_main.c: make N_FREED_SEGS command-line configurable.</para> 460 </listitem> 461 462 <listitem> 463 <para> sg_main.c: Improve the performance of the stack / global 464 checks by doing some up-front filtering to ignore references in 465 areas which "obviously" can't be stack or globals. This will 466 require using information that m_aspacemgr knows about the address 467 space layout.</para> 468 </listitem> 469 470 <listitem> 471 <para>h_main.c: get rid of the last_seg_added hack; add suitable 472 plumbing to the core/tool interface to do this cleanly.</para> 473 </listitem> 474 475 <listitem> 476 <para>h_main.c: move vast amounts of arch-dependent ugliness 477 (get_IntRegInfo et al) to its own source file, a la 478 mc_machine.c.</para> 479 </listitem> 480 481 <listitem> 482 <para>h_main.c: make the lossage-check stuff work again, as a way 483 of doing quality assurance on the implementation.</para> 484 </listitem> 485 486 <listitem> 487 <para>h_main.c: schemeEw_Atom: don't generate a call to 488 nonptr_or_unknown, this is really stupid, since it could be done at 489 translation time instead.</para> 490 </listitem> 491 492 <listitem> 493 <para>CRITICAL: h_main.c: h_instrument (main instrumentation fn): 494 generate shadows for word-sized temps defined in the block's 495 preamble. (Why does this work at all, as it stands?)</para> 496 </listitem> 497 498 <listitem> 499 <para>sg_main.c: fix compute_II_hash to make it a bit more sensible 500 for ppc32/64 targets (except that sg_ doesn't work on ppc32/64 501 targets, so this is a bit academic at the moment).</para> 502 </listitem> 503 504</itemizedlist> 505 506</sect1> 507 508 509 510</chapter> 511