1<?xml version="1.0"?> <!-- -*- sgml -*- --> 2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"> 4 5 6<chapter id="mc-manual" xreflabel="Memcheck: a memory error detector"> 7<title>Memcheck: a memory error detector</title> 8 9<para>To use this tool, you may specify <option>--tool=memcheck</option> 10on the Valgrind command line. You don't have to, though, since Memcheck 11is the default tool.</para> 12 13 14<sect1 id="mc-manual.overview" xreflabel="Overview"> 15<title>Overview</title> 16 17<para>Memcheck is a memory error detector. It can detect the following 18problems that are common in C and C++ programs.</para> 19 20<itemizedlist> 21 <listitem> 22 <para>Accessing memory you shouldn't, e.g. overrunning and underrunning 23 heap blocks, overrunning the top of the stack, and accessing memory after 24 it has been freed.</para> 25 </listitem> 26 27 <listitem> 28 <para>Using undefined values, i.e. values that have not been initialised, 29 or that have been derived from other undefined values.</para> 30 </listitem> 31 32 <listitem> 33 <para>Incorrect freeing of heap memory, such as double-freeing heap 34 blocks, or mismatched use of 35 <function>malloc</function>/<computeroutput>new</computeroutput>/<computeroutput>new[]</computeroutput> 36 versus 37 <function>free</function>/<computeroutput>delete</computeroutput>/<computeroutput>delete[]</computeroutput></para> 38 </listitem> 39 40 <listitem> 41 <para>Overlapping <computeroutput>src</computeroutput> and 42 <computeroutput>dst</computeroutput> pointers in 43 <computeroutput>memcpy</computeroutput> and related 44 functions.</para> 45 </listitem> 46 47 <listitem> 48 <para>Memory leaks.</para> 49 </listitem> 50</itemizedlist> 51 52<para>Problems like these can be difficult to find by other means, 53often remaining undetected for long periods, then causing occasional, 54difficult-to-diagnose crashes.</para> 55 56</sect1> 57 58 59 60<sect1 id="mc-manual.errormsgs" 61 xreflabel="Explanation of error messages from Memcheck"> 62<title>Explanation of error messages from Memcheck</title> 63 64<para>Memcheck issues a range of error messages. This section presents a 65quick summary of what error messages mean. The precise behaviour of the 66error-checking machinery is described in <xref 67linkend="mc-manual.machine"/>.</para> 68 69 70<sect2 id="mc-manual.badrw" 71 xreflabel="Illegal read / Illegal write errors"> 72<title>Illegal read / Illegal write errors</title> 73 74<para>For example:</para> 75<programlisting><![CDATA[ 76Invalid read of size 4 77 at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9) 78 by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9) 79 by 0x40B07FF4: read_png_image(QImageIO *) (kernel/qpngio.cpp:326) 80 by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621) 81 Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd 82]]></programlisting> 83 84<para>This happens when your program reads or writes memory at a place 85which Memcheck reckons it shouldn't. In this example, the program did a 864-byte read at address 0xBFFFF0E0, somewhere within the system-supplied 87library libpng.so.2.1.0.9, which was called from somewhere else in the 88same library, called from line 326 of <filename>qpngio.cpp</filename>, 89and so on.</para> 90 91<para>Memcheck tries to establish what the illegal address might relate 92to, since that's often useful. So, if it points into a block of memory 93which has already been freed, you'll be informed of this, and also where 94the block was freed. Likewise, if it should turn out to be just off 95the end of a heap block, a common result of off-by-one-errors in 96array subscripting, you'll be informed of this fact, and also where the 97block was allocated. If you use the <option><xref 98linkend="opt.read-var-info"/></option> option Memcheck will run more slowly 99but may give a more detailed description of any illegal address.</para> 100 101<para>In this example, Memcheck can't identify the address. Actually 102the address is on the stack, but, for some reason, this is not a valid 103stack address -- it is below the stack pointer and that isn't allowed. 104In this particular case it's probably caused by GCC generating invalid 105code, a known bug in some ancient versions of GCC.</para> 106 107<para>Note that Memcheck only tells you that your program is about to 108access memory at an illegal address. It can't stop the access from 109happening. So, if your program makes an access which normally would 110result in a segmentation fault, you program will still suffer the same 111fate -- but you will get a message from Memcheck immediately prior to 112this. In this particular example, reading junk on the stack is 113non-fatal, and the program stays alive.</para> 114 115</sect2> 116 117 118 119<sect2 id="mc-manual.uninitvals" 120 xreflabel="Use of uninitialised values"> 121<title>Use of uninitialised values</title> 122 123<para>For example:</para> 124<programlisting><![CDATA[ 125Conditional jump or move depends on uninitialised value(s) 126 at 0x402DFA94: _IO_vfprintf (_itoa.h:49) 127 by 0x402E8476: _IO_printf (printf.c:36) 128 by 0x8048472: main (tests/manuel1.c:8) 129]]></programlisting> 130 131<para>An uninitialised-value use error is reported when your program 132uses a value which hasn't been initialised -- in other words, is 133undefined. Here, the undefined value is used somewhere inside the 134<function>printf</function> machinery of the C library. This error was 135reported when running the following small program:</para> 136<programlisting><![CDATA[ 137int main() 138{ 139 int x; 140 printf ("x = %d\n", x); 141}]]></programlisting> 142 143<para>It is important to understand that your program can copy around 144junk (uninitialised) data as much as it likes. Memcheck observes this 145and keeps track of the data, but does not complain. A complaint is 146issued only when your program attempts to make use of uninitialised 147data in a way that might affect your program's externally-visible behaviour. 148In this example, <varname>x</varname> is uninitialised. Memcheck observes 149the value being passed to <function>_IO_printf</function> and thence to 150<function>_IO_vfprintf</function>, but makes no comment. However, 151<function>_IO_vfprintf</function> has to examine the value of 152<varname>x</varname> so it can turn it into the corresponding ASCII string, 153and it is at this point that Memcheck complains.</para> 154 155<para>Sources of uninitialised data tend to be:</para> 156<itemizedlist> 157 <listitem> 158 <para>Local variables in procedures which have not been initialised, 159 as in the example above.</para> 160 </listitem> 161 <listitem> 162 <para>The contents of heap blocks (allocated with 163 <function>malloc</function>, <function>new</function>, or a similar 164 function) before you (or a constructor) write something there. 165 </para> 166 </listitem> 167</itemizedlist> 168 169<para>To see information on the sources of uninitialised data in your 170program, use the <option>--track-origins=yes</option> option. This 171makes Memcheck run more slowly, but can make it much easier to track down 172the root causes of uninitialised value errors.</para> 173 174</sect2> 175 176 177 178<sect2 id="mc-manual.bad-syscall-args" 179 xreflabel="Use of uninitialised or unaddressable values in system 180 calls"> 181<title>Use of uninitialised or unaddressable values in system 182 calls</title> 183 184<para>Memcheck checks all parameters to system calls: 185<itemizedlist> 186 <listitem> 187 <para>It checks all the direct parameters themselves, whether they are 188 initialised.</para> 189 </listitem> 190 <listitem> 191 <para>Also, if a system call needs to read from a buffer provided by 192 your program, Memcheck checks that the entire buffer is addressable 193 and its contents are initialised.</para> 194 </listitem> 195 <listitem> 196 <para>Also, if the system call needs to write to a user-supplied 197 buffer, Memcheck checks that the buffer is addressable.</para> 198 </listitem> 199</itemizedlist> 200</para> 201 202<para>After the system call, Memcheck updates its tracked information to 203precisely reflect any changes in memory state caused by the system 204call.</para> 205 206<para>Here's an example of two system calls with invalid parameters:</para> 207<programlisting><![CDATA[ 208 #include <stdlib.h> 209 #include <unistd.h> 210 int main( void ) 211 { 212 char* arr = malloc(10); 213 int* arr2 = malloc(sizeof(int)); 214 write( 1 /* stdout */, arr, 10 ); 215 exit(arr2[0]); 216 } 217]]></programlisting> 218 219<para>You get these complaints ...</para> 220<programlisting><![CDATA[ 221 Syscall param write(buf) points to uninitialised byte(s) 222 at 0x25A48723: __write_nocancel (in /lib/tls/libc-2.3.3.so) 223 by 0x259AFAD3: __libc_start_main (in /lib/tls/libc-2.3.3.so) 224 by 0x8048348: (within /auto/homes/njn25/grind/head4/a.out) 225 Address 0x25AB8028 is 0 bytes inside a block of size 10 alloc'd 226 at 0x259852B0: malloc (vg_replace_malloc.c:130) 227 by 0x80483F1: main (a.c:5) 228 229 Syscall param exit(error_code) contains uninitialised byte(s) 230 at 0x25A21B44: __GI__exit (in /lib/tls/libc-2.3.3.so) 231 by 0x8048426: main (a.c:8) 232]]></programlisting> 233 234<para>... because the program has (a) written uninitialised junk 235from the heap block to the standard output, and (b) passed an 236uninitialised value to <function>exit</function>. Note that the first 237error refers to the memory pointed to by 238<computeroutput>buf</computeroutput> (not 239<computeroutput>buf</computeroutput> itself), but the second error 240refers directly to <computeroutput>exit</computeroutput>'s argument 241<computeroutput>arr2[0]</computeroutput>.</para> 242 243</sect2> 244 245 246<sect2 id="mc-manual.badfrees" xreflabel="Illegal frees"> 247<title>Illegal frees</title> 248 249<para>For example:</para> 250<programlisting><![CDATA[ 251Invalid free() 252 at 0x4004FFDF: free (vg_clientmalloc.c:577) 253 by 0x80484C7: main (tests/doublefree.c:10) 254 Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd 255 at 0x4004FFDF: free (vg_clientmalloc.c:577) 256 by 0x80484C7: main (tests/doublefree.c:10) 257]]></programlisting> 258 259<para>Memcheck keeps track of the blocks allocated by your program 260with <function>malloc</function>/<computeroutput>new</computeroutput>, 261so it can know exactly whether or not the argument to 262<function>free</function>/<computeroutput>delete</computeroutput> is 263legitimate or not. Here, this test program has freed the same block 264twice. As with the illegal read/write errors, Memcheck attempts to 265make sense of the address freed. If, as here, the address is one 266which has previously been freed, you wil be told that -- making 267duplicate frees of the same block easy to spot. You will also get this 268message if you try to free a pointer that doesn't point to the start of a 269heap block.</para> 270 271</sect2> 272 273 274<sect2 id="mc-manual.rudefn" 275 xreflabel="When a heap block is freed with an inappropriate deallocation 276function"> 277<title>When a heap block is freed with an inappropriate deallocation 278function</title> 279 280<para>In the following example, a block allocated with 281<function>new[]</function> has wrongly been deallocated with 282<function>free</function>:</para> 283<programlisting><![CDATA[ 284Mismatched free() / delete / delete [] 285 at 0x40043249: free (vg_clientfuncs.c:171) 286 by 0x4102BB4E: QGArray::~QGArray(void) (tools/qgarray.cpp:149) 287 by 0x4C261C41: PptDoc::~PptDoc(void) (include/qmemarray.h:60) 288 by 0x4C261F0E: PptXml::~PptXml(void) (pptxml.cc:44) 289 Address 0x4BB292A8 is 0 bytes inside a block of size 64 alloc'd 290 at 0x4004318C: operator new[](unsigned int) (vg_clientfuncs.c:152) 291 by 0x4C21BC15: KLaola::readSBStream(int) const (klaola.cc:314) 292 by 0x4C21C155: KLaola::stream(KLaola::OLENode const *) (klaola.cc:416) 293 by 0x4C21788F: OLEFilter::convert(QCString const &) (olefilter.cc:272) 294]]></programlisting> 295 296<para>In <literal>C++</literal> it's important to deallocate memory in a 297way compatible with how it was allocated. The deal is:</para> 298<itemizedlist> 299 <listitem> 300 <para>If allocated with 301 <function>malloc</function>, 302 <function>calloc</function>, 303 <function>realloc</function>, 304 <function>valloc</function> or 305 <function>memalign</function>, you must 306 deallocate with <function>free</function>.</para> 307 </listitem> 308 <listitem> 309 <para>If allocated with <function>new</function>, you must deallocate 310 with <function>delete</function>.</para> 311 </listitem> 312 <listitem> 313 <para>If allocated with <function>new[]</function>, you must 314 deallocate with <function>delete[]</function>.</para> 315 </listitem> 316</itemizedlist> 317 318<para>The worst thing is that on Linux apparently it doesn't matter if 319you do mix these up, but the same program may then crash on a 320different platform, Solaris for example. So it's best to fix it 321properly. According to the KDE folks "it's amazing how many C++ 322programmers don't know this".</para> 323 324<para>The reason behind the requirement is as follows. In some C++ 325implementations, <function>delete[]</function> must be used for 326objects allocated by <function>new[]</function> because the compiler 327stores the size of the array and the pointer-to-member to the 328destructor of the array's content just before the pointer actually 329returned. <function>delete</function> doesn't account for this and will get 330confused, possibly corrupting the heap.</para> 331 332</sect2> 333 334 335 336<sect2 id="mc-manual.overlap" 337 xreflabel="Overlapping source and destination blocks"> 338<title>Overlapping source and destination blocks</title> 339 340<para>The following C library functions copy some data from one 341memory block to another (or something similar): 342<function>memcpy</function>, 343<function>strcpy</function>, 344<function>strncpy</function>, 345<function>strcat</function>, 346<function>strncat</function>. 347The blocks pointed to by their <computeroutput>src</computeroutput> and 348<computeroutput>dst</computeroutput> pointers aren't allowed to overlap. 349The POSIX standards have wording along the lines "If copying takes place 350between objects that overlap, the behavior is undefined." Therefore, 351Memcheck checks for this. 352</para> 353 354<para>For example:</para> 355<programlisting><![CDATA[ 356==27492== Source and destination overlap in memcpy(0xbffff294, 0xbffff280, 21) 357==27492== at 0x40026CDC: memcpy (mc_replace_strmem.c:71) 358==27492== by 0x804865A: main (overlap.c:40) 359]]></programlisting> 360 361<para>You don't want the two blocks to overlap because one of them could 362get partially overwritten by the copying.</para> 363 364<para>You might think that Memcheck is being overly pedantic reporting 365this in the case where <computeroutput>dst</computeroutput> is less than 366<computeroutput>src</computeroutput>. For example, the obvious way to 367implement <function>memcpy</function> is by copying from the first 368byte to the last. However, the optimisation guides of some 369architectures recommend copying from the last byte down to the first. 370Also, some implementations of <function>memcpy</function> zero 371<computeroutput>dst</computeroutput> before copying, because zeroing the 372destination's cache line(s) can improve performance.</para> 373 374<para>The moral of the story is: if you want to write truly portable 375code, don't make any assumptions about the language 376implementation.</para> 377 378</sect2> 379 380 381<sect2 id="mc-manual.leaks" xreflabel="Memory leak detection"> 382<title>Memory leak detection</title> 383 384<para>Memcheck keeps track of all heap blocks issued in response to 385calls to 386<function>malloc</function>/<function>new</function> et al. 387So when the program exits, it knows which blocks have not been freed. 388</para> 389 390<para>If <option>--leak-check</option> is set appropriately, for each 391remaining block, Memcheck determines if the block is reachable from pointers 392within the root-set. The root-set consists of (a) general purpose registers 393of all threads, and (b) initialised, aligned, pointer-sized data words in 394accessible client memory, including stacks.</para> 395 396<para>There are two ways a block can be reached. The first is with a 397"start-pointer", i.e. a pointer to the start of the block. The second is with 398an "interior-pointer", i.e. a pointer to the middle of the block. There are 399three ways we know of that an interior-pointer can occur:</para> 400 401<itemizedlist> 402 <listitem> 403 <para>The pointer might have originally been a start-pointer and have been 404 moved along deliberately (or not deliberately) by the program. In 405 particular, this can happen if your program uses tagged pointers, i.e. 406 if it uses the bottom one, two or three bits of a pointer, which are 407 normally always zero due to alignment, in order to store extra 408 information.</para> 409 </listitem> 410 411 <listitem> 412 <para>It might be a random junk value in memory, entirely unrelated, just 413 a coincidence.</para> 414 </listitem> 415 416 <listitem> 417 <para>It might be a pointer to an array of C++ objects (which possess 418 destructors) allocated with <computeroutput>new[]</computeroutput>. In 419 this case, some compilers store a "magic cookie" containing the array 420 length at the start of the allocated block, and return a pointer to just 421 past that magic cookie, i.e. an interior-pointer. 422 See <ulink url="http://theory.uwinnipeg.ca/gnu/gcc/gxxint_14.html">this 423 page</ulink> for more information.</para> 424 </listitem> 425</itemizedlist> 426 427<para>With that in mind, consider the nine possible cases described by the 428following figure.</para> 429 430<programlisting><![CDATA[ 431 Pointer chain AAA Category BBB Category 432 ------------- ------------ ------------ 433(1) RRR ------------> BBB DR 434(2) RRR ---> AAA ---> BBB DR IR 435(3) RRR BBB DL 436(4) RRR AAA ---> BBB DL IL 437(5) RRR ------?-----> BBB (y)DR, (n)DL 438(6) RRR ---> AAA -?-> BBB DR (y)IR, (n)DL 439(7) RRR -?-> AAA ---> BBB (y)DR, (n)DL (y)IR, (n)IL 440(8) RRR -?-> AAA -?-> BBB (y)DR, (n)DL (y,y)IR, (n,y)IL, (_,n)DL 441(9) RRR AAA -?-> BBB DL (y)IL, (n)DL 442 443Pointer chain legend: 444- RRR: a root set node or DR block 445- AAA, BBB: heap blocks 446- --->: a start-pointer 447- -?->: an interior-pointer 448 449Category legend: 450- DR: Directly reachable 451- IR: Indirectly reachable 452- DL: Directly lost 453- IL: Indirectly lost 454- (y)XY: it's XY if the interior-pointer is a real pointer 455- (n)XY: it's XY if the interior-pointer is not a real pointer 456- (_)XY: it's XY in either case 457]]></programlisting> 458 459<para>Every possible case can be reduced to one of the above nine. Memcheck 460merges some of these cases in its output, resulting in the following four 461categories.</para> 462 463 464<itemizedlist> 465 466 <listitem> 467 <para>"Still reachable". This covers cases 1 and 2 (for the BBB blocks) 468 above. A start-pointer or chain of start-pointers to the block is 469 found. Since the block is still pointed at, the programmer could, at 470 least in principle, have freed it before program exit. Because these 471 are very common and arguably not a problem, Memcheck won't report such 472 blocks individually unless <option>--show-reachable=yes</option> is 473 specified.</para> 474 </listitem> 475 476 <listitem> 477 <para>"Definitely lost". This covers case 3 (for the BBB blocks) above. 478 This means that no pointer to the block can be found. The block is 479 classified as "lost", because the programmer could not possibly have 480 freed it at program exit, since no pointer to it exists. This is likely 481 a symptom of having lost the pointer at some earlier point in the 482 program. Such cases should be fixed by the programmer.</para> 483 </listitem> 484 485 <listitem> 486 <para>"Indirectly lost". This covers cases 4 and 9 (for the BBB blocks) 487 above. This means that the block is lost, not because there are no 488 pointers to it, but rather because all the blocks that point to it are 489 themselves lost. For example, if you have a binary tree and the root 490 node is lost, all its children nodes will be indirectly lost. Because 491 the problem will disappear if the definitely lost block that caused the 492 indirect leak is fixed, Memcheck won't report such blocks individually 493 unless <option>--show-reachable=yes</option> is specified.</para> 494 </listitem> 495 496 <listitem> 497 <para>"Possibly lost". This covers cases 5--8 (for the BBB blocks) 498 above. This means that a chain of one or more pointers to the block has 499 been found, but at least one of the pointers is an interior-pointer. 500 This could just be a random value in memory that happens to point into a 501 block, and so you shouldn't consider this ok unless you know you have 502 interior-pointers.</para> 503 </listitem> 504 505</itemizedlist> 506 507<para>(Note: This mapping of the nine possible cases onto four categories is 508not necessarily the best way that leaks could be reported; in particular, 509interior-pointers are treated inconsistently. It is possible the 510categorisation may be improved in the future.)</para> 511 512<para>Furthermore, if suppressions exists for a block, it will be reported 513as "suppressed" no matter what which of the above four categories it belongs 514to.</para> 515 516 517<para>The following is an example leak summary.</para> 518 519<programlisting><![CDATA[ 520LEAK SUMMARY: 521 definitely lost: 48 bytes in 3 blocks. 522 indirectly lost: 32 bytes in 2 blocks. 523 possibly lost: 96 bytes in 6 blocks. 524 still reachable: 64 bytes in 4 blocks. 525 suppressed: 0 bytes in 0 blocks. 526]]></programlisting> 527 528<para>If <option>--leak-check=full</option> is specified, 529Memcheck will give details for each definitely lost or possibly lost block, 530including where it was allocated. (Actually, it merges results for all 531blocks that have the same category and sufficiently similar stack traces 532into a single "loss record". The 533<option>--leak-resolution</option> lets you control the 534meaning of "sufficiently similar".) It cannot tell you when or how or why 535the pointer to a leaked block was lost; you have to work that out for 536yourself. In general, you should attempt to ensure your programs do not 537have any definitely lost or possibly lost blocks at exit.</para> 538 539<para>For example:</para> 540<programlisting><![CDATA[ 5418 bytes in 1 blocks are definitely lost in loss record 1 of 14 542 at 0x........: malloc (vg_replace_malloc.c:...) 543 by 0x........: mk (leak-tree.c:11) 544 by 0x........: main (leak-tree.c:39) 545 54688 (8 direct, 80 indirect) bytes in 1 blocks are definitely lost in loss record 13 of 14 547 at 0x........: malloc (vg_replace_malloc.c:...) 548 by 0x........: mk (leak-tree.c:11) 549 by 0x........: main (leak-tree.c:25) 550]]></programlisting> 551 552<para>The first message describes a simple case of a single 8 byte block 553that has been definitely lost. The second case mentions another 8 byte 554block that has been definitely lost; the difference is that a further 80 555bytes in other blocks are indirectly lost because of this lost block. 556The loss records are not presented in any notable order, so the loss record 557numbers aren't particularly meaningful.</para> 558 559<para>If you specify <option>--show-reachable=yes</option>, 560reachable and indirectly lost blocks will also be shown, as the following 561two examples show.</para> 562 563<programlisting><![CDATA[ 56464 bytes in 4 blocks are still reachable in loss record 2 of 4 565 at 0x........: malloc (vg_replace_malloc.c:177) 566 by 0x........: mk (leak-cases.c:52) 567 by 0x........: main (leak-cases.c:74) 568 56932 bytes in 2 blocks are indirectly lost in loss record 1 of 4 570 at 0x........: malloc (vg_replace_malloc.c:177) 571 by 0x........: mk (leak-cases.c:52) 572 by 0x........: main (leak-cases.c:80) 573]]></programlisting> 574 575<para>Because there are different kinds of leaks with different severities, an 576interesting question is this: which leaks should be counted as true "errors" 577and which should not? The answer to this question affects the numbers printed 578in the <computeroutput>ERROR SUMMARY</computeroutput> line, and also the effect 579of the <option>--error-exitcode</option> option. Memcheck uses the following 580criteria:</para> 581 582<itemizedlist> 583 <listitem> 584 <para>First, a leak is only counted as a true "error" if 585 <option>--leak-check=full</option> is specified. In other words, an 586 unprinted leak is not considered a true "error". If this were not the 587 case, it would be possible to get a high error count but not have any 588 errors printed, which would be confusing.</para> 589 </listitem> 590 591 <listitem> 592 <para>After that, definitely lost and possibly lost blocks are counted as 593 true "errors". Indirectly lost and still reachable blocks are not counted 594 as true "errors", even if <option>--show-reachable=yes</option> is 595 specified and they are printed; this is because such blocks don't need 596 direct fixing by the programmer. 597 </para> 598 </listitem> 599</itemizedlist> 600 601</sect2> 602 603</sect1> 604 605 606 607<sect1 id="mc-manual.options" 608 xreflabel="Memcheck Command-Line Options"> 609<title>Memcheck Command-Line Options</title> 610 611<!-- start of xi:include in the manpage --> 612<variablelist id="mc.opts.list"> 613 614 <varlistentry id="opt.leak-check" xreflabel="--leak-check"> 615 <term> 616 <option><![CDATA[--leak-check=<no|summary|yes|full> [default: summary] ]]></option> 617 </term> 618 <listitem> 619 <para>When enabled, search for memory leaks when the client 620 program finishes. If set to <varname>summary</varname>, it says how 621 many leaks occurred. If set to <varname>full</varname> or 622 <varname>yes</varname>, it also gives details of each individual 623 leak.</para> 624 </listitem> 625 </varlistentry> 626 627 <varlistentry id="opt.show-possibly-lost" xreflabel="--show-possibly-lost"> 628 <term> 629 <option><![CDATA[--show-possibly-lost=<yes|no> [default: yes] ]]></option> 630 </term> 631 <listitem> 632 <para>When disabled, the memory leak detector will not show "possibly lost" blocks. 633 </para> 634 </listitem> 635 </varlistentry> 636 637 <varlistentry id="opt.leak-resolution" xreflabel="--leak-resolution"> 638 <term> 639 <option><![CDATA[--leak-resolution=<low|med|high> [default: high] ]]></option> 640 </term> 641 <listitem> 642 <para>When doing leak checking, determines how willing 643 Memcheck is to consider different backtraces to 644 be the same for the purposes of merging multiple leaks into a single 645 leak report. When set to <varname>low</varname>, only the first 646 two entries need match. When <varname>med</varname>, four entries 647 have to match. When <varname>high</varname>, all entries need to 648 match.</para> 649 650 <para>For hardcore leak debugging, you probably want to use 651 <option>--leak-resolution=high</option> together with 652 <option>--num-callers=40</option> or some such large number. 653 </para> 654 655 <para>Note that the <option>--leak-resolution</option> setting 656 does not affect Memcheck's ability to find 657 leaks. It only changes how the results are presented.</para> 658 </listitem> 659 </varlistentry> 660 661 <varlistentry id="opt.show-reachable" xreflabel="--show-reachable"> 662 <term> 663 <option><![CDATA[--show-reachable=<yes|no> [default: no] ]]></option> 664 </term> 665 <listitem> 666 <para>When disabled, the memory leak detector only shows "definitely 667 lost" and "possibly lost" blocks. When enabled, the leak detector also 668 shows "reachable" and "indirectly lost" blocks. (In other words, it 669 shows all blocks, except suppressed ones, so 670 <option>--show-all</option> would be a better name for 671 it.)</para> 672 </listitem> 673 </varlistentry> 674 675 <varlistentry id="opt.undef-value-errors" xreflabel="--undef-value-errors"> 676 <term> 677 <option><![CDATA[--undef-value-errors=<yes|no> [default: yes] ]]></option> 678 </term> 679 <listitem> 680 <para>Controls whether Memcheck reports 681 uses of undefined value errors. Set this to 682 <varname>no</varname> if you don't want to see undefined value 683 errors. It also has the side effect of speeding up 684 Memcheck somewhat. 685 </para> 686 </listitem> 687 </varlistentry> 688 689 <varlistentry id="opt.track-origins" xreflabel="--track-origins"> 690 <term> 691 <option><![CDATA[--track-origins=<yes|no> [default: no] ]]></option> 692 </term> 693 <listitem> 694 <para>Controls whether Memcheck tracks 695 the origin of uninitialised values. By default, it does not, 696 which means that although it can tell you that an 697 uninitialised value is being used in a dangerous way, it 698 cannot tell you where the uninitialised value came from. This 699 often makes it difficult to track down the root problem. 700 </para> 701 <para>When set 702 to <varname>yes</varname>, Memcheck keeps 703 track of the origins of all uninitialised values. Then, when 704 an uninitialised value error is 705 reported, Memcheck will try to show the 706 origin of the value. An origin can be one of the following 707 four places: a heap block, a stack allocation, a client 708 request, or miscellaneous other sources (eg, a call 709 to <varname>brk</varname>). 710 </para> 711 <para>For uninitialised values originating from a heap 712 block, Memcheck shows where the block was 713 allocated. For uninitialised values originating from a stack 714 allocation, Memcheck can tell you which 715 function allocated the value, but no more than that -- typically 716 it shows you the source location of the opening brace of the 717 function. So you should carefully check that all of the 718 function's local variables are initialised properly. 719 </para> 720 <para>Performance overhead: origin tracking is expensive. It 721 halves Memcheck's speed and increases 722 memory use by a minimum of 100MB, and possibly more. 723 Nevertheless it can drastically reduce the effort required to 724 identify the root cause of uninitialised value errors, and so 725 is often a programmer productivity win, despite running 726 more slowly. 727 </para> 728 <para>Accuracy: Memcheck tracks origins 729 quite accurately. To avoid very large space and time 730 overheads, some approximations are made. It is possible, 731 although unlikely, that Memcheck will report an incorrect origin, or 732 not be able to identify any origin. 733 </para> 734 <para>Note that the combination 735 <option>--track-origins=yes</option> 736 and <option>--undef-value-errors=no</option> is 737 nonsensical. Memcheck checks for and 738 rejects this combination at startup. 739 </para> 740 </listitem> 741 </varlistentry> 742 743 <varlistentry id="opt.partial-loads-ok" xreflabel="--partial-loads-ok"> 744 <term> 745 <option><![CDATA[--partial-loads-ok=<yes|no> [default: no] ]]></option> 746 </term> 747 <listitem> 748 <para>Controls how Memcheck handles word-sized, 749 word-aligned loads from addresses for which some bytes are 750 addressable and others are not. When <varname>yes</varname>, such 751 loads do not produce an address error. Instead, loaded bytes 752 originating from illegal addresses are marked as uninitialised, and 753 those corresponding to legal addresses are handled in the normal 754 way.</para> 755 756 <para>When <varname>no</varname>, loads from partially invalid 757 addresses are treated the same as loads from completely invalid 758 addresses: an illegal-address error is issued, and the resulting 759 bytes are marked as initialised.</para> 760 761 <para>Note that code that behaves in this way is in violation of 762 the the ISO C/C++ standards, and should be considered broken. If 763 at all possible, such code should be fixed. This option should be 764 used only as a last resort.</para> 765 </listitem> 766 </varlistentry> 767 768 <varlistentry id="opt.freelist-vol" xreflabel="--freelist-vol"> 769 <term> 770 <option><![CDATA[--freelist-vol=<number> [default: 20000000] ]]></option> 771 </term> 772 <listitem> 773 <para>When the client program releases memory using 774 <function>free</function> (in <literal>C</literal>) or 775 <computeroutput>delete</computeroutput> 776 (<literal>C++</literal>), that memory is not immediately made 777 available for re-allocation. Instead, it is marked inaccessible 778 and placed in a queue of freed blocks. The purpose is to defer as 779 long as possible the point at which freed-up memory comes back 780 into circulation. This increases the chance that 781 Memcheck will be able to detect invalid 782 accesses to blocks for some significant period of time after they 783 have been freed.</para> 784 785 <para>This option specifies the maximum total size, in bytes, of the 786 blocks in the queue. The default value is twenty million bytes. 787 Increasing this increases the total amount of memory used by 788 Memcheck but may detect invalid uses of freed 789 blocks which would otherwise go undetected.</para> 790 </listitem> 791 </varlistentry> 792 793 <varlistentry id="opt.freelist-big-blocks" xreflabel="--freelist-big-blocks"> 794 <term> 795 <option><![CDATA[--freelist-big-blocks=<number> [default: 1000000] ]]></option> 796 </term> 797 <listitem> 798 <para>When making blocks from the queue of freed blocks available 799 for re-allocation, Memcheck will in priority re-circulate the blocks 800 with a size greater or equal to <option>--freelist-big-blocks</option>. 801 This ensures that freeing big blocks (in particular freeing blocks bigger than 802 <option>--freelist-vol</option>) does not immediately lead to a re-circulation 803 of all (or a lot of) the small blocks in the free list. In other words, 804 this option increases the likelihood to discover dangling pointers 805 for the "small" blocks, even when big blocks are freed.</para> 806 <para>Setting a value of 0 means that all the blocks are re-circulated 807 in a FIFO order. </para> 808 </listitem> 809 </varlistentry> 810 811 <varlistentry id="opt.workaround-gcc296-bugs" xreflabel="--workaround-gcc296-bugs"> 812 <term> 813 <option><![CDATA[--workaround-gcc296-bugs=<yes|no> [default: no] ]]></option> 814 </term> 815 <listitem> 816 <para>When enabled, assume that reads and writes some small 817 distance below the stack pointer are due to bugs in GCC 2.96, and 818 does not report them. The "small distance" is 256 bytes by 819 default. Note that GCC 2.96 is the default compiler on some ancient 820 Linux distributions (RedHat 7.X) and so you may need to use this 821 option. Do not use it if you do not have to, as it can cause real 822 errors to be overlooked. A better alternative is to use a more 823 recent GCC in which this bug is fixed.</para> 824 825 <para>You may also need to use this option when working with 826 GCC 3.X or 4.X on 32-bit PowerPC Linux. This is because 827 GCC generates code which occasionally accesses below the 828 stack pointer, particularly for floating-point to/from integer 829 conversions. This is in violation of the 32-bit PowerPC ELF 830 specification, which makes no provision for locations below the 831 stack pointer to be accessible.</para> 832 </listitem> 833 </varlistentry> 834 835 <varlistentry id="opt.ignore-ranges" xreflabel="--ignore-ranges"> 836 <term> 837 <option><![CDATA[--ignore-ranges=0xPP-0xQQ[,0xRR-0xSS] ]]></option> 838 </term> 839 <listitem> 840 <para>Any ranges listed in this option (and multiple ranges can be 841 specified, separated by commas) will be ignored by Memcheck's 842 addressability checking.</para> 843 </listitem> 844 </varlistentry> 845 846 <varlistentry id="opt.malloc-fill" xreflabel="--malloc-fill"> 847 <term> 848 <option><![CDATA[--malloc-fill=<hexnumber> ]]></option> 849 </term> 850 <listitem> 851 <para>Fills blocks allocated 852 by <computeroutput>malloc</computeroutput>, 853 <computeroutput>new</computeroutput>, etc, but not 854 by <computeroutput>calloc</computeroutput>, with the specified 855 byte. This can be useful when trying to shake out obscure 856 memory corruption problems. The allocated area is still 857 regarded by Memcheck as undefined -- this option only affects its 858 contents. 859 </para> 860 </listitem> 861 </varlistentry> 862 863 <varlistentry id="opt.free-fill" xreflabel="--free-fill"> 864 <term> 865 <option><![CDATA[--free-fill=<hexnumber> ]]></option> 866 </term> 867 <listitem> 868 <para>Fills blocks freed 869 by <computeroutput>free</computeroutput>, 870 <computeroutput>delete</computeroutput>, etc, with the 871 specified byte value. This can be useful when trying to shake out 872 obscure memory corruption problems. The freed area is still 873 regarded by Memcheck as not valid for access -- this option only 874 affects its contents. 875 </para> 876 </listitem> 877 </varlistentry> 878 879</variablelist> 880<!-- end of xi:include in the manpage --> 881 882</sect1> 883 884 885<sect1 id="mc-manual.suppfiles" xreflabel="Writing suppression files"> 886<title>Writing suppression files</title> 887 888<para>The basic suppression format is described in 889<xref linkend="manual-core.suppress"/>.</para> 890 891<para>The suppression-type (second) line should have the form:</para> 892<programlisting><![CDATA[ 893Memcheck:suppression_type]]></programlisting> 894 895<para>The Memcheck suppression types are as follows:</para> 896 897<itemizedlist> 898 <listitem> 899 <para><varname>Value1</varname>, 900 <varname>Value2</varname>, 901 <varname>Value4</varname>, 902 <varname>Value8</varname>, 903 <varname>Value16</varname>, 904 meaning an uninitialised-value error when 905 using a value of 1, 2, 4, 8 or 16 bytes.</para> 906 </listitem> 907 908 <listitem> 909 <para><varname>Cond</varname> (or its old 910 name, <varname>Value0</varname>), meaning use 911 of an uninitialised CPU condition code.</para> 912 </listitem> 913 914 <listitem> 915 <para><varname>Addr1</varname>, 916 <varname>Addr2</varname>, 917 <varname>Addr4</varname>, 918 <varname>Addr8</varname>, 919 <varname>Addr16</varname>, 920 meaning an invalid address during a 921 memory access of 1, 2, 4, 8 or 16 bytes respectively.</para> 922 </listitem> 923 924 <listitem> 925 <para><varname>Jump</varname>, meaning an 926 jump to an unaddressable location error.</para> 927 </listitem> 928 929 <listitem> 930 <para><varname>Param</varname>, meaning an 931 invalid system call parameter error.</para> 932 </listitem> 933 934 <listitem> 935 <para><varname>Free</varname>, meaning an 936 invalid or mismatching free.</para> 937 </listitem> 938 939 <listitem> 940 <para><varname>Overlap</varname>, meaning a 941 <computeroutput>src</computeroutput> / 942 <computeroutput>dst</computeroutput> overlap in 943 <function>memcpy</function> or a similar function.</para> 944 </listitem> 945 946 <listitem> 947 <para><varname>Leak</varname>, meaning 948 a memory leak.</para> 949 </listitem> 950 951</itemizedlist> 952 953<para><computeroutput>Param</computeroutput> errors have an extra 954information line at this point, which is the name of the offending 955system call parameter. No other error kinds have this extra 956line.</para> 957 958<para>The first line of the calling context: for <varname>ValueN</varname> 959and <varname>AddrN</varname> errors, it is either the name of the function 960in which the error occurred, or, failing that, the full path of the 961<filename>.so</filename> file 962or executable containing the error location. For <varname>Free</varname> errors, is the name 963of the function doing the freeing (eg, <function>free</function>, 964<function>__builtin_vec_delete</function>, etc). For 965<varname>Overlap</varname> errors, is the name of the function with the 966overlapping arguments (eg. <function>memcpy</function>, 967<function>strcpy</function>, etc).</para> 968 969<para>Lastly, there's the rest of the calling context.</para> 970 971</sect1> 972 973 974 975<sect1 id="mc-manual.machine" 976 xreflabel="Details of Memcheck's checking machinery"> 977<title>Details of Memcheck's checking machinery</title> 978 979<para>Read this section if you want to know, in detail, exactly 980what and how Memcheck is checking.</para> 981 982 983<sect2 id="mc-manual.value" xreflabel="Valid-value (V) bit"> 984<title>Valid-value (V) bits</title> 985 986<para>It is simplest to think of Memcheck implementing a synthetic CPU 987which is identical to a real CPU, except for one crucial detail. Every 988bit (literally) of data processed, stored and handled by the real CPU 989has, in the synthetic CPU, an associated "valid-value" bit, which says 990whether or not the accompanying bit has a legitimate value. In the 991discussions which follow, this bit is referred to as the V (valid-value) 992bit.</para> 993 994<para>Each byte in the system therefore has a 8 V bits which follow it 995wherever it goes. For example, when the CPU loads a word-size item (4 996bytes) from memory, it also loads the corresponding 32 V bits from a 997bitmap which stores the V bits for the process' entire address space. 998If the CPU should later write the whole or some part of that value to 999memory at a different address, the relevant V bits will be stored back 1000in the V-bit bitmap.</para> 1001 1002<para>In short, each bit in the system has (conceptually) an associated V 1003bit, which follows it around everywhere, even inside the CPU. Yes, all the 1004CPU's registers (integer, floating point, vector and condition registers) 1005have their own V bit vectors. For this to work, Memcheck uses a great deal 1006of compression to represent the V bits compactly.</para> 1007 1008<para>Copying values around does not cause Memcheck to check for, or 1009report on, errors. However, when a value is used in a way which might 1010conceivably affect your program's externally-visible behaviour, 1011the associated V bits are immediately checked. If any of these indicate 1012that the value is undefined (even partially), an error is reported.</para> 1013 1014<para>Here's an (admittedly nonsensical) example:</para> 1015<programlisting><![CDATA[ 1016int i, j; 1017int a[10], b[10]; 1018for ( i = 0; i < 10; i++ ) { 1019 j = a[i]; 1020 b[i] = j; 1021}]]></programlisting> 1022 1023<para>Memcheck emits no complaints about this, since it merely copies 1024uninitialised values from <varname>a[]</varname> into 1025<varname>b[]</varname>, and doesn't use them in a way which could 1026affect the behaviour of the program. However, if 1027the loop is changed to:</para> 1028<programlisting><![CDATA[ 1029for ( i = 0; i < 10; i++ ) { 1030 j += a[i]; 1031} 1032if ( j == 77 ) 1033 printf("hello there\n"); 1034]]></programlisting> 1035 1036<para>then Memcheck will complain, at the 1037<computeroutput>if</computeroutput>, that the condition depends on 1038uninitialised values. Note that it <command>doesn't</command> complain 1039at the <varname>j += a[i];</varname>, since at that point the 1040undefinedness is not "observable". It's only when a decision has to be 1041made as to whether or not to do the <function>printf</function> -- an 1042observable action of your program -- that Memcheck complains.</para> 1043 1044<para>Most low level operations, such as adds, cause Memcheck to use the 1045V bits for the operands to calculate the V bits for the result. Even if 1046the result is partially or wholly undefined, it does not 1047complain.</para> 1048 1049<para>Checks on definedness only occur in three places: when a value is 1050used to generate a memory address, when control flow decision needs to 1051be made, and when a system call is detected, Memcheck checks definedness 1052of parameters as required.</para> 1053 1054<para>If a check should detect undefinedness, an error message is 1055issued. The resulting value is subsequently regarded as well-defined. 1056To do otherwise would give long chains of error messages. In other 1057words, once Memcheck reports an undefined value error, it tries to 1058avoid reporting further errors derived from that same undefined 1059value.</para> 1060 1061<para>This sounds overcomplicated. Why not just check all reads from 1062memory, and complain if an undefined value is loaded into a CPU 1063register? Well, that doesn't work well, because perfectly legitimate C 1064programs routinely copy uninitialised values around in memory, and we 1065don't want endless complaints about that. Here's the canonical example. 1066Consider a struct like this:</para> 1067<programlisting><![CDATA[ 1068struct S { int x; char c; }; 1069struct S s1, s2; 1070s1.x = 42; 1071s1.c = 'z'; 1072s2 = s1; 1073]]></programlisting> 1074 1075<para>The question to ask is: how large is <varname>struct S</varname>, 1076in bytes? An <varname>int</varname> is 4 bytes and a 1077<varname>char</varname> one byte, so perhaps a <varname>struct 1078S</varname> occupies 5 bytes? Wrong. All non-toy compilers we know 1079of will round the size of <varname>struct S</varname> up to a whole 1080number of words, in this case 8 bytes. Not doing this forces compilers 1081to generate truly appalling code for accessing arrays of 1082<varname>struct S</varname>'s on some architectures.</para> 1083 1084<para>So <varname>s1</varname> occupies 8 bytes, yet only 5 of them will 1085be initialised. For the assignment <varname>s2 = s1</varname>, GCC 1086generates code to copy all 8 bytes wholesale into <varname>s2</varname> 1087without regard for their meaning. If Memcheck simply checked values as 1088they came out of memory, it would yelp every time a structure assignment 1089like this happened. So the more complicated behaviour described above 1090is necessary. This allows GCC to copy 1091<varname>s1</varname> into <varname>s2</varname> any way it likes, and a 1092warning will only be emitted if the uninitialised values are later 1093used.</para> 1094 1095</sect2> 1096 1097 1098<sect2 id="mc-manual.vaddress" xreflabel=" Valid-address (A) bits"> 1099<title>Valid-address (A) bits</title> 1100 1101<para>Notice that the previous subsection describes how the validity of 1102values is established and maintained without having to say whether the 1103program does or does not have the right to access any particular memory 1104location. We now consider the latter question.</para> 1105 1106<para>As described above, every bit in memory or in the CPU has an 1107associated valid-value (V) bit. In addition, all bytes in memory, but 1108not in the CPU, have an associated valid-address (A) bit. This 1109indicates whether or not the program can legitimately read or write that 1110location. It does not give any indication of the validity of the data 1111at that location -- that's the job of the V bits -- only whether or not 1112the location may be accessed.</para> 1113 1114<para>Every time your program reads or writes memory, Memcheck checks 1115the A bits associated with the address. If any of them indicate an 1116invalid address, an error is emitted. Note that the reads and writes 1117themselves do not change the A bits, only consult them.</para> 1118 1119<para>So how do the A bits get set/cleared? Like this:</para> 1120 1121<itemizedlist> 1122 <listitem> 1123 <para>When the program starts, all the global data areas are 1124 marked as accessible.</para> 1125 </listitem> 1126 1127 <listitem> 1128 <para>When the program does 1129 <function>malloc</function>/<computeroutput>new</computeroutput>, 1130 the A bits for exactly the area allocated, and not a byte more, 1131 are marked as accessible. Upon freeing the area the A bits are 1132 changed to indicate inaccessibility.</para> 1133 </listitem> 1134 1135 <listitem> 1136 <para>When the stack pointer register (<literal>SP</literal>) moves 1137 up or down, A bits are set. The rule is that the area from 1138 <literal>SP</literal> up to the base of the stack is marked as 1139 accessible, and below <literal>SP</literal> is inaccessible. (If 1140 that sounds illogical, bear in mind that the stack grows down, not 1141 up, on almost all Unix systems, including GNU/Linux.) Tracking 1142 <literal>SP</literal> like this has the useful side-effect that the 1143 section of stack used by a function for local variables etc is 1144 automatically marked accessible on function entry and inaccessible 1145 on exit.</para> 1146 </listitem> 1147 1148 <listitem> 1149 <para>When doing system calls, A bits are changed appropriately. 1150 For example, <literal>mmap</literal> 1151 magically makes files appear in the process' 1152 address space, so the A bits must be updated if <literal>mmap</literal> 1153 succeeds.</para> 1154 </listitem> 1155 1156 <listitem> 1157 <para>Optionally, your program can tell Memcheck about such changes 1158 explicitly, using the client request mechanism described 1159 above.</para> 1160 </listitem> 1161 1162</itemizedlist> 1163 1164</sect2> 1165 1166 1167<sect2 id="mc-manual.together" xreflabel="Putting it all together"> 1168<title>Putting it all together</title> 1169 1170<para>Memcheck's checking machinery can be summarised as 1171follows:</para> 1172 1173<itemizedlist> 1174 <listitem> 1175 <para>Each byte in memory has 8 associated V (valid-value) bits, 1176 saying whether or not the byte has a defined value, and a single A 1177 (valid-address) bit, saying whether or not the program currently has 1178 the right to read/write that address. As mentioned above, heavy 1179 use of compression means the overhead is typically around 25%.</para> 1180 </listitem> 1181 1182 <listitem> 1183 <para>When memory is read or written, the relevant A bits are 1184 consulted. If they indicate an invalid address, Memcheck emits an 1185 Invalid read or Invalid write error.</para> 1186 </listitem> 1187 1188 <listitem> 1189 <para>When memory is read into the CPU's registers, the relevant V 1190 bits are fetched from memory and stored in the simulated CPU. They 1191 are not consulted.</para> 1192 </listitem> 1193 1194 <listitem> 1195 <para>When a register is written out to memory, the V bits for that 1196 register are written back to memory too.</para> 1197 </listitem> 1198 1199 <listitem> 1200 <para>When values in CPU registers are used to generate a memory 1201 address, or to determine the outcome of a conditional branch, the V 1202 bits for those values are checked, and an error emitted if any of 1203 them are undefined.</para> 1204 </listitem> 1205 1206 <listitem> 1207 <para>When values in CPU registers are used for any other purpose, 1208 Memcheck computes the V bits for the result, but does not check 1209 them.</para> 1210 </listitem> 1211 1212 <listitem> 1213 <para>Once the V bits for a value in the CPU have been checked, they 1214 are then set to indicate validity. This avoids long chains of 1215 errors.</para> 1216 </listitem> 1217 1218 <listitem> 1219 <para>When values are loaded from memory, Memcheck checks the A bits 1220 for that location and issues an illegal-address warning if needed. 1221 In that case, the V bits loaded are forced to indicate Valid, 1222 despite the location being invalid.</para> 1223 1224 <para>This apparently strange choice reduces the amount of confusing 1225 information presented to the user. It avoids the unpleasant 1226 phenomenon in which memory is read from a place which is both 1227 unaddressable and contains invalid values, and, as a result, you get 1228 not only an invalid-address (read/write) error, but also a 1229 potentially large set of uninitialised-value errors, one for every 1230 time the value is used.</para> 1231 1232 <para>There is a hazy boundary case to do with multi-byte loads from 1233 addresses which are partially valid and partially invalid. See 1234 details of the option <option>--partial-loads-ok</option> for details. 1235 </para> 1236 </listitem> 1237 1238</itemizedlist> 1239 1240 1241<para>Memcheck intercepts calls to <function>malloc</function>, 1242<function>calloc</function>, <function>realloc</function>, 1243<function>valloc</function>, <function>memalign</function>, 1244<function>free</function>, <computeroutput>new</computeroutput>, 1245<computeroutput>new[]</computeroutput>, 1246<computeroutput>delete</computeroutput> and 1247<computeroutput>delete[]</computeroutput>. The behaviour you get 1248is:</para> 1249 1250<itemizedlist> 1251 1252 <listitem> 1253 <para><function>malloc</function>/<function>new</function>/<computeroutput>new[]</computeroutput>: 1254 the returned memory is marked as addressable but not having valid 1255 values. This means you have to write to it before you can read 1256 it.</para> 1257 </listitem> 1258 1259 <listitem> 1260 <para><function>calloc</function>: returned memory is marked both 1261 addressable and valid, since <function>calloc</function> clears 1262 the area to zero.</para> 1263 </listitem> 1264 1265 <listitem> 1266 <para><function>realloc</function>: if the new size is larger than 1267 the old, the new section is addressable but invalid, as with 1268 <function>malloc</function>. If the new size is smaller, the 1269 dropped-off section is marked as unaddressable. You may only pass to 1270 <function>realloc</function> a pointer previously issued to you by 1271 <function>malloc</function>/<function>calloc</function>/<function>realloc</function>.</para> 1272 </listitem> 1273 1274 <listitem> 1275 <para><function>free</function>/<computeroutput>delete</computeroutput>/<computeroutput>delete[]</computeroutput>: 1276 you may only pass to these functions a pointer previously issued 1277 to you by the corresponding allocation function. Otherwise, 1278 Memcheck complains. If the pointer is indeed valid, Memcheck 1279 marks the entire area it points at as unaddressable, and places 1280 the block in the freed-blocks-queue. The aim is to defer as long 1281 as possible reallocation of this block. Until that happens, all 1282 attempts to access it will elicit an invalid-address error, as you 1283 would hope.</para> 1284 </listitem> 1285 1286</itemizedlist> 1287 1288</sect2> 1289</sect1> 1290 1291<sect1 id="mc-manual.monitor-commands" xreflabel="Memcheck Monitor Commands"> 1292<title>Memcheck Monitor Commands</title> 1293<para>The Memcheck tool provides monitor commands handled by Valgrind's 1294built-in gdbserver (see <xref linkend="manual-core-adv.gdbserver-commandhandling"/>). 1295</para> 1296 1297<itemizedlist> 1298 <listitem> 1299 <para><varname>get_vbits <addr> [<len>]</varname> 1300 shows the definedness (V) bits for <len> (default 1) bytes 1301 starting at <addr>. The definedness of each byte in the 1302 range is given using two hexadecimal digits. These hexadecimal 1303 digits encode the validity of each bit of the corresponding byte, 1304 using 0 if the bit is defined and 1 if the bit is undefined. 1305 If a byte is not addressable, its validity bits are replaced 1306 by <varname>__</varname> (a double underscore). 1307 </para> 1308 <para> 1309 In the following example, <varname>string10</varname> is an array 1310 of 10 characters, in which the even numbered bytes are 1311 undefined. In the below example, the byte corresponding 1312 to <varname>string10[5]</varname> is not addressable. 1313 </para> 1314<programlisting><![CDATA[ 1315(gdb) p &string10 1316$4 = (char (*)[10]) 0x8049e28 1317(gdb) monitor get_vbits 0x8049e28 10 1318ff00ff00 ff__ff00 ff00 1319(gdb) 1320]]></programlisting> 1321 1322 <para> The command get_vbits cannot be used with registers. To get 1323 the validity bits of a register, you must start Valgrind with the 1324 option <option>--vgdb-shadow-registers=yes</option>. The validity 1325 bits of a register can be obtained by printing the 'shadow 1' 1326 corresponding register. In the below x86 example, the register 1327 eax has all its bits undefined, while the register ebx is fully 1328 defined. 1329 </para> 1330<programlisting><![CDATA[ 1331(gdb) p /x $eaxs1 1332$9 = 0xffffffff 1333(gdb) p /x $ebxs1 1334$10 = 0x0 1335(gdb) 1336]]></programlisting> 1337 1338 </listitem> 1339 1340 <listitem> 1341 <para><varname>make_memory 1342 [noaccess|undefined|defined|Definedifaddressable] <addr> 1343 [<len>]</varname> marks the range of <len> (default 1) 1344 bytes at <addr> as having the given status. Parameter 1345 <varname>noaccess</varname> marks the range as non-accessible, so 1346 Memcheck will report an error on any access to it. 1347 <varname>undefined</varname> or <varname>defined</varname> mark 1348 the area as accessible, but Memcheck regards the bytes in it 1349 respectively as having undefined or defined values. 1350 <varname>Definedifaddressable</varname> marks as defined, bytes in 1351 the range which are already addressible, but makes no change to 1352 the status of bytes in the range which are not addressible. Note 1353 that the first letter of <varname>Definedifaddressable</varname> 1354 is an uppercase D to avoid confusion with <varname>defined</varname>. 1355 </para> 1356 1357 <para> 1358 In the following example, the first byte of the 1359 <varname>string10</varname> is marked as defined: 1360 </para> 1361<programlisting><![CDATA[ 1362(gdb) monitor make_memory defined 0x8049e28 1 1363(gdb) monitor get_vbits 0x8049e28 10 13640000ff00 ff00ff00 ff00 1365(gdb) 1366]]></programlisting> 1367 </listitem> 1368 1369 <listitem> 1370 <para><varname>check_memory [addressable|defined] <addr> 1371 [<len>]</varname> checks that the range of <len> 1372 (default 1) bytes at <addr> has the specified accessibility. 1373 It then outputs a description of <addr>. In the following 1374 example, a detailed description is available because the 1375 option <option>--read-var-info=yes</option> was given Valgrind at 1376 startup: 1377 </para> 1378<programlisting><![CDATA[ 1379(gdb) monitor check_memory defined 0x8049e28 1 1380Address 0x8049E28 len 1 defined 1381==14698== Location 0x8049e28 is 0 bytes inside string10[0], 1382==14698== declared at prog.c:10, in frame #0 of thread 1 1383(gdb) 1384]]></programlisting> 1385 </listitem> 1386 1387 <listitem> 1388 <para><varname>leak_check [full*|summary] 1389 [reachable|possibleleak*|definiteleak] 1390 [increased*|changed|any] 1391 </varname> 1392 performs a leak check. The <varname>*</varname> in the arguments 1393 indicates the default value. </para> 1394 1395 <para> If the first argument is <varname>summary</varname>, only a 1396 summary of the leak search is given; otherwise a full leak report 1397 is produced. A full leak report gives detailed information for 1398 each leak: the stack trace where the leaked blocks were allocated, 1399 the number of blocks leaked and their total size. When a full 1400 report is requested, the next two arguments further specify what 1401 kind of leaks to report. A leak's details are shown if they match 1402 both the second and third argument. 1403 </para> 1404 1405 <para>The second argument controls what kind of blocks are shown for 1406 a <varname>full</varname> leak search. The 1407 value <varname>definiteleak</varname> specifies that only 1408 definitely leaked blocks should be shown. The 1409 value <varname>possibleleak</varname> will also show possibly 1410 leaked blocks (those for which only an interior pointer was 1411 found). The value 1412 <varname>reachable</varname> will show all block categories 1413 (reachable, possibly leaked, definitely leaked). 1414 </para> 1415 1416 <para>The third argument controls what kinds of changes are shown 1417 for a <varname>full</varname> leak search. The 1418 value <varname>increased</varname> specifies that only block 1419 allocation stacks with an increased number of leaked bytes or 1420 blocks since the previous leak check should be shown. The 1421 value <varname>changed</varname> specifies that allocation stacks 1422 with any change since the previous leak check should be shown. 1423 The value <varname>any</varname> specifies that all leak entries 1424 should be shown, regardless of any increase or decrease. When 1425 If <varname>increased</varname> or <varname>changed</varname> are 1426 specified, the leak report entries will show the delta relative to 1427 the previous leak report. 1428 </para> 1429 1430 <para>The following example shows usage of the 1431 <varname>leak_check monitor</varname> command on 1432 the <varname>memcheck/tests/leak-cases.c</varname> regression 1433 test. The first command outputs one entry having an increase in 1434 the leaked bytes. The second command is the same as the first 1435 command, but uses the abbreviated forms accepted by GDB and the 1436 Valgrind gdbserver. It only outputs the summary information, as 1437 there was no increase since the previous leak search.</para> 1438<programlisting><![CDATA[ 1439(gdb) monitor leak_check full possibleleak increased 1440==14729== 16 (+16) bytes in 1 (+1) blocks are possibly lost in loss record 13 of 16 1441==14729== at 0x4006E9E: malloc (vg_replace_malloc.c:236) 1442==14729== by 0x80484D5: mk (leak-cases.c:52) 1443==14729== by 0x804855F: f (leak-cases.c:81) 1444==14729== by 0x80488F5: main (leak-cases.c:107) 1445==14729== 1446==14729== LEAK SUMMARY: 1447==14729== definitely lost: 32 (+0) bytes in 2 (+0) blocks 1448==14729== indirectly lost: 16 (+0) bytes in 1 (+0) blocks 1449==14729== possibly lost: 32 (+16) bytes in 2 (+1) blocks 1450==14729== still reachable: 96 (+16) bytes in 6 (+1) blocks 1451==14729== suppressed: 0 (+0) bytes in 0 (+0) blocks 1452==14729== Reachable blocks (those to which a pointer was found) are not shown. 1453==14729== To see them, add 'reachable any' args to leak_check 1454==14729== 1455(gdb) mo l 1456==14729== LEAK SUMMARY: 1457==14729== definitely lost: 32 (+0) bytes in 2 (+0) blocks 1458==14729== indirectly lost: 16 (+0) bytes in 1 (+0) blocks 1459==14729== possibly lost: 32 (+0) bytes in 2 (+0) blocks 1460==14729== still reachable: 96 (+0) bytes in 6 (+0) blocks 1461==14729== suppressed: 0 (+0) bytes in 0 (+0) blocks 1462==14729== Reachable blocks (those to which a pointer was found) are not shown. 1463==14729== To see them, add 'reachable any' args to leak_check 1464==14729== 1465(gdb) 1466]]></programlisting> 1467 <para>Note that when using Valgrind's gdbserver, it is not 1468 necessary to rerun 1469 with <option>--leak-check=full</option> 1470 <option>--show-reachable=yes</option> to see the reachable 1471 blocks. You can obtain the same information without rerunning by 1472 using the GDB command <computeroutput>monitor leak_check full 1473 reachable any</computeroutput> (or, using 1474 abbreviation: <computeroutput>mo l f r a</computeroutput>). 1475 </para> 1476 </listitem> 1477</itemizedlist> 1478 1479</sect1> 1480 1481<sect1 id="mc-manual.clientreqs" xreflabel="Client requests"> 1482<title>Client Requests</title> 1483 1484<para>The following client requests are defined in 1485<filename>memcheck.h</filename>. 1486See <filename>memcheck.h</filename> for exact details of their 1487arguments.</para> 1488 1489<itemizedlist> 1490 1491 <listitem> 1492 <para><varname>VALGRIND_MAKE_MEM_NOACCESS</varname>, 1493 <varname>VALGRIND_MAKE_MEM_UNDEFINED</varname> and 1494 <varname>VALGRIND_MAKE_MEM_DEFINED</varname>. 1495 These mark address ranges as completely inaccessible, 1496 accessible but containing undefined data, and accessible and 1497 containing defined data, respectively.</para> 1498 </listitem> 1499 1500 <listitem> 1501 <para><varname>VALGRIND_MAKE_MEM_DEFINED_IF_ADDRESSABLE</varname>. 1502 This is just like <varname>VALGRIND_MAKE_MEM_DEFINED</varname> but only 1503 affects those bytes that are already addressable.</para> 1504 </listitem> 1505 1506 <listitem> 1507 <para><varname>VALGRIND_CHECK_MEM_IS_ADDRESSABLE</varname> and 1508 <varname>VALGRIND_CHECK_MEM_IS_DEFINED</varname>: check immediately 1509 whether or not the given address range has the relevant property, 1510 and if not, print an error message. Also, for the convenience of 1511 the client, returns zero if the relevant property holds; otherwise, 1512 the returned value is the address of the first byte for which the 1513 property is not true. Always returns 0 when not run on 1514 Valgrind.</para> 1515 </listitem> 1516 1517 <listitem> 1518 <para><varname>VALGRIND_CHECK_VALUE_IS_DEFINED</varname>: a quick and easy 1519 way to find out whether Valgrind thinks a particular value 1520 (lvalue, to be precise) is addressable and defined. Prints an error 1521 message if not. It has no return value.</para> 1522 </listitem> 1523 1524 <listitem> 1525 <para><varname>VALGRIND_DO_LEAK_CHECK</varname>: does a full memory leak 1526 check (like <option>--leak-check=full</option>) right now. 1527 This is useful for incrementally checking for leaks between arbitrary 1528 places in the program's execution. It has no return value.</para> 1529 </listitem> 1530 1531 <listitem> 1532 <para><varname>VALGRIND_DO_ADDED_LEAK_CHECK</varname>: same as 1533 <varname> VALGRIND_DO_LEAK_CHECK</varname> but only shows the 1534 entries for which there was an increase in leaked bytes or leaked 1535 number of blocks since the previous leak search. It has no return 1536 value.</para> 1537 </listitem> 1538 1539 <listitem> 1540 <para><varname>VALGRIND_DO_CHANGED_LEAK_CHECK</varname>: same as 1541 <varname>VALGRIND_DO_LEAK_CHECK</varname> but only shows the 1542 entries for which there was an increase or decrease in leaked 1543 bytes or leaked number of blocks since the previous leak search. It 1544 has no return value.</para> 1545 </listitem> 1546 1547 <listitem> 1548 <para><varname>VALGRIND_DO_QUICK_LEAK_CHECK</varname>: like 1549 <varname>VALGRIND_DO_LEAK_CHECK</varname>, except it produces only a leak 1550 summary (like <option>--leak-check=summary</option>). 1551 It has no return value.</para> 1552 </listitem> 1553 1554 <listitem> 1555 <para><varname>VALGRIND_COUNT_LEAKS</varname>: fills in the four 1556 arguments with the number of bytes of memory found by the previous 1557 leak check to be leaked (i.e. the sum of direct leaks and indirect leaks), 1558 dubious, reachable and suppressed. This is useful in test harness code, 1559 after calling <varname>VALGRIND_DO_LEAK_CHECK</varname> or 1560 <varname>VALGRIND_DO_QUICK_LEAK_CHECK</varname>.</para> 1561 </listitem> 1562 1563 <listitem> 1564 <para><varname>VALGRIND_COUNT_LEAK_BLOCKS</varname>: identical to 1565 <varname>VALGRIND_COUNT_LEAKS</varname> except that it returns the 1566 number of blocks rather than the number of bytes in each 1567 category.</para> 1568 </listitem> 1569 1570 <listitem> 1571 <para><varname>VALGRIND_GET_VBITS</varname> and 1572 <varname>VALGRIND_SET_VBITS</varname>: allow you to get and set the 1573 V (validity) bits for an address range. You should probably only 1574 set V bits that you have got with 1575 <varname>VALGRIND_GET_VBITS</varname>. Only for those who really 1576 know what they are doing.</para> 1577 </listitem> 1578 1579 <listitem> 1580 <para><varname>VALGRIND_CREATE_BLOCK</varname> and 1581 <varname>VALGRIND_DISCARD</varname>. <varname>VALGRIND_CREATE_BLOCK</varname> 1582 takes an address, a number of bytes and a character string. The 1583 specified address range is then associated with that string. When 1584 Memcheck reports an invalid access to an address in the range, it 1585 will describe it in terms of this block rather than in terms of 1586 any other block it knows about. Note that the use of this macro 1587 does not actually change the state of memory in any way -- it 1588 merely gives a name for the range. 1589 </para> 1590 1591 <para>At some point you may want Memcheck to stop reporting errors 1592 in terms of the block named 1593 by <varname>VALGRIND_CREATE_BLOCK</varname>. To make this 1594 possible, <varname>VALGRIND_CREATE_BLOCK</varname> returns a 1595 "block handle", which is a C <varname>int</varname> value. You 1596 can pass this block handle to <varname>VALGRIND_DISCARD</varname>. 1597 After doing so, Valgrind will no longer relate addressing errors 1598 in the specified range to the block. Passing invalid handles to 1599 <varname>VALGRIND_DISCARD</varname> is harmless. 1600 </para> 1601 </listitem> 1602 1603</itemizedlist> 1604 1605</sect1> 1606 1607 1608 1609 1610<sect1 id="mc-manual.mempools" xreflabel="Memory Pools"> 1611<title>Memory Pools: describing and working with custom allocators</title> 1612 1613<para>Some programs use custom memory allocators, often for performance 1614reasons. Left to itself, Memcheck is unable to understand the 1615behaviour of custom allocation schemes as well as it understands the 1616standard allocators, and so may miss errors and leaks in your program. What 1617this section describes is a way to give Memcheck enough of a description of 1618your custom allocator that it can make at least some sense of what is 1619happening.</para> 1620 1621<para>There are many different sorts of custom allocator, so Memcheck 1622attempts to reason about them using a loose, abstract model. We 1623use the following terminology when describing custom allocation 1624systems:</para> 1625 1626<itemizedlist> 1627 <listitem> 1628 <para>Custom allocation involves a set of independent "memory pools". 1629 </para> 1630 </listitem> 1631 <listitem> 1632 <para>Memcheck's notion of a a memory pool consists of a single "anchor 1633 address" and a set of non-overlapping "chunks" associated with the 1634 anchor address.</para> 1635 </listitem> 1636 <listitem> 1637 <para>Typically a pool's anchor address is the address of a 1638 book-keeping "header" structure.</para> 1639 </listitem> 1640 <listitem> 1641 <para>Typically the pool's chunks are drawn from a contiguous 1642 "superblock" acquired through the system 1643 <function>malloc</function> or 1644 <function>mmap</function>.</para> 1645 </listitem> 1646 1647</itemizedlist> 1648 1649<para>Keep in mind that the last two points above say "typically": the 1650Valgrind mempool client request API is intentionally vague about the 1651exact structure of a mempool. There is no specific mention made of 1652headers or superblocks. Nevertheless, the following picture may help 1653elucidate the intention of the terms in the API:</para> 1654 1655<programlisting><![CDATA[ 1656 "pool" 1657 (anchor address) 1658 | 1659 v 1660 +--------+---+ 1661 | header | o | 1662 +--------+-|-+ 1663 | 1664 v superblock 1665 +------+---+--------------+---+------------------+ 1666 | |rzB| allocation |rzB| | 1667 +------+---+--------------+---+------------------+ 1668 ^ ^ 1669 | | 1670 "addr" "addr"+"size" 1671]]></programlisting> 1672 1673<para> 1674Note that the header and the superblock may be contiguous or 1675discontiguous, and there may be multiple superblocks associated with a 1676single header; such variations are opaque to Memcheck. The API 1677only requires that your allocation scheme can present sensible values 1678of "pool", "addr" and "size".</para> 1679 1680<para> 1681Typically, before making client requests related to mempools, a client 1682program will have allocated such a header and superblock for their 1683mempool, and marked the superblock NOACCESS using the 1684<varname>VALGRIND_MAKE_MEM_NOACCESS</varname> client request.</para> 1685 1686<para> 1687When dealing with mempools, the goal is to maintain a particular 1688invariant condition: that Memcheck believes the unallocated portions 1689of the pool's superblock (including redzones) are NOACCESS. To 1690maintain this invariant, the client program must ensure that the 1691superblock starts out in that state; Memcheck cannot make it so, since 1692Memcheck never explicitly learns about the superblock of a pool, only 1693the allocated chunks within the pool.</para> 1694 1695<para> 1696Once the header and superblock for a pool are established and properly 1697marked, there are a number of client requests programs can use to 1698inform Memcheck about changes to the state of a mempool:</para> 1699 1700<itemizedlist> 1701 1702 <listitem> 1703 <para> 1704 <varname>VALGRIND_CREATE_MEMPOOL(pool, rzB, is_zeroed)</varname>: 1705 This request registers the address <varname>pool</varname> as the anchor 1706 address for a memory pool. It also provides a size 1707 <varname>rzB</varname>, specifying how large the redzones placed around 1708 chunks allocated from the pool should be. Finally, it provides an 1709 <varname>is_zeroed</varname> argument that specifies whether the pool's 1710 chunks are zeroed (more precisely: defined) when allocated. 1711 </para> 1712 <para> 1713 Upon completion of this request, no chunks are associated with the 1714 pool. The request simply tells Memcheck that the pool exists, so that 1715 subsequent calls can refer to it as a pool. 1716 </para> 1717 </listitem> 1718 1719 <listitem> 1720 <para><varname>VALGRIND_DESTROY_MEMPOOL(pool)</varname>: 1721 This request tells Memcheck that a pool is being torn down. Memcheck 1722 then removes all records of chunks associated with the pool, as well 1723 as its record of the pool's existence. While destroying its records of 1724 a mempool, Memcheck resets the redzones of any live chunks in the pool 1725 to NOACCESS. 1726 </para> 1727 </listitem> 1728 1729 <listitem> 1730 <para><varname>VALGRIND_MEMPOOL_ALLOC(pool, addr, size)</varname>: 1731 This request informs Memcheck that a <varname>size</varname>-byte chunk 1732 has been allocated at <varname>addr</varname>, and associates the chunk with the 1733 specified 1734 <varname>pool</varname>. If the pool was created with nonzero 1735 <varname>rzB</varname> redzones, Memcheck will mark the 1736 <varname>rzB</varname> bytes before and after the chunk as NOACCESS. If 1737 the pool was created with the <varname>is_zeroed</varname> argument set, 1738 Memcheck will mark the chunk as DEFINED, otherwise Memcheck will mark 1739 the chunk as UNDEFINED. 1740 </para> 1741 </listitem> 1742 1743 <listitem> 1744 <para><varname>VALGRIND_MEMPOOL_FREE(pool, addr)</varname>: 1745 This request informs Memcheck that the chunk at <varname>addr</varname> 1746 should no longer be considered allocated. Memcheck will mark the chunk 1747 associated with <varname>addr</varname> as NOACCESS, and delete its 1748 record of the chunk's existence. 1749 </para> 1750 </listitem> 1751 1752 <listitem> 1753 <para><varname>VALGRIND_MEMPOOL_TRIM(pool, addr, size)</varname>: 1754 This request trims the chunks associated with <varname>pool</varname>. 1755 The request only operates on chunks associated with 1756 <varname>pool</varname>. Trimming is formally defined as:</para> 1757 <itemizedlist> 1758 <listitem> 1759 <para> All chunks entirely inside the range 1760 <varname>addr..(addr+size-1)</varname> are preserved.</para> 1761 </listitem> 1762 <listitem> 1763 <para>All chunks entirely outside the range 1764 <varname>addr..(addr+size-1)</varname> are discarded, as though 1765 <varname>VALGRIND_MEMPOOL_FREE</varname> was called on them. </para> 1766 </listitem> 1767 <listitem> 1768 <para>All other chunks must intersect with the range 1769 <varname>addr..(addr+size-1)</varname>; areas outside the 1770 intersection are marked as NOACCESS, as though they had been 1771 independently freed with 1772 <varname>VALGRIND_MEMPOOL_FREE</varname>.</para> 1773 </listitem> 1774 </itemizedlist> 1775 <para>This is a somewhat rare request, but can be useful in 1776 implementing the type of mass-free operations common in custom 1777 LIFO allocators.</para> 1778 </listitem> 1779 1780 <listitem> 1781 <para><varname>VALGRIND_MOVE_MEMPOOL(poolA, poolB)</varname>: This 1782 request informs Memcheck that the pool previously anchored at 1783 address <varname>poolA</varname> has moved to anchor address 1784 <varname>poolB</varname>. This is a rare request, typically only needed 1785 if you <function>realloc</function> the header of a mempool.</para> 1786 <para>No memory-status bits are altered by this request.</para> 1787 </listitem> 1788 1789 <listitem> 1790 <para> 1791 <varname>VALGRIND_MEMPOOL_CHANGE(pool, addrA, addrB, 1792 size)</varname>: This request informs Memcheck that the chunk 1793 previously allocated at address <varname>addrA</varname> within 1794 <varname>pool</varname> has been moved and/or resized, and should be 1795 changed to cover the region <varname>addrB..(addrB+size-1)</varname>. This 1796 is a rare request, typically only needed if you 1797 <function>realloc</function> a superblock or wish to extend a chunk 1798 without changing its memory-status bits. 1799 </para> 1800 <para>No memory-status bits are altered by this request. 1801 </para> 1802 </listitem> 1803 1804 <listitem> 1805 <para><varname>VALGRIND_MEMPOOL_EXISTS(pool)</varname>: 1806 This request informs the caller whether or not Memcheck is currently 1807 tracking a mempool at anchor address <varname>pool</varname>. It 1808 evaluates to 1 when there is a mempool associated with that address, 0 1809 otherwise. This is a rare request, only useful in circumstances when 1810 client code might have lost track of the set of active mempools. 1811 </para> 1812 </listitem> 1813 1814</itemizedlist> 1815 1816</sect1> 1817 1818 1819 1820 1821 1822 1823 1824<sect1 id="mc-manual.mpiwrap" xreflabel="MPI Wrappers"> 1825<title>Debugging MPI Parallel Programs with Valgrind</title> 1826 1827<para>Memcheck supports debugging of distributed-memory applications 1828which use the MPI message passing standard. This support consists of a 1829library of wrapper functions for the 1830<computeroutput>PMPI_*</computeroutput> interface. When incorporated 1831into the application's address space, either by direct linking or by 1832<computeroutput>LD_PRELOAD</computeroutput>, the wrappers intercept 1833calls to <computeroutput>PMPI_Send</computeroutput>, 1834<computeroutput>PMPI_Recv</computeroutput>, etc. They then 1835use client requests to inform Memcheck of memory state changes caused 1836by the function being wrapped. This reduces the number of false 1837positives that Memcheck otherwise typically reports for MPI 1838applications.</para> 1839 1840<para>The wrappers also take the opportunity to carefully check 1841size and definedness of buffers passed as arguments to MPI functions, hence 1842detecting errors such as passing undefined data to 1843<computeroutput>PMPI_Send</computeroutput>, or receiving data into a 1844buffer which is too small.</para> 1845 1846<para>Unlike most of the rest of Valgrind, the wrapper library is subject to a 1847BSD-style license, so you can link it into any code base you like. 1848See the top of <computeroutput>mpi/libmpiwrap.c</computeroutput> 1849for license details.</para> 1850 1851 1852<sect2 id="mc-manual.mpiwrap.build" xreflabel="Building MPI Wrappers"> 1853<title>Building and installing the wrappers</title> 1854 1855<para> The wrapper library will be built automatically if possible. 1856Valgrind's configure script will look for a suitable 1857<computeroutput>mpicc</computeroutput> to build it with. This must be 1858the same <computeroutput>mpicc</computeroutput> you use to build the 1859MPI application you want to debug. By default, Valgrind tries 1860<computeroutput>mpicc</computeroutput>, but you can specify a 1861different one by using the configure-time option 1862<option>--with-mpicc</option>. Currently the 1863wrappers are only buildable with 1864<computeroutput>mpicc</computeroutput>s which are based on GNU 1865GCC or Intel's C++ Compiler.</para> 1866 1867<para>Check that the configure script prints a line like this:</para> 1868 1869<programlisting><![CDATA[ 1870checking for usable MPI2-compliant mpicc and mpi.h... yes, mpicc 1871]]></programlisting> 1872 1873<para>If it says <computeroutput>... no</computeroutput>, your 1874<computeroutput>mpicc</computeroutput> has failed to compile and link 1875a test MPI2 program.</para> 1876 1877<para>If the configure test succeeds, continue in the usual way with 1878<computeroutput>make</computeroutput> and <computeroutput>make 1879install</computeroutput>. The final install tree should then contain 1880<computeroutput>libmpiwrap-<platform>.so</computeroutput>. 1881</para> 1882 1883<para>Compile up a test MPI program (eg, MPI hello-world) and try 1884this:</para> 1885 1886<programlisting><![CDATA[ 1887LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so \ 1888 mpirun [args] $prefix/bin/valgrind ./hello 1889]]></programlisting> 1890 1891<para>You should see something similar to the following</para> 1892 1893<programlisting><![CDATA[ 1894valgrind MPI wrappers 31901: Active for pid 31901 1895valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options 1896]]></programlisting> 1897 1898<para>repeated for every process in the group. If you do not see 1899these, there is an build/installation problem of some kind.</para> 1900 1901<para> The MPI functions to be wrapped are assumed to be in an ELF 1902shared object with soname matching 1903<computeroutput>libmpi.so*</computeroutput>. This is known to be 1904correct at least for Open MPI and Quadrics MPI, and can easily be 1905changed if required.</para> 1906</sect2> 1907 1908 1909<sect2 id="mc-manual.mpiwrap.gettingstarted" 1910 xreflabel="Getting started with MPI Wrappers"> 1911<title>Getting started</title> 1912 1913<para>Compile your MPI application as usual, taking care to link it 1914using the same <computeroutput>mpicc</computeroutput> that your 1915Valgrind build was configured with.</para> 1916 1917<para> 1918Use the following basic scheme to run your application on Valgrind with 1919the wrappers engaged:</para> 1920 1921<programlisting><![CDATA[ 1922MPIWRAP_DEBUG=[wrapper-args] \ 1923 LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so \ 1924 mpirun [mpirun-args] \ 1925 $prefix/bin/valgrind [valgrind-args] \ 1926 [application] [app-args] 1927]]></programlisting> 1928 1929<para>As an alternative to 1930<computeroutput>LD_PRELOAD</computeroutput>ing 1931<computeroutput>libmpiwrap-<platform>.so</computeroutput>, you can 1932simply link it to your application if desired. This should not disturb 1933native behaviour of your application in any way.</para> 1934</sect2> 1935 1936 1937<sect2 id="mc-manual.mpiwrap.controlling" 1938 xreflabel="Controlling the MPI Wrappers"> 1939<title>Controlling the wrapper library</title> 1940 1941<para>Environment variable 1942<computeroutput>MPIWRAP_DEBUG</computeroutput> is consulted at 1943startup. The default behaviour is to print a starting banner</para> 1944 1945<programlisting><![CDATA[ 1946valgrind MPI wrappers 16386: Active for pid 16386 1947valgrind MPI wrappers 16386: Try MPIWRAP_DEBUG=help for possible options 1948]]></programlisting> 1949 1950<para> and then be relatively quiet.</para> 1951 1952<para>You can give a list of comma-separated options in 1953<computeroutput>MPIWRAP_DEBUG</computeroutput>. These are</para> 1954 1955<itemizedlist> 1956 <listitem> 1957 <para><computeroutput>verbose</computeroutput>: 1958 show entries/exits of all wrappers. Also show extra 1959 debugging info, such as the status of outstanding 1960 <computeroutput>MPI_Request</computeroutput>s resulting 1961 from uncompleted <computeroutput>MPI_Irecv</computeroutput>s.</para> 1962 </listitem> 1963 <listitem> 1964 <para><computeroutput>quiet</computeroutput>: 1965 opposite of <computeroutput>verbose</computeroutput>, only print 1966 anything when the wrappers want 1967 to report a detected programming error, or in case of catastrophic 1968 failure of the wrappers.</para> 1969 </listitem> 1970 <listitem> 1971 <para><computeroutput>warn</computeroutput>: 1972 by default, functions which lack proper wrappers 1973 are not commented on, just silently 1974 ignored. This causes a warning to be printed for each unwrapped 1975 function used, up to a maximum of three warnings per function.</para> 1976 </listitem> 1977 <listitem> 1978 <para><computeroutput>strict</computeroutput>: 1979 print an error message and abort the program if 1980 a function lacking a wrapper is used.</para> 1981 </listitem> 1982</itemizedlist> 1983 1984<para> If you want to use Valgrind's XML output facility 1985(<option>--xml=yes</option>), you should pass 1986<computeroutput>quiet</computeroutput> in 1987<computeroutput>MPIWRAP_DEBUG</computeroutput> so as to get rid of any 1988extraneous printing from the wrappers.</para> 1989 1990</sect2> 1991 1992 1993<sect2 id="mc-manual.mpiwrap.limitations.functions" 1994 xreflabel="Functions: Abilities and Limitations"> 1995<title>Functions</title> 1996 1997<para>All MPI2 functions except 1998<computeroutput>MPI_Wtick</computeroutput>, 1999<computeroutput>MPI_Wtime</computeroutput> and 2000<computeroutput>MPI_Pcontrol</computeroutput> have wrappers. The 2001first two are not wrapped because they return a 2002<computeroutput>double</computeroutput>, which Valgrind's 2003function-wrap mechanism cannot handle (but it could easily be 2004extended to do so). <computeroutput>MPI_Pcontrol</computeroutput> cannot be 2005wrapped as it has variable arity: 2006<computeroutput>int MPI_Pcontrol(const int level, ...)</computeroutput></para> 2007 2008<para>Most functions are wrapped with a default wrapper which does 2009nothing except complain or abort if it is called, depending on 2010settings in <computeroutput>MPIWRAP_DEBUG</computeroutput> listed 2011above. The following functions have "real", do-something-useful 2012wrappers:</para> 2013 2014<programlisting><![CDATA[ 2015PMPI_Send PMPI_Bsend PMPI_Ssend PMPI_Rsend 2016 2017PMPI_Recv PMPI_Get_count 2018 2019PMPI_Isend PMPI_Ibsend PMPI_Issend PMPI_Irsend 2020 2021PMPI_Irecv 2022PMPI_Wait PMPI_Waitall 2023PMPI_Test PMPI_Testall 2024 2025PMPI_Iprobe PMPI_Probe 2026 2027PMPI_Cancel 2028 2029PMPI_Sendrecv 2030 2031PMPI_Type_commit PMPI_Type_free 2032 2033PMPI_Pack PMPI_Unpack 2034 2035PMPI_Bcast PMPI_Gather PMPI_Scatter PMPI_Alltoall 2036PMPI_Reduce PMPI_Allreduce PMPI_Op_create 2037 2038PMPI_Comm_create PMPI_Comm_dup PMPI_Comm_free PMPI_Comm_rank PMPI_Comm_size 2039 2040PMPI_Error_string 2041PMPI_Init PMPI_Initialized PMPI_Finalize 2042]]></programlisting> 2043 2044<para> A few functions such as 2045<computeroutput>PMPI_Address</computeroutput> are listed as 2046<computeroutput>HAS_NO_WRAPPER</computeroutput>. They have no wrapper 2047at all as there is nothing worth checking, and giving a no-op wrapper 2048would reduce performance for no reason.</para> 2049 2050<para> Note that the wrapper library itself can itself generate large 2051numbers of calls to the MPI implementation, especially when walking 2052complex types. The most common functions called are 2053<computeroutput>PMPI_Extent</computeroutput>, 2054<computeroutput>PMPI_Type_get_envelope</computeroutput>, 2055<computeroutput>PMPI_Type_get_contents</computeroutput>, and 2056<computeroutput>PMPI_Type_free</computeroutput>. </para> 2057</sect2> 2058 2059<sect2 id="mc-manual.mpiwrap.limitations.types" 2060 xreflabel="Types: Abilities and Limitations"> 2061<title>Types</title> 2062 2063<para> MPI-1.1 structured types are supported, and walked exactly. 2064The currently supported combiners are 2065<computeroutput>MPI_COMBINER_NAMED</computeroutput>, 2066<computeroutput>MPI_COMBINER_CONTIGUOUS</computeroutput>, 2067<computeroutput>MPI_COMBINER_VECTOR</computeroutput>, 2068<computeroutput>MPI_COMBINER_HVECTOR</computeroutput> 2069<computeroutput>MPI_COMBINER_INDEXED</computeroutput>, 2070<computeroutput>MPI_COMBINER_HINDEXED</computeroutput> and 2071<computeroutput>MPI_COMBINER_STRUCT</computeroutput>. This should 2072cover all MPI-1.1 types. The mechanism (function 2073<computeroutput>walk_type</computeroutput>) should extend easily to 2074cover MPI2 combiners.</para> 2075 2076<para>MPI defines some named structured types 2077(<computeroutput>MPI_FLOAT_INT</computeroutput>, 2078<computeroutput>MPI_DOUBLE_INT</computeroutput>, 2079<computeroutput>MPI_LONG_INT</computeroutput>, 2080<computeroutput>MPI_2INT</computeroutput>, 2081<computeroutput>MPI_SHORT_INT</computeroutput>, 2082<computeroutput>MPI_LONG_DOUBLE_INT</computeroutput>) which are pairs 2083of some basic type and a C <computeroutput>int</computeroutput>. 2084Unfortunately the MPI specification makes it impossible to look inside 2085these types and see where the fields are. Therefore these wrappers 2086assume the types are laid out as <computeroutput>struct { float val; 2087int loc; }</computeroutput> (for 2088<computeroutput>MPI_FLOAT_INT</computeroutput>), etc, and act 2089accordingly. This appears to be correct at least for Open MPI 1.0.2 2090and for Quadrics MPI.</para> 2091 2092<para>If <computeroutput>strict</computeroutput> is an option specified 2093in <computeroutput>MPIWRAP_DEBUG</computeroutput>, the application 2094will abort if an unhandled type is encountered. Otherwise, the 2095application will print a warning message and continue.</para> 2096 2097<para>Some effort is made to mark/check memory ranges corresponding to 2098arrays of values in a single pass. This is important for performance 2099since asking Valgrind to mark/check any range, no matter how small, 2100carries quite a large constant cost. This optimisation is applied to 2101arrays of primitive types (<computeroutput>double</computeroutput>, 2102<computeroutput>float</computeroutput>, 2103<computeroutput>int</computeroutput>, 2104<computeroutput>long</computeroutput>, <computeroutput>long 2105long</computeroutput>, <computeroutput>short</computeroutput>, 2106<computeroutput>char</computeroutput>, and <computeroutput>long 2107double</computeroutput> on platforms where <computeroutput>sizeof(long 2108double) == 8</computeroutput>). For arrays of all other types, the 2109wrappers handle each element individually and so there can be a very 2110large performance cost.</para> 2111 2112</sect2> 2113 2114 2115<sect2 id="mc-manual.mpiwrap.writingwrappers" 2116 xreflabel="Writing new MPI Wrappers"> 2117<title>Writing new wrappers</title> 2118 2119<para> 2120For the most part the wrappers are straightforward. The only 2121significant complexity arises with nonblocking receives.</para> 2122 2123<para>The issue is that <computeroutput>MPI_Irecv</computeroutput> 2124states the recv buffer and returns immediately, giving a handle 2125(<computeroutput>MPI_Request</computeroutput>) for the transaction. 2126Later the user will have to poll for completion with 2127<computeroutput>MPI_Wait</computeroutput> etc, and when the 2128transaction completes successfully, the wrappers have to paint the 2129recv buffer. But the recv buffer details are not presented to 2130<computeroutput>MPI_Wait</computeroutput> -- only the handle is. The 2131library therefore maintains a shadow table which associates 2132uncompleted <computeroutput>MPI_Request</computeroutput>s with the 2133corresponding buffer address/count/type. When an operation completes, 2134the table is searched for the associated address/count/type info, and 2135memory is marked accordingly.</para> 2136 2137<para>Access to the table is guarded by a (POSIX pthreads) lock, so as 2138to make the library thread-safe.</para> 2139 2140<para>The table is allocated with 2141<computeroutput>malloc</computeroutput> and never 2142<computeroutput>free</computeroutput>d, so it will show up in leak 2143checks.</para> 2144 2145<para>Writing new wrappers should be fairly easy. The source file is 2146<computeroutput>mpi/libmpiwrap.c</computeroutput>. If possible, 2147find an existing wrapper for a function of similar behaviour to the 2148one you want to wrap, and use it as a starting point. The wrappers 2149are organised in sections in the same order as the MPI 1.1 spec, to 2150aid navigation. When adding a wrapper, remember to comment out the 2151definition of the default wrapper in the long list of defaults at the 2152bottom of the file (do not remove it, just comment it out).</para> 2153</sect2> 2154 2155<sect2 id="mc-manual.mpiwrap.whattoexpect" 2156 xreflabel="What to expect with MPI Wrappers"> 2157<title>What to expect when using the wrappers</title> 2158 2159<para>The wrappers should reduce Memcheck's false-error rate on MPI 2160applications. Because the wrapping is done at the MPI interface, 2161there will still potentially be a large number of errors reported in 2162the MPI implementation below the interface. The best you can do is 2163try to suppress them.</para> 2164 2165<para>You may also find that the input-side (buffer 2166length/definedness) checks find errors in your MPI use, for example 2167passing too short a buffer to 2168<computeroutput>MPI_Recv</computeroutput>.</para> 2169 2170<para>Functions which are not wrapped may increase the false 2171error rate. A possible approach is to run with 2172<computeroutput>MPI_DEBUG</computeroutput> containing 2173<computeroutput>warn</computeroutput>. This will show you functions 2174which lack proper wrappers but which are nevertheless used. You can 2175then write wrappers for them. 2176</para> 2177 2178<para>A known source of potential false errors are the 2179<computeroutput>PMPI_Reduce</computeroutput> family of functions, when 2180using a custom (user-defined) reduction function. In a reduction 2181operation, each node notionally sends data to a "central point" which 2182uses the specified reduction function to merge the data items into a 2183single item. Hence, in general, data is passed between nodes and fed 2184to the reduction function, but the wrapper library cannot mark the 2185transferred data as initialised before it is handed to the reduction 2186function, because all that happens "inside" the 2187<computeroutput>PMPI_Reduce</computeroutput> call. As a result you 2188may see false positives reported in your reduction function.</para> 2189 2190</sect2> 2191 2192</sect1> 2193 2194 2195 2196 2197 2198</chapter> 2199