1<html> 2<head> 3<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> 4<title>4.�Memcheck: a memory error detector</title> 5<link rel="stylesheet" href="vg_basic.css" type="text/css"> 6<meta name="generator" content="DocBook XSL Stylesheets V1.75.2"> 7<link rel="home" href="index.html" title="Valgrind Documentation"> 8<link rel="up" href="manual.html" title="Valgrind User Manual"> 9<link rel="prev" href="manual-core-adv.html" title="3.�Using and understanding the Valgrind core: Advanced Topics"> 10<link rel="next" href="cg-manual.html" title="5.�Cachegrind: a cache and branch-prediction profiler"> 11</head> 12<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> 13<div><table class="nav" width="100%" cellspacing="3" cellpadding="3" border="0" summary="Navigation header"><tr> 14<td width="22px" align="center" valign="middle"><a accesskey="p" href="manual-core-adv.html"><img src="images/prev.png" width="18" height="21" border="0" alt="Prev"></a></td> 15<td width="25px" align="center" valign="middle"><a accesskey="u" href="manual.html"><img src="images/up.png" width="21" height="18" border="0" alt="Up"></a></td> 16<td width="31px" align="center" valign="middle"><a accesskey="h" href="index.html"><img src="images/home.png" width="27" height="20" border="0" alt="Up"></a></td> 17<th align="center" valign="middle">Valgrind User Manual</th> 18<td width="22px" align="center" valign="middle"><a accesskey="n" href="cg-manual.html"><img src="images/next.png" width="18" height="21" border="0" alt="Next"></a></td> 19</tr></table></div> 20<div class="chapter" title="4.�Memcheck: a memory error detector"> 21<div class="titlepage"><div><div><h2 class="title"> 22<a name="mc-manual"></a>4.�Memcheck: a memory error detector</h2></div></div></div> 23<div class="toc"> 24<p><b>Table of Contents</b></p> 25<dl> 26<dt><span class="sect1"><a href="mc-manual.html#mc-manual.overview">4.1. Overview</a></span></dt> 27<dt><span class="sect1"><a href="mc-manual.html#mc-manual.errormsgs">4.2. Explanation of error messages from Memcheck</a></span></dt> 28<dd><dl> 29<dt><span class="sect2"><a href="mc-manual.html#mc-manual.badrw">4.2.1. Illegal read / Illegal write errors</a></span></dt> 30<dt><span class="sect2"><a href="mc-manual.html#mc-manual.uninitvals">4.2.2. Use of uninitialised values</a></span></dt> 31<dt><span class="sect2"><a href="mc-manual.html#mc-manual.bad-syscall-args">4.2.3. Use of uninitialised or unaddressable values in system 32 calls</a></span></dt> 33<dt><span class="sect2"><a href="mc-manual.html#mc-manual.badfrees">4.2.4. Illegal frees</a></span></dt> 34<dt><span class="sect2"><a href="mc-manual.html#mc-manual.rudefn">4.2.5. When a heap block is freed with an inappropriate deallocation 35function</a></span></dt> 36<dt><span class="sect2"><a href="mc-manual.html#mc-manual.overlap">4.2.6. Overlapping source and destination blocks</a></span></dt> 37<dt><span class="sect2"><a href="mc-manual.html#mc-manual.leaks">4.2.7. Memory leak detection</a></span></dt> 38</dl></dd> 39<dt><span class="sect1"><a href="mc-manual.html#mc-manual.options">4.3. Memcheck Command-Line Options</a></span></dt> 40<dt><span class="sect1"><a href="mc-manual.html#mc-manual.suppfiles">4.4. Writing suppression files</a></span></dt> 41<dt><span class="sect1"><a href="mc-manual.html#mc-manual.machine">4.5. Details of Memcheck's checking machinery</a></span></dt> 42<dd><dl> 43<dt><span class="sect2"><a href="mc-manual.html#mc-manual.value">4.5.1. Valid-value (V) bits</a></span></dt> 44<dt><span class="sect2"><a href="mc-manual.html#mc-manual.vaddress">4.5.2. Valid-address (A) bits</a></span></dt> 45<dt><span class="sect2"><a href="mc-manual.html#mc-manual.together">4.5.3. Putting it all together</a></span></dt> 46</dl></dd> 47<dt><span class="sect1"><a href="mc-manual.html#mc-manual.clientreqs">4.6. Client Requests</a></span></dt> 48<dt><span class="sect1"><a href="mc-manual.html#mc-manual.mempools">4.7. Memory Pools: describing and working with custom allocators</a></span></dt> 49<dt><span class="sect1"><a href="mc-manual.html#mc-manual.mpiwrap">4.8. Debugging MPI Parallel Programs with Valgrind</a></span></dt> 50<dd><dl> 51<dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.build">4.8.1. Building and installing the wrappers</a></span></dt> 52<dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.gettingstarted">4.8.2. Getting started</a></span></dt> 53<dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.controlling">4.8.3. Controlling the wrapper library</a></span></dt> 54<dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.limitations.functions">4.8.4. Functions</a></span></dt> 55<dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.limitations.types">4.8.5. Types</a></span></dt> 56<dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.writingwrappers">4.8.6. Writing new wrappers</a></span></dt> 57<dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.whattoexpect">4.8.7. What to expect when using the wrappers</a></span></dt> 58</dl></dd> 59</dl> 60</div> 61<p>To use this tool, you may specify <code class="option">--tool=memcheck</code> 62on the Valgrind command line. You don't have to, though, since Memcheck 63is the default tool.</p> 64<div class="sect1" title="4.1.�Overview"> 65<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 66<a name="mc-manual.overview"></a>4.1.�Overview</h2></div></div></div> 67<p>Memcheck is a memory error detector. It can detect the following 68problems that are common in C and C++ programs.</p> 69<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 70<li class="listitem"><p>Accessing memory you shouldn't, e.g. overrunning and underrunning 71 heap blocks, overrunning the top of the stack, and accessing memory after 72 it has been freed.</p></li> 73<li class="listitem"><p>Using undefined values, i.e. values that have not been initialised, 74 or that have been derived from other undefined values.</p></li> 75<li class="listitem"><p>Incorrect freeing of heap memory, such as double-freeing heap 76 blocks, or mismatched use of 77 <code class="function">malloc</code>/<code class="computeroutput">new</code>/<code class="computeroutput">new[]</code> 78 versus 79 <code class="function">free</code>/<code class="computeroutput">delete</code>/<code class="computeroutput">delete[]</code></p></li> 80<li class="listitem"><p>Overlapping <code class="computeroutput">src</code> and 81 <code class="computeroutput">dst</code> pointers in 82 <code class="computeroutput">memcpy</code> and related 83 functions.</p></li> 84<li class="listitem"><p>Memory leaks.</p></li> 85</ul></div> 86<p>Problems like these can be difficult to find by other means, 87often remaining undetected for long periods, then causing occasional, 88difficult-to-diagnose crashes.</p> 89</div> 90<div class="sect1" title="4.2.�Explanation of error messages from Memcheck"> 91<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 92<a name="mc-manual.errormsgs"></a>4.2.�Explanation of error messages from Memcheck</h2></div></div></div> 93<p>Memcheck issues a range of error messages. This section presents a 94quick summary of what error messages mean. The precise behaviour of the 95error-checking machinery is described in <a class="xref" href="mc-manual.html#mc-manual.machine" title="4.5.�Details of Memcheck's checking machinery">Details of Memcheck's checking machinery</a>.</p> 96<div class="sect2" title="4.2.1.�Illegal read / Illegal write errors"> 97<div class="titlepage"><div><div><h3 class="title"> 98<a name="mc-manual.badrw"></a>4.2.1.�Illegal read / Illegal write errors</h3></div></div></div> 99<p>For example:</p> 100<pre class="programlisting"> 101Invalid read of size 4 102 at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9) 103 by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9) 104 by 0x40B07FF4: read_png_image(QImageIO *) (kernel/qpngio.cpp:326) 105 by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621) 106 Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd 107</pre> 108<p>This happens when your program reads or writes memory at a place 109which Memcheck reckons it shouldn't. In this example, the program did a 1104-byte read at address 0xBFFFF0E0, somewhere within the system-supplied 111library libpng.so.2.1.0.9, which was called from somewhere else in the 112same library, called from line 326 of <code class="filename">qpngio.cpp</code>, 113and so on.</p> 114<p>Memcheck tries to establish what the illegal address might relate 115to, since that's often useful. So, if it points into a block of memory 116which has already been freed, you'll be informed of this, and also where 117the block was freed. Likewise, if it should turn out to be just off 118the end of a heap block, a common result of off-by-one-errors in 119array subscripting, you'll be informed of this fact, and also where the 120block was allocated. If you use the <code class="option"><a class="xref" href="manual-core.html#opt.read-var-info">--read-var-info</a></code> option Memcheck will run more slowly 121but may give a more detailed description of any illegal address.</p> 122<p>In this example, Memcheck can't identify the address. Actually 123the address is on the stack, but, for some reason, this is not a valid 124stack address -- it is below the stack pointer and that isn't allowed. 125In this particular case it's probably caused by GCC generating invalid 126code, a known bug in some ancient versions of GCC.</p> 127<p>Note that Memcheck only tells you that your program is about to 128access memory at an illegal address. It can't stop the access from 129happening. So, if your program makes an access which normally would 130result in a segmentation fault, you program will still suffer the same 131fate -- but you will get a message from Memcheck immediately prior to 132this. In this particular example, reading junk on the stack is 133non-fatal, and the program stays alive.</p> 134</div> 135<div class="sect2" title="4.2.2.�Use of uninitialised values"> 136<div class="titlepage"><div><div><h3 class="title"> 137<a name="mc-manual.uninitvals"></a>4.2.2.�Use of uninitialised values</h3></div></div></div> 138<p>For example:</p> 139<pre class="programlisting"> 140Conditional jump or move depends on uninitialised value(s) 141 at 0x402DFA94: _IO_vfprintf (_itoa.h:49) 142 by 0x402E8476: _IO_printf (printf.c:36) 143 by 0x8048472: main (tests/manuel1.c:8) 144</pre> 145<p>An uninitialised-value use error is reported when your program 146uses a value which hasn't been initialised -- in other words, is 147undefined. Here, the undefined value is used somewhere inside the 148<code class="function">printf</code> machinery of the C library. This error was 149reported when running the following small program:</p> 150<pre class="programlisting"> 151int main() 152{ 153 int x; 154 printf ("x = %d\n", x); 155}</pre> 156<p>It is important to understand that your program can copy around 157junk (uninitialised) data as much as it likes. Memcheck observes this 158and keeps track of the data, but does not complain. A complaint is 159issued only when your program attempts to make use of uninitialised 160data in a way that might affect your program's externally-visible behaviour. 161In this example, <code class="varname">x</code> is uninitialised. Memcheck observes 162the value being passed to <code class="function">_IO_printf</code> and thence to 163<code class="function">_IO_vfprintf</code>, but makes no comment. However, 164<code class="function">_IO_vfprintf</code> has to examine the value of 165<code class="varname">x</code> so it can turn it into the corresponding ASCII string, 166and it is at this point that Memcheck complains.</p> 167<p>Sources of uninitialised data tend to be:</p> 168<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 169<li class="listitem"><p>Local variables in procedures which have not been initialised, 170 as in the example above.</p></li> 171<li class="listitem"><p>The contents of heap blocks (allocated with 172 <code class="function">malloc</code>, <code class="function">new</code>, or a similar 173 function) before you (or a constructor) write something there. 174 </p></li> 175</ul></div> 176<p>To see information on the sources of uninitialised data in your 177program, use the <code class="option">--track-origins=yes</code> option. This 178makes Memcheck run more slowly, but can make it much easier to track down 179the root causes of uninitialised value errors.</p> 180</div> 181<div class="sect2" title="4.2.3.�Use of uninitialised or unaddressable values in system calls"> 182<div class="titlepage"><div><div><h3 class="title"> 183<a name="mc-manual.bad-syscall-args"></a>4.2.3.�Use of uninitialised or unaddressable values in system 184 calls</h3></div></div></div> 185<p>Memcheck checks all parameters to system calls: 186</p> 187<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 188<li class="listitem"><p>It checks all the direct parameters themselves, whether they are 189 initialised.</p></li> 190<li class="listitem"><p>Also, if a system call needs to read from a buffer provided by 191 your program, Memcheck checks that the entire buffer is addressable 192 and its contents are initialised.</p></li> 193<li class="listitem"><p>Also, if the system call needs to write to a user-supplied 194 buffer, Memcheck checks that the buffer is addressable.</p></li> 195</ul></div> 196<p> 197</p> 198<p>After the system call, Memcheck updates its tracked information to 199precisely reflect any changes in memory state caused by the system 200call.</p> 201<p>Here's an example of two system calls with invalid parameters:</p> 202<pre class="programlisting"> 203 #include <stdlib.h> 204 #include <unistd.h> 205 int main( void ) 206 { 207 char* arr = malloc(10); 208 int* arr2 = malloc(sizeof(int)); 209 write( 1 /* stdout */, arr, 10 ); 210 exit(arr2[0]); 211 } 212</pre> 213<p>You get these complaints ...</p> 214<pre class="programlisting"> 215 Syscall param write(buf) points to uninitialised byte(s) 216 at 0x25A48723: __write_nocancel (in /lib/tls/libc-2.3.3.so) 217 by 0x259AFAD3: __libc_start_main (in /lib/tls/libc-2.3.3.so) 218 by 0x8048348: (within /auto/homes/njn25/grind/head4/a.out) 219 Address 0x25AB8028 is 0 bytes inside a block of size 10 alloc'd 220 at 0x259852B0: malloc (vg_replace_malloc.c:130) 221 by 0x80483F1: main (a.c:5) 222 223 Syscall param exit(error_code) contains uninitialised byte(s) 224 at 0x25A21B44: __GI__exit (in /lib/tls/libc-2.3.3.so) 225 by 0x8048426: main (a.c:8) 226</pre> 227<p>... because the program has (a) written uninitialised junk 228from the heap block to the standard output, and (b) passed an 229uninitialised value to <code class="function">exit</code>. Note that the first 230error refers to the memory pointed to by 231<code class="computeroutput">buf</code> (not 232<code class="computeroutput">buf</code> itself), but the second error 233refers directly to <code class="computeroutput">exit</code>'s argument 234<code class="computeroutput">arr2[0]</code>.</p> 235</div> 236<div class="sect2" title="4.2.4.�Illegal frees"> 237<div class="titlepage"><div><div><h3 class="title"> 238<a name="mc-manual.badfrees"></a>4.2.4.�Illegal frees</h3></div></div></div> 239<p>For example:</p> 240<pre class="programlisting"> 241Invalid free() 242 at 0x4004FFDF: free (vg_clientmalloc.c:577) 243 by 0x80484C7: main (tests/doublefree.c:10) 244 Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd 245 at 0x4004FFDF: free (vg_clientmalloc.c:577) 246 by 0x80484C7: main (tests/doublefree.c:10) 247</pre> 248<p>Memcheck keeps track of the blocks allocated by your program 249with <code class="function">malloc</code>/<code class="computeroutput">new</code>, 250so it can know exactly whether or not the argument to 251<code class="function">free</code>/<code class="computeroutput">delete</code> is 252legitimate or not. Here, this test program has freed the same block 253twice. As with the illegal read/write errors, Memcheck attempts to 254make sense of the address freed. If, as here, the address is one 255which has previously been freed, you wil be told that -- making 256duplicate frees of the same block easy to spot. You will also get this 257message if you try to free a pointer that doesn't point to the start of a 258heap block.</p> 259</div> 260<div class="sect2" title="4.2.5.�When a heap block is freed with an inappropriate deallocation function"> 261<div class="titlepage"><div><div><h3 class="title"> 262<a name="mc-manual.rudefn"></a>4.2.5.�When a heap block is freed with an inappropriate deallocation 263function</h3></div></div></div> 264<p>In the following example, a block allocated with 265<code class="function">new[]</code> has wrongly been deallocated with 266<code class="function">free</code>:</p> 267<pre class="programlisting"> 268Mismatched free() / delete / delete [] 269 at 0x40043249: free (vg_clientfuncs.c:171) 270 by 0x4102BB4E: QGArray::~QGArray(void) (tools/qgarray.cpp:149) 271 by 0x4C261C41: PptDoc::~PptDoc(void) (include/qmemarray.h:60) 272 by 0x4C261F0E: PptXml::~PptXml(void) (pptxml.cc:44) 273 Address 0x4BB292A8 is 0 bytes inside a block of size 64 alloc'd 274 at 0x4004318C: operator new[](unsigned int) (vg_clientfuncs.c:152) 275 by 0x4C21BC15: KLaola::readSBStream(int) const (klaola.cc:314) 276 by 0x4C21C155: KLaola::stream(KLaola::OLENode const *) (klaola.cc:416) 277 by 0x4C21788F: OLEFilter::convert(QCString const &) (olefilter.cc:272) 278</pre> 279<p>In <code class="literal">C++</code> it's important to deallocate memory in a 280way compatible with how it was allocated. The deal is:</p> 281<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 282<li class="listitem"><p>If allocated with 283 <code class="function">malloc</code>, 284 <code class="function">calloc</code>, 285 <code class="function">realloc</code>, 286 <code class="function">valloc</code> or 287 <code class="function">memalign</code>, you must 288 deallocate with <code class="function">free</code>.</p></li> 289<li class="listitem"><p>If allocated with <code class="function">new</code>, you must deallocate 290 with <code class="function">delete</code>.</p></li> 291<li class="listitem"><p>If allocated with <code class="function">new[]</code>, you must 292 deallocate with <code class="function">delete[]</code>.</p></li> 293</ul></div> 294<p>The worst thing is that on Linux apparently it doesn't matter if 295you do mix these up, but the same program may then crash on a 296different platform, Solaris for example. So it's best to fix it 297properly. According to the KDE folks "it's amazing how many C++ 298programmers don't know this".</p> 299<p>The reason behind the requirement is as follows. In some C++ 300implementations, <code class="function">delete[]</code> must be used for 301objects allocated by <code class="function">new[]</code> because the compiler 302stores the size of the array and the pointer-to-member to the 303destructor of the array's content just before the pointer actually 304returned. <code class="function">delete</code> doesn't account for this and will get 305confused, possibly corrupting the heap.</p> 306</div> 307<div class="sect2" title="4.2.6.�Overlapping source and destination blocks"> 308<div class="titlepage"><div><div><h3 class="title"> 309<a name="mc-manual.overlap"></a>4.2.6.�Overlapping source and destination blocks</h3></div></div></div> 310<p>The following C library functions copy some data from one 311memory block to another (or something similar): 312<code class="function">memcpy</code>, 313<code class="function">strcpy</code>, 314<code class="function">strncpy</code>, 315<code class="function">strcat</code>, 316<code class="function">strncat</code>. 317The blocks pointed to by their <code class="computeroutput">src</code> and 318<code class="computeroutput">dst</code> pointers aren't allowed to overlap. 319The POSIX standards have wording along the lines "If copying takes place 320between objects that overlap, the behavior is undefined." Therefore, 321Memcheck checks for this. 322</p> 323<p>For example:</p> 324<pre class="programlisting"> 325==27492== Source and destination overlap in memcpy(0xbffff294, 0xbffff280, 21) 326==27492== at 0x40026CDC: memcpy (mc_replace_strmem.c:71) 327==27492== by 0x804865A: main (overlap.c:40) 328</pre> 329<p>You don't want the two blocks to overlap because one of them could 330get partially overwritten by the copying.</p> 331<p>You might think that Memcheck is being overly pedantic reporting 332this in the case where <code class="computeroutput">dst</code> is less than 333<code class="computeroutput">src</code>. For example, the obvious way to 334implement <code class="function">memcpy</code> is by copying from the first 335byte to the last. However, the optimisation guides of some 336architectures recommend copying from the last byte down to the first. 337Also, some implementations of <code class="function">memcpy</code> zero 338<code class="computeroutput">dst</code> before copying, because zeroing the 339destination's cache line(s) can improve performance.</p> 340<p>The moral of the story is: if you want to write truly portable 341code, don't make any assumptions about the language 342implementation.</p> 343</div> 344<div class="sect2" title="4.2.7.�Memory leak detection"> 345<div class="titlepage"><div><div><h3 class="title"> 346<a name="mc-manual.leaks"></a>4.2.7.�Memory leak detection</h3></div></div></div> 347<p>Memcheck keeps track of all heap blocks issued in response to 348calls to 349<code class="function">malloc</code>/<code class="function">new</code> et al. 350So when the program exits, it knows which blocks have not been freed. 351</p> 352<p>If <code class="option">--leak-check</code> is set appropriately, for each 353remaining block, Memcheck determines if the block is reachable from pointers 354within the root-set. The root-set consists of (a) general purpose registers 355of all threads, and (b) initialised, aligned, pointer-sized data words in 356accessible client memory, including stacks.</p> 357<p>There are two ways a block can be reached. The first is with a 358"start-pointer", i.e. a pointer to the start of the block. The second is with 359an "interior-pointer", i.e. a pointer to the middle of the block. There are 360three ways we know of that an interior-pointer can occur:</p> 361<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 362<li class="listitem"><p>The pointer might have originally been a start-pointer and have been 363 moved along deliberately (or not deliberately) by the program. In 364 particular, this can happen if your program uses tagged pointers, i.e. 365 if it uses the bottom one, two or three bits of a pointer, which are 366 normally always zero due to alignment, in order to store extra 367 information.</p></li> 368<li class="listitem"><p>It might be a random junk value in memory, entirely unrelated, just 369 a coincidence.</p></li> 370<li class="listitem"><p>It might be a pointer to an array of C++ objects (which possess 371 destructors) allocated with <code class="computeroutput">new[]</code>. In 372 this case, some compilers store a "magic cookie" containing the array 373 length at the start of the allocated block, and return a pointer to just 374 past that magic cookie, i.e. an interior-pointer. 375 See <a class="ulink" href="http://theory.uwinnipeg.ca/gnu/gcc/gxxint_14.html" target="_top">this 376 page</a> for more information.</p></li> 377</ul></div> 378<p>With that in mind, consider the nine possible cases described by the 379following figure.</p> 380<pre class="programlisting"> 381 Pointer chain AAA Category BBB Category 382 ------------- ------------ ------------ 383(1) RRR ------------> BBB DR 384(2) RRR ---> AAA ---> BBB DR IR 385(3) RRR BBB DL 386(4) RRR AAA ---> BBB DL IL 387(5) RRR ------?-----> BBB (y)DR, (n)DL 388(6) RRR ---> AAA -?-> BBB DR (y)IR, (n)DL 389(7) RRR -?-> AAA ---> BBB (y)DR, (n)DL (y)IR, (n)IL 390(8) RRR -?-> AAA -?-> BBB (y)DR, (n)DL (y,y)IR, (n,y)IL, (_,n)DL 391(9) RRR AAA -?-> BBB DL (y)IL, (n)DL 392 393Pointer chain legend: 394- RRR: a root set node or DR block 395- AAA, BBB: heap blocks 396- --->: a start-pointer 397- -?->: an interior-pointer 398 399Category legend: 400- DR: Directly reachable 401- IR: Indirectly reachable 402- DL: Directly lost 403- IL: Indirectly lost 404- (y)XY: it's XY if the interior-pointer is a real pointer 405- (n)XY: it's XY if the interior-pointer is not a real pointer 406- (_)XY: it's XY in either case 407</pre> 408<p>Every possible case can be reduced to one of the above nine. Memcheck 409merges some of these cases in its output, resulting in the following four 410categories.</p> 411<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 412<li class="listitem"><p>"Still reachable". This covers cases 1 and 2 (for the BBB blocks) 413 above. A start-pointer or chain of start-pointers to the block is 414 found. Since the block is still pointed at, the programmer could, at 415 least in principle, have freed it before program exit. Because these 416 are very common and arguably not a problem, Memcheck won't report such 417 blocks individually unless <code class="option">--show-reachable=yes</code> is 418 specified.</p></li> 419<li class="listitem"><p>"Definitely lost". This covers case 3 (for the BBB blocks) above. 420 This means that no pointer to the block can be found. The block is 421 classified as "lost", because the programmer could not possibly have 422 freed it at program exit, since no pointer to it exists. This is likely 423 a symptom of having lost the pointer at some earlier point in the 424 program. Such cases should be fixed by the programmer.</p></li> 425<li class="listitem"><p>"Indirectly lost". This covers cases 4 and 9 (for the BBB blocks) 426 above. This means that the block is lost, not because there are no 427 pointers to it, but rather because all the blocks that point to it are 428 themselves lost. For example, if you have a binary tree and the root 429 node is lost, all its children nodes will be indirectly lost. Because 430 the problem will disappear if the definitely lost block that caused the 431 indirect leak is fixed, Memcheck won't report such blocks individually 432 unless <code class="option">--show-reachable=yes</code> is specified.</p></li> 433<li class="listitem"><p>"Possibly lost". This covers cases 5--8 (for the BBB blocks) 434 above. This means that a chain of one or more pointers to the block has 435 been found, but at least one of the pointers is an interior-pointer. 436 This could just be a random value in memory that happens to point into a 437 block, and so you shouldn't consider this ok unless you know you have 438 interior-pointers.</p></li> 439</ul></div> 440<p>(Note: This mapping of the nine possible cases onto four categories is 441not necessarily the best way that leaks could be reported; in particular, 442interior-pointers are treated inconsistently. It is possible the 443categorisation may be improved in the future.)</p> 444<p>Furthermore, if suppressions exists for a block, it will be reported 445as "suppressed" no matter what which of the above four categories it belongs 446to.</p> 447<p>The following is an example leak summary.</p> 448<pre class="programlisting"> 449LEAK SUMMARY: 450 definitely lost: 48 bytes in 3 blocks. 451 indirectly lost: 32 bytes in 2 blocks. 452 possibly lost: 96 bytes in 6 blocks. 453 still reachable: 64 bytes in 4 blocks. 454 suppressed: 0 bytes in 0 blocks. 455</pre> 456<p>If <code class="option">--leak-check=full</code> is specified, 457Memcheck will give details for each definitely lost or possibly lost block, 458including where it was allocated. (Actually, it merges results for all 459blocks that have the same category and sufficiently similar stack traces 460into a single "loss record". The 461<code class="option">--leak-resolution</code> lets you control the 462meaning of "sufficiently similar".) It cannot tell you when or how or why 463the pointer to a leaked block was lost; you have to work that out for 464yourself. In general, you should attempt to ensure your programs do not 465have any definitely lost or possibly lost blocks at exit.</p> 466<p>For example:</p> 467<pre class="programlisting"> 4688 bytes in 1 blocks are definitely lost in loss record 1 of 14 469 at 0x........: malloc (vg_replace_malloc.c:...) 470 by 0x........: mk (leak-tree.c:11) 471 by 0x........: main (leak-tree.c:39) 472 47388 (8 direct, 80 indirect) bytes in 1 blocks are definitely lost in loss record 13 of 14 474 at 0x........: malloc (vg_replace_malloc.c:...) 475 by 0x........: mk (leak-tree.c:11) 476 by 0x........: main (leak-tree.c:25) 477</pre> 478<p>The first message describes a simple case of a single 8 byte block 479that has been definitely lost. The second case mentions another 8 byte 480block that has been definitely lost; the difference is that a further 80 481bytes in other blocks are indirectly lost because of this lost block. 482The loss records are not presented in any notable order, so the loss record 483numbers aren't particularly meaningful.</p> 484<p>If you specify <code class="option">--show-reachable=yes</code>, 485reachable and indirectly lost blocks will also be shown, as the following 486two examples show.</p> 487<pre class="programlisting"> 48864 bytes in 4 blocks are still reachable in loss record 2 of 4 489 at 0x........: malloc (vg_replace_malloc.c:177) 490 by 0x........: mk (leak-cases.c:52) 491 by 0x........: main (leak-cases.c:74) 492 49332 bytes in 2 blocks are indirectly lost in loss record 1 of 4 494 at 0x........: malloc (vg_replace_malloc.c:177) 495 by 0x........: mk (leak-cases.c:52) 496 by 0x........: main (leak-cases.c:80) 497</pre> 498<p>Because there are different kinds of leaks with different severities, an 499interesting question is this: which leaks should be counted as true "errors" 500and which should not? The answer to this question affects the numbers printed 501in the <code class="computeroutput">ERROR SUMMARY</code> line, and also the effect 502of the <code class="option">--error-exitcode</code> option. Memcheck uses the following 503criteria:</p> 504<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 505<li class="listitem"><p>First, a leak is only counted as a true "error" if 506 <code class="option">--leak-check=full</code> is specified. In other words, an 507 unprinted leak is not considered a true "error". If this were not the 508 case, it would be possible to get a high error count but not have any 509 errors printed, which would be confusing.</p></li> 510<li class="listitem"><p>After that, definitely lost and possibly lost blocks are counted as 511 true "errors". Indirectly lost and still reachable blocks are not counted 512 as true "errors", even if <code class="option">--show-reachable=yes</code> is 513 specified and they are printed; this is because such blocks don't need 514 direct fixing by the programmer. 515 </p></li> 516</ul></div> 517</div> 518</div> 519<div class="sect1" title="4.3.�Memcheck Command-Line Options"> 520<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 521<a name="mc-manual.options"></a>4.3.�Memcheck Command-Line Options</h2></div></div></div> 522<div class="variablelist"> 523<a name="mc.opts.list"></a><dl> 524<dt> 525<a name="opt.leak-check"></a><span class="term"> 526 <code class="option">--leak-check=<no|summary|yes|full> [default: summary] </code> 527 </span> 528</dt> 529<dd><p>When enabled, search for memory leaks when the client 530 program finishes. If set to <code class="varname">summary</code>, it says how 531 many leaks occurred. If set to <code class="varname">full</code> or 532 <code class="varname">yes</code>, it also gives details of each individual 533 leak.</p></dd> 534<dt> 535<a name="opt.show-possibly-lost"></a><span class="term"> 536 <code class="option">--show-possibly-lost=<yes|no> [default: yes] </code> 537 </span> 538</dt> 539<dd><p>When disabled, the memory leak detector will not show "possibly lost" blocks. 540 </p></dd> 541<dt> 542<a name="opt.leak-resolution"></a><span class="term"> 543 <code class="option">--leak-resolution=<low|med|high> [default: high] </code> 544 </span> 545</dt> 546<dd> 547<p>When doing leak checking, determines how willing 548 Memcheck is to consider different backtraces to 549 be the same for the purposes of merging multiple leaks into a single 550 leak report. When set to <code class="varname">low</code>, only the first 551 two entries need match. When <code class="varname">med</code>, four entries 552 have to match. When <code class="varname">high</code>, all entries need to 553 match.</p> 554<p>For hardcore leak debugging, you probably want to use 555 <code class="option">--leak-resolution=high</code> together with 556 <code class="option">--num-callers=40</code> or some such large number. 557 </p> 558<p>Note that the <code class="option">--leak-resolution</code> setting 559 does not affect Memcheck's ability to find 560 leaks. It only changes how the results are presented.</p> 561</dd> 562<dt> 563<a name="opt.show-reachable"></a><span class="term"> 564 <code class="option">--show-reachable=<yes|no> [default: no] </code> 565 </span> 566</dt> 567<dd><p>When disabled, the memory leak detector only shows "definitely 568 lost" and "possibly lost" blocks. When enabled, the leak detector also 569 shows "reachable" and "indirectly lost" blocks. (In other words, it 570 shows all blocks, except suppressed ones, so 571 <code class="option">--show-all</code> would be a better name for 572 it.)</p></dd> 573<dt> 574<a name="opt.undef-value-errors"></a><span class="term"> 575 <code class="option">--undef-value-errors=<yes|no> [default: yes] </code> 576 </span> 577</dt> 578<dd><p>Controls whether Memcheck reports 579 uses of undefined value errors. Set this to 580 <code class="varname">no</code> if you don't want to see undefined value 581 errors. It also has the side effect of speeding up 582 Memcheck somewhat. 583 </p></dd> 584<dt> 585<a name="opt.track-origins"></a><span class="term"> 586 <code class="option">--track-origins=<yes|no> [default: no] </code> 587 </span> 588</dt> 589<dd> 590<p>Controls whether Memcheck tracks 591 the origin of uninitialised values. By default, it does not, 592 which means that although it can tell you that an 593 uninitialised value is being used in a dangerous way, it 594 cannot tell you where the uninitialised value came from. This 595 often makes it difficult to track down the root problem. 596 </p> 597<p>When set 598 to <code class="varname">yes</code>, Memcheck keeps 599 track of the origins of all uninitialised values. Then, when 600 an uninitialised value error is 601 reported, Memcheck will try to show the 602 origin of the value. An origin can be one of the following 603 four places: a heap block, a stack allocation, a client 604 request, or miscellaneous other sources (eg, a call 605 to <code class="varname">brk</code>). 606 </p> 607<p>For uninitialised values originating from a heap 608 block, Memcheck shows where the block was 609 allocated. For uninitialised values originating from a stack 610 allocation, Memcheck can tell you which 611 function allocated the value, but no more than that -- typically 612 it shows you the source location of the opening brace of the 613 function. So you should carefully check that all of the 614 function's local variables are initialised properly. 615 </p> 616<p>Performance overhead: origin tracking is expensive. It 617 halves Memcheck's speed and increases 618 memory use by a minimum of 100MB, and possibly more. 619 Nevertheless it can drastically reduce the effort required to 620 identify the root cause of uninitialised value errors, and so 621 is often a programmer productivity win, despite running 622 more slowly. 623 </p> 624<p>Accuracy: Memcheck tracks origins 625 quite accurately. To avoid very large space and time 626 overheads, some approximations are made. It is possible, 627 although unlikely, that Memcheck will report an incorrect origin, or 628 not be able to identify any origin. 629 </p> 630<p>Note that the combination 631 <code class="option">--track-origins=yes</code> 632 and <code class="option">--undef-value-errors=no</code> is 633 nonsensical. Memcheck checks for and 634 rejects this combination at startup. 635 </p> 636</dd> 637<dt> 638<a name="opt.partial-loads-ok"></a><span class="term"> 639 <code class="option">--partial-loads-ok=<yes|no> [default: no] </code> 640 </span> 641</dt> 642<dd> 643<p>Controls how Memcheck handles word-sized, 644 word-aligned loads from addresses for which some bytes are 645 addressable and others are not. When <code class="varname">yes</code>, such 646 loads do not produce an address error. Instead, loaded bytes 647 originating from illegal addresses are marked as uninitialised, and 648 those corresponding to legal addresses are handled in the normal 649 way.</p> 650<p>When <code class="varname">no</code>, loads from partially invalid 651 addresses are treated the same as loads from completely invalid 652 addresses: an illegal-address error is issued, and the resulting 653 bytes are marked as initialised.</p> 654<p>Note that code that behaves in this way is in violation of 655 the the ISO C/C++ standards, and should be considered broken. If 656 at all possible, such code should be fixed. This option should be 657 used only as a last resort.</p> 658</dd> 659<dt> 660<a name="opt.freelist-vol"></a><span class="term"> 661 <code class="option">--freelist-vol=<number> [default: 20000000] </code> 662 </span> 663</dt> 664<dd> 665<p>When the client program releases memory using 666 <code class="function">free</code> (in <code class="literal">C</code>) or 667 <code class="computeroutput">delete</code> 668 (<code class="literal">C++</code>), that memory is not immediately made 669 available for re-allocation. Instead, it is marked inaccessible 670 and placed in a queue of freed blocks. The purpose is to defer as 671 long as possible the point at which freed-up memory comes back 672 into circulation. This increases the chance that 673 Memcheck will be able to detect invalid 674 accesses to blocks for some significant period of time after they 675 have been freed.</p> 676<p>This option specifies the maximum total size, in bytes, of the 677 blocks in the queue. The default value is twenty million bytes. 678 Increasing this increases the total amount of memory used by 679 Memcheck but may detect invalid uses of freed 680 blocks which would otherwise go undetected.</p> 681</dd> 682<dt> 683<a name="opt.workaround-gcc296-bugs"></a><span class="term"> 684 <code class="option">--workaround-gcc296-bugs=<yes|no> [default: no] </code> 685 </span> 686</dt> 687<dd> 688<p>When enabled, assume that reads and writes some small 689 distance below the stack pointer are due to bugs in GCC 2.96, and 690 does not report them. The "small distance" is 256 bytes by 691 default. Note that GCC 2.96 is the default compiler on some ancient 692 Linux distributions (RedHat 7.X) and so you may need to use this 693 option. Do not use it if you do not have to, as it can cause real 694 errors to be overlooked. A better alternative is to use a more 695 recent GCC in which this bug is fixed.</p> 696<p>You may also need to use this option when working with 697 GCC 3.X or 4.X on 32-bit PowerPC Linux. This is because 698 GCC generates code which occasionally accesses below the 699 stack pointer, particularly for floating-point to/from integer 700 conversions. This is in violation of the 32-bit PowerPC ELF 701 specification, which makes no provision for locations below the 702 stack pointer to be accessible.</p> 703</dd> 704<dt> 705<a name="opt.ignore-ranges"></a><span class="term"> 706 <code class="option">--ignore-ranges=0xPP-0xQQ[,0xRR-0xSS] </code> 707 </span> 708</dt> 709<dd><p>Any ranges listed in this option (and multiple ranges can be 710 specified, separated by commas) will be ignored by Memcheck's 711 addressability checking.</p></dd> 712<dt> 713<a name="opt.malloc-fill"></a><span class="term"> 714 <code class="option">--malloc-fill=<hexnumber> </code> 715 </span> 716</dt> 717<dd><p>Fills blocks allocated 718 by <code class="computeroutput">malloc</code>, 719 <code class="computeroutput">new</code>, etc, but not 720 by <code class="computeroutput">calloc</code>, with the specified 721 byte. This can be useful when trying to shake out obscure 722 memory corruption problems. The allocated area is still 723 regarded by Memcheck as undefined -- this option only affects its 724 contents. 725 </p></dd> 726<dt> 727<a name="opt.free-fill"></a><span class="term"> 728 <code class="option">--free-fill=<hexnumber> </code> 729 </span> 730</dt> 731<dd><p>Fills blocks freed 732 by <code class="computeroutput">free</code>, 733 <code class="computeroutput">delete</code>, etc, with the 734 specified byte value. This can be useful when trying to shake out 735 obscure memory corruption problems. The freed area is still 736 regarded by Memcheck as not valid for access -- this option only 737 affects its contents. 738 </p></dd> 739</dl> 740</div> 741</div> 742<div class="sect1" title="4.4.�Writing suppression files"> 743<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 744<a name="mc-manual.suppfiles"></a>4.4.�Writing suppression files</h2></div></div></div> 745<p>The basic suppression format is described in 746<a class="xref" href="manual-core.html#manual-core.suppress" title="2.5.�Suppressing errors">Suppressing errors</a>.</p> 747<p>The suppression-type (second) line should have the form:</p> 748<pre class="programlisting"> 749Memcheck:suppression_type</pre> 750<p>The Memcheck suppression types are as follows:</p> 751<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 752<li class="listitem"><p><code class="varname">Value1</code>, 753 <code class="varname">Value2</code>, 754 <code class="varname">Value4</code>, 755 <code class="varname">Value8</code>, 756 <code class="varname">Value16</code>, 757 meaning an uninitialised-value error when 758 using a value of 1, 2, 4, 8 or 16 bytes.</p></li> 759<li class="listitem"><p><code class="varname">Cond</code> (or its old 760 name, <code class="varname">Value0</code>), meaning use 761 of an uninitialised CPU condition code.</p></li> 762<li class="listitem"><p><code class="varname">Addr1</code>, 763 <code class="varname">Addr2</code>, 764 <code class="varname">Addr4</code>, 765 <code class="varname">Addr8</code>, 766 <code class="varname">Addr16</code>, 767 meaning an invalid address during a 768 memory access of 1, 2, 4, 8 or 16 bytes respectively.</p></li> 769<li class="listitem"><p><code class="varname">Jump</code>, meaning an 770 jump to an unaddressable location error.</p></li> 771<li class="listitem"><p><code class="varname">Param</code>, meaning an 772 invalid system call parameter error.</p></li> 773<li class="listitem"><p><code class="varname">Free</code>, meaning an 774 invalid or mismatching free.</p></li> 775<li class="listitem"><p><code class="varname">Overlap</code>, meaning a 776 <code class="computeroutput">src</code> / 777 <code class="computeroutput">dst</code> overlap in 778 <code class="function">memcpy</code> or a similar function.</p></li> 779<li class="listitem"><p><code class="varname">Leak</code>, meaning 780 a memory leak.</p></li> 781</ul></div> 782<p><code class="computeroutput">Param</code> errors have an extra 783information line at this point, which is the name of the offending 784system call parameter. No other error kinds have this extra 785line.</p> 786<p>The first line of the calling context: for <code class="varname">ValueN</code> 787and <code class="varname">AddrN</code> errors, it is either the name of the function 788in which the error occurred, or, failing that, the full path of the 789<code class="filename">.so</code> file 790or executable containing the error location. For <code class="varname">Free</code> errors, is the name 791of the function doing the freeing (eg, <code class="function">free</code>, 792<code class="function">__builtin_vec_delete</code>, etc). For 793<code class="varname">Overlap</code> errors, is the name of the function with the 794overlapping arguments (eg. <code class="function">memcpy</code>, 795<code class="function">strcpy</code>, etc).</p> 796<p>Lastly, there's the rest of the calling context.</p> 797</div> 798<div class="sect1" title="4.5.�Details of Memcheck's checking machinery"> 799<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 800<a name="mc-manual.machine"></a>4.5.�Details of Memcheck's checking machinery</h2></div></div></div> 801<p>Read this section if you want to know, in detail, exactly 802what and how Memcheck is checking.</p> 803<div class="sect2" title="4.5.1.�Valid-value (V) bits"> 804<div class="titlepage"><div><div><h3 class="title"> 805<a name="mc-manual.value"></a>4.5.1.�Valid-value (V) bits</h3></div></div></div> 806<p>It is simplest to think of Memcheck implementing a synthetic CPU 807which is identical to a real CPU, except for one crucial detail. Every 808bit (literally) of data processed, stored and handled by the real CPU 809has, in the synthetic CPU, an associated "valid-value" bit, which says 810whether or not the accompanying bit has a legitimate value. In the 811discussions which follow, this bit is referred to as the V (valid-value) 812bit.</p> 813<p>Each byte in the system therefore has a 8 V bits which follow it 814wherever it goes. For example, when the CPU loads a word-size item (4 815bytes) from memory, it also loads the corresponding 32 V bits from a 816bitmap which stores the V bits for the process' entire address space. 817If the CPU should later write the whole or some part of that value to 818memory at a different address, the relevant V bits will be stored back 819in the V-bit bitmap.</p> 820<p>In short, each bit in the system has (conceptually) an associated V 821bit, which follows it around everywhere, even inside the CPU. Yes, all the 822CPU's registers (integer, floating point, vector and condition registers) 823have their own V bit vectors. For this to work, Memcheck uses a great deal 824of compression to represent the V bits compactly.</p> 825<p>Copying values around does not cause Memcheck to check for, or 826report on, errors. However, when a value is used in a way which might 827conceivably affect your program's externally-visible behaviour, 828the associated V bits are immediately checked. If any of these indicate 829that the value is undefined (even partially), an error is reported.</p> 830<p>Here's an (admittedly nonsensical) example:</p> 831<pre class="programlisting"> 832int i, j; 833int a[10], b[10]; 834for ( i = 0; i < 10; i++ ) { 835 j = a[i]; 836 b[i] = j; 837}</pre> 838<p>Memcheck emits no complaints about this, since it merely copies 839uninitialised values from <code class="varname">a[]</code> into 840<code class="varname">b[]</code>, and doesn't use them in a way which could 841affect the behaviour of the program. However, if 842the loop is changed to:</p> 843<pre class="programlisting"> 844for ( i = 0; i < 10; i++ ) { 845 j += a[i]; 846} 847if ( j == 77 ) 848 printf("hello there\n"); 849</pre> 850<p>then Memcheck will complain, at the 851<code class="computeroutput">if</code>, that the condition depends on 852uninitialised values. Note that it <span class="command"><strong>doesn't</strong></span> complain 853at the <code class="varname">j += a[i];</code>, since at that point the 854undefinedness is not "observable". It's only when a decision has to be 855made as to whether or not to do the <code class="function">printf</code> -- an 856observable action of your program -- that Memcheck complains.</p> 857<p>Most low level operations, such as adds, cause Memcheck to use the 858V bits for the operands to calculate the V bits for the result. Even if 859the result is partially or wholly undefined, it does not 860complain.</p> 861<p>Checks on definedness only occur in three places: when a value is 862used to generate a memory address, when control flow decision needs to 863be made, and when a system call is detected, Memcheck checks definedness 864of parameters as required.</p> 865<p>If a check should detect undefinedness, an error message is 866issued. The resulting value is subsequently regarded as well-defined. 867To do otherwise would give long chains of error messages. In other 868words, once Memcheck reports an undefined value error, it tries to 869avoid reporting further errors derived from that same undefined 870value.</p> 871<p>This sounds overcomplicated. Why not just check all reads from 872memory, and complain if an undefined value is loaded into a CPU 873register? Well, that doesn't work well, because perfectly legitimate C 874programs routinely copy uninitialised values around in memory, and we 875don't want endless complaints about that. Here's the canonical example. 876Consider a struct like this:</p> 877<pre class="programlisting"> 878struct S { int x; char c; }; 879struct S s1, s2; 880s1.x = 42; 881s1.c = 'z'; 882s2 = s1; 883</pre> 884<p>The question to ask is: how large is <code class="varname">struct S</code>, 885in bytes? An <code class="varname">int</code> is 4 bytes and a 886<code class="varname">char</code> one byte, so perhaps a <code class="varname">struct 887S</code> occupies 5 bytes? Wrong. All non-toy compilers we know 888of will round the size of <code class="varname">struct S</code> up to a whole 889number of words, in this case 8 bytes. Not doing this forces compilers 890to generate truly appalling code for accessing arrays of 891<code class="varname">struct S</code>'s on some architectures.</p> 892<p>So <code class="varname">s1</code> occupies 8 bytes, yet only 5 of them will 893be initialised. For the assignment <code class="varname">s2 = s1</code>, GCC 894generates code to copy all 8 bytes wholesale into <code class="varname">s2</code> 895without regard for their meaning. If Memcheck simply checked values as 896they came out of memory, it would yelp every time a structure assignment 897like this happened. So the more complicated behaviour described above 898is necessary. This allows GCC to copy 899<code class="varname">s1</code> into <code class="varname">s2</code> any way it likes, and a 900warning will only be emitted if the uninitialised values are later 901used.</p> 902</div> 903<div class="sect2" title="4.5.2.�Valid-address (A) bits"> 904<div class="titlepage"><div><div><h3 class="title"> 905<a name="mc-manual.vaddress"></a>4.5.2.�Valid-address (A) bits</h3></div></div></div> 906<p>Notice that the previous subsection describes how the validity of 907values is established and maintained without having to say whether the 908program does or does not have the right to access any particular memory 909location. We now consider the latter question.</p> 910<p>As described above, every bit in memory or in the CPU has an 911associated valid-value (V) bit. In addition, all bytes in memory, but 912not in the CPU, have an associated valid-address (A) bit. This 913indicates whether or not the program can legitimately read or write that 914location. It does not give any indication of the validity or the data 915at that location -- that's the job of the V bits -- only whether or not 916the location may be accessed.</p> 917<p>Every time your program reads or writes memory, Memcheck checks 918the A bits associated with the address. If any of them indicate an 919invalid address, an error is emitted. Note that the reads and writes 920themselves do not change the A bits, only consult them.</p> 921<p>So how do the A bits get set/cleared? Like this:</p> 922<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 923<li class="listitem"><p>When the program starts, all the global data areas are 924 marked as accessible.</p></li> 925<li class="listitem"><p>When the program does 926 <code class="function">malloc</code>/<code class="computeroutput">new</code>, 927 the A bits for exactly the area allocated, and not a byte more, 928 are marked as accessible. Upon freeing the area the A bits are 929 changed to indicate inaccessibility.</p></li> 930<li class="listitem"><p>When the stack pointer register (<code class="literal">SP</code>) moves 931 up or down, A bits are set. The rule is that the area from 932 <code class="literal">SP</code> up to the base of the stack is marked as 933 accessible, and below <code class="literal">SP</code> is inaccessible. (If 934 that sounds illogical, bear in mind that the stack grows down, not 935 up, on almost all Unix systems, including GNU/Linux.) Tracking 936 <code class="literal">SP</code> like this has the useful side-effect that the 937 section of stack used by a function for local variables etc is 938 automatically marked accessible on function entry and inaccessible 939 on exit.</p></li> 940<li class="listitem"><p>When doing system calls, A bits are changed appropriately. 941 For example, <code class="literal">mmap</code> 942 magically makes files appear in the process' 943 address space, so the A bits must be updated if <code class="literal">mmap</code> 944 succeeds.</p></li> 945<li class="listitem"><p>Optionally, your program can tell Memcheck about such changes 946 explicitly, using the client request mechanism described 947 above.</p></li> 948</ul></div> 949</div> 950<div class="sect2" title="4.5.3.�Putting it all together"> 951<div class="titlepage"><div><div><h3 class="title"> 952<a name="mc-manual.together"></a>4.5.3.�Putting it all together</h3></div></div></div> 953<p>Memcheck's checking machinery can be summarised as 954follows:</p> 955<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 956<li class="listitem"><p>Each byte in memory has 8 associated V (valid-value) bits, 957 saying whether or not the byte has a defined value, and a single A 958 (valid-address) bit, saying whether or not the program currently has 959 the right to read/write that address. As mentioned above, heavy 960 use of compression means the overhead is typically around 25%.</p></li> 961<li class="listitem"><p>When memory is read or written, the relevant A bits are 962 consulted. If they indicate an invalid address, Memcheck emits an 963 Invalid read or Invalid write error.</p></li> 964<li class="listitem"><p>When memory is read into the CPU's registers, the relevant V 965 bits are fetched from memory and stored in the simulated CPU. They 966 are not consulted.</p></li> 967<li class="listitem"><p>When a register is written out to memory, the V bits for that 968 register are written back to memory too.</p></li> 969<li class="listitem"><p>When values in CPU registers are used to generate a memory 970 address, or to determine the outcome of a conditional branch, the V 971 bits for those values are checked, and an error emitted if any of 972 them are undefined.</p></li> 973<li class="listitem"><p>When values in CPU registers are used for any other purpose, 974 Memcheck computes the V bits for the result, but does not check 975 them.</p></li> 976<li class="listitem"><p>Once the V bits for a value in the CPU have been checked, they 977 are then set to indicate validity. This avoids long chains of 978 errors.</p></li> 979<li class="listitem"> 980<p>When values are loaded from memory, Memcheck checks the A bits 981 for that location and issues an illegal-address warning if needed. 982 In that case, the V bits loaded are forced to indicate Valid, 983 despite the location being invalid.</p> 984<p>This apparently strange choice reduces the amount of confusing 985 information presented to the user. It avoids the unpleasant 986 phenomenon in which memory is read from a place which is both 987 unaddressable and contains invalid values, and, as a result, you get 988 not only an invalid-address (read/write) error, but also a 989 potentially large set of uninitialised-value errors, one for every 990 time the value is used.</p> 991<p>There is a hazy boundary case to do with multi-byte loads from 992 addresses which are partially valid and partially invalid. See 993 details of the option <code class="option">--partial-loads-ok</code> for details. 994 </p> 995</li> 996</ul></div> 997<p>Memcheck intercepts calls to <code class="function">malloc</code>, 998<code class="function">calloc</code>, <code class="function">realloc</code>, 999<code class="function">valloc</code>, <code class="function">memalign</code>, 1000<code class="function">free</code>, <code class="computeroutput">new</code>, 1001<code class="computeroutput">new[]</code>, 1002<code class="computeroutput">delete</code> and 1003<code class="computeroutput">delete[]</code>. The behaviour you get 1004is:</p> 1005<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 1006<li class="listitem"><p><code class="function">malloc</code>/<code class="function">new</code>/<code class="computeroutput">new[]</code>: 1007 the returned memory is marked as addressable but not having valid 1008 values. This means you have to write to it before you can read 1009 it.</p></li> 1010<li class="listitem"><p><code class="function">calloc</code>: returned memory is marked both 1011 addressable and valid, since <code class="function">calloc</code> clears 1012 the area to zero.</p></li> 1013<li class="listitem"><p><code class="function">realloc</code>: if the new size is larger than 1014 the old, the new section is addressable but invalid, as with 1015 <code class="function">malloc</code>. If the new size is smaller, the 1016 dropped-off section is marked as unaddressable. You may only pass to 1017 <code class="function">realloc</code> a pointer previously issued to you by 1018 <code class="function">malloc</code>/<code class="function">calloc</code>/<code class="function">realloc</code>.</p></li> 1019<li class="listitem"><p><code class="function">free</code>/<code class="computeroutput">delete</code>/<code class="computeroutput">delete[]</code>: 1020 you may only pass to these functions a pointer previously issued 1021 to you by the corresponding allocation function. Otherwise, 1022 Memcheck complains. If the pointer is indeed valid, Memcheck 1023 marks the entire area it points at as unaddressable, and places 1024 the block in the freed-blocks-queue. The aim is to defer as long 1025 as possible reallocation of this block. Until that happens, all 1026 attempts to access it will elicit an invalid-address error, as you 1027 would hope.</p></li> 1028</ul></div> 1029</div> 1030</div> 1031<div class="sect1" title="4.6.�Client Requests"> 1032<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 1033<a name="mc-manual.clientreqs"></a>4.6.�Client Requests</h2></div></div></div> 1034<p>The following client requests are defined in 1035<code class="filename">memcheck.h</code>. 1036See <code class="filename">memcheck.h</code> for exact details of their 1037arguments.</p> 1038<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 1039<li class="listitem"><p><code class="varname">VALGRIND_MAKE_MEM_NOACCESS</code>, 1040 <code class="varname">VALGRIND_MAKE_MEM_UNDEFINED</code> and 1041 <code class="varname">VALGRIND_MAKE_MEM_DEFINED</code>. 1042 These mark address ranges as completely inaccessible, 1043 accessible but containing undefined data, and accessible and 1044 containing defined data, respectively.</p></li> 1045<li class="listitem"><p><code class="varname">VALGRIND_MAKE_MEM_DEFINED_IF_ADDRESSABLE</code>. 1046 This is just like <code class="varname">VALGRIND_MAKE_MEM_DEFINED</code> but only 1047 affects those bytes that are already addressable.</p></li> 1048<li class="listitem"><p><code class="varname">VALGRIND_CHECK_MEM_IS_ADDRESSABLE</code> and 1049 <code class="varname">VALGRIND_CHECK_MEM_IS_DEFINED</code>: check immediately 1050 whether or not the given address range has the relevant property, 1051 and if not, print an error message. Also, for the convenience of 1052 the client, returns zero if the relevant property holds; otherwise, 1053 the returned value is the address of the first byte for which the 1054 property is not true. Always returns 0 when not run on 1055 Valgrind.</p></li> 1056<li class="listitem"><p><code class="varname">VALGRIND_CHECK_VALUE_IS_DEFINED</code>: a quick and easy 1057 way to find out whether Valgrind thinks a particular value 1058 (lvalue, to be precise) is addressable and defined. Prints an error 1059 message if not. It has no return value.</p></li> 1060<li class="listitem"><p><code class="varname">VALGRIND_DO_LEAK_CHECK</code>: does a full memory leak 1061 check (like <code class="option">--leak-check=full</code>) right now. 1062 This is useful for incrementally checking for leaks between arbitrary 1063 places in the program's execution. It has no return value.</p></li> 1064<li class="listitem"><p><code class="varname">VALGRIND_DO_QUICK_LEAK_CHECK</code>: like 1065 <code class="varname">VALGRIND_DO_LEAK_CHECK</code>, except it produces only a leak 1066 summary (like <code class="option">--leak-check=summary</code>). 1067 It has no return value.</p></li> 1068<li class="listitem"><p><code class="varname">VALGRIND_COUNT_LEAKS</code>: fills in the four 1069 arguments with the number of bytes of memory found by the previous 1070 leak check to be leaked (i.e. the sum of direct leaks and indirect leaks), 1071 dubious, reachable and suppressed. This is useful in test harness code, 1072 after calling <code class="varname">VALGRIND_DO_LEAK_CHECK</code> or 1073 <code class="varname">VALGRIND_DO_QUICK_LEAK_CHECK</code>.</p></li> 1074<li class="listitem"><p><code class="varname">VALGRIND_COUNT_LEAK_BLOCKS</code>: identical to 1075 <code class="varname">VALGRIND_COUNT_LEAKS</code> except that it returns the 1076 number of blocks rather than the number of bytes in each 1077 category.</p></li> 1078<li class="listitem"><p><code class="varname">VALGRIND_GET_VBITS</code> and 1079 <code class="varname">VALGRIND_SET_VBITS</code>: allow you to get and set the 1080 V (validity) bits for an address range. You should probably only 1081 set V bits that you have got with 1082 <code class="varname">VALGRIND_GET_VBITS</code>. Only for those who really 1083 know what they are doing.</p></li> 1084<li class="listitem"> 1085<p><code class="varname">VALGRIND_CREATE_BLOCK</code> and 1086 <code class="varname">VALGRIND_DISCARD</code>. <code class="varname">VALGRIND_CREATE_BLOCK</code> 1087 takes an address, a number of bytes and a character string. The 1088 specified address range is then associated with that string. When 1089 Memcheck reports an invalid access to an address in the range, it 1090 will describe it in terms of this block rather than in terms of 1091 any other block it knows about. Note that the use of this macro 1092 does not actually change the state of memory in any way -- it 1093 merely gives a name for the range. 1094 </p> 1095<p>At some point you may want Memcheck to stop reporting errors 1096 in terms of the block named 1097 by <code class="varname">VALGRIND_CREATE_BLOCK</code>. To make this 1098 possible, <code class="varname">VALGRIND_CREATE_BLOCK</code> returns a 1099 "block handle", which is a C <code class="varname">int</code> value. You 1100 can pass this block handle to <code class="varname">VALGRIND_DISCARD</code>. 1101 After doing so, Valgrind will no longer relate addressing errors 1102 in the specified range to the block. Passing invalid handles to 1103 <code class="varname">VALGRIND_DISCARD</code> is harmless. 1104 </p> 1105</li> 1106</ul></div> 1107</div> 1108<div class="sect1" title="4.7.�Memory Pools: describing and working with custom allocators"> 1109<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 1110<a name="mc-manual.mempools"></a>4.7.�Memory Pools: describing and working with custom allocators</h2></div></div></div> 1111<p>Some programs use custom memory allocators, often for performance 1112reasons. Left to itself, Memcheck is unable to understand the 1113behaviour of custom allocation schemes as well as it understands the 1114standard allocators, and so may miss errors and leaks in your program. What 1115this section describes is a way to give Memcheck enough of a description of 1116your custom allocator that it can make at least some sense of what is 1117happening.</p> 1118<p>There are many different sorts of custom allocator, so Memcheck 1119attempts to reason about them using a loose, abstract model. We 1120use the following terminology when describing custom allocation 1121systems:</p> 1122<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 1123<li class="listitem"><p>Custom allocation involves a set of independent "memory pools". 1124 </p></li> 1125<li class="listitem"><p>Memcheck's notion of a a memory pool consists of a single "anchor 1126 address" and a set of non-overlapping "chunks" associated with the 1127 anchor address.</p></li> 1128<li class="listitem"><p>Typically a pool's anchor address is the address of a 1129 book-keeping "header" structure.</p></li> 1130<li class="listitem"><p>Typically the pool's chunks are drawn from a contiguous 1131 "superblock" acquired through the system 1132 <code class="function">malloc</code> or 1133 <code class="function">mmap</code>.</p></li> 1134</ul></div> 1135<p>Keep in mind that the last two points above say "typically": the 1136Valgrind mempool client request API is intentionally vague about the 1137exact structure of a mempool. There is no specific mention made of 1138headers or superblocks. Nevertheless, the following picture may help 1139elucidate the intention of the terms in the API:</p> 1140<pre class="programlisting"> 1141 "pool" 1142 (anchor address) 1143 | 1144 v 1145 +--------+---+ 1146 | header | o | 1147 +--------+-|-+ 1148 | 1149 v superblock 1150 +------+---+--------------+---+------------------+ 1151 | |rzB| allocation |rzB| | 1152 +------+---+--------------+---+------------------+ 1153 ^ ^ 1154 | | 1155 "addr" "addr"+"size" 1156</pre> 1157<p> 1158Note that the header and the superblock may be contiguous or 1159discontiguous, and there may be multiple superblocks associated with a 1160single header; such variations are opaque to Memcheck. The API 1161only requires that your allocation scheme can present sensible values 1162of "pool", "addr" and "size".</p> 1163<p> 1164Typically, before making client requests related to mempools, a client 1165program will have allocated such a header and superblock for their 1166mempool, and marked the superblock NOACCESS using the 1167<code class="varname">VALGRIND_MAKE_MEM_NOACCESS</code> client request.</p> 1168<p> 1169When dealing with mempools, the goal is to maintain a particular 1170invariant condition: that Memcheck believes the unallocated portions 1171of the pool's superblock (including redzones) are NOACCESS. To 1172maintain this invariant, the client program must ensure that the 1173superblock starts out in that state; Memcheck cannot make it so, since 1174Memcheck never explicitly learns about the superblock of a pool, only 1175the allocated chunks within the pool.</p> 1176<p> 1177Once the header and superblock for a pool are established and properly 1178marked, there are a number of client requests programs can use to 1179inform Memcheck about changes to the state of a mempool:</p> 1180<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 1181<li class="listitem"> 1182<p> 1183 <code class="varname">VALGRIND_CREATE_MEMPOOL(pool, rzB, is_zeroed)</code>: 1184 This request registers the address <code class="varname">pool</code> as the anchor 1185 address for a memory pool. It also provides a size 1186 <code class="varname">rzB</code>, specifying how large the redzones placed around 1187 chunks allocated from the pool should be. Finally, it provides an 1188 <code class="varname">is_zeroed</code> argument that specifies whether the pool's 1189 chunks are zeroed (more precisely: defined) when allocated. 1190 </p> 1191<p> 1192 Upon completion of this request, no chunks are associated with the 1193 pool. The request simply tells Memcheck that the pool exists, so that 1194 subsequent calls can refer to it as a pool. 1195 </p> 1196</li> 1197<li class="listitem"><p><code class="varname">VALGRIND_DESTROY_MEMPOOL(pool)</code>: 1198 This request tells Memcheck that a pool is being torn down. Memcheck 1199 then removes all records of chunks associated with the pool, as well 1200 as its record of the pool's existence. While destroying its records of 1201 a mempool, Memcheck resets the redzones of any live chunks in the pool 1202 to NOACCESS. 1203 </p></li> 1204<li class="listitem"><p><code class="varname">VALGRIND_MEMPOOL_ALLOC(pool, addr, size)</code>: 1205 This request informs Memcheck that a <code class="varname">size</code>-byte chunk 1206 has been allocated at <code class="varname">addr</code>, and associates the chunk with the 1207 specified 1208 <code class="varname">pool</code>. If the pool was created with nonzero 1209 <code class="varname">rzB</code> redzones, Memcheck will mark the 1210 <code class="varname">rzB</code> bytes before and after the chunk as NOACCESS. If 1211 the pool was created with the <code class="varname">is_zeroed</code> argument set, 1212 Memcheck will mark the chunk as DEFINED, otherwise Memcheck will mark 1213 the chunk as UNDEFINED. 1214 </p></li> 1215<li class="listitem"><p><code class="varname">VALGRIND_MEMPOOL_FREE(pool, addr)</code>: 1216 This request informs Memcheck that the chunk at <code class="varname">addr</code> 1217 should no longer be considered allocated. Memcheck will mark the chunk 1218 associated with <code class="varname">addr</code> as NOACCESS, and delete its 1219 record of the chunk's existence. 1220 </p></li> 1221<li class="listitem"> 1222<p><code class="varname">VALGRIND_MEMPOOL_TRIM(pool, addr, size)</code>: 1223 This request trims the chunks associated with <code class="varname">pool</code>. 1224 The request only operates on chunks associated with 1225 <code class="varname">pool</code>. Trimming is formally defined as:</p> 1226<div class="itemizedlist"><ul class="itemizedlist" type="circle"> 1227<li class="listitem"><p> All chunks entirely inside the range 1228 <code class="varname">addr..(addr+size-1)</code> are preserved.</p></li> 1229<li class="listitem"><p>All chunks entirely outside the range 1230 <code class="varname">addr..(addr+size-1)</code> are discarded, as though 1231 <code class="varname">VALGRIND_MEMPOOL_FREE</code> was called on them. </p></li> 1232<li class="listitem"><p>All other chunks must intersect with the range 1233 <code class="varname">addr..(addr+size-1)</code>; areas outside the 1234 intersection are marked as NOACCESS, as though they had been 1235 independently freed with 1236 <code class="varname">VALGRIND_MEMPOOL_FREE</code>.</p></li> 1237</ul></div> 1238<p>This is a somewhat rare request, but can be useful in 1239 implementing the type of mass-free operations common in custom 1240 LIFO allocators.</p> 1241</li> 1242<li class="listitem"> 1243<p><code class="varname">VALGRIND_MOVE_MEMPOOL(poolA, poolB)</code>: This 1244 request informs Memcheck that the pool previously anchored at 1245 address <code class="varname">poolA</code> has moved to anchor address 1246 <code class="varname">poolB</code>. This is a rare request, typically only needed 1247 if you <code class="function">realloc</code> the header of a mempool.</p> 1248<p>No memory-status bits are altered by this request.</p> 1249</li> 1250<li class="listitem"> 1251<p> 1252 <code class="varname">VALGRIND_MEMPOOL_CHANGE(pool, addrA, addrB, 1253 size)</code>: This request informs Memcheck that the chunk 1254 previously allocated at address <code class="varname">addrA</code> within 1255 <code class="varname">pool</code> has been moved and/or resized, and should be 1256 changed to cover the region <code class="varname">addrB..(addrB+size-1)</code>. This 1257 is a rare request, typically only needed if you 1258 <code class="function">realloc</code> a superblock or wish to extend a chunk 1259 without changing its memory-status bits. 1260 </p> 1261<p>No memory-status bits are altered by this request. 1262 </p> 1263</li> 1264<li class="listitem"><p><code class="varname">VALGRIND_MEMPOOL_EXISTS(pool)</code>: 1265 This request informs the caller whether or not Memcheck is currently 1266 tracking a mempool at anchor address <code class="varname">pool</code>. It 1267 evaluates to 1 when there is a mempool associated with that address, 0 1268 otherwise. This is a rare request, only useful in circumstances when 1269 client code might have lost track of the set of active mempools. 1270 </p></li> 1271</ul></div> 1272</div> 1273<div class="sect1" title="4.8.�Debugging MPI Parallel Programs with Valgrind"> 1274<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 1275<a name="mc-manual.mpiwrap"></a>4.8.�Debugging MPI Parallel Programs with Valgrind</h2></div></div></div> 1276<p>Memcheck supports debugging of distributed-memory applications 1277which use the MPI message passing standard. This support consists of a 1278library of wrapper functions for the 1279<code class="computeroutput">PMPI_*</code> interface. When incorporated 1280into the application's address space, either by direct linking or by 1281<code class="computeroutput">LD_PRELOAD</code>, the wrappers intercept 1282calls to <code class="computeroutput">PMPI_Send</code>, 1283<code class="computeroutput">PMPI_Recv</code>, etc. They then 1284use client requests to inform Memcheck of memory state changes caused 1285by the function being wrapped. This reduces the number of false 1286positives that Memcheck otherwise typically reports for MPI 1287applications.</p> 1288<p>The wrappers also take the opportunity to carefully check 1289size and definedness of buffers passed as arguments to MPI functions, hence 1290detecting errors such as passing undefined data to 1291<code class="computeroutput">PMPI_Send</code>, or receiving data into a 1292buffer which is too small.</p> 1293<p>Unlike most of the rest of Valgrind, the wrapper library is subject to a 1294BSD-style license, so you can link it into any code base you like. 1295See the top of <code class="computeroutput">mpi/libmpiwrap.c</code> 1296for license details.</p> 1297<div class="sect2" title="4.8.1.�Building and installing the wrappers"> 1298<div class="titlepage"><div><div><h3 class="title"> 1299<a name="mc-manual.mpiwrap.build"></a>4.8.1.�Building and installing the wrappers</h3></div></div></div> 1300<p> The wrapper library will be built automatically if possible. 1301Valgrind's configure script will look for a suitable 1302<code class="computeroutput">mpicc</code> to build it with. This must be 1303the same <code class="computeroutput">mpicc</code> you use to build the 1304MPI application you want to debug. By default, Valgrind tries 1305<code class="computeroutput">mpicc</code>, but you can specify a 1306different one by using the configure-time option 1307<code class="option">--with-mpicc</code>. Currently the 1308wrappers are only buildable with 1309<code class="computeroutput">mpicc</code>s which are based on GNU 1310GCC or Intel's C++ Compiler.</p> 1311<p>Check that the configure script prints a line like this:</p> 1312<pre class="programlisting"> 1313checking for usable MPI2-compliant mpicc and mpi.h... yes, mpicc 1314</pre> 1315<p>If it says <code class="computeroutput">... no</code>, your 1316<code class="computeroutput">mpicc</code> has failed to compile and link 1317a test MPI2 program.</p> 1318<p>If the configure test succeeds, continue in the usual way with 1319<code class="computeroutput">make</code> and <code class="computeroutput">make 1320install</code>. The final install tree should then contain 1321<code class="computeroutput">libmpiwrap-<platform>.so</code>. 1322</p> 1323<p>Compile up a test MPI program (eg, MPI hello-world) and try 1324this:</p> 1325<pre class="programlisting"> 1326LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so \ 1327 mpirun [args] $prefix/bin/valgrind ./hello 1328</pre> 1329<p>You should see something similar to the following</p> 1330<pre class="programlisting"> 1331valgrind MPI wrappers 31901: Active for pid 31901 1332valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options 1333</pre> 1334<p>repeated for every process in the group. If you do not see 1335these, there is an build/installation problem of some kind.</p> 1336<p> The MPI functions to be wrapped are assumed to be in an ELF 1337shared object with soname matching 1338<code class="computeroutput">libmpi.so*</code>. This is known to be 1339correct at least for Open MPI and Quadrics MPI, and can easily be 1340changed if required.</p> 1341</div> 1342<div class="sect2" title="4.8.2.�Getting started"> 1343<div class="titlepage"><div><div><h3 class="title"> 1344<a name="mc-manual.mpiwrap.gettingstarted"></a>4.8.2.�Getting started</h3></div></div></div> 1345<p>Compile your MPI application as usual, taking care to link it 1346using the same <code class="computeroutput">mpicc</code> that your 1347Valgrind build was configured with.</p> 1348<p> 1349Use the following basic scheme to run your application on Valgrind with 1350the wrappers engaged:</p> 1351<pre class="programlisting"> 1352MPIWRAP_DEBUG=[wrapper-args] \ 1353 LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so \ 1354 mpirun [mpirun-args] \ 1355 $prefix/bin/valgrind [valgrind-args] \ 1356 [application] [app-args] 1357</pre> 1358<p>As an alternative to 1359<code class="computeroutput">LD_PRELOAD</code>ing 1360<code class="computeroutput">libmpiwrap-<platform>.so</code>, you can 1361simply link it to your application if desired. This should not disturb 1362native behaviour of your application in any way.</p> 1363</div> 1364<div class="sect2" title="4.8.3.�Controlling the wrapper library"> 1365<div class="titlepage"><div><div><h3 class="title"> 1366<a name="mc-manual.mpiwrap.controlling"></a>4.8.3.�Controlling the wrapper library</h3></div></div></div> 1367<p>Environment variable 1368<code class="computeroutput">MPIWRAP_DEBUG</code> is consulted at 1369startup. The default behaviour is to print a starting banner</p> 1370<pre class="programlisting"> 1371valgrind MPI wrappers 16386: Active for pid 16386 1372valgrind MPI wrappers 16386: Try MPIWRAP_DEBUG=help for possible options 1373</pre> 1374<p> and then be relatively quiet.</p> 1375<p>You can give a list of comma-separated options in 1376<code class="computeroutput">MPIWRAP_DEBUG</code>. These are</p> 1377<div class="itemizedlist"><ul class="itemizedlist" type="disc"> 1378<li class="listitem"><p><code class="computeroutput">verbose</code>: 1379 show entries/exits of all wrappers. Also show extra 1380 debugging info, such as the status of outstanding 1381 <code class="computeroutput">MPI_Request</code>s resulting 1382 from uncompleted <code class="computeroutput">MPI_Irecv</code>s.</p></li> 1383<li class="listitem"><p><code class="computeroutput">quiet</code>: 1384 opposite of <code class="computeroutput">verbose</code>, only print 1385 anything when the wrappers want 1386 to report a detected programming error, or in case of catastrophic 1387 failure of the wrappers.</p></li> 1388<li class="listitem"><p><code class="computeroutput">warn</code>: 1389 by default, functions which lack proper wrappers 1390 are not commented on, just silently 1391 ignored. This causes a warning to be printed for each unwrapped 1392 function used, up to a maximum of three warnings per function.</p></li> 1393<li class="listitem"><p><code class="computeroutput">strict</code>: 1394 print an error message and abort the program if 1395 a function lacking a wrapper is used.</p></li> 1396</ul></div> 1397<p> If you want to use Valgrind's XML output facility 1398(<code class="option">--xml=yes</code>), you should pass 1399<code class="computeroutput">quiet</code> in 1400<code class="computeroutput">MPIWRAP_DEBUG</code> so as to get rid of any 1401extraneous printing from the wrappers.</p> 1402</div> 1403<div class="sect2" title="4.8.4.�Functions"> 1404<div class="titlepage"><div><div><h3 class="title"> 1405<a name="mc-manual.mpiwrap.limitations.functions"></a>4.8.4.�Functions</h3></div></div></div> 1406<p>All MPI2 functions except 1407<code class="computeroutput">MPI_Wtick</code>, 1408<code class="computeroutput">MPI_Wtime</code> and 1409<code class="computeroutput">MPI_Pcontrol</code> have wrappers. The 1410first two are not wrapped because they return a 1411<code class="computeroutput">double</code>, which Valgrind's 1412function-wrap mechanism cannot handle (but it could easily be 1413extended to do so). <code class="computeroutput">MPI_Pcontrol</code> cannot be 1414wrapped as it has variable arity: 1415<code class="computeroutput">int MPI_Pcontrol(const int level, ...)</code></p> 1416<p>Most functions are wrapped with a default wrapper which does 1417nothing except complain or abort if it is called, depending on 1418settings in <code class="computeroutput">MPIWRAP_DEBUG</code> listed 1419above. The following functions have "real", do-something-useful 1420wrappers:</p> 1421<pre class="programlisting"> 1422PMPI_Send PMPI_Bsend PMPI_Ssend PMPI_Rsend 1423 1424PMPI_Recv PMPI_Get_count 1425 1426PMPI_Isend PMPI_Ibsend PMPI_Issend PMPI_Irsend 1427 1428PMPI_Irecv 1429PMPI_Wait PMPI_Waitall 1430PMPI_Test PMPI_Testall 1431 1432PMPI_Iprobe PMPI_Probe 1433 1434PMPI_Cancel 1435 1436PMPI_Sendrecv 1437 1438PMPI_Type_commit PMPI_Type_free 1439 1440PMPI_Pack PMPI_Unpack 1441 1442PMPI_Bcast PMPI_Gather PMPI_Scatter PMPI_Alltoall 1443PMPI_Reduce PMPI_Allreduce PMPI_Op_create 1444 1445PMPI_Comm_create PMPI_Comm_dup PMPI_Comm_free PMPI_Comm_rank PMPI_Comm_size 1446 1447PMPI_Error_string 1448PMPI_Init PMPI_Initialized PMPI_Finalize 1449</pre> 1450<p> A few functions such as 1451<code class="computeroutput">PMPI_Address</code> are listed as 1452<code class="computeroutput">HAS_NO_WRAPPER</code>. They have no wrapper 1453at all as there is nothing worth checking, and giving a no-op wrapper 1454would reduce performance for no reason.</p> 1455<p> Note that the wrapper library itself can itself generate large 1456numbers of calls to the MPI implementation, especially when walking 1457complex types. The most common functions called are 1458<code class="computeroutput">PMPI_Extent</code>, 1459<code class="computeroutput">PMPI_Type_get_envelope</code>, 1460<code class="computeroutput">PMPI_Type_get_contents</code>, and 1461<code class="computeroutput">PMPI_Type_free</code>. </p> 1462</div> 1463<div class="sect2" title="4.8.5.�Types"> 1464<div class="titlepage"><div><div><h3 class="title"> 1465<a name="mc-manual.mpiwrap.limitations.types"></a>4.8.5.�Types</h3></div></div></div> 1466<p> MPI-1.1 structured types are supported, and walked exactly. 1467The currently supported combiners are 1468<code class="computeroutput">MPI_COMBINER_NAMED</code>, 1469<code class="computeroutput">MPI_COMBINER_CONTIGUOUS</code>, 1470<code class="computeroutput">MPI_COMBINER_VECTOR</code>, 1471<code class="computeroutput">MPI_COMBINER_HVECTOR</code> 1472<code class="computeroutput">MPI_COMBINER_INDEXED</code>, 1473<code class="computeroutput">MPI_COMBINER_HINDEXED</code> and 1474<code class="computeroutput">MPI_COMBINER_STRUCT</code>. This should 1475cover all MPI-1.1 types. The mechanism (function 1476<code class="computeroutput">walk_type</code>) should extend easily to 1477cover MPI2 combiners.</p> 1478<p>MPI defines some named structured types 1479(<code class="computeroutput">MPI_FLOAT_INT</code>, 1480<code class="computeroutput">MPI_DOUBLE_INT</code>, 1481<code class="computeroutput">MPI_LONG_INT</code>, 1482<code class="computeroutput">MPI_2INT</code>, 1483<code class="computeroutput">MPI_SHORT_INT</code>, 1484<code class="computeroutput">MPI_LONG_DOUBLE_INT</code>) which are pairs 1485of some basic type and a C <code class="computeroutput">int</code>. 1486Unfortunately the MPI specification makes it impossible to look inside 1487these types and see where the fields are. Therefore these wrappers 1488assume the types are laid out as <code class="computeroutput">struct { float val; 1489int loc; }</code> (for 1490<code class="computeroutput">MPI_FLOAT_INT</code>), etc, and act 1491accordingly. This appears to be correct at least for Open MPI 1.0.2 1492and for Quadrics MPI.</p> 1493<p>If <code class="computeroutput">strict</code> is an option specified 1494in <code class="computeroutput">MPIWRAP_DEBUG</code>, the application 1495will abort if an unhandled type is encountered. Otherwise, the 1496application will print a warning message and continue.</p> 1497<p>Some effort is made to mark/check memory ranges corresponding to 1498arrays of values in a single pass. This is important for performance 1499since asking Valgrind to mark/check any range, no matter how small, 1500carries quite a large constant cost. This optimisation is applied to 1501arrays of primitive types (<code class="computeroutput">double</code>, 1502<code class="computeroutput">float</code>, 1503<code class="computeroutput">int</code>, 1504<code class="computeroutput">long</code>, <code class="computeroutput">long 1505long</code>, <code class="computeroutput">short</code>, 1506<code class="computeroutput">char</code>, and <code class="computeroutput">long 1507double</code> on platforms where <code class="computeroutput">sizeof(long 1508double) == 8</code>). For arrays of all other types, the 1509wrappers handle each element individually and so there can be a very 1510large performance cost.</p> 1511</div> 1512<div class="sect2" title="4.8.6.�Writing new wrappers"> 1513<div class="titlepage"><div><div><h3 class="title"> 1514<a name="mc-manual.mpiwrap.writingwrappers"></a>4.8.6.�Writing new wrappers</h3></div></div></div> 1515<p> 1516For the most part the wrappers are straightforward. The only 1517significant complexity arises with nonblocking receives.</p> 1518<p>The issue is that <code class="computeroutput">MPI_Irecv</code> 1519states the recv buffer and returns immediately, giving a handle 1520(<code class="computeroutput">MPI_Request</code>) for the transaction. 1521Later the user will have to poll for completion with 1522<code class="computeroutput">MPI_Wait</code> etc, and when the 1523transaction completes successfully, the wrappers have to paint the 1524recv buffer. But the recv buffer details are not presented to 1525<code class="computeroutput">MPI_Wait</code> -- only the handle is. The 1526library therefore maintains a shadow table which associates 1527uncompleted <code class="computeroutput">MPI_Request</code>s with the 1528corresponding buffer address/count/type. When an operation completes, 1529the table is searched for the associated address/count/type info, and 1530memory is marked accordingly.</p> 1531<p>Access to the table is guarded by a (POSIX pthreads) lock, so as 1532to make the library thread-safe.</p> 1533<p>The table is allocated with 1534<code class="computeroutput">malloc</code> and never 1535<code class="computeroutput">free</code>d, so it will show up in leak 1536checks.</p> 1537<p>Writing new wrappers should be fairly easy. The source file is 1538<code class="computeroutput">mpi/libmpiwrap.c</code>. If possible, 1539find an existing wrapper for a function of similar behaviour to the 1540one you want to wrap, and use it as a starting point. The wrappers 1541are organised in sections in the same order as the MPI 1.1 spec, to 1542aid navigation. When adding a wrapper, remember to comment out the 1543definition of the default wrapper in the long list of defaults at the 1544bottom of the file (do not remove it, just comment it out).</p> 1545</div> 1546<div class="sect2" title="4.8.7.�What to expect when using the wrappers"> 1547<div class="titlepage"><div><div><h3 class="title"> 1548<a name="mc-manual.mpiwrap.whattoexpect"></a>4.8.7.�What to expect when using the wrappers</h3></div></div></div> 1549<p>The wrappers should reduce Memcheck's false-error rate on MPI 1550applications. Because the wrapping is done at the MPI interface, 1551there will still potentially be a large number of errors reported in 1552the MPI implementation below the interface. The best you can do is 1553try to suppress them.</p> 1554<p>You may also find that the input-side (buffer 1555length/definedness) checks find errors in your MPI use, for example 1556passing too short a buffer to 1557<code class="computeroutput">MPI_Recv</code>.</p> 1558<p>Functions which are not wrapped may increase the false 1559error rate. A possible approach is to run with 1560<code class="computeroutput">MPI_DEBUG</code> containing 1561<code class="computeroutput">warn</code>. This will show you functions 1562which lack proper wrappers but which are nevertheless used. You can 1563then write wrappers for them. 1564</p> 1565<p>A known source of potential false errors are the 1566<code class="computeroutput">PMPI_Reduce</code> family of functions, when 1567using a custom (user-defined) reduction function. In a reduction 1568operation, each node notionally sends data to a "central point" which 1569uses the specified reduction function to merge the data items into a 1570single item. Hence, in general, data is passed between nodes and fed 1571to the reduction function, but the wrapper library cannot mark the 1572transferred data as initialised before it is handed to the reduction 1573function, because all that happens "inside" the 1574<code class="computeroutput">PMPI_Reduce</code> call. As a result you 1575may see false positives reported in your reduction function.</p> 1576</div> 1577</div> 1578</div> 1579<div> 1580<br><table class="nav" width="100%" cellspacing="3" cellpadding="2" border="0" summary="Navigation footer"> 1581<tr> 1582<td rowspan="2" width="40%" align="left"> 1583<a accesskey="p" href="manual-core-adv.html"><<�3.�Using and understanding the Valgrind core: Advanced Topics</a>�</td> 1584<td width="20%" align="center"><a accesskey="u" href="manual.html">Up</a></td> 1585<td rowspan="2" width="40%" align="right">�<a accesskey="n" href="cg-manual.html">5.�Cachegrind: a cache and branch-prediction profiler�>></a> 1586</td> 1587</tr> 1588<tr><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td></tr> 1589</table> 1590</div> 1591</body> 1592</html> 1593