• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<?xml version="1.0"?> <!-- -*- sgml -*- -->
2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3          "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
4[ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]>
5
6
7<chapter id="pc-manual"
8         xreflabel="Ptrcheck: an experimental heap, stack and global array overrun detector">
9  <title>Ptrcheck: an experimental heap, stack and global array overrun detector</title>
10
11<para>To use this tool, you must specify
12<option>--tool=exp-ptrcheck</option> on the Valgrind
13command line.</para>
14
15
16
17
18<sect1 id="pc-manual.overview" xreflabel="Overview">
19<title>Overview</title>
20
21<para>Ptrcheck is a tool for finding overruns of heap, stack
22and global arrays.  Its functionality overlaps somewhat with
23Memcheck's, but it is able to catch invalid accesses in a number of
24cases that Memcheck would miss.  A detailed comparison against
25Memcheck is presented below.</para>
26
27<para>Ptrcheck is composed of two almost completely independent tools
28that have been glued together.  One part,
29in <computeroutput>h_main.[ch]</computeroutput>, checks accesses
30through heap-derived pointers.  The other part, in
31<computeroutput>sg_main.[ch]</computeroutput>, checks accesses to
32stack and global arrays.  The remaining
33files <computeroutput>pc_{common,main}.[ch]</computeroutput>, provide
34common error-management and coordination functions, so as to make it
35appear as a single tool.</para>
36
37<para>The heap-check part is an extensively-hacked (largely rewritten)
38version of the experimental "Annelid" tool developed and described by
39Nicholas Nethercote and Jeremy Fitzhardinge.  The stack- and global-
40check part uses a heuristic approach derived from an observation about
41the likely forms of stack and global array accesses, and, as far as is
42known, is entirely novel.</para>
43
44</sect1>
45
46
47
48
49<sect1 id="pc-manual.options" xreflabel="Ptrcheck Command-line Options">
50<title>Ptrcheck Command-line Options</title>
51
52<para>Ptrcheck-specific command-line options are:</para>
53
54<!-- start of xi:include in the manpage -->
55<variablelist id="pc.opts.list">
56
57  <varlistentry id="opt.enable-sg-checks" xreflabel="--enable-sg-checks">
58    <term>
59      <option><![CDATA[--enable-sg-checks=no|yes
60      [default: yes] ]]></option>
61    </term>
62    <listitem>
63      <para>By default, Ptrcheck checks for overruns of stack, global
64       and heap arrays.
65       With <varname>--enable-sg-checks=no</varname>, the stack and
66       global array checks are omitted, and only heap checking is
67       performed.  This can be useful because the stack and global
68       checks are quite expensive, so omitting them speeds Ptrcheck up
69       a lot.
70      </para>
71    </listitem>
72  </varlistentry>
73
74  <varlistentry id="opt.partial-loads-ok" xreflabel="--partial-loads-ok">
75    <term>
76      <option><![CDATA[--partial-loads-ok=<yes|no> [default: no] ]]></option>
77    </term>
78    <listitem>
79      <para>This option has the same meaning as it does for
80      Memcheck.</para>
81      <para>Controls how Ptrcheck handles word-sized, word-aligned
82      loads which partially overlap the end of heap blocks -- that is,
83      some of the bytes in the word are validly addressable, but
84      others are not.  When <varname>yes</varname>, such loads do not
85      produce an address error.  When <varname>no</varname> (the
86      default), loads from partially invalid addresses are treated the
87      same as loads from completely invalid addresses: an illegal heap
88      access error is issued.
89      </para>
90      <para>Note that code that behaves in this way is in violation of
91      the the ISO C/C++ standards, and should be considered broken.  If
92      at all possible, such code should be fixed.  This option should be
93      used only as a last resort.</para>
94    </listitem>
95  </varlistentry>
96
97</variablelist>
98<!-- end of xi:include in the manpage -->
99
100</sect1>
101
102
103
104
105<sect1 id="pc-manual.how-works.heap-checks"
106       xreflabel="How Ptrcheck Works: Heap Checks">
107<title>How Ptrcheck Works: Heap Checks</title>
108
109<para>Ptrcheck can check for invalid uses of heap pointers, including
110out of range accesses and accesses to freed memory.  The mechanism is
111however completely different from Memcheck's, and the checking is more
112powerful.</para>
113
114<para>For each pointer in the program, Ptrcheck keeps track of which
115heap block (if any) it was derived from.  Then, when an access is made
116through that pointer, Ptrcheck compares the access address with the
117bounds of the associated block, and reports an error if the address is
118out of bounds, or if the block has been freed.</para>
119
120<para>Of course it is rarely the case that one wants to access a block
121only at the exact address returned by <function>malloc</function> et al.
122Ptrcheck understands that adding or subtracting offsets from a pointer to a
123block results in a pointer to the same block.</para>
124
125<para>At a fundamental level, this scheme works because a correct
126program cannot make assumptions about the addresses returned by
127<function>malloc</function> et al.  In particular it cannot make any
128assumptions about the differences in addresses returned by subsequent calls
129to <function>malloc</function> et al.  Hence there are very few ways to take
130an address returned by <function>malloc</function>, modify it, and still
131have a valid address.  In short, the only allowable operations are adding
132and subtracting other non-pointer values.  Almost all other operations
133produce a value which cannot possibly be a valid pointer.</para>
134
135</sect1>
136
137
138
139<sect1 id="pc-manual.how-works.sg-checks"
140       xreflabel="How Ptrcheck Works: Stack and Global Checks">
141<title>How Ptrcheck Works: Stack and Global Checks</title>
142
143<para>When a source file is compiled
144with <option>-g</option>, the compiler attaches DWARF3
145debugging information which describes the location of all stack and
146global arrays in the file.</para>
147
148<para>Checking of accesses to such arrays would then be relatively
149simple, if the compiler could also tell us which array (if any) each
150memory referencing instruction was supposed to access.  Unfortunately
151the DWARF3 debugging format does not provide a way to represent such
152information, so we have to resort to a heuristic technique to
153approximate the same information.  The key observation is that
154   <emphasis>
155   if a memory referencing instruction accesses inside a stack or
156   global array once, then it is highly likely to always access that
157   same array</emphasis>.</para>
158
159<para>To see how this might be useful, consider the following buggy
160fragment:</para>
161<programlisting><![CDATA[
162   { int i, a[10];  // both are auto vars
163     for (i = 0; i <= 10; i++)
164        a[i] = 42;
165   }
166]]></programlisting>
167
168<para>At run time we will know the precise address
169of <computeroutput>a[]</computeroutput> on the stack, and so we can
170observe that the first store resulting from <computeroutput>a[i] =
17142</computeroutput> writes <computeroutput>a[]</computeroutput>, and
172we will (correctly) assume that that instruction is intended always to
173access <computeroutput>a[]</computeroutput>.  Then, on the 11th
174iteration, it accesses somewhere else, possibly a different local,
175possibly an un-accounted for area of the stack (eg, spill slot), so
176Ptrcheck reports an error.</para>
177
178<para>There is an important caveat.</para>
179
180<para>Imagine a function such as <function>memcpy</function>, which is used
181to read and write many different areas of memory over the lifetime of the
182program.  If we insist that the read and write instructions in its memory
183copying loop only ever access one particular stack or global variable, we
184will be flooded with errors resulting from calls to
185<function>memcpy</function>.</para>
186
187<para>To avoid this problem, Ptrcheck instantiates fresh likely-target
188records for each entry to a function, and discards them on exit.  This
189allows detection of cases where (e.g.) <function>memcpy</function> overflows
190its source or destination buffers for any specific call, but does not carry
191any restriction from one call to the next.  Indeed, multiple threads may be
192multiple simultaneous calls to (e.g.) <function>memcpy</function> without
193mutual interference.</para>
194
195</sect1>
196
197
198
199
200<sect1 id="pc-manual.cmp-w-memcheck"
201       xreflabel="Comparison with Memcheck">
202<title>Comparison with Memcheck</title>
203
204<para>Memcheck does not do any access checks for stack or global arrays, so
205the presence of those in Ptrcheck is a straight win.  (But see
206"Limitations" below).</para>
207
208<para>Memcheck and Ptrcheck use different approaches for checking heap
209accesses.  Memcheck maintains bitmaps telling it which areas of memory
210are accessible and which are not.  If a memory access falls in an
211unaccessible area, it reports an error.  By marking the 16 bytes
212before and after an allocated block unaccessible, Memcheck is able to
213detect small over- and underruns of the block.  Similarly, by marking
214freed memory as unaccessible, Memcheck can detect all accesses to
215freed memory.</para>
216
217<para>Memcheck's approach is simple.  But it's also weak.  It can't
218catch block overruns beyond 16 bytes.  And, more generally, because it
219focusses only on the question "is the target address accessible", it
220fails to detect invalid accesses which just happen to fall within some
221other valid area.  This is not improbable, especially in crowded areas
222of the process' address space.</para>
223
224<para>Ptrcheck's approach is to keep track of pointers derived from
225heap blocks.  It tracks pointers which are derived directly from calls
226to <function>malloc</function> et al, but also ones derived indirectly, by
227adding or subtracting offsets from the directly-derived pointers.  When a
228pointer is finally used to access memory, Ptrcheck compares the access
229address with that of the block it was originally derived from, and
230reports an error if the access address is not within the block
231bounds.</para>
232
233<para>Consequently Ptrcheck can detect any out of bounds access
234through a heap-derived pointer, no matter how far from the original
235block it is.</para>
236
237<para>A second advantage is that Ptrcheck is better at detecting
238accesses to blocks freed very far in the past.  Memcheck can detect
239these too, but only for blocks freed relatively recently.  To detect
240accesses to a freed block, Memcheck must make it inaccessible, hence
241requiring a space overhead proportional to the size of the block.  If
242the blocks are large, Memcheck will have to make them available for
243re-allocation relatively quickly, thereby losing the ability to detect
244invalid accesses to them.</para>
245
246<para>By contrast, Ptrcheck has a constant per-block space requirement
247of four machine words, for detection of accesses to freed blocks.  A
248freed block can be reallocated immediately, yet Ptrcheck can still
249detect all invalid accesses through any pointers derived from the old
250allocation, providing only that the four-word descriptor for the old
251allocation is stored.  For example, on a 64-bit machine, to detect
252accesses in any of the most recently freed 10 million blocks, Ptrcheck
253will require only 320MB of extra storage.  Achieving the same level of
254detection with Memcheck is close to impossible and would likely
255involve several gigabytes of extra storage.</para>
256
257<para>Having said all that, remember that Memcheck performs uninitialised
258value checking, invalid and mismatched free checking, overlap checking, and
259leak checking, none of which Ptrcheck do.  Memcheck has also benefitted from
260years of refinement, tuning, and experience with production-level usage, and
261so is much faster than Ptrcheck as it currently stands.
262</para>
263
264<para>Consequently we recommend you first make your programs run Memcheck
265clean.  Once that's done, try Ptrcheck to see if you can shake out any
266further heap, global or stack errors.</para>
267
268</sect1>
269
270
271
272
273
274<sect1 id="pc-manual.limitations"
275       xreflabel="Limitations">
276<title>Limitations</title>
277
278<para>This is an experimental tool, which relies rather too heavily on some
279not-as-robust-as-I-would-like assumptions on the behaviour of correct
280programs.  There are a number of limitations which you should be aware
281of.</para>
282
283<itemizedlist>
284
285  <listitem>
286   <para>Heap checks: Ptrcheck can occasionally lose track of, or
287   become confused about, which heap block a given pointer has been
288   derived from.  This can cause it to falsely report errors, or to
289   miss some errors.  This is not believed to be a serious
290   problem.</para>
291  </listitem>
292
293  <listitem>
294   <para>Heap checks: Ptrcheck only tracks pointers that are stored
295   properly aligned in memory.  If a pointer is stored at a misaligned
296   address, and then later read again, Ptrcheck will lose track of
297   what it points at.  Similar problem if a pointer is split into
298   pieces and later reconsitituted.</para>
299  </listitem>
300
301  <listitem>
302   <para>Heap checks: Ptrcheck needs to "understand" which system
303   calls return pointers and which don't.  Many, but not all system
304   calls are handled.  If an unhandled one is encountered, Ptrcheck will
305   abort.  Fortunately, adding support for a new syscall is very
306   easy.</para>
307  </listitem>
308
309  <listitem>
310   <para>Stack checks: It follows from the description above (<xref
311   linkend="pc-manual.how-works.sg-checks"/>) that the first access by a
312   memory referencing instruction to a stack or global array creates an
313   association between that instruction and the array, which is checked on
314   subsequent accesses by that instruction, until the containing function
315   exits.  Hence, the first access by an instruction to an array (in any
316   given function instantiation) is not checked for overrun, since Ptrcheck
317   uses that as the "example" of how subsequent accesses should
318   behave.</para>
319  </listitem>
320
321  <listitem>
322   <para>Stack checks: Similarly, and more serious, it is clearly
323   possible to write legitimate pieces of code which break the basic
324   assumption upon which the stack/global checking rests.  For
325   example:</para>
326
327<programlisting><![CDATA[
328  { int a[10], b[10], *p, i;
329    for (i = 0; i < 10; i++) {
330       p = /* arbitrary condition */  ? &a[i]  : &b[i];
331       *p = 42;
332    }
333  }
334]]></programlisting>
335
336   <para>In this case the store sometimes
337   accesses <computeroutput>a[]</computeroutput> and
338   sometimes <computeroutput>b[]</computeroutput>, but in no cases is
339   the addressed array overrun.  Nevertheless the change in target
340   will cause an error to be reported.</para>
341
342   <para>It is hard to see how to get around this problem.  The only
343   mitigating factor is that such constructions appear very rare, at
344   least judging from the results using the tool so far.  Such a
345   construction appears only once in the Valgrind sources (running
346   Valgrind on Valgrind) and perhaps two or three times for a start
347   and exit of Firefox.  The best that can be done is to suppress the
348   errors.</para>
349  </listitem>
350
351  <listitem>
352   <para>Performance: the stack/global checks require reading all of
353   the DWARF3 type and variable information on the executable and its
354   shared objects.  This is computationally expensive and makes
355   startup quite slow.  You can expect debuginfo reading time to be in
356   the region of a minute for an OpenOffice sized application, on a
357   2.4 GHz Core 2 machine.  Reading this information also requires a
358   lot of memory.  To make it viable, Ptrcheck goes to considerable
359   trouble to compress the in-memory representation of the DWARF3
360   data, which is why the process of reading it appears slow.</para>
361  </listitem>
362
363  <listitem>
364   <para>Performance: Ptrcheck runs slower than Memcheck.  This is
365   partly due to a lack of tuning, but partly due to algorithmic
366   difficulties.  The heap-check side is potentially quite fast.  The
367   stack and global checks can sometimes require a number of range
368   checks per memory access, and these are difficult to short-circuit
369   (despite considerable efforts having been made).
370   </para>
371  </listitem>
372
373  <listitem>
374   <para>Coverage: the heap checking is relatively robust, requiring
375   only that Ptrcheck can see calls to <function>malloc</function> et al.
376   In that sense it has debug-info requirements comparable with Memcheck,
377   and is able to heap-check programs even with no debugging information
378   attached.</para>
379
380   <para>Stack/global checking is much more fragile.  If a shared
381   object does not have debug information attached, then Ptrcheck will
382   not be able to determine the bounds of any stack or global arrays
383   defined within that shared object, and so will not be able to check
384   accesses to them.  This is true even when those arrays are accessed
385   from some other shared object which was compiled with debug
386   info.</para>
387
388   <para>At the moment Ptrcheck accepts objects lacking debuginfo
389   without comment.  This is dangerous as it causes Ptrcheck to
390   silently skip stack and global checking for such objects.  It would
391   be better to print a warning in such circumstances.</para>
392  </listitem>
393
394  <listitem>
395   <para>Coverage: Ptrcheck checks that the areas read or written by
396   system calls do not overrun heap blocks.  But it doesn't currently
397   check them for overruns stack and global arrays.  This would be
398   easy to add.</para>
399  </listitem>
400
401  <listitem>
402   <para>Platforms: the stack/global checks won't work properly on any
403   PowerPC platforms, only on x86 and amd64 targets.  That's because
404   the stack and global checking requires tracking function calls and
405   exits reliably, and there's no obvious way to do it with the PPC
406   ABIs.  (In comparison, with the x86 and amd64 ABIs this is relatively
407   straightforward.)</para>
408  </listitem>
409
410  <listitem>
411   <para>Robustness: related to the previous point.  Function
412   call/exit tracking for x86/amd64 is believed to work properly even
413   in the presence of longjmps within the same stack (although this
414   has not been tested).  However, code which switches stacks is
415   likely to cause breakage/chaos.</para>
416  </listitem>
417</itemizedlist>
418
419</sect1>
420
421
422
423
424
425<sect1 id="pc-manual.todo-user-visible"
426       xreflabel="Still To Do: User-visible Functionality">
427<title>Still To Do: User-visible Functionality</title>
428
429<itemizedlist>
430
431  <listitem>
432   <para>Extend system call checking to work on stack and global arrays.</para>
433  </listitem>
434
435  <listitem>
436   <para>Print a warning if a shared object does not have debug info
437   attached, or if, for whatever reason, debug info could not be
438   found, or read.</para>
439  </listitem>
440
441</itemizedlist>
442
443</sect1>
444
445
446
447
448<sect1 id="pc-manual.todo-implementation"
449       xreflabel="Still To Do: Implementation Tidying">
450<title>Still To Do: Implementation Tidying</title>
451
452<para>Items marked CRITICAL are considered important for correctness:
453non-fixage of them is liable to lead to crashes or assertion failures
454in real use.</para>
455
456<itemizedlist>
457
458  <listitem>
459   <para>h_main.c: make N_FREED_SEGS command-line configurable.</para>
460  </listitem>
461
462  <listitem>
463   <para> sg_main.c: Improve the performance of the stack / global
464   checks by doing some up-front filtering to ignore references in
465   areas which "obviously" can't be stack or globals.  This will
466   require using information that m_aspacemgr knows about the address
467   space layout.</para>
468  </listitem>
469
470  <listitem>
471   <para>h_main.c: get rid of the last_seg_added hack; add suitable
472   plumbing to the core/tool interface to do this cleanly.</para>
473  </listitem>
474
475  <listitem>
476   <para>h_main.c: move vast amounts of arch-dependent ugliness
477   (get_IntRegInfo et al) to its own source file, a la
478   mc_machine.c.</para>
479  </listitem>
480
481  <listitem>
482   <para>h_main.c: make the lossage-check stuff work again, as a way
483   of doing quality assurance on the implementation.</para>
484  </listitem>
485
486  <listitem>
487   <para>h_main.c: schemeEw_Atom: don't generate a call to
488   nonptr_or_unknown, this is really stupid, since it could be done at
489   translation time instead.</para>
490  </listitem>
491
492  <listitem>
493   <para>CRITICAL: h_main.c: h_instrument (main instrumentation fn):
494   generate shadows for word-sized temps defined in the block's
495   preamble.  (Why does this work at all, as it stands?)</para>
496  </listitem>
497
498  <listitem>
499   <para>sg_main.c: fix compute_II_hash to make it a bit more sensible
500   for ppc32/64 targets (except that sg_ doesn't work on ppc32/64
501   targets, so this is a bit academic at the moment).</para>
502  </listitem>
503
504</itemizedlist>
505
506</sect1>
507
508
509
510</chapter>
511