• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<?xml version="1.0"?> <!-- -*- sgml -*- -->
2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3  "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
4[ <!ENTITY % vg-entities SYSTEM "vg-entities.xml"> %vg-entities; ]>
5
6
7<chapter id="manual-core-adv" xreflabel="Valgrind's core: advanced topics">
8<title>Using and understanding the Valgrind core: Advanced Topics</title>
9
10<para>This chapter describes advanced aspects of the Valgrind core
11services, which are mostly of interest to power users who wish to
12customise and modify Valgrind's default behaviours in certain useful
13ways.  The subjects covered are:</para>
14
15<itemizedlist>
16  <listitem><para>The "Client Request" mechanism</para></listitem>
17  <listitem><para>Debugging your program using Valgrind's gdbserver
18      and GDB</para></listitem>
19  <listitem><para>Function Wrapping</para></listitem>
20</itemizedlist>
21
22
23
24<sect1 id="manual-core-adv.clientreq"
25       xreflabel="The Client Request mechanism">
26<title>The Client Request mechanism</title>
27
28<para>Valgrind has a trapdoor mechanism via which the client
29program can pass all manner of requests and queries to Valgrind
30and the current tool.  Internally, this is used extensively
31to make various things work, although that's not visible from the
32outside.</para>
33
34<para>For your convenience, a subset of these so-called client
35requests is provided to allow you to tell Valgrind facts about
36the behaviour of your program, and also to make queries.
37In particular, your program can tell Valgrind about things that it
38otherwise would not know, leading to better results.
39</para>
40
41<para>Clients need to include a header file to make this work.
42Which header file depends on which client requests you use.  Some
43client requests are handled by the core, and are defined in the
44header file <filename>valgrind/valgrind.h</filename>.  Tool-specific
45header files are named after the tool, e.g.
46<filename>valgrind/memcheck.h</filename>.  Each tool-specific header file
47includes <filename>valgrind/valgrind.h</filename> so you don't need to
48include it in your client if you include a tool-specific header.  All header
49files can be found in the <literal>include/valgrind</literal> directory of
50wherever Valgrind was installed.</para>
51
52<para>The macros in these header files have the magical property
53that they generate code in-line which Valgrind can spot.
54However, the code does nothing when not run on Valgrind, so you
55are not forced to run your program under Valgrind just because you
56use the macros in this file.  Also, you are not required to link your
57program with any extra supporting libraries.</para>
58
59<para>The code added to your binary has negligible performance impact:
60on x86, amd64, ppc32, ppc64 and ARM, the overhead is 6 simple integer
61instructions and is probably undetectable except in tight loops.
62However, if you really wish to compile out the client requests, you
63can compile with <option>-DNVALGRIND</option> (analogous to
64<option>-DNDEBUG</option>'s effect on
65<function>assert</function>).
66</para>
67
68<para>You are encouraged to copy the <filename>valgrind/*.h</filename> headers
69into your project's include directory, so your program doesn't have a
70compile-time dependency on Valgrind being installed.  The Valgrind headers,
71unlike most of the rest of the code, are under a BSD-style license so you may
72include them without worrying about license incompatibility.</para>
73
74<para>Here is a brief description of the macros available in
75<filename>valgrind.h</filename>, which work with more than one
76tool (see the tool-specific documentation for explanations of the
77tool-specific macros).</para>
78
79 <variablelist>
80
81  <varlistentry>
82   <term><command><computeroutput>RUNNING_ON_VALGRIND</computeroutput></command>:</term>
83   <listitem>
84    <para>Returns 1 if running on Valgrind, 0 if running on the
85    real CPU.  If you are running Valgrind on itself, returns the
86    number of layers of Valgrind emulation you're running on.
87    </para>
88   </listitem>
89  </varlistentry>
90
91  <varlistentry>
92   <term><command><computeroutput>VALGRIND_DISCARD_TRANSLATIONS</computeroutput>:</command></term>
93   <listitem>
94    <para>Discards translations of code in the specified address
95    range.  Useful if you are debugging a JIT compiler or some other
96    dynamic code generation system.  After this call, attempts to
97    execute code in the invalidated address range will cause
98    Valgrind to make new translations of that code, which is
99    probably the semantics you want.  Note that code invalidations
100    are expensive because finding all the relevant translations
101    quickly is very difficult, so try not to call it often.
102    Note that you can be clever about
103    this: you only need to call it when an area which previously
104    contained code is overwritten with new code.  You can choose
105    to write code into fresh memory, and just call this
106    occasionally to discard large chunks of old code all at
107    once.</para>
108    <para>
109    Alternatively, for transparent self-modifying-code support,
110    use<option>--smc-check=all</option>, or run
111    on ppc32/Linux, ppc64/Linux or ARM/Linux.
112    </para>
113   </listitem>
114  </varlistentry>
115
116  <varlistentry>
117   <term><command><computeroutput>VALGRIND_COUNT_ERRORS</computeroutput>:</command></term>
118   <listitem>
119    <para>Returns the number of errors found so far by Valgrind.  Can be
120    useful in test harness code when combined with the
121    <option>--log-fd=-1</option> option; this runs Valgrind silently,
122    but the client program can detect when errors occur.  Only useful
123    for tools that report errors, e.g. it's useful for Memcheck, but for
124    Cachegrind it will always return zero because Cachegrind doesn't
125    report errors.</para>
126   </listitem>
127  </varlistentry>
128
129  <varlistentry>
130   <term><command><computeroutput>VALGRIND_MALLOCLIKE_BLOCK</computeroutput>:</command></term>
131   <listitem>
132    <para>If your program manages its own memory instead of using
133    the standard <function>malloc</function> /
134    <function>new</function> /
135    <function>new[]</function>, tools that track
136    information about heap blocks will not do nearly as good a
137    job.  For example, Memcheck won't detect nearly as many
138    errors, and the error messages won't be as informative.  To
139    improve this situation, use this macro just after your custom
140    allocator allocates some new memory.  See the comments in
141    <filename>valgrind.h</filename> for information on how to use
142    it.</para>
143   </listitem>
144  </varlistentry>
145
146  <varlistentry>
147   <term><command><computeroutput>VALGRIND_FREELIKE_BLOCK</computeroutput>:</command></term>
148   <listitem>
149    <para>This should be used in conjunction with
150    <computeroutput>VALGRIND_MALLOCLIKE_BLOCK</computeroutput>.
151    Again, see <filename>valgrind.h</filename> for
152    information on how to use it.</para>
153   </listitem>
154  </varlistentry>
155
156  <varlistentry>
157   <term><command><computeroutput>VALGRIND_RESIZEINPLACE_BLOCK</computeroutput>:</command></term>
158   <listitem>
159    <para>Informs a Valgrind tool that the size of an allocated block has been
160    modified but not its address. See <filename>valgrind.h</filename> for
161    more information on how to use it.</para>
162   </listitem>
163  </varlistentry>
164
165  <varlistentry>
166   <term>
167   <command><computeroutput>VALGRIND_CREATE_MEMPOOL</computeroutput></command>,
168   <command><computeroutput>VALGRIND_DESTROY_MEMPOOL</computeroutput></command>,
169   <command><computeroutput>VALGRIND_MEMPOOL_ALLOC</computeroutput></command>,
170   <command><computeroutput>VALGRIND_MEMPOOL_FREE</computeroutput></command>,
171   <command><computeroutput>VALGRIND_MOVE_MEMPOOL</computeroutput></command>,
172   <command><computeroutput>VALGRIND_MEMPOOL_CHANGE</computeroutput></command>,
173   <command><computeroutput>VALGRIND_MEMPOOL_EXISTS</computeroutput></command>:
174   </term>
175   <listitem>
176    <para>These are similar to
177    <computeroutput>VALGRIND_MALLOCLIKE_BLOCK</computeroutput> and
178    <computeroutput>VALGRIND_FREELIKE_BLOCK</computeroutput>
179    but are tailored towards code that uses memory pools.  See
180    <xref linkend="mc-manual.mempools"/> for a detailed description.</para>
181   </listitem>
182  </varlistentry>
183
184  <varlistentry>
185   <term><command><computeroutput>VALGRIND_NON_SIMD_CALL[0123]</computeroutput>:</command></term>
186   <listitem>
187    <para>Executes a function in the client program on the
188    <emphasis>real</emphasis> CPU, not the virtual CPU that Valgrind
189    normally runs code on.  The function must take an integer (holding a
190    thread ID) as the first argument and then 0, 1, 2 or 3 more arguments
191    (depending on which client request is used).  These are used in various
192    ways internally to Valgrind.  They might be useful to client
193    programs.</para>
194
195    <para><command>Warning:</command> Only use these if you
196    <emphasis>really</emphasis> know what you are doing.  They aren't
197    entirely reliable, and can cause Valgrind to crash.  See
198    <filename>valgrind.h</filename> for more details.
199    </para>
200   </listitem>
201  </varlistentry>
202
203  <varlistentry>
204   <term><command><computeroutput>VALGRIND_PRINTF(format, ...)</computeroutput>:</command></term>
205   <listitem>
206    <para>Print a printf-style message to the Valgrind log file.  The
207    message is prefixed with the PID between a pair of
208    <computeroutput>**</computeroutput> markers.  (Like all client requests,
209    nothing is output if the client program is not running under Valgrind.)
210    Output is not produced until a newline is encountered, or subsequent
211    Valgrind output is printed; this allows you to build up a single line of
212    output over multiple calls.  Returns the number of characters output,
213    excluding the PID prefix.</para>
214   </listitem>
215  </varlistentry>
216
217  <varlistentry>
218   <term><command><computeroutput>VALGRIND_PRINTF_BACKTRACE(format, ...)</computeroutput>:</command></term>
219   <listitem>
220    <para>Like <computeroutput>VALGRIND_PRINTF</computeroutput> (in
221    particular, the return value is identical), but prints a stack backtrace
222    immediately afterwards.</para>
223   </listitem>
224  </varlistentry>
225
226  <varlistentry>
227   <term><command><computeroutput>VALGRIND_STACK_REGISTER(start, end)</computeroutput>:</command></term>
228   <listitem>
229    <para>Registers a new stack.  Informs Valgrind that the memory range
230    between start and end is a unique stack.  Returns a stack identifier
231    that can be used with other
232    <computeroutput>VALGRIND_STACK_*</computeroutput> calls.</para>
233    <para>Valgrind will use this information to determine if a change to
234    the stack pointer is an item pushed onto the stack or a change over
235    to a new stack.  Use this if you're using a user-level thread package
236    and are noticing spurious errors from Valgrind about uninitialized
237    memory reads.</para>
238
239    <para><command>Warning:</command> Unfortunately, this client request is
240    unreliable and best avoided.</para>
241   </listitem>
242  </varlistentry>
243
244  <varlistentry>
245   <term><command><computeroutput>VALGRIND_STACK_DEREGISTER(id)</computeroutput>:</command></term>
246   <listitem>
247    <para>Deregisters a previously registered stack.  Informs
248    Valgrind that previously registered memory range with stack id
249    <computeroutput>id</computeroutput> is no longer a stack.</para>
250
251    <para><command>Warning:</command> Unfortunately, this client request is
252    unreliable and best avoided.</para>
253   </listitem>
254  </varlistentry>
255
256  <varlistentry>
257   <term><command><computeroutput>VALGRIND_STACK_CHANGE(id, start, end)</computeroutput>:</command></term>
258   <listitem>
259    <para>Changes a previously registered stack.  Informs
260    Valgrind that the previously registered stack with stack id
261    <computeroutput>id</computeroutput> has changed its start and end
262    values.  Use this if your user-level thread package implements
263    stack growth.</para>
264
265    <para><command>Warning:</command> Unfortunately, this client request is
266    unreliable and best avoided.</para>
267   </listitem>
268  </varlistentry>
269
270 </variablelist>
271
272</sect1>
273
274
275
276
277
278
279
280<sect1 id="manual-core-adv.gdbserver"
281       xreflabel="Debugging your program using Valgrind's gdbserver and GDB">
282<title>Debugging your program using Valgrind gdbserver and GDB</title>
283
284<para>A program running under Valgrind is not executed directly by the
285CPU.  Instead it runs on a synthetic CPU provided by Valgrind.  This is
286why a debugger cannot debug your program when it runs on Valgrind.
287</para>
288<para>
289This section describes how GDB can interact with the
290Valgrind gdbserver to provide a fully debuggable program under
291Valgrind. Used in this way, GDB also provides an interactive usage of
292Valgrind core or tool functionalities, including incremental leak search
293under Memcheck and on-demand Massif snapshot production.
294</para>
295
296<sect2 id="manual-core-adv.gdbserver-simple"
297       xreflabel="gdbserver simple example">
298<title>Quick Start: debugging in 3 steps</title>
299
300<para>The simplest way to get started is to run Valgrind with the
301flag <option>--vgdb-error=0</option>.  Then follow the on-screen
302directions, which give you the precise commands needed to start GDB
303and connect it to your program.</para>
304
305<para>Otherwise, here's a slightly more verbose overview.</para>
306
307<para>If you want to debug a program with GDB when using the Memcheck
308tool, start Valgrind like this:
309<screen><![CDATA[
310valgrind --vgdb=yes --vgdb-error=0 prog
311]]></screen></para>
312
313<para>In another shell, start GDB:
314<screen><![CDATA[
315gdb prog
316]]></screen></para>
317
318<para>Then give the following command to GDB:
319<screen><![CDATA[
320(gdb) target remote | vgdb
321]]></screen></para>
322
323<para>You can now debug your program e.g. by inserting a breakpoint
324and then using the GDB <computeroutput>continue</computeroutput>
325command.</para>
326
327<para>This quick start information is enough for basic usage of the
328Valgrind gdbserver.  The sections below describe more advanced
329functionality provided by the combination of Valgrind and GDB. Note
330that the command line flag <option>--vgdb=yes</option> can be omitted,
331as this is the default value.
332</para>
333
334</sect2>
335
336<sect2 id="manual-core-adv.gdbserver-concept"
337       xreflabel="gdbserver">
338<title>Valgrind gdbserver overall organisation</title>
339<para>The GNU GDB debugger is typically used to debug a process
340running on the same machine.  In this mode, GDB uses system calls to
341control and query the program being debugged.  This works well, but
342only allows GDB to debug a program running on the same computer.
343</para>
344
345<para>GDB can also debug processes running on a different computer.
346To achieve this, GDB defines a protocol (that is, a set of query and
347reply packets) that facilitates fetching the value of memory or
348registers, setting breakpoints, etc.  A gdbserver is an implementation
349of this "GDB remote debugging" protocol.  To debug a process running
350on a remote computer, a gdbserver (sometimes called a GDB stub)
351must run at the remote computer side.
352</para>
353
354<para>The Valgrind core provides a built-in gdbserver implementation,
355which is activated using <option>--vgdb=yes</option>
356or <option>--vgdb=full</option>.  This gdbserver allows the process
357running on Valgrind's synthetic CPU to be debugged remotely.
358GDB sends protocol query packets (such as "get register contents") to
359the Valgrind embedded gdbserver.  The gdbserver executes the queries
360(for example, it will get the register values of the synthetic CPU)
361and gives the results back to GDB.
362</para>
363
364<para>GDB can use various kinds of channels (TCP/IP, serial line, etc)
365to communicate with the gdbserver.  In the case of Valgrind's
366gdbserver, communication is done via a pipe and a small helper program
367called <xref linkend="manual-core-adv.vgdb"/>, which acts as an
368intermediary.  If no GDB is in use, vgdb can also be
369used to send monitor commands to the Valgrind gdbserver from a shell
370command line.
371</para>
372
373</sect2>
374
375<sect2 id="manual-core-adv.gdbserver-gdb"
376       xreflabel="Connecting GDB to a Valgrind gdbserver">
377<title>Connecting GDB to a Valgrind gdbserver</title>
378<para>To debug a program "<filename>prog</filename>" running under
379Valgrind, you must ensure that the Valgrind gdbserver is activated by
380specifying either <option>--vgdb=yes</option>
381or <option>--vgdb=full</option>.  A secondary command line option,
382<option>--vgdb-error=number</option>, can be used to tell the gdbserver
383only to become active once the specified number of errors have been
384reported.  A value of zero will therefore cause
385the gdbserver to become active at startup, which allows you to
386insert breakpoints before starting the run.  For example:
387<screen><![CDATA[
388valgrind --tool=memcheck --vgdb=yes --vgdb-error=0 ./prog
389]]></screen></para>
390
391<para>The Valgrind gdbserver is invoked at startup
392and indicates it is waiting for a connection from a GDB:</para>
393
394<programlisting><![CDATA[
395==2418== Memcheck, a memory error detector
396==2418== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
397==2418== Using Valgrind-3.7.0.SVN and LibVEX; rerun with -h for copyright info
398==2418== Command: ./prog
399==2418==
400==2418== (action at startup) vgdb me ...
401]]></programlisting>
402
403
404<para>GDB (in another shell) can then be connected to the Valgrind gdbserver.
405For this, GDB must be started on the program <filename>prog</filename>:
406<screen><![CDATA[
407gdb ./prog
408]]></screen></para>
409
410
411<para>You then indicate to GDB that you want to debug a remote target:
412<screen><![CDATA[
413(gdb) target remote | vgdb
414]]></screen>
415GDB then starts a vgdb relay application to communicate with the
416Valgrind embedded gdbserver:</para>
417
418<programlisting><![CDATA[
419(gdb) target remote | vgdb
420Remote debugging using | vgdb
421relaying data between gdb and process 2418
422Reading symbols from /lib/ld-linux.so.2...done.
423Reading symbols from /usr/lib/debug/lib/ld-2.11.2.so.debug...done.
424Loaded symbols for /lib/ld-linux.so.2
425[Switching to Thread 2418]
4260x001f2850 in _start () from /lib/ld-linux.so.2
427(gdb)
428]]></programlisting>
429
430<para>Note that vgdb is provided as part of the Valgrind
431distribution.  You do not need to install it separately.</para>
432
433<para>If vgdb detects that there are multiple Valgrind gdbservers that
434can be connected to, it will list all such servers and their PIDs, and
435then exit.  You can then reissue the GDB "target" command, but
436specifying the PID of the process you want to debug:
437</para>
438
439<programlisting><![CDATA[
440(gdb) target remote | vgdb
441Remote debugging using | vgdb
442no --pid= arg given and multiple valgrind pids found:
443use --pid=2479 for valgrind --tool=memcheck --vgdb=yes --vgdb-error=0 ./prog
444use --pid=2481 for valgrind --tool=memcheck --vgdb=yes --vgdb-error=0 ./prog
445use --pid=2483 for valgrind --vgdb=yes --vgdb-error=0 ./another_prog
446Remote communication error: Resource temporarily unavailable.
447(gdb)  target remote | vgdb --pid=2479
448Remote debugging using | vgdb --pid=2479
449relaying data between gdb and process 2479
450Reading symbols from /lib/ld-linux.so.2...done.
451Reading symbols from /usr/lib/debug/lib/ld-2.11.2.so.debug...done.
452Loaded symbols for /lib/ld-linux.so.2
453[Switching to Thread 2479]
4540x001f2850 in _start () from /lib/ld-linux.so.2
455(gdb)
456]]></programlisting>
457
458<para>Once GDB is connected to the Valgrind gdbserver, it can be used
459in the same way as if you were debugging the program natively:</para>
460 <itemizedlist>
461  <listitem>
462    <para>Breakpoints can be inserted or deleted.</para>
463  </listitem>
464  <listitem>
465    <para>Variables and register values can be examined or modified.
466    </para>
467  </listitem>
468  <listitem>
469    <para>Signal handling can be configured (printing, ignoring).
470    </para>
471  </listitem>
472  <listitem>
473    <para>Execution can be controlled (continue, step, next, stepi, etc).
474    </para>
475  </listitem>
476  <listitem>
477    <para>Program execution can be interrupted using Control-C.</para>
478  </listitem>
479 </itemizedlist>
480
481<para>And so on.  Refer to the GDB user manual for a complete
482description of GDB's functionality.
483</para>
484
485</sect2>
486
487<sect2 id="manual-core-adv.gdbserver-gdb-android"
488       xreflabel="Connecting to an Android gdbserver">
489<title>Connecting to an Android gdbserver</title>
490<para> When developping applications for Android, you will typically use
491a development system (on which the Android NDK is installed) to compile your
492application. An Android target system or emulator will be used to run
493the application.
494In this setup, Valgrind and vgdb will run on the Android system,
495while GDB will run on the development system. GDB will connect
496to the vgdb running on the Android system using the Android NDK
497'adb forward' application.
498</para>
499<para> Example: on the Android system, execute the following:
500    <screen><![CDATA[
501valgrind --vgdb-error=0 prog
502# and then in another shell, run:
503vgdb --port=1234
504]]></screen>
505</para>
506
507<para> On the development system, execute the following commands:
508<screen><![CDATA[
509adb forward tcp:1234 tcp:1234
510gdb prog
511(gdb) target remote :1234
512]]></screen>
513GDB will use a local tcp/ip connection to connect to the Android adb forwarder.
514Adb will establish a relay connection between the host system and the Android
515target system. Pay attention to use the GDB delivered in the
516Android NDK system (typically, arm-linux-androideabi-gdb), as the host
517GDB is probably not able to debug Android arm applications.
518Note that the local port nr (used by GDB) must not necessarily be equal
519to the port number used by vgdb: adb can forward tcp/ip between different
520port numbers.
521</para>
522
523</sect2>
524
525<sect2 id="manual-core-adv.gdbserver-commandhandling"
526       xreflabel="Monitor command handling by the Valgrind gdbserver">
527<title>Monitor command handling by the Valgrind gdbserver</title>
528
529<para> The Valgrind gdbserver provides additional Valgrind-specific
530functionality via "monitor commands".  Such monitor commands can
531be sent from the GDB command line or from the shell command line.  See
532<xref linkend="manual-core-adv.valgrind-monitor-commands"/> for the list
533of the Valgrind core monitor commands.
534</para>
535
536<para>Each tool can also provide tool-specific monitor commands.
537An example of a tool specific monitor command is the Memcheck monitor
538command <computeroutput>leak_check full
539reachable any</computeroutput>.  This requests a full reporting of the
540allocated memory blocks.  To have this leak check executed, use the GDB
541command:
542<screen><![CDATA[
543(gdb) monitor leak_check full reachable any
544]]></screen>
545</para>
546
547<para>GDB will send the <computeroutput>leak_check</computeroutput>
548command to the Valgrind gdbserver.  The Valgrind gdbserver will
549execute the monitor command itself, if it recognises it to be a Valgrind core
550monitor command.  If it is not recognised as such, it is assumed to
551be tool-specific and is handed to the tool for execution.  For example:
552</para>
553<programlisting><![CDATA[
554(gdb) monitor leak_check full reachable any
555==2418== 100 bytes in 1 blocks are still reachable in loss record 1 of 1
556==2418==    at 0x4006E9E: malloc (vg_replace_malloc.c:236)
557==2418==    by 0x804884F: main (prog.c:88)
558==2418==
559==2418== LEAK SUMMARY:
560==2418==    definitely lost: 0 bytes in 0 blocks
561==2418==    indirectly lost: 0 bytes in 0 blocks
562==2418==      possibly lost: 0 bytes in 0 blocks
563==2418==    still reachable: 100 bytes in 1 blocks
564==2418==         suppressed: 0 bytes in 0 blocks
565==2418==
566(gdb)
567]]></programlisting>
568
569<para>As with other GDB commands, the Valgrind gdbserver will accept
570abbreviated monitor command names and arguments, as long as the given
571abbreviation is unambiguous.  For example, the above
572<computeroutput>leak_check</computeroutput>
573command can also be typed as:
574<screen><![CDATA[
575(gdb) mo l f r a
576]]></screen>
577
578The letters <computeroutput>mo</computeroutput> are recognised by GDB as being
579an abbreviation for <computeroutput>monitor</computeroutput>.  So GDB sends the
580string <computeroutput>l f r a</computeroutput> to the Valgrind
581gdbserver.  The letters provided in this string are unambiguous for the
582Valgrind gdbserver.  This therefore gives the same output as the
583unabbreviated command and arguments.  If the provided abbreviation is
584ambiguous, the Valgrind gdbserver will report the list of commands (or
585argument values) that can match:
586<programlisting><![CDATA[
587(gdb) mo v. n
588v. can match v.set v.info v.wait v.kill v.translate
589(gdb) mo v.i n
590n_errs_found 0 (vgdb-error 0)
591(gdb)
592]]></programlisting>
593</para>
594
595<para>Instead of sending a monitor command from GDB, you can also send
596these from a shell command line.  For example, the following command
597lines, when given in a shell, will cause the same leak search to be executed
598by the process 3145:
599<screen><![CDATA[
600vgdb --pid=3145 leak_check full reachable any
601vgdb --pid=3145 l f r a
602]]></screen></para>
603
604<para>Note that the Valgrind gdbserver automatically continues the
605execution of the program after a standalone invocation of
606vgdb.  Monitor commands sent from GDB do not cause the program to
607continue: the program execution is controlled explicitly using GDB
608commands such as "continue" or "next".</para>
609
610</sect2>
611
612<sect2 id="manual-core-adv.gdbserver-threads"
613       xreflabel="Valgrind gdbserver thread information">
614<title>Valgrind gdbserver thread information</title>
615
616<para>Valgrind's gdbserver enriches the output of the
617GDB <computeroutput>info threads</computeroutput> command
618with Valgrind-specific information.
619The operating system's thread number is followed
620by Valgrind's internal index for that thread ("tid") and by
621the Valgrind scheduler thread state:</para>
622
623<programlisting><![CDATA[
624(gdb) info threads
625  4 Thread 6239 (tid 4 VgTs_Yielding)  0x001f2832 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
626* 3 Thread 6238 (tid 3 VgTs_Runnable)  make_error (s=0x8048b76 "called from London") at prog.c:20
627  2 Thread 6237 (tid 2 VgTs_WaitSys)  0x001f2832 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
628  1 Thread 6234 (tid 1 VgTs_Yielding)  main (argc=1, argv=0xbedcc274) at prog.c:105
629(gdb)
630]]></programlisting>
631
632</sect2>
633
634<sect2 id="manual-core-adv.gdbserver-shadowregisters"
635       xreflabel="Examining and modifying Valgrind shadow registers">
636<title>Examining and modifying Valgrind shadow registers</title>
637
638<para> When the option <option>--vgdb-shadow-registers=yes</option> is
639given, the Valgrind gdbserver will let GDB examine and/or modify
640Valgrind's shadow registers.  GDB version 7.1 or later is needed for this
641to work.</para>
642
643<para>For each CPU register, the Valgrind core maintains two
644shadow register sets.  These shadow registers can be accessed from
645GDB by giving a postfix <computeroutput>s1</computeroutput>
646or <computeroutput>s2</computeroutput> for respectively the first
647and second shadow register.  For example, the x86 register
648<computeroutput>eax</computeroutput> and its two shadows
649can be examined using the following commands:</para>
650
651<programlisting><![CDATA[
652(gdb) p $eax
653$1 = 0
654(gdb) p $eaxs1
655$2 = 0
656(gdb) p $eaxs2
657$3 = 0
658(gdb)
659]]></programlisting>
660
661</sect2>
662
663
664<sect2 id="manual-core-adv.gdbserver-limitations"
665       xreflabel="Limitations of the Valgrind gdbserver">
666<title>Limitations of the Valgrind gdbserver</title>
667
668<para>Debugging with the Valgrind gdbserver is very similar to native
669debugging.  Valgrind's gdbserver implementation is quite
670complete, and so provides most of the GDB debugging functionality.  There
671are however some limitations and peculiarities:</para>
672 <itemizedlist>
673   <listitem>
674     <para>Precision of "stop-at" commands.</para>
675     <para>
676       GDB commands such as "step", "next", "stepi", breakpoints
677       and watchpoints, will stop the execution of the process.  With
678       the option <option>--vgdb=yes</option>, the process might not
679       stop at the exact requested instruction. Instead, it might
680       continue execution of the current basic block and stop at one
681       of the following basic blocks. This is linked to the fact that
682       Valgrind gdbserver has to instrument a block to allow stopping
683       at the exact instruction requested.  Currently,
684       re-instrumentation of the block currently being executed is not
685       supported. So, if the action requested by GDB (e.g. single
686       stepping or inserting a breakpoint) implies re-instrumentation
687       of the current block, the GDB action may not be executed
688       precisely.
689     </para>
690     <para>
691       This limitation applies when the basic block
692       currently being executed has not yet been instrumented for debugging.
693       This typically happens when the gdbserver is activated due to the
694       tool reporting an error or to a watchpoint.  If the gdbserver
695       block has been activated following a breakpoint, or if a
696       breakpoint has been inserted in the block before its execution,
697       then the block has already been instrumented for debugging.
698     </para>
699     <para>
700       If you use the option <option>--vgdb=full</option>, then GDB
701       "stop-at" commands will be obeyed precisely.  The
702       downside is that this requires each instruction to be
703       instrumented with an additional call to a gdbserver helper
704       function, which gives considerable overhead compared to
705       <option>--vgdb=no</option>.  Option <option>--vgdb=yes</option>
706       has neglectible overhead compared
707       to <option>--vgdb=no</option>.
708     </para>
709   </listitem>
710
711   <listitem>
712     <para>Hardware watchpoint support by the Valgrind
713     gdbserver.</para>
714
715     <para> The Valgrind gdbserver can simulate hardware watchpoints
716     if the selected tool provides support for it.  Currently,
717     only Memcheck provides hardware watchpoint simulation.  The
718     hardware watchpoint simulation provided by Memcheck is much
719     faster that GDB software watchpoints, which are implemented by
720     GDB checking the value of the watched zone(s) after each
721     instruction.  Hardware watchpoint simulation also provides read
722     watchpoints.  The hardware watchpoint simulation by Memcheck has
723     some limitations compared to real hardware
724     watchpoints. However, the number and length of simulated
725     watchpoints are not limited.
726     </para>
727     <para>Typically, the number of (real) hardware watchpoints is
728     limited.  For example, the x86 architecture supports a maximum of
729     4 hardware watchpoints, each watchpoint watching 1, 2, 4 or 8
730     bytes. The Valgrind gdbserver does not have any limitation on the
731     number of simulated hardware watchpoints. It also has no
732     limitation on the length of the memory zone being
733     watched.  Using GDB version 7.4 or later allow full use of the
734     flexibility of the Valgrind gdbserver's simulated hardware watchpoints.
735     Previous GDB versions do not understand that Valgrind gdbserver
736     watchpoints have no length limit.
737     </para>
738     <para>Memcheck implements hardware watchpoint simulation by
739     marking the watched address ranges as being unaddressable.  When
740     a hardware watchpoint is removed, the range is marked as
741     addressable and defined.  Hardware watchpoint simulation of
742     addressable-but-undefined memory zones works properly, but has
743     the undesirable side effect of marking the zone as defined when
744     the watchpoint is removed.
745     </para>
746     <para>Write watchpoints might not be reported at the
747     exact instruction that writes the monitored area,
748     unless option <option>--vgdb=full</option> is given.  Read watchpoints
749     will always be reported at the exact instruction reading the
750     watched memory.
751     </para>
752     <para>It is better to avoid using hardware watchpoint of not
753     addressable (yet) memory: in such a case, GDB will fall back to
754     extremely slow software watchpoints.  Also, if you do not quit GDB
755     between two debugging sessions, the hardware watchpoints of the
756     previous sessions will be re-inserted as software watchpoints if
757     the watched memory zone is not addressable at program startup.
758     </para>
759   </listitem>
760
761   <listitem>
762     <para>Stepping inside shared libraries on ARM.</para>
763     <para>For unknown reasons, stepping inside shared
764     libraries on ARM may fail.  A workaround is to use the
765     <computeroutput>ldd</computeroutput> command
766     to find the list of shared libraries and their loading address
767     and inform GDB of the loading address using the GDB command
768     "add-symbol-file". Example:
769     <programlisting><![CDATA[
770(gdb) shell ldd ./prog
771	libc.so.6 => /lib/libc.so.6 (0x4002c000)
772	/lib/ld-linux.so.3 (0x40000000)
773(gdb) add-symbol-file /lib/libc.so.6 0x4002c000
774add symbol table from file "/lib/libc.so.6" at
775	.text_addr = 0x4002c000
776(y or n) y
777Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
778(gdb)
779]]></programlisting>
780     </para>
781   </listitem>
782
783   <listitem>
784     <para>GDB version needed for ARM and PPC32/64.</para>
785     <para>You must use a GDB version which is able to read XML
786     target description sent by a gdbserver.  This is the standard setup
787     if GDB was configured and built with the "expat"
788     library.  If your GDB was not configured with XML support, it
789     will report an error message when using the "target"
790     command.  Debugging will not work because GDB will then not be
791     able to fetch the registers from the Valgrind gdbserver.
792     For ARM programs using the Thumb instruction set, you must use
793     a GDB version of 7.1 or later, as earlier versions have problems
794     with next/step/breakpoints in Thumb code.
795     </para>
796   </listitem>
797
798   <listitem>
799     <para>Stack unwinding on PPC32/PPC64. </para>
800     <para>On PPC32/PPC64, stack unwinding for leaf functions
801     (functions that do not call any other functions) works properly
802     only when you give the option
803     <option>--vex-iropt-precise-memory-exns=yes</option>.
804     You must also pass this option in order to get a precise stack when
805     a signal is trapped by GDB.
806     </para>
807   </listitem>
808
809   <listitem>
810     <para>Breakpoints encountered multiple times.</para>
811     <para>Some instructions (e.g. x86 "rep movsb")
812     are translated by Valgrind using a loop.  If a breakpoint is placed
813     on such an instruction, the breakpoint will be encountered
814     multiple times -- once for each step of the "implicit" loop
815     implementing the instruction.
816     </para>
817   </listitem>
818
819   <listitem>
820     <para>Execution of Inferior function calls by the Valgrind
821     gdbserver.</para>
822
823     <para>GDB allows the user to "call" functions inside the process
824     being debugged.  Such calls are named "inferior calls" in the GDB
825     terminology.  A typical use of an inferior call is to execute
826     a function that prints a human-readable version of a complex data
827     structure.  To make an inferior call, use the GDB "print" command
828     followed by the function to call and its arguments.  As an
829     example, the following GDB command causes an inferior call to the
830     libc "printf" function to be executed by the process
831     being debugged:
832     </para>
833     <programlisting><![CDATA[
834(gdb) p printf("process being debugged has pid %d\n", getpid())
835$5 = 36
836(gdb)
837]]></programlisting>
838
839     <para>The Valgrind gdbserver supports inferior function calls.
840     Whilst an inferior call is running, the Valgrind tool will report
841     errors as usual.  If you do not want to have such errors stop the
842     execution of the inferior call, you can
843     use <computeroutput>v.set vgdb-error</computeroutput> to set a
844     big value before the call, then manually reset it to its original
845     value when the call is complete.</para>
846
847     <para>To execute inferior calls, GDB changes registers such as
848     the program counter, and then continues the execution of the
849     program. In a multithreaded program, all threads are continued,
850     not just the thread instructed to make the inferior call.  If
851     another thread reports an error or encounters a breakpoint, the
852     evaluation of the inferior call is abandoned.</para>
853
854     <para>Note that inferior function calls are a powerful GDB
855     feature, but should be used with caution. For example, if
856     the program being debugged is stopped inside the function "printf",
857     forcing a recursive call to printf via an inferior call will
858     very probably create problems.  The Valgrind tool might also add
859     another level of complexity to inferior calls, e.g. by reporting
860     tool errors during the Inferior call or due to the
861     instrumentation done.
862     </para>
863
864   </listitem>
865
866   <listitem>
867     <para>Connecting to or interrupting a Valgrind process blocked in
868     a system call.</para>
869
870     <para>Connecting to or interrupting a Valgrind process blocked in
871     a system call requires the "ptrace" system call to be usable.
872     This may be disabled in your kernel for security reasons.</para>
873
874     <para>When running your program, Valgrind's scheduler
875     periodically checks whether there is any work to be handled by
876     the gdbserver.  Unfortunately this check is only done if at least
877     one thread of the process is runnable.  If all the threads of the
878     process are blocked in a system call, then the checks do not
879     happen, and the Valgrind scheduler will not invoke the gdbserver.
880     In such a case, the vgdb relay application will "force" the
881     gdbserver to be invoked, without the intervention of the Valgrind
882     scheduler.
883     </para>
884
885     <para>Such forced invocation of the Valgrind gdbserver is
886     implemented by vgdb using ptrace system calls.  On a properly
887     implemented kernel, the ptrace calls done by vgdb will not
888     influence the behaviour of the program running under Valgrind.
889     If however they do, giving the
890     option <option>--max-invoke-ms=0</option> to the vgdb relay
891     application will disable the usage of ptrace calls.  The
892     consequence of disabling ptrace usage in vgdb is that a Valgrind
893     process blocked in a system call cannot be woken up or
894     interrupted from GDB until it executes enough basic blocks to let
895     the Valgrind scheduler's normal checking take effect.
896     </para>
897
898     <para>When ptrace is disabled in vgdb, you can increase the
899     responsiveness of the Valgrind gdbserver to commands or
900     interrupts by giving a lower value to the
901     option <option>--vgdb-poll</option>.  If your application is
902     blocked in system calls most of the time, using a very low value
903     for <option>--vgdb-poll</option> will cause a the gdbserver to be
904     invoked sooner.  The gdbserver polling done by Valgrind's
905     scheduler is very efficient, so the increased polling frequency
906     should not cause significant performance degradation.
907     </para>
908
909     <para>When ptrace is disabled in vgdb, a query packet sent by GDB
910     may take significant time to be handled by the Valgrind
911     gdbserver.  In such cases, GDB might encounter a protocol
912     timeout.  To avoid this,
913     you can increase the value of the timeout by using the GDB
914     command "set remotetimeout".
915     </para>
916
917     <para>Ubuntu versions 10.10 and later may restrict the scope of
918     ptrace to the children of the process calling ptrace.  As the
919     Valgrind process is not a child of vgdb, such restricted scoping
920     causes the ptrace calls to fail.  To avoid that, when Valgrind
921     gdbserver receives the first packet from a vgdb, it calls
922     <computeroutput>prctl(PR_SET_PTRACER, vgdb_pid, 0, 0,
923     0)</computeroutput> to ensure vgdb can reliably use ptrace.
924     Once <computeroutput>vgdb_pid</computeroutput> has been marked as
925     a ptracer, vgdb can then properly force the invocation of
926     Valgrind gdbserver when needed.  To ensure the vgdb is set as a
927     ptracer before the Valgrind process gets blocked in a system
928     call, connect your GDB to the Valgrind gdbserver at startup by
929     passing <option>--vgdb-error=0</option> to Valgrind.</para>
930
931     <para>Note that
932     this "set ptracer" technique does not solve the problem in the
933     case where a standalone vgdb process wants to connect to the
934     gdbserver, since the first command to be sent by a standalone
935     vgdb must wake up the Valgrind process before Valgrind gdbserver
936     will mark vgdb as a ptracer.
937     </para>
938
939     <para>Unblocking processes blocked in system calls is not
940     currently implemented on Mac OS X and Android.  So you cannot
941     connect to or interrupt a process blocked in a system call on Mac
942     OS X or Android.
943     </para>
944
945   </listitem>
946
947   <listitem>
948     <para>Changing register values.</para>
949     <para>The Valgrind gdbserver will only modify the values of the
950     thread's registers when the thread is in status Runnable or
951     Yielding.  In other states (typically, WaitSys), attempts to
952     change register values will fail.  Amongst other things, this
953     means that inferior calls are not executed for a thread which is
954     in a system call, since the Valgrind gdbserver does not implement
955     system call restart.
956     </para>
957   </listitem>
958
959   <listitem>
960     <para>Unsupported GDB functionality.</para>
961     <para>GDB provides a lot of debugging functionality and not all
962     of it is supported.  Specifically, the following are not
963     supported: reversible debugging and tracepoints.
964     </para>
965   </listitem>
966
967   <listitem>
968     <para>Unknown limitations or problems.</para>
969     <para>The combination of GDB, Valgrind and the Valgrind gdbserver
970     probably has unknown other limitations and problems.  If you
971     encounter strange or unexpected behaviour, feel free to report a
972     bug.  But first please verify that the limitation or problem is
973     not inherent to GDB or the GDB remote protocol.  You may be able
974     to do so by checking the behaviour when using standard gdbserver
975     part of the GDB package.
976     </para>
977   </listitem>
978
979 </itemizedlist>
980
981</sect2>
982
983<sect2 id="manual-core-adv.vgdb"
984       xreflabel="vgdb">
985<title>vgdb command line options</title>
986<para> Usage: <computeroutput>vgdb [OPTION]... [[-c] COMMAND]...</computeroutput></para>
987
988<para> vgdb ("Valgrind to GDB") is a small program that is used as an
989intermediary between Valgrind and GDB or a shell.
990Therefore, it has two usage modes:
991</para>
992<orderedlist>
993  <listitem id="manual-core-adv.vgdb-standalone" xreflabel="vgdb standalone">
994    <para>As a standalone utility, it is used from a shell command
995    line to send monitor commands to a process running under
996    Valgrind. For this usage, the vgdb OPTION(s) must be followed by
997    the monitor command to send. To send more than one command,
998    separate them with the <option>-c</option> option.
999    </para>
1000  </listitem>
1001
1002  <listitem id="manual-core-adv.vgdb-relay" xreflabel="vgdb relay">
1003    <para>In combination with GDB "target remote |" command, it is
1004    used as the relay application between GDB and the Valgrind
1005    gdbserver.  For this usage, only OPTION(s) can be given, but no
1006    COMMAND can be given.
1007    </para>
1008  </listitem>
1009
1010</orderedlist>
1011
1012<para><computeroutput>vgdb</computeroutput> accepts the following
1013options:</para>
1014<itemizedlist>
1015  <listitem>
1016    <para><option>--pid=&lt;number&gt;</option>: specifies the PID of
1017    the process to which vgdb must connect to.  This option is useful
1018    in case more than one Valgrind gdbserver can be connected to.  If
1019    the <option>--pid</option> argument is not given and multiple
1020    Valgrind gdbserver processes are running, vgdb will report the
1021    list of such processes and then exit.</para>
1022  </listitem>
1023
1024  <listitem>
1025    <para><option>--vgdb-prefix</option> must be given to both
1026    Valgrind and vgdb if you want to change the default prefix for the
1027    FIFOs (named pipes) used for communication between the Valgrind
1028    gdbserver and vgdb. </para>
1029  </listitem>
1030
1031  <listitem>
1032    <para><option>--wait=&lt;number&gt;</option> instructs vgdb to
1033    search for available Valgrind gdbservers for the specified number
1034    of seconds.  This makes it possible start a vgdb process
1035    before starting the Valgrind gdbserver with which you intend the
1036    vgdb to communicate.  This option is useful when used in
1037    conjunction with a <option>--vgdb-prefix</option> that is
1038    unique to the process you want to wait for.
1039    Also, if you use the <option>--wait</option> argument in the GDB
1040    "target remote" command, you must set the GDB remotetimeout to a
1041    value bigger than the --wait argument value.  See option
1042    <option>--max-invoke-ms</option> (just below)
1043    for an example of setting the remotetimeout value.</para>
1044  </listitem>
1045
1046  <listitem>
1047    <para><option>--max-invoke-ms=&lt;number&gt;</option> gives the
1048    number of milliseconds after which vgdb will force the invocation
1049    of gdbserver embedded in Valgrind.  The default value is 100
1050    milliseconds. A value of 0 disables forced invocation. The forced
1051    invocation is used when vgdb is connected to a Valgrind gdbserver,
1052    and the Valgrind process has all its threads blocked in a system
1053    call.
1054    </para>
1055
1056    <para>If you specify a large value, you might need to increase the
1057    GDB "remotetimeout" value from its default value of 2 seconds.
1058    You should ensure that the timeout (in seconds) is
1059    bigger than the <option>--max-invoke-ms</option> value.  For
1060    example, for <option>--max-invoke-ms=5000</option>, the following
1061    GDB command is suitable:
1062    <screen><![CDATA[
1063    (gdb) set remotetimeout 6
1064    ]]></screen>
1065    </para>
1066  </listitem>
1067
1068  <listitem>
1069    <para><option>--cmd-time-out=&lt;number&gt;</option> instructs a
1070    standalone vgdb to exit if the Valgrind gdbserver it is connected
1071    to does not process a command in the specified number of seconds.
1072    The default value is to never time out.</para>
1073  </listitem>
1074
1075  <listitem>
1076    <para><option>--port=&lt;portnr&gt;</option> instructs vgdb to
1077    use tcp/ip and listen for GDB on the specified port nr rather than
1078    to use a pipe to communicate with GDB. Using tcp/ip allows to have
1079    GDB running on one computer and debugging a Valgrind process
1080    running on another target computer.
1081    Example:
1082    <screen><![CDATA[
1083# On the target computer, start your program under valgrind using
1084valgrind --vgdb-error=0 prog
1085# and then in another shell, run:
1086vgdb --port=1234
1087]]></screen></para>
1088    <para>On the computer which hosts GDB, execute the command:
1089    <screen><![CDATA[
1090gdb prog
1091(gdb) target remote targetip:1234
1092]]></screen>
1093    where targetip is the ip address or hostname of the target computer.
1094    </para>
1095  </listitem>
1096
1097  <listitem>
1098    <para><option>-c</option> To give more than one command to a
1099    standalone vgdb, separate the commands by an
1100    option <option>-c</option>. Example:
1101    <screen><![CDATA[
1102vgdb v.set log_output -c leak_check any
1103]]></screen></para>
1104  </listitem>
1105
1106  <listitem>
1107    <para><option>-l</option> instructs a standalone vgdb to report
1108    the list of the Valgrind gdbserver processes running and then
1109    exit.</para>
1110  </listitem>
1111
1112  <listitem>
1113    <para><option>-D</option> instructs a standalone vgdb to show the
1114    state of the shared memory used by the Valgrind gdbserver.  vgdb
1115    will exit after having shown the Valgrind gdbserver shared memory
1116    state.</para>
1117  </listitem>
1118
1119  <listitem>
1120    <para><option>-d</option> instructs vgdb to produce debugging
1121    output.  Give multiple <option>-d</option> args to increase the
1122    verbosity. When giving <option>-d</option> to a relay vgdb, you better
1123    redirect the standard error (stderr) of vgdb to a file to avoid
1124    interaction between GDB and vgdb debugging output.</para>
1125  </listitem>
1126
1127</itemizedlist>
1128
1129</sect2>
1130
1131
1132<sect2 id="manual-core-adv.valgrind-monitor-commands"
1133       xreflabel="Valgrind monitor commands">
1134<title>Valgrind monitor commands</title>
1135
1136<para>The Valgrind monitor commands are available regardless of the
1137Valgrind tool selected.  They can be sent either from a shell command
1138line, by using a standalone vgdb, or from GDB, by using GDB's
1139"monitor" command.</para>
1140
1141<itemizedlist>
1142  <listitem>
1143    <para><varname>help [debug]</varname> instructs Valgrind's gdbserver
1144    to give the list of all monitor commands of the Valgrind core and
1145    of the tool. The optional "debug" argument tells to also give help
1146    for the monitor commands aimed at Valgrind internals debugging.
1147    </para>
1148  </listitem>
1149
1150  <listitem>
1151    <para><varname>v.info all_errors</varname> shows all errors found
1152    so far.</para>
1153  </listitem>
1154  <listitem>
1155    <para><varname>v.info last_error</varname> shows the last error
1156    found.</para>
1157  </listitem>
1158
1159  <listitem>
1160    <para><varname>v.info n_errs_found</varname> shows the number of
1161    errors found so far and the current value of the
1162    <option>--vgdb-error</option>
1163    argument.</para>
1164  </listitem>
1165
1166  <listitem>
1167    <para><varname>v.set {gdb_output | log_output |
1168    mixed_output}</varname> allows redirection of the Valgrind output
1169    (e.g. the errors detected by the tool).  The default setting is
1170    <computeroutput>mixed_output</computeroutput>.</para>
1171
1172    <para>With <computeroutput>mixed_output</computeroutput>, the
1173    Valgrind output goes to the Valgrind log (typically stderr) while
1174    the output of the interactive GDB monitor commands (e.g.
1175    <computeroutput>v.info last_error</computeroutput>)
1176    is displayed by GDB.</para>
1177
1178    <para>With <computeroutput>gdb_output</computeroutput>, both the
1179    Valgrind output and the interactive GDB monitor commands output are
1180    displayed by GDB.</para>
1181
1182    <para>With <computeroutput>log_output</computeroutput>, both the
1183    Valgrind output and the interactive GDB monitor commands output go
1184    to the Valgrind log.</para>
1185  </listitem>
1186
1187  <listitem>
1188    <para><varname>v.wait [ms (default 0)]</varname> instructs
1189    Valgrind gdbserver to sleep "ms" milli-seconds and then
1190    continue.  When sent from a standalone vgdb, if this is the last
1191    command, the Valgrind process will continue the execution of the
1192    guest process. The typical usage of this is to use vgdb to send a
1193    "no-op" command to a Valgrind gdbserver so as to continue the
1194    execution of the guest process.
1195    </para>
1196  </listitem>
1197
1198  <listitem>
1199    <para><varname>v.kill</varname> requests the gdbserver to kill
1200    the process. This can be used from a standalone vgdb to properly
1201    kill a Valgrind process which is currently expecting a vgdb
1202    connection.</para>
1203  </listitem>
1204
1205  <listitem>
1206    <para><varname>v.set vgdb-error &lt;errornr&gt;</varname>
1207    dynamically changes the value of the
1208    <option>--vgdb-error</option> argument. A
1209    typical usage of this is to start with
1210    <option>--vgdb-error=0</option> on the
1211    command line, then set a few breakpoints, set the vgdb-error value
1212    to a huge value and continue execution.</para>
1213  </listitem>
1214
1215</itemizedlist>
1216
1217<para>The following Valgrind monitor commands are useful for
1218investigating the behaviour of Valgrind or its gdbserver in case of
1219problems or bugs.</para>
1220
1221<itemizedlist>
1222
1223  <listitem>
1224    <para><varname>v.info gdbserver_status</varname> shows the
1225    gdbserver status. In case of problems (e.g. of communications),
1226    this shows the values of some relevant Valgrind gdbserver internal
1227    variables.  Note that the variables related to breakpoints and
1228    watchpoints (e.g. the number of breakpoint addresses and the number of
1229    watchpoints) will be zero, as GDB by default removes all
1230    watchpoints and breakpoints when execution stops, and re-inserts
1231    them when resuming the execution of the debugged process.  You can
1232    change this GDB behaviour by using the GDB command
1233    <computeroutput>set breakpoint always-inserted on</computeroutput>.
1234    </para>
1235  </listitem>
1236
1237  <listitem>
1238    <para><varname>v.info memory</varname> shows the statistics of
1239    Valgrind's internal heap management. If
1240    option <option>--profile-heap=yes</option> was given, detailed
1241    statistics will be output.
1242    </para>
1243  </listitem>
1244
1245  <listitem>
1246    <para><varname>v.info scheduler</varname> shows the state and
1247    stack trace for all threads, as known by Valgrind.  This allows to
1248    compare the stack traces produced by the Valgrind unwinder with
1249    the stack traces produced by GDB+Valgrind gdbserver. Pay attention
1250    that GDB and Valgrind scheduler status have their own thread
1251    numbering scheme. To make the link between the GDB thread
1252    number and the corresponding Valgrind scheduler thread number,
1253    use the GDB command <computeroutput>info
1254    threads</computeroutput>.  The output of this command shows the
1255    GDB thread number and the valgrind 'tid'. The 'tid' is the thread number
1256    output by <computeroutput>v.info scheduler</computeroutput>.
1257    When using the callgrind tool, the callgrind monitor command
1258    <computeroutput>status</computeroutput> outputs internal callgrind
1259    information about the stack/call graph it maintains.
1260    </para>
1261  </listitem>
1262
1263  <listitem>
1264    <para><varname>v.set debuglog &lt;intvalue&gt;</varname> sets the
1265    Valgrind debug log level to &lt;intvalue&gt;.  This allows to
1266    dynamically change the log level of Valgrind e.g. when a problem
1267    is detected.</para>
1268  </listitem>
1269
1270  <listitem>
1271    <para><varname>v.translate &lt;address&gt;
1272    [&lt;traceflags&gt;]</varname> shows the translation of the block
1273    containing <computeroutput>address</computeroutput> with the given
1274    trace flags. The <computeroutput>traceflags</computeroutput> value
1275    bit patterns have similar meaning to Valgrind's
1276    <option>--trace-flags</option> option.  It can be given
1277    in hexadecimal (e.g. 0x20) or decimal (e.g. 32) or in binary 1s
1278    and 0s bit (e.g. 0b00100000). The default value of the traceflags
1279    is 0b00100000, corresponding to "show after instrumentation".
1280    The output of this command always goes to the Valgrind
1281    log.</para>
1282    <para>The additional bit flag 0b100000000 (bit 8)
1283    has no equivalent in the <option>--trace-flags</option> option.
1284    It enables tracing of the gdbserver specific instrumentation.  Note
1285    that this bit 8 can only enable the addition of gdbserver
1286    instrumentation in the trace.  Setting it to 0 will not
1287    disable the tracing of the gdbserver instrumentation if it is
1288    active for some other reason, for example because there is a breakpoint at
1289    this address or because gdbserver is in single stepping
1290    mode.</para>
1291  </listitem>
1292
1293</itemizedlist>
1294
1295</sect2>
1296
1297</sect1>
1298
1299
1300
1301
1302
1303<sect1 id="manual-core-adv.wrapping" xreflabel="Function Wrapping">
1304<title>Function wrapping</title>
1305
1306<para>
1307Valgrind allows calls to some specified functions to be intercepted and
1308rerouted to a different, user-supplied function.  This can do whatever it
1309likes, typically examining the arguments, calling onwards to the original,
1310and possibly examining the result.  Any number of functions may be
1311wrapped.</para>
1312
1313<para>
1314Function wrapping is useful for instrumenting an API in some way.  For
1315example, Helgrind wraps functions in the POSIX pthreads API so it can know
1316about thread status changes, and the core is able to wrap
1317functions in the MPI (message-passing) API so it can know
1318of memory status changes associated with message arrival/departure.
1319Such information is usually passed to Valgrind by using client
1320requests in the wrapper functions, although the exact mechanism may vary.
1321</para>
1322
1323<sect2 id="manual-core-adv.wrapping.example" xreflabel="A Simple Example">
1324<title>A Simple Example</title>
1325
1326<para>Supposing we want to wrap some function</para>
1327
1328<programlisting><![CDATA[
1329int foo ( int x, int y ) { return x + y; }]]></programlisting>
1330
1331<para>A wrapper is a function of identical type, but with a special name
1332which identifies it as the wrapper for <computeroutput>foo</computeroutput>.
1333Wrappers need to include
1334supporting macros from <filename>valgrind.h</filename>.
1335Here is a simple wrapper which prints the arguments and return value:</para>
1336
1337<programlisting><![CDATA[
1338#include <stdio.h>
1339#include "valgrind.h"
1340int I_WRAP_SONAME_FNNAME_ZU(NONE,foo)( int x, int y )
1341{
1342   int    result;
1343   OrigFn fn;
1344   VALGRIND_GET_ORIG_FN(fn);
1345   printf("foo's wrapper: args %d %d\n", x, y);
1346   CALL_FN_W_WW(result, fn, x,y);
1347   printf("foo's wrapper: result %d\n", result);
1348   return result;
1349}
1350]]></programlisting>
1351
1352<para>To become active, the wrapper merely needs to be present in a text
1353section somewhere in the same process' address space as the function
1354it wraps, and for its ELF symbol name to be visible to Valgrind.  In
1355practice, this means either compiling to a
1356<computeroutput>.o</computeroutput> and linking it in, or
1357compiling to a <computeroutput>.so</computeroutput> and
1358<computeroutput>LD_PRELOAD</computeroutput>ing it in.  The latter is more
1359convenient in that it doesn't require relinking.</para>
1360
1361<para>All wrappers have approximately the above form.  There are three
1362crucial macros:</para>
1363
1364<para><computeroutput>I_WRAP_SONAME_FNNAME_ZU</computeroutput>:
1365this generates the real name of the wrapper.
1366This is an encoded name which Valgrind notices when reading symbol
1367table information.  What it says is: I am the wrapper for any function
1368named <computeroutput>foo</computeroutput> which is found in
1369an ELF shared object with an empty
1370("<computeroutput>NONE</computeroutput>") soname field.  The specification
1371mechanism is powerful in
1372that wildcards are allowed for both sonames and function names.
1373The details are discussed below.</para>
1374
1375<para><computeroutput>VALGRIND_GET_ORIG_FN</computeroutput>:
1376once in the the wrapper, the first priority is
1377to get hold of the address of the original (and any other supporting
1378information needed).  This is stored in a value of opaque
1379type <computeroutput>OrigFn</computeroutput>.
1380The information is acquired using
1381<computeroutput>VALGRIND_GET_ORIG_FN</computeroutput>.  It is crucial
1382to make this macro call before calling any other wrapped function
1383in the same thread.</para>
1384
1385<para><computeroutput>CALL_FN_W_WW</computeroutput>: eventually we will
1386want to call the function being
1387wrapped.  Calling it directly does not work, since that just gets us
1388back to the wrapper and leads to an infinite loop.  Instead, the result
1389lvalue,
1390<computeroutput>OrigFn</computeroutput> and arguments are
1391handed to one of a family of macros of the form
1392<computeroutput>CALL_FN_*</computeroutput>.  These
1393cause Valgrind to call the original and avoid recursion back to the
1394wrapper.</para>
1395</sect2>
1396
1397<sect2 id="manual-core-adv.wrapping.specs" xreflabel="Wrapping Specifications">
1398<title>Wrapping Specifications</title>
1399
1400<para>This scheme has the advantage of being self-contained.  A library of
1401wrappers can be compiled to object code in the normal way, and does
1402not rely on an external script telling Valgrind which wrappers pertain
1403to which originals.</para>
1404
1405<para>Each wrapper has a name which, in the most general case says: I am the
1406wrapper for any function whose name matches FNPATT and whose ELF
1407"soname" matches SOPATT.  Both FNPATT and SOPATT may contain wildcards
1408(asterisks) and other characters (spaces, dots, @, etc) which are not
1409generally regarded as valid C identifier names.</para>
1410
1411<para>This flexibility is needed to write robust wrappers for POSIX pthread
1412functions, where typically we are not completely sure of either the
1413function name or the soname, or alternatively we want to wrap a whole
1414set of functions at once.</para>
1415
1416<para>For example, <computeroutput>pthread_create</computeroutput>
1417in GNU libpthread is usually a
1418versioned symbol - one whose name ends in, eg,
1419<computeroutput>@GLIBC_2.3</computeroutput>.  Hence we
1420are not sure what its real name is.  We also want to cover any soname
1421of the form <computeroutput>libpthread.so*</computeroutput>.
1422So the header of the wrapper will be</para>
1423
1424<programlisting><![CDATA[
1425int I_WRAP_SONAME_FNNAME_ZZ(libpthreadZdsoZd0,pthreadZucreateZAZa)
1426  ( ... formals ... )
1427  { ... body ... }
1428]]></programlisting>
1429
1430<para>In order to write unusual characters as valid C function names, a
1431Z-encoding scheme is used.  Names are written literally, except that
1432a capital Z acts as an escape character, with the following encoding:</para>
1433
1434<programlisting><![CDATA[
1435     Za   encodes    *
1436     Zp              +
1437     Zc              :
1438     Zd              .
1439     Zu              _
1440     Zh              -
1441     Zs              (space)
1442     ZA              @
1443     ZZ              Z
1444     ZL              (       # only in valgrind 3.3.0 and later
1445     ZR              )       # only in valgrind 3.3.0 and later
1446]]></programlisting>
1447
1448<para>Hence <computeroutput>libpthreadZdsoZd0</computeroutput> is an
1449encoding of the soname <computeroutput>libpthread.so.0</computeroutput>
1450and <computeroutput>pthreadZucreateZAZa</computeroutput> is an encoding
1451of the function name <computeroutput>pthread_create@*</computeroutput>.
1452</para>
1453
1454<para>The macro <computeroutput>I_WRAP_SONAME_FNNAME_ZZ</computeroutput>
1455constructs a wrapper name in which
1456both the soname (first component) and function name (second component)
1457are Z-encoded.  Encoding the function name can be tiresome and is
1458often unnecessary, so a second macro,
1459<computeroutput>I_WRAP_SONAME_FNNAME_ZU</computeroutput>, can be
1460used instead.  The <computeroutput>_ZU</computeroutput> variant is
1461also useful for writing wrappers for
1462C++ functions, in which the function name is usually already mangled
1463using some other convention in which Z plays an important role.  Having
1464to encode a second time quickly becomes confusing.</para>
1465
1466<para>Since the function name field may contain wildcards, it can be
1467anything, including just <computeroutput>*</computeroutput>.
1468The same is true for the soname.
1469However, some ELF objects - specifically, main executables - do not
1470have sonames.  Any object lacking a soname is treated as if its soname
1471was <computeroutput>NONE</computeroutput>, which is why the original
1472example above had a name
1473<computeroutput>I_WRAP_SONAME_FNNAME_ZU(NONE,foo)</computeroutput>.</para>
1474
1475<para>Note that the soname of an ELF object is not the same as its
1476file name, although it is often similar.  You can find the soname of
1477an object <computeroutput>libfoo.so</computeroutput> using the command
1478<computeroutput>readelf -a libfoo.so | grep soname</computeroutput>.</para>
1479</sect2>
1480
1481<sect2 id="manual-core-adv.wrapping.semantics" xreflabel="Wrapping Semantics">
1482<title>Wrapping Semantics</title>
1483
1484<para>The ability for a wrapper to replace an infinite family of functions
1485is powerful but brings complications in situations where ELF objects
1486appear and disappear (are dlopen'd and dlclose'd) on the fly.
1487Valgrind tries to maintain sensible behaviour in such situations.</para>
1488
1489<para>For example, suppose a process has dlopened (an ELF object with
1490soname) <filename>object1.so</filename>, which contains
1491<computeroutput>function1</computeroutput>.  It starts to use
1492<computeroutput>function1</computeroutput> immediately.</para>
1493
1494<para>After a while it dlopens <filename>wrappers.so</filename>,
1495which contains a wrapper
1496for <computeroutput>function1</computeroutput> in (soname)
1497<filename>object1.so</filename>.  All subsequent calls to
1498<computeroutput>function1</computeroutput> are rerouted to the wrapper.</para>
1499
1500<para>If <filename>wrappers.so</filename> is
1501later dlclose'd, calls to <computeroutput>function1</computeroutput> are
1502naturally routed back to the original.</para>
1503
1504<para>Alternatively, if <filename>object1.so</filename>
1505is dlclose'd but <filename>wrappers.so</filename> remains,
1506then the wrapper exported by <filename>wrappers.so</filename>
1507becomes inactive, since there
1508is no way to get to it - there is no original to call any more.  However,
1509Valgrind remembers that the wrapper is still present.  If
1510<filename>object1.so</filename> is
1511eventually dlopen'd again, the wrapper will become active again.</para>
1512
1513<para>In short, valgrind inspects all code loading/unloading events to
1514ensure that the set of currently active wrappers remains consistent.</para>
1515
1516<para>A second possible problem is that of conflicting wrappers.  It is
1517easily possible to load two or more wrappers, both of which claim
1518to be wrappers for some third function.  In such cases Valgrind will
1519complain about conflicting wrappers when the second one appears, and
1520will honour only the first one.</para>
1521</sect2>
1522
1523<sect2 id="manual-core-adv.wrapping.debugging" xreflabel="Debugging">
1524<title>Debugging</title>
1525
1526<para>Figuring out what's going on given the dynamic nature of wrapping
1527can be difficult.  The
1528<option>--trace-redir=yes</option> option makes
1529this possible
1530by showing the complete state of the redirection subsystem after
1531every
1532<function>mmap</function>/<function>munmap</function>
1533event affecting code (text).</para>
1534
1535<para>There are two central concepts:</para>
1536
1537<itemizedlist>
1538
1539  <listitem><para>A "redirection specification" is a binding of
1540  a (soname pattern, fnname pattern) pair to a code address.
1541  These bindings are created by writing functions with names
1542  made with the
1543  <computeroutput>I_WRAP_SONAME_FNNAME_{ZZ,_ZU}</computeroutput>
1544  macros.</para></listitem>
1545
1546  <listitem><para>An "active redirection" is a code-address to
1547  code-address binding currently in effect.</para></listitem>
1548
1549</itemizedlist>
1550
1551<para>The state of the wrapping-and-redirection subsystem comprises a set of
1552specifications and a set of active bindings.  The specifications are
1553acquired/discarded by watching all
1554<function>mmap</function>/<function>munmap</function>
1555events on code (text)
1556sections.  The active binding set is (conceptually) recomputed from
1557the specifications, and all known symbol names, following any change
1558to the specification set.</para>
1559
1560<para><option>--trace-redir=yes</option> shows the contents
1561of both sets following any such event.</para>
1562
1563<para><option>-v</option> prints a line of text each
1564time an active specification is used for the first time.</para>
1565
1566<para>Hence for maximum debugging effectiveness you will need to use both
1567options.</para>
1568
1569<para>One final comment.  The function-wrapping facility is closely
1570tied to Valgrind's ability to replace (redirect) specified
1571functions, for example to redirect calls to
1572<function>malloc</function> to its
1573own implementation.  Indeed, a replacement function can be
1574regarded as a wrapper function which does not call the original.
1575However, to make the implementation more robust, the two kinds
1576of interception (wrapping vs replacement) are treated differently.
1577</para>
1578
1579<para><option>--trace-redir=yes</option> shows
1580specifications and bindings for both
1581replacement and wrapper functions.  To differentiate the
1582two, replacement bindings are printed using
1583<computeroutput>R-></computeroutput> whereas
1584wraps are printed using <computeroutput>W-></computeroutput>.
1585</para>
1586</sect2>
1587
1588
1589<sect2 id="manual-core-adv.wrapping.limitations-cf"
1590       xreflabel="Limitations - control flow">
1591<title>Limitations - control flow</title>
1592
1593<para>For the most part, the function wrapping implementation is robust.
1594The only important caveat is: in a wrapper, get hold of
1595the <computeroutput>OrigFn</computeroutput> information using
1596<computeroutput>VALGRIND_GET_ORIG_FN</computeroutput> before calling any
1597other wrapped function.  Once you have the
1598<computeroutput>OrigFn</computeroutput>, arbitrary
1599calls between, recursion between, and longjumps out of wrappers
1600should work correctly.  There is never any interaction between wrapped
1601functions and merely replaced functions
1602(eg <function>malloc</function>), so you can call
1603<function>malloc</function> etc safely from within wrappers.
1604</para>
1605
1606<para>The above comments are true for {x86,amd64,ppc32,arm}-linux.  On
1607ppc64-linux function wrapping is more fragile due to the (arguably
1608poorly designed) ppc64-linux ABI.  This mandates the use of a shadow
1609stack which tracks entries/exits of both wrapper and replacement
1610functions.  This gives two limitations: firstly, longjumping out of
1611wrappers will rapidly lead to disaster, since the shadow stack will
1612not get correctly cleared.  Secondly, since the shadow stack has
1613finite size, recursion between wrapper/replacement functions is only
1614possible to a limited depth, beyond which Valgrind has to abort the
1615run.  This depth is currently 16 calls.</para>
1616
1617<para>For all platforms ({x86,amd64,ppc32,ppc64,arm}-linux) all the above
1618comments apply on a per-thread basis.  In other words, wrapping is
1619thread-safe: each thread must individually observe the above
1620restrictions, but there is no need for any kind of inter-thread
1621cooperation.</para>
1622</sect2>
1623
1624
1625<sect2 id="manual-core-adv.wrapping.limitations-sigs"
1626       xreflabel="Limitations - original function signatures">
1627<title>Limitations - original function signatures</title>
1628
1629<para>As shown in the above example, to call the original you must use a
1630macro of the form <computeroutput>CALL_FN_*</computeroutput>.
1631For technical reasons it is impossible
1632to create a single macro to deal with all argument types and numbers,
1633so a family of macros covering the most common cases is supplied.  In
1634what follows, 'W' denotes a machine-word-typed value (a pointer or a
1635C <computeroutput>long</computeroutput>),
1636and 'v' denotes C's <computeroutput>void</computeroutput> type.
1637The currently available macros are:</para>
1638
1639<programlisting><![CDATA[
1640CALL_FN_v_v    -- call an original of type  void fn ( void )
1641CALL_FN_W_v    -- call an original of type  long fn ( void )
1642
1643CALL_FN_v_W    -- call an original of type  void fn ( long )
1644CALL_FN_W_W    -- call an original of type  long fn ( long )
1645
1646CALL_FN_v_WW   -- call an original of type  void fn ( long, long )
1647CALL_FN_W_WW   -- call an original of type  long fn ( long, long )
1648
1649CALL_FN_v_WWW  -- call an original of type  void fn ( long, long, long )
1650CALL_FN_W_WWW  -- call an original of type  long fn ( long, long, long )
1651
1652CALL_FN_W_WWWW -- call an original of type  long fn ( long, long, long, long )
1653CALL_FN_W_5W   -- call an original of type  long fn ( long, long, long, long, long )
1654CALL_FN_W_6W   -- call an original of type  long fn ( long, long, long, long, long, long )
1655and so on, up to
1656CALL_FN_W_12W
1657]]></programlisting>
1658
1659<para>The set of supported types can be expanded as needed.  It is
1660regrettable that this limitation exists.  Function wrapping has proven
1661difficult to implement, with a certain apparently unavoidable level of
1662ickiness.  After several implementation attempts, the present
1663arrangement appears to be the least-worst tradeoff.  At least it works
1664reliably in the presence of dynamic linking and dynamic code
1665loading/unloading.</para>
1666
1667<para>You should not attempt to wrap a function of one type signature with a
1668wrapper of a different type signature.  Such trickery will surely lead
1669to crashes or strange behaviour.  This is not a limitation
1670of the function wrapping implementation, merely a reflection of the
1671fact that it gives you sweeping powers to shoot yourself in the foot
1672if you are not careful.  Imagine the instant havoc you could wreak by
1673writing a wrapper which matched any function name in any soname - in
1674effect, one which claimed to be a wrapper for all functions in the
1675process.</para>
1676</sect2>
1677
1678<sect2 id="manual-core-adv.wrapping.examples" xreflabel="Examples">
1679<title>Examples</title>
1680
1681<para>In the source tree,
1682<filename>memcheck/tests/wrap[1-8].c</filename> provide a series of
1683examples, ranging from very simple to quite advanced.</para>
1684
1685<para><filename>mpi/libmpiwrap.c</filename> is an example
1686of wrapping a big, complex API (the MPI-2 interface).  This file defines
1687almost 300 different wrappers.</para>
1688</sect2>
1689
1690</sect1>
1691
1692
1693
1694
1695</chapter>
1696