1<?xml version="1.0"?> <!-- -*- sgml -*- --> 2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" 4[ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]> 5 6 7<chapter id="sg-manual" 8 xreflabel="SGCheck: an experimental stack and global array overrun detector"> 9 <title>SGCheck: an experimental stack and global array overrun detector</title> 10 11<para>To use this tool, you must specify 12<option>--tool=exp-sgcheck</option> on the Valgrind 13command line.</para> 14 15 16 17 18<sect1 id="sg-manual.overview" xreflabel="Overview"> 19<title>Overview</title> 20 21<para>SGCheck is a tool for finding overruns of stack and global 22arrays. It works by using a heuristic approach derived from an 23observation about the likely forms of stack and global array accesses. 24</para> 25 26</sect1> 27 28 29 30 31<sect1 id="sg-manual.options" xreflabel="SGCheck Command-line Options"> 32<title>SGCheck Command-line Options</title> 33 34<para>There are no SGCheck-specific command-line options at present.</para> 35<!-- 36<para>SGCheck-specific command-line options are:</para> 37 38 39<variablelist id="sg.opts.list"> 40</variablelist> 41--> 42 43</sect1> 44 45 46 47<sect1 id="sg-manual.how-works.sg-checks" 48 xreflabel="How SGCheck Works"> 49<title>How SGCheck Works</title> 50 51<para>When a source file is compiled 52with <option>-g</option>, the compiler attaches DWARF3 53debugging information which describes the location of all stack and 54global arrays in the file.</para> 55 56<para>Checking of accesses to such arrays would then be relatively 57simple, if the compiler could also tell us which array (if any) each 58memory referencing instruction was supposed to access. Unfortunately 59the DWARF3 debugging format does not provide a way to represent such 60information, so we have to resort to a heuristic technique to 61approximate it. The key observation is that 62 <emphasis> 63 if a memory referencing instruction accesses inside a stack or 64 global array once, then it is highly likely to always access that 65 same array</emphasis>.</para> 66 67<para>To see how this might be useful, consider the following buggy 68fragment:</para> 69<programlisting><![CDATA[ 70 { int i, a[10]; // both are auto vars 71 for (i = 0; i <= 10; i++) 72 a[i] = 42; 73 } 74]]></programlisting> 75 76<para>At run time we will know the precise address 77of <computeroutput>a[]</computeroutput> on the stack, and so we can 78observe that the first store resulting from <computeroutput>a[i] = 7942</computeroutput> writes <computeroutput>a[]</computeroutput>, and 80we will (correctly) assume that that instruction is intended always to 81access <computeroutput>a[]</computeroutput>. Then, on the 11th 82iteration, it accesses somewhere else, possibly a different local, 83possibly an un-accounted for area of the stack (eg, spill slot), so 84SGCheck reports an error.</para> 85 86<para>There is an important caveat.</para> 87 88<para>Imagine a function such as <function>memcpy</function>, which is used 89to read and write many different areas of memory over the lifetime of the 90program. If we insist that the read and write instructions in its memory 91copying loop only ever access one particular stack or global variable, we 92will be flooded with errors resulting from calls to 93<function>memcpy</function>.</para> 94 95<para>To avoid this problem, SGCheck instantiates fresh likely-target 96records for each entry to a function, and discards them on exit. This 97allows detection of cases where (e.g.) <function>memcpy</function> 98overflows its source or destination buffers for any specific call, but 99does not carry any restriction from one call to the next. Indeed, 100multiple threads may make multiple simultaneous calls to 101(e.g.) <function>memcpy</function> without mutual interference.</para> 102 103</sect1> 104 105 106 107 108<sect1 id="sg-manual.cmp-w-memcheck" 109 xreflabel="Comparison with Memcheck"> 110<title>Comparison with Memcheck</title> 111 112<para>SGCheck and Memcheck are complementary: their capabilities do 113not overlap. Memcheck performs bounds checks and use-after-free 114checks for heap arrays. It also finds uses of uninitialised values 115created by heap or stack allocations. But it does not perform bounds 116checking for stack or global arrays.</para> 117 118<para>SGCheck, on the other hand, does do bounds checking for stack or 119global arrays, but it doesn't do anything else.</para> 120 121</sect1> 122 123 124 125 126 127<sect1 id="sg-manual.limitations" 128 xreflabel="Limitations"> 129<title>Limitations</title> 130 131<para>This is an experimental tool, which relies rather too heavily on some 132not-as-robust-as-I-would-like assumptions on the behaviour of correct 133programs. There are a number of limitations which you should be aware 134of.</para> 135 136<itemizedlist> 137 138 <listitem> 139 <para>False negatives (missed errors): it follows from the 140 description above (<xref linkend="sg-manual.how-works.sg-checks"/>) 141 that the first access by a memory referencing instruction to a 142 stack or global array creates an association between that 143 instruction and the array, which is checked on subsequent accesses 144 by that instruction, until the containing function exits. Hence, 145 the first access by an instruction to an array (in any given 146 function instantiation) is not checked for overrun, since SGCheck 147 uses that as the "example" of how subsequent accesses should 148 behave.</para> 149 </listitem> 150 151 <listitem> 152 <para>False positives (false errors): similarly, and more serious, 153 it is clearly possible to write legitimate pieces of code which 154 break the basic assumption upon which the checking algorithm 155 depends. For example:</para> 156 157<programlisting><![CDATA[ 158 { int a[10], b[10], *p, i; 159 for (i = 0; i < 10; i++) { 160 p = /* arbitrary condition */ ? &a[i] : &b[i]; 161 *p = 42; 162 } 163 } 164]]></programlisting> 165 166 <para>In this case the store sometimes 167 accesses <computeroutput>a[]</computeroutput> and 168 sometimes <computeroutput>b[]</computeroutput>, but in no cases is 169 the addressed array overrun. Nevertheless the change in target 170 will cause an error to be reported.</para> 171 172 <para>It is hard to see how to get around this problem. The only 173 mitigating factor is that such constructions appear very rare, at 174 least judging from the results using the tool so far. Such a 175 construction appears only once in the Valgrind sources (running 176 Valgrind on Valgrind) and perhaps two or three times for a start 177 and exit of Firefox. The best that can be done is to suppress the 178 errors.</para> 179 </listitem> 180 181 <listitem> 182 <para>Performance: SGCheck has to read all of 183 the DWARF3 type and variable information on the executable and its 184 shared objects. This is computationally expensive and makes 185 startup quite slow. You can expect debuginfo reading time to be in 186 the region of a minute for an OpenOffice sized application, on a 187 2.4 GHz Core 2 machine. Reading this information also requires a 188 lot of memory. To make it viable, SGCheck goes to considerable 189 trouble to compress the in-memory representation of the DWARF3 190 data, which is why the process of reading it appears slow.</para> 191 </listitem> 192 193 <listitem> 194 <para>Performance: SGCheck runs slower than Memcheck. This is 195 partly due to a lack of tuning, but partly due to algorithmic 196 difficulties. The 197 stack and global checks can sometimes require a number of range 198 checks per memory access, and these are difficult to short-circuit, 199 despite considerable efforts having been made. A 200 redesign and reimplementation could potentially make it much faster. 201 </para> 202 </listitem> 203 204 <listitem> 205 <para>Coverage: Stack and global checking is fragile. If a shared 206 object does not have debug information attached, then SGCheck will 207 not be able to determine the bounds of any stack or global arrays 208 defined within that shared object, and so will not be able to check 209 accesses to them. This is true even when those arrays are accessed 210 from some other shared object which was compiled with debug 211 info.</para> 212 213 <para>At the moment SGCheck accepts objects lacking debuginfo 214 without comment. This is dangerous as it causes SGCheck to 215 silently skip stack and global checking for such objects. It would 216 be better to print a warning in such circumstances.</para> 217 </listitem> 218 219 <listitem> 220 <para>Coverage: SGCheck does not check whether the the areas read 221 or written by system calls do overrun stack or global arrays. This 222 would be easy to add.</para> 223 </listitem> 224 225 <listitem> 226 <para>Platforms: the stack/global checks won't work properly on 227 PowerPC, ARM or S390X platforms, only on X86 and AMD64 targets. 228 That's because the stack and global checking requires tracking 229 function calls and exits reliably, and there's no obvious way to do 230 it on ABIs that use a link register for function returns. 231 </para> 232 </listitem> 233 234 <listitem> 235 <para>Robustness: related to the previous point. Function 236 call/exit tracking for X86 and AMD64 is believed to work properly 237 even in the presence of longjmps within the same stack (although 238 this has not been tested). However, code which switches stacks is 239 likely to cause breakage/chaos.</para> 240 </listitem> 241</itemizedlist> 242 243</sect1> 244 245 246 247 248 249<sect1 id="sg-manual.todo-user-visible" 250 xreflabel="Still To Do: User-visible Functionality"> 251<title>Still To Do: User-visible Functionality</title> 252 253<itemizedlist> 254 255 <listitem> 256 <para>Extend system call checking to work on stack and global arrays.</para> 257 </listitem> 258 259 <listitem> 260 <para>Print a warning if a shared object does not have debug info 261 attached, or if, for whatever reason, debug info could not be 262 found, or read.</para> 263 </listitem> 264 265 <listitem> 266 <para>Add some heuristic filtering that removes obvious false 267 positives. This would be easy to do. For example, an access 268 transition from a heap to a stack object almost certainly isn't a 269 bug and so should not be reported to the user.</para> 270 </listitem> 271 272</itemizedlist> 273 274</sect1> 275 276 277 278 279<sect1 id="sg-manual.todo-implementation" 280 xreflabel="Still To Do: Implementation Tidying"> 281<title>Still To Do: Implementation Tidying</title> 282 283<para>Items marked CRITICAL are considered important for correctness: 284non-fixage of them is liable to lead to crashes or assertion failures 285in real use.</para> 286 287<itemizedlist> 288 289 <listitem> 290 <para> sg_main.c: Redesign and reimplement the basic checking 291 algorithm. It could be done much faster than it is -- the current 292 implementation isn't very good. 293 </para> 294 </listitem> 295 296 <listitem> 297 <para> sg_main.c: Improve the performance of the stack / global 298 checks by doing some up-front filtering to ignore references in 299 areas which "obviously" can't be stack or globals. This will 300 require using information that m_aspacemgr knows about the address 301 space layout.</para> 302 </listitem> 303 304 <listitem> 305 <para>sg_main.c: fix compute_II_hash to make it a bit more sensible 306 for ppc32/64 targets (except that sg_ doesn't work on ppc32/64 307 targets, so this is a bit academic at the moment).</para> 308 </listitem> 309 310</itemizedlist> 311 312</sect1> 313 314 315 316</chapter> 317