• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2          "http://www.w3.org/TR/html4/strict.dtd">
3<html>
4<head>
5  <title>Checker Developer Manual</title>
6  <link type="text/css" rel="stylesheet" href="menu.css">
7  <link type="text/css" rel="stylesheet" href="content.css">
8  <script type="text/javascript" src="scripts/menu.js"></script>
9</head>
10<body>
11
12<div id="page">
13<!--#include virtual="menu.html.incl"-->
14
15<div id="content">
16
17<h3 style="color:red">This Page Is Under Construction</h3>
18
19<h1>Checker Developer Manual</h1>
20
21<p>The static analyzer engine performs path-sensitive exploration of the program and
22relies on a set of checkers to implement the logic for detecting and
23constructing specific bug reports. Anyone who is interested in implementing their own
24checker, should check out the Building a Checker in 24 Hours talk
25(<a href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">slides</a>
26 <a href="http://llvm.org/devmtg/2012-11/videos/Zaks-Rose-Checker24Hours.mp4">video</a>)
27and refer to this page for additional information on writing a checker. The static analyzer is a
28part of the Clang project, so consult <a href="http://clang.llvm.org/hacking.html">Hacking on Clang</a>
29and <a href="http://llvm.org/docs/ProgrammersManual.html">LLVM Programmer's Manual</a>
30for developer guidelines and send your questions and proposals to
31<a href=http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev>cfe-dev mailing list</a>.
32</p>
33
34    <ul>
35      <li><a href="#start">Getting Started</a></li>
36      <li><a href="#analyzer">Static Analyzer Overview</a>
37      <ul>
38        <li><a href="#interaction">Interaction with Checkers</a></li>
39        <li><a href="#values">Representing Values</a></li>
40      </ul></li>
41      <li><a href="#idea">Idea for a Checker</a></li>
42      <li><a href="#registration">Checker Registration</a></li>
43      <li><a href="#events_callbacks">Events, Callbacks, and Checker Class Structure</a></li>
44      <li><a href="#extendingstates">Custom Program States</a></li>
45      <li><a href="#bugs">Bug Reports</a></li>
46      <li><a href="#ast">AST Visitors</a></li>
47      <li><a href="#testing">Testing</a></li>
48      <li><a href="#commands">Useful Commands/Debugging Hints</a></li>
49      <li><a href="#additioninformation">Additional Sources of Information</a></li>
50      <li><a href="#links">Useful Links</a></li>
51    </ul>
52
53<h2 id=start>Getting Started</h2>
54  <ul>
55    <li>To check out the source code and build the project, follow steps 1-4 of
56    the <a href="http://clang.llvm.org/get_started.html">Clang Getting Started</a>
57  page.</li>
58
59    <li>The analyzer source code is located under the Clang source tree:
60    <br><tt>
61    $ <b>cd llvm/tools/clang</b>
62    </tt>
63    <br>See: <tt>include/clang/StaticAnalyzer</tt>, <tt>lib/StaticAnalyzer</tt>,
64     <tt>test/Analysis</tt>.</li>
65
66    <li>The analyzer regression tests can be executed from the Clang's build
67    directory:
68    <br><tt>
69    $ <b>cd ../../../; cd build/tools/clang; TESTDIRS=Analysis make test</b>
70    </tt></li>
71
72    <li>Analyze a file with the specified checker:
73    <br><tt>
74    $ <b>clang -cc1 -analyze -analyzer-checker=core.DivideZero test.c</b>
75    </tt></li>
76
77    <li>List the available checkers:
78    <br><tt>
79    $ <b>clang -cc1 -analyzer-checker-help</b>
80    </tt></li>
81
82    <li>See the analyzer help for different output formats, fine tuning, and
83    debug options:
84    <br><tt>
85    $ <b>clang -cc1 -help | grep "analyzer"</b>
86    </tt></li>
87
88  </ul>
89
90<h2 id=analyzer>Static Analyzer Overview</h2>
91  The analyzer core performs symbolic execution of the given program. All the
92  input values are represented with symbolic values; further, the engine deduces
93  the values of all the expressions in the program based on the input symbols
94  and the path. The execution is path sensitive and every possible path through
95  the program is explored. The explored execution traces are represented with
96  <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ExplodedGraph.html">ExplodedGraph</a> object.
97  Each node of the graph is
98  <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ExplodedNode.html">ExplodedNode</a>,
99  which consists of a <tt>ProgramPoint</tt> and a <tt>ProgramState</tt>.
100  <p>
101  <a href="http://clang.llvm.org/doxygen/classclang_1_1ProgramPoint.html">ProgramPoint</a>
102  represents the corresponding location in the program (or the CFG graph).
103  <tt>ProgramPoint</tt> is also used to record additional information on
104  when/how the state was added. For example, <tt>PostPurgeDeadSymbolsKind</tt>
105  kind means that the state is the result of purging dead symbols - the
106  analyzer's equivalent of garbage collection.
107  <p>
108  <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ProgramState.html">ProgramState</a>
109  represents abstract state of the program. It consists of:
110  <ul>
111    <li><tt>Environment</tt> - a mapping from source code expressions to symbolic
112    values
113    <li><tt>Store</tt> - a mapping from memory locations to symbolic values
114    <li><tt>GenericDataMap</tt> - constraints on symbolic values
115  </ul>
116
117  <h3 id=interaction>Interaction with Checkers</h3>
118  Checkers are not merely passive receivers of the analyzer core changes - they
119  actively participate in the <tt>ProgramState</tt> construction through the
120  <tt>GenericDataMap</tt> which can be used to store the checker-defined part
121  of the state. Each time the analyzer engine explores a new statement, it
122  notifies each checker registered to listen for that statement, giving it an
123  opportunity to either report a bug or modify the state. (As a rule of thumb,
124  the checker itself should be stateless.) The checkers are called one after another
125  in the predefined order; thus, calling all the checkers adds a chain to the
126  <tt>ExplodedGraph</tt>.
127
128  <h3 id=values>Representing Values</h3>
129  During symbolic execution, <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SVal.html">SVal</a>
130  objects are used to represent the semantic evaluation of expressions.
131  They can represent things like concrete
132  integers, symbolic values, or memory locations (which are memory regions).
133  They are a discriminated union of "values", symbolic and otherwise.
134  If a value isn't symbolic, usually that means there is no symbolic
135  information to track. For example, if the value was an integer, such as
136  <tt>42</tt>, it would be a <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1nonloc_1_1ConcreteInt.html">ConcreteInt</a>,
137  and the checker doesn't usually need to track any state with the concrete
138  number. In some cases, <tt>SVal</tt> is not a symbol, but it really should be
139  a symbolic value. This happens when the analyzer cannot reason about something
140  (yet). An example is floating point numbers. In such cases, the
141  <tt>SVal</tt> will evaluate to <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1UnknownVal.html">UnknownVal</a>.
142  This represents a case that is outside the realm of the analyzer's reasoning
143  capabilities. <tt>SVals</tt> are value objects and their values can be viewed
144  using the <tt>.dump()</tt> method. Often they wrap persistent objects such as
145  symbols or regions.
146  <p>
147  <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SymExpr.html">SymExpr</a> (symbol)
148  is meant to represent abstract, but named, symbolic value. Symbols represent
149  an actual (immutable) value. We might not know what its specific value is, but
150  we can associate constraints with that value as we analyze a path. For
151  example, we might record that the value of a symbol is greater than
152  <tt>0</tt>, etc.
153  <p>
154
155  <p>
156  <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1MemRegion.html">MemRegion</a> is similar to a symbol.
157  It is used to provide a lexicon of how to describe abstract memory. Regions can
158  layer on top of other regions, providing a layered approach to representing memory.
159  For example, a struct object on the stack might be represented by a <tt>VarRegion</tt>,
160  but a <tt>FieldRegion</tt> which is a subregion of the <tt>VarRegion</tt> could
161  be used to represent the memory associated with a specific field of that object.
162  So how do we represent symbolic memory regions? That's what
163  <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SymbolicRegion.html">SymbolicRegion</a>
164  is for. It is a <tt>MemRegion</tt> that has an associated symbol. Since the
165  symbol is unique and has a unique name; that symbol names the region.
166
167  <P>
168  Let's see how the analyzer processes the expressions in the following example:
169  <p>
170  <pre class="code_example">
171  int foo(int x) {
172     int y = x * 2;
173     int z = x;
174     ...
175  }
176  </pre>
177  <p>
178Let's look at how <tt>x*2</tt> gets evaluated. When <tt>x</tt> is evaluated,
179we first construct an <tt>SVal</tt> that represents the lvalue of <tt>x</tt>, in
180this case it is an <tt>SVal</tt> that references the <tt>MemRegion</tt> for <tt>x</tt>.
181Afterwards, when we do the lvalue-to-rvalue conversion, we get a new <tt>SVal</tt>,
182which references the value <b>currently bound</b> to <tt>x</tt>. That value is
183symbolic; it's whatever <tt>x</tt> was bound to at the start of the function.
184Let's call that symbol <tt>$0</tt>. Similarly, we evaluate the expression for <tt>2</tt>,
185and get an <tt>SVal</tt> that references the concrete number <tt>2</tt>. When
186we evaluate <tt>x*2</tt>, we take the two <tt>SVals</tt> of the subexpressions,
187and create a new <tt>SVal</tt> that represents their multiplication (which in
188this case is a new symbolic expression, which we might call <tt>$1</tt>). When we
189evaluate the assignment to <tt>y</tt>, we again compute its lvalue (a <tt>MemRegion</tt>),
190and then bind the <tt>SVal</tt> for the RHS (which references the symbolic value <tt>$1</tt>)
191to the <tt>MemRegion</tt> in the symbolic store.
192<br>
193The second line is similar. When we evaluate <tt>x</tt> again, we do the same
194dance, and create an <tt>SVal</tt> that references the symbol <tt>$0</tt>. Note, two <tt>SVals</tt>
195might reference the same underlying values.
196
197<p>
198To summarize, MemRegions are unique names for blocks of memory. Symbols are
199unique names for abstract symbolic values. Some MemRegions represents abstract
200symbolic chunks of memory, and thus are also based on symbols. SVals are just
201references to values, and can reference either MemRegions, Symbols, or concrete
202values (e.g., the number 1).
203
204  <!--
205  TODO: Add a picture.
206  <br>
207  Symbols<br>
208  FunctionalObjects are used throughout.
209  -->
210
211<h2 id=idea>Idea for a Checker</h2>
212  Here are several questions which you should consider when evaluating your
213  checker idea:
214  <ul>
215    <li>Can the check be effectively implemented without path-sensitive
216    analysis? See <a href="#ast">AST Visitors</a>.</li>
217
218    <li>How high the false positive rate is going to be? Looking at the occurrences
219    of the issue you want to write a checker for in the existing code bases might
220    give you some ideas. </li>
221
222    <li>How the current limitations of the analysis will effect the false alarm
223    rate? Currently, the analyzer only reasons about one procedure at a time (no
224    inter-procedural analysis). Also, it uses a simple range tracking based
225    solver to model symbolic execution.</li>
226
227    <li>Consult the <a
228    href="http://llvm.org/bugs/buglist.cgi?query_format=advanced&amp;bug_status=NEW&amp;bug_status=REOPENED&amp;version=trunk&amp;component=Static%20Analyzer&amp;product=clang">Bugzilla database</a>
229    to get some ideas for new checkers and consider starting with improving/fixing
230    bugs in the existing checkers.</li>
231  </ul>
232
233<p>Once an idea for a checker has been chosen, there are two key decisions that
234need to be made:
235  <ul>
236    <li> Which events the checker should be tracking. This is discussed in more
237    detail in the section <a href="#events_callbacks">Events, Callbacks, and
238    Checker Class Structure</a>.
239    <li> What checker-specific data needs to be stored as part of the program
240    state (if any). This should be minimized as much as possible. More detail about
241    implementing custom program state is given in section <a
242    href="#extendingstates">Custom Program States</a>.
243  </ul>
244
245
246<h2 id=registration>Checker Registration</h2>
247  All checker implementation files are located in
248  <tt>clang/lib/StaticAnalyzer/Checkers</tt> folder. The steps below describe
249  how the checker <tt>SimpleStreamChecker</tt>, which checks for misuses of
250  stream APIs, was registered with the analyzer.
251  Similar steps should be followed for a new checker.
252<ol>
253  <li>A new checker implementation file, <tt>SimpleStreamChecker.cpp</tt>, was
254  created in the directory <tt>lib/StaticAnalyzer/Checkers</tt>.
255  <li>The following registration code was added to the implementation file:
256<pre class="code_example">
257void ento::registerSimpleStreamChecker(CheckerManager &amp;mgr) {
258  mgr.registerChecker&lt;SimpleStreamChecker&gt();
259}
260</pre>
261<li>A package was selected for the checker and the checker was defined in the
262table of checkers at <tt>lib/StaticAnalyzer/Checkers/Checkers.td</tt>. Since all
263checkers should first be developed as "alpha", and the SimpleStreamChecker
264performs UNIX API checks, the correct package is "alpha.unix", and the following
265was added to the corresponding <tt>UnixAlpha</tt> section of <tt>Checkers.td</tt>:
266<pre class="code_example">
267let ParentPackage = UnixAlpha in {
268...
269def SimpleStreamChecker : Checker<"SimpleStream">,
270  HelpText<"Check for misuses of stream APIs">,
271  DescFile<"SimpleStreamChecker.cpp">;
272...
273} // end "alpha.unix"
274</pre>
275
276<li>The source code file was made visible to CMake by adding it to
277<tt>lib/StaticAnalyzer/Checkers/CMakeLists.txt</tt>.
278
279</ol>
280
281After adding a new checker to the analyzer, one can verify that the new checker
282was successfully added by seeing if it appears in the list of available checkers:
283<br> <tt><b>$clang -cc1 -analyzer-checker-help</b></tt>
284
285<h2 id=events_callbacks>Events, Callbacks, and Checker Class Structure</h2>
286
287<p> All checkers inherit from the <tt><a
288href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1Checker.html">
289Checker</a></tt> template class; the template parameter(s) describe the type of
290events that the checker is interested in processing. The various types of events
291that are available are described in the file <a
292href="http://clang.llvm.org/doxygen/CheckerDocumentation_8cpp_source.html">
293CheckerDocumentation.cpp</a>
294
295<p> For each event type requested, a corresponding callback function must be
296defined in the checker class (<a
297href="http://clang.llvm.org/doxygen/CheckerDocumentation_8cpp_source.html">
298CheckerDocumentation.cpp</a> shows the
299correct function name and signature for each event type).
300
301<p> As an example, consider <tt>SimpleStreamChecker</tt>. This checker needs to
302take action at the following times:
303
304<ul>
305<li>Before making a call to a function, check if the function is <tt>fclose</tt>.
306If so, check the parameter being passed.
307<li>After making a function call, check if the function is <tt>fopen</tt>. If
308so, process the return value.
309<li>When values go out of scope, check whether they are still-open file
310descriptors, and report a bug if so. In addition, remove any information about
311them from the program state in order to keep the state as small as possible.
312<li>When file pointers "escape" (are used in a way that the analyzer can no longer
313track them), mark them as such. This prevents false positives in the cases where
314the analyzer cannot be sure whether the file was closed or not.
315</ul>
316
317<p>These events that will be used for each of these actions are, respectively, <a
318href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1check_1_1PreCall.html">PreCall</a>,
319<a
320href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1check_1_1PostCall.html">PostCall</a>,
321<a
322href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1check_1_1DeadSymbols.html">DeadSymbols</a>,
323and <a
324href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1check_1_1PointerEscape.html">PointerEscape</a>.
325The high-level structure of the checker's class is thus:
326
327<pre class="code_example">
328class SimpleStreamChecker : public Checker&lt;check::PreCall,
329                                           check::PostCall,
330                                           check::DeadSymbols,
331                                           check::PointerEscape&gt; {
332public:
333
334  void checkPreCall(const CallEvent &amp;Call, CheckerContext &amp;C) const;
335
336  void checkPostCall(const CallEvent &amp;Call, CheckerContext &amp;C) const;
337
338  void checkDeadSymbols(SymbolReaper &amp;SR, CheckerContext &amp;C) const;
339
340  ProgramStateRef checkPointerEscape(ProgramStateRef State,
341                                     const InvalidatedSymbols &amp;Escaped,
342                                     const CallEvent *Call,
343                                     PointerEscapeKind Kind) const;
344};
345</pre>
346
347<h2 id=extendingstates>Custom Program States</h2>
348
349<p> Checkers often need to keep track of information specific to the checks they
350perform. However, since checkers have no guarantee about the order in which the
351program will be explored, or even that all possible paths will be explored, this
352state information cannot be kept within individual checkers. Therefore, if
353checkers need to store custom information, they need to add new categories of
354data to the <tt>ProgramState</tt>. The preferred way to do so is to use one of
355several macros designed for this purpose. They are:
356
357<ul>
358<li><a
359href="http://clang.llvm.org/doxygen/ProgramStateTrait_8h.html#ae4cddb54383cd702a045d7c61b009147">REGISTER_TRAIT_WITH_PROGRAMSTATE</a>:
360Used when the state information is a single value. The methods available for
361state types declared with this macro are <tt>get</tt>, <tt>set</tt>, and
362<tt>remove</tt>.
363<li><a
364href="http://clang.llvm.org/doxygen/CheckerContext_8h.html#aa27656fa0ce65b0d9ba12eb3c02e8be9">REGISTER_LIST_WITH_PROGRAMSTATE</a>:
365Used when the state information is a list of values. The methods available for
366state types declared with this macro are <tt>add</tt>, <tt>get</tt>,
367<tt>remove</tt>, and <tt>contains</tt>.
368<li><a
369href="http://clang.llvm.org/doxygen/CheckerContext_8h.html#ad90f9387b94b344eaaf499afec05f4d1">REGISTER_SET_WITH_PROGRAMSTATE</a>:
370Used when the state information is a set of values. The methods available for
371state types declared with this macro are <tt>add</tt>, <tt>get</tt>,
372<tt>remove</tt>, and <tt>contains</tt>.
373<li><a
374href="http://clang.llvm.org/doxygen/CheckerContext_8h.html#a6d1893bb8c18543337b6c363c1319fcf">REGISTER_MAP_WITH_PROGRAMSTATE</a>:
375Used when the state information is a map from a key to a value. The methods
376available for state types declared with this macro are <tt>add</tt>,
377<tt>set</tt>, <tt>get</tt>, <tt>remove</tt>, and <tt>contains</tt>.
378</ul>
379
380<p>All of these macros take as parameters the name to be used for the custom
381category of state information and the data type(s) to be used for storage. The
382data type(s) specified will become the parameter type and/or return type of the
383methods that manipulate the new category of state information. Each of these
384methods are templated with the name of the custom data type.
385
386<p>For example, a common case is the need to track data associated with a
387symbolic expression; a map type is the most logical way to implement this. The
388key for this map will be a pointer to a symbolic expression
389(<tt>SymbolRef</tt>). If the data type to be associated with the symbolic
390expression is an integer, then the custom category of state information would be
391declared as
392
393<pre class="code_example">
394REGISTER_MAP_WITH_PROGRAMSTATE(ExampleDataType, SymbolRef, int)
395</pre>
396
397The data would be accessed with the function
398
399<pre class="code_example">
400ProgramStateRef state;
401SymbolRef Sym;
402...
403int currentlValue = state-&gt;get&lt;ExampleDataType&gt;(Sym);
404</pre>
405
406and set with the function
407
408<pre class="code_example">
409ProgramStateRef state;
410SymbolRef Sym;
411int newValue;
412...
413ProgramStateRef newState = state-&gt;set&lt;ExampleDataType&gt;(Sym, newValue);
414</pre>
415
416<p>In addition, the macros define a data type used for storing the data of the
417new data category; the name of this type is the name of the data category with
418"Ty" appended. For <tt>REGISTER_TRAIT_WITH_PROGRAMSTATE</tt>, this will simply
419be passed data type; for the other three macros, this will be a specialized
420version of the <a
421href="http://llvm.org/doxygen/classllvm_1_1ImmutableList.html">llvm::ImmutableList</a>,
422<a
423href="http://llvm.org/doxygen/classllvm_1_1ImmutableSet.html">llvm::ImmutableSet</a>,
424or <a
425href="http://llvm.org/doxygen/classllvm_1_1ImmutableMap.html">llvm::ImmutableMap</a>
426templated class. For the <tt>ExampleDataType</tt> example above, the type
427created would be equivalent to writing the declaration:
428
429<pre class="code_example">
430typedef llvm::ImmutableMap&lt;SymbolRef, int&gt; ExampleDataTypeTy;
431</pre>
432
433<p>These macros will cover a majority of use cases; however, they still have a
434few limitations. They cannot be used inside namespaces (since they expand to
435contain top-level namespace references), and the data types that they define
436cannot be referenced from more than one file.
437
438<p>Note that <tt>ProgramStates</tt> are immutable; instead of modifying an existing
439one, functions that modify the state will return a copy of the previous state
440with the change applied. This updated state must be then provided to the
441analyzer core by calling the <tt>CheckerContext::addTransition</tt> function.
442<h2 id=bugs>Bug Reports</h2>
443
444
445<p> When a checker detects a mistake in the analyzed code, it needs a way to
446report it to the analyzer core so that it can be displayed. The two classes used
447to construct this report are <tt><a
448href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1BugType.html">BugType</a></tt>
449and <tt><a
450href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1BugReport.html">
451BugReport</a></tt>.
452
453<p>
454<tt>BugType</tt>, as the name would suggest, represents a type of bug. The
455constructor for <tt>BugType</tt> takes two parameters: The name of the bug
456type, and the name of the category of the bug. These are used (e.g.) in the
457summary page generated by the scan-build tool.
458
459<P>
460  The <tt>BugReport</tt> class represents a specific occurrence of a bug. In
461  the most common case, three parameters are used to form a <tt>BugReport</tt>:
462<ol>
463<li>The type of bug, specified as an instance of the <tt>BugType</tt> class.
464<li>A short descriptive string. This is placed at the location of the bug in
465the detailed line-by-line output generated by scan-build.
466<li>The context in which the bug occurred. This includes both the location of
467the bug in the program and the program's state when the location is reached. These are
468both encapsulated in an <tt>ExplodedNode</tt>.
469</ol>
470
471<p>In order to obtain the correct <tt>ExplodedNode</tt>, a decision must be made
472as to whether or not analysis can continue along the current path. This decision
473is based on whether the detected bug is one that would prevent the program under
474analysis from continuing. For example, leaking of a resource should not stop
475analysis, as the program can continue to run after the leak. Dereferencing a
476null pointer, on the other hand, should stop analysis, as there is no way for
477the program to meaningfully continue after such an error.
478
479<p>If analysis can continue, then the most recent <tt>ExplodedNode</tt>
480generated by the checker can be passed to the <tt>BugReport</tt> constructor
481without additional modification. This <tt>ExplodedNode</tt> will be the one
482returned by the most recent call to <a
483href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerContext.html#a264f48d97809707049689c37aa35af78">CheckerContext::addTransition</a>.
484If no transition has been performed during the current callback, the checker should call <a
485href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerContext.html#a264f48d97809707049689c37aa35af78">CheckerContext::addTransition()</a>
486and use the returned node for bug reporting.
487
488<p>If analysis can not continue, then the current state should be transitioned
489into a so-called <i>sink node</i>, a node from which no further analysis will be
490performed. This is done by calling the <a
491href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerContext.html#adeea33a5a2bed190210c4a2bb807a6f0">
492CheckerContext::generateSink</a> function; this function is the same as the
493<tt>addTransition</tt> function, but marks the state as a sink node. Like
494<tt>addTransition</tt>, this returns an <tt>ExplodedNode</tt> with the updated
495state, which can then be passed to the <tt>BugReport</tt> constructor.
496
497<p>
498After a <tt>BugReport</tt> is created, it should be passed to the analyzer core
499by calling <a href = "http://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerContext.html#ae7738af2cbfd1d713edec33d3203dff5">CheckerContext::emitReport</a>.
500
501<h2 id=ast>AST Visitors</h2>
502  Some checks might not require path-sensitivity to be effective. Simple AST walk
503  might be sufficient. If that is the case, consider implementing a Clang
504  compiler warning. On the other hand, a check might not be acceptable as a compiler
505  warning; for example, because of a relatively high false positive rate. In this
506  situation, AST callbacks <tt><b>checkASTDecl</b></tt> and
507  <tt><b>checkASTCodeBody</b></tt> are your best friends.
508
509<h2 id=testing>Testing</h2>
510  Every patch should be well tested with Clang regression tests. The checker tests
511  live in <tt>clang/test/Analysis</tt> folder. To run all of the analyzer tests,
512  execute the following from the <tt>clang</tt> build directory:
513    <pre class="code">
514    $ <b>TESTDIRS=Analysis make test</b>
515    </pre>
516
517<h2 id=commands>Useful Commands/Debugging Hints</h2>
518<ul>
519<li>
520While investigating a checker-related issue, instruct the analyzer to only
521execute a single checker:
522<br><tt>
523$ <b>clang -cc1 -analyze -analyzer-checker=osx.KeychainAPI test.c</b>
524</tt>
525</li>
526<li>
527To dump AST:
528<br><tt>
529$ <b>clang -cc1 -ast-dump test.c</b>
530</tt>
531</li>
532<li>
533To view/dump CFG use <tt>debug.ViewCFG</tt> or <tt>debug.DumpCFG</tt> checkers:
534<br><tt>
535$ <b>clang -cc1 -analyze -analyzer-checker=debug.ViewCFG test.c</b>
536</tt>
537</li>
538<li>
539To see all available debug checkers:
540<br><tt>
541$ <b>clang -cc1 -analyzer-checker-help | grep "debug"</b>
542</tt>
543</li>
544<li>
545To see which function is failing while processing a large file use
546<tt>-analyzer-display-progress</tt> option.
547</li>
548<li>
549While debugging execute <tt>clang -cc1 -analyze -analyzer-checker=core</tt>
550instead of <tt>clang --analyze</tt>, as the later would call the compiler
551in a separate process.
552</li>
553<li>
554To view <tt>ExplodedGraph</tt> (the state graph explored by the analyzer) while
555debugging, goto a frame that has <tt>clang::ento::ExprEngine</tt> object and
556execute:
557<br><tt>
558(gdb) <b>p ViewGraph(0)</b>
559</tt>
560</li>
561<li>
562To see the <tt>ProgramState</tt> while debugging use the following command.
563<br><tt>
564(gdb) <b>p State->dump()</b>
565</tt>
566</li>
567<li>
568To see <tt>clang::Expr</tt> while debugging use the following command. If you
569pass in a SourceManager object, it will also dump the corresponding line in the
570source code.
571<br><tt>
572(gdb) <b>p E->dump()</b>
573</tt>
574</li>
575<li>
576To dump AST of a method that the current <tt>ExplodedNode</tt> belongs to:
577<br><tt>
578(gdb) <b>p C.getPredecessor()->getCodeDecl().getBody()->dump()</b>
579(gdb) <b>p C.getPredecessor()->getCodeDecl().getBody()->dump(getContext().getSourceManager())</b>
580</tt>
581</li>
582</ul>
583
584<h2 id=additioninformation>Additional Sources of Information</h2>
585
586Here are some additional resources that are useful when working on the Clang
587Static Analyzer:
588
589<ul>
590<li> <a href="http://clang.llvm.org/doxygen">Clang doxygen</a>. Contains
591up-to-date documentation about the APIs available in Clang. Relevant entries
592have been linked throughout this page. Also of use is the
593<a href="http://llvm.org/doxygen">LLVM doxygen</a>, when dealing with classes
594from LLVM.
595<li> The <a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev">
596cfe-dev mailing list</a>. This is the primary mailing list used for
597discussion of Clang development (including static code analysis). The
598<a href="http://lists.cs.uiuc.edu/pipermail/cfe-dev">archive</a> also contains
599a lot of information.
600<li> The "Building a Checker in 24 hours" presentation given at the <a
601href="http://llvm.org/devmtg/2012-11">November 2012 LLVM Developer's
602meeting</a>. Describes the construction of SimpleStreamChecker. <a
603href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">Slides</a>
604and <a
605href="http://llvm.org/devmtg/2012-11/videos/Zaks-Rose-Checker24Hours.mp4">video</a>
606are available.
607</ul>
608
609<h2 id=links>Useful Links</h2>
610<ul>
611<li>The list of <a href="implicit_checks.html">Implicit Checkers</a></li>
612</ul>
613
614</div>
615</div>
616</body>
617</html>
618