1============================= 2Introduction to the Clang AST 3============================= 4 5This document gives a gentle introduction to the mysteries of the Clang 6AST. It is targeted at developers who either want to contribute to 7Clang, or use tools that work based on Clang's AST, like the AST 8matchers. 9 10.. raw:: html 11 12 <center><iframe width="560" height="315" src="http://www.youtube.com/embed/VqCkCDFLSsc?vq=hd720" frameborder="0" allowfullscreen></iframe></center> 13 14`Slides <http://llvm.org/devmtg/2013-04/klimek-slides.pdf>`_ 15 16Introduction 17============ 18 19Clang's AST is different from ASTs produced by some other compilers in 20that it closely resembles both the written C++ code and the C++ 21standard. For example, parenthesis expressions and compile time 22constants are available in an unreduced form in the AST. This makes 23Clang's AST a good fit for refactoring tools. 24 25Documentation for all Clang AST nodes is available via the generated 26`Doxygen <http://clang.llvm.org/doxygen>`_. The doxygen online 27documentation is also indexed by your favorite search engine, which will 28make a search for clang and the AST node's class name usually turn up 29the doxygen of the class you're looking for (for example, search for: 30clang ParenExpr). 31 32Examining the AST 33================= 34 35A good way to familarize yourself with the Clang AST is to actually look 36at it on some simple example code. Clang has a builtin AST-dump modes, 37which can be enabled with the flags ``-ast-dump`` and ``-ast-dump-xml``. Note 38that ``-ast-dump-xml`` currently only works with debug builds of clang. 39 40Let's look at a simple example AST: 41 42:: 43 44 $ cat test.cc 45 int f(int x) { 46 int result = (x / 42); 47 return result; 48 } 49 50 # Clang by default is a frontend for many tools; -cc1 tells it to directly 51 # use the C++ compiler mode. -undef leaves out some internal declarations. 52 $ clang -cc1 -undef -ast-dump-xml test.cc 53 ... cutting out internal declarations of clang ... 54 <TranslationUnit ptr="0x4871160"> 55 <Function ptr="0x48a5800" name="f" prototype="true"> 56 <FunctionProtoType ptr="0x4871de0" canonical="0x4871de0"> 57 <BuiltinType ptr="0x4871250" canonical="0x4871250"/> 58 <parameters> 59 <BuiltinType ptr="0x4871250" canonical="0x4871250"/> 60 </parameters> 61 </FunctionProtoType> 62 <ParmVar ptr="0x4871d80" name="x" initstyle="c"> 63 <BuiltinType ptr="0x4871250" canonical="0x4871250"/> 64 </ParmVar> 65 <Stmt> 66 (CompoundStmt 0x48a5a38 <t2.cc:1:14, line:4:1> 67 (DeclStmt 0x48a59c0 <line:2:3, col:24> 68 0x48a58c0 "int result = 69 (ParenExpr 0x48a59a0 <col:16, col:23> 'int' 70 (BinaryOperator 0x48a5978 <col:17, col:21> 'int' '/' 71 (ImplicitCastExpr 0x48a5960 <col:17> 'int' <LValueToRValue> 72 (DeclRefExpr 0x48a5918 <col:17> 'int' lvalue ParmVar 0x4871d80 'x' 'int')) 73 (IntegerLiteral 0x48a5940 <col:21> 'int' 42)))") 74 (ReturnStmt 0x48a5a18 <line:3:3, col:10> 75 (ImplicitCastExpr 0x48a5a00 <col:10> 'int' <LValueToRValue> 76 (DeclRefExpr 0x48a59d8 <col:10> 'int' lvalue Var 0x48a58c0 'result' 'int')))) 77 78 </Stmt> 79 </Function> 80 </TranslationUnit> 81 82In general, ``-ast-dump-xml`` dumps declarations in an XML-style format and 83statements in an S-expression-style format. The toplevel declaration in 84a translation unit is always the `translation unit 85declaration <http://clang.llvm.org/doxygen/classclang_1_1TranslationUnitDecl.html>`_. 86In this example, our first user written declaration is the `function 87declaration <http://clang.llvm.org/doxygen/classclang_1_1FunctionDecl.html>`_ 88of "``f``". The body of "``f``" is a `compound 89statement <http://clang.llvm.org/doxygen/classclang_1_1CompoundStmt.html>`_, 90whose child nodes are a `declaration 91statement <http://clang.llvm.org/doxygen/classclang_1_1DeclStmt.html>`_ 92that declares our result variable, and the `return 93statement <http://clang.llvm.org/doxygen/classclang_1_1ReturnStmt.html>`_. 94 95AST Context 96=========== 97 98All information about the AST for a translation unit is bundled up in 99the class 100`ASTContext <http://clang.llvm.org/doxygen/classclang_1_1ASTContext.html>`_. 101It allows traversal of the whole translation unit starting from 102`getTranslationUnitDecl <http://clang.llvm.org/doxygen/classclang_1_1ASTContext.html#abd909fb01ef10cfd0244832a67b1dd64>`_, 103or to access Clang's `table of 104identifiers <http://clang.llvm.org/doxygen/classclang_1_1ASTContext.html#a4f95adb9958e22fbe55212ae6482feb4>`_ 105for the parsed translation unit. 106 107AST Nodes 108========= 109 110Clang's AST nodes are modeled on a class hierarchy that does not have a 111common ancestor. Instead, there are multiple larger hierarchies for 112basic node types like 113`Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_ and 114`Stmt <http://clang.llvm.org/doxygen/classclang_1_1Stmt.html>`_. Many 115important AST nodes derive from 116`Type <http://clang.llvm.org/doxygen/classclang_1_1Type.html>`_, 117`Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_, 118`DeclContext <http://clang.llvm.org/doxygen/classclang_1_1DeclContext.html>`_ 119or `Stmt <http://clang.llvm.org/doxygen/classclang_1_1Stmt.html>`_, with 120some classes deriving from both Decl and DeclContext. 121 122There are also a multitude of nodes in the AST that are not part of a 123larger hierarchy, and are only reachable from specific other nodes, like 124`CXXBaseSpecifier <http://clang.llvm.org/doxygen/classclang_1_1CXXBaseSpecifier.html>`_. 125 126Thus, to traverse the full AST, one starts from the 127`TranslationUnitDecl <http://clang.llvm.org/doxygen/classclang_1_1TranslationUnitDecl.html>`_ 128and then recursively traverses everything that can be reached from that 129node - this information has to be encoded for each specific node type. 130This algorithm is encoded in the 131`RecursiveASTVisitor <http://clang.llvm.org/doxygen/classclang_1_1RecursiveASTVisitor.html>`_. 132See the `RecursiveASTVisitor 133tutorial <http://clang.llvm.org/docs/RAVFrontendAction.html>`_. 134 135The two most basic nodes in the Clang AST are statements 136(`Stmt <http://clang.llvm.org/doxygen/classclang_1_1Stmt.html>`_) and 137declarations 138(`Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_). Note 139that expressions 140(`Expr <http://clang.llvm.org/doxygen/classclang_1_1Expr.html>`_) are 141also statements in Clang's AST. 142