1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 2 "http://www.w3.org/TR/html4/strict.dtd"> 3<html> 4 <head> 5 <title>Pretokenized Headers (PTH)</title> 6 <link type="text/css" rel="stylesheet" href="../menu.css"> 7 <link type="text/css" rel="stylesheet" href="../content.css"> 8 <style type="text/css"> 9 td { 10 vertical-align: top; 11 } 12 </style> 13</head> 14<body> 15 16<!--#include virtual="../menu.html.incl"--> 17 18<div id="content"> 19 20<h1>Pretokenized Headers (PTH)</h1> 21 22<p>This document first describes the low-level 23interface for using PTH and then briefly elaborates on its design and 24implementation. If you are interested in the end-user view, please see the 25<a href="UsersManual.html#precompiledheaders">User's Manual</a>.</p> 26 27 28<h2>Using Pretokenized Headers with <tt>clang</tt> (Low-level Interface)</h2> 29 30<p>The Clang compiler frontend, <tt>clang -cc1</tt>, supports three command line 31options for generating and using PTH files.<p> 32 33<p>To generate PTH files using <tt>clang -cc1</tt>, use the option 34<b><tt>-emit-pth</tt></b>: 35 36<pre> $ clang -cc1 test.h -emit-pth -o test.h.pth </pre> 37 38<p>This option is transparently used by <tt>clang</tt> when generating PTH 39files. Similarly, PTH files can be used as prefix headers using the 40<b><tt>-include-pth</tt></b> option:</p> 41 42<pre> 43 $ clang -cc1 -include-pth test.h.pth test.c -o test.s 44</pre> 45 46<p>Alternatively, Clang's PTH files can be used as a raw "token-cache" 47(or "content" cache) of the source included by the original header 48file. This means that the contents of the PTH file are searched as substitutes 49for <em>any</em> source files that are used by <tt>clang -cc1</tt> to process a 50source file. This is done by specifying the <b><tt>-token-cache</tt></b> 51option:</p> 52 53<pre> 54 $ cat test.h 55 #include <stdio.h> 56 $ clang -cc1 -emit-pth test.h -o test.h.pth 57 $ cat test.c 58 #include "test.h" 59 $ clang -cc1 test.c -o test -token-cache test.h.pth 60</pre> 61 62<p>In this example the contents of <tt>stdio.h</tt> (and the files it includes) 63will be retrieved from <tt>test.h.pth</tt>, as the PTH file is being used in 64this case as a raw cache of the contents of <tt>test.h</tt>. This is a low-level 65interface used to both implement the high-level PTH interface as well as to 66provide alternative means to use PTH-style caching.</p> 67 68<h2>PTH Design and Implementation</h2> 69 70<p>Unlike GCC's precompiled headers, which cache the full ASTs and preprocessor 71state of a header file, Clang's pretokenized header files mainly cache the raw 72lexer <em>tokens</em> that are needed to segment the stream of characters in a 73source file into keywords, identifiers, and operators. Consequently, PTH serves 74to mainly directly speed up the lexing and preprocessing of a source file, while 75parsing and type-checking must be completely redone every time a PTH file is 76used.</p> 77 78<h3>Basic Design Tradeoffs</h3> 79 80<p>In the long term there are plans to provide an alternate PCH implementation 81for Clang that also caches the work for parsing and type checking the contents 82of header files. The current implementation of PCH in Clang as pretokenized 83header files was motivated by the following factors:<p> 84 85<ul> 86 87<li><p><b>Language independence</b>: PTH files work with any language that 88Clang's lexer can handle, including C, Objective-C, and (in the early stages) 89C++. This means development on language features at the parsing level or above 90(which is basically almost all interesting pieces) does not require PTH to be 91modified.</p></li> 92 93<li><b>Simple design</b>: Relatively speaking, PTH has a simple design and 94implementation, making it easy to test. Further, because the machinery for PTH 95resides at the lower-levels of the Clang library stack it is fairly 96straightforward to profile and optimize.</li> 97</ul> 98 99<p>Further, compared to GCC's PCH implementation (which is the dominate 100precompiled header file implementation that Clang can be directly compared 101against) the PTH design in Clang yields several attractive features:</p> 102 103<ul> 104 105<li><p><b>Architecture independence</b>: In contrast to GCC's PCH files (and 106those of several other compilers), Clang's PTH files are architecture 107independent, requiring only a single PTH file when building an program for 108multiple architectures.</p> 109 110<p>For example, on Mac OS X one may wish to 111compile a "universal binary" that runs on PowerPC, 32-bit Intel 112(i386), and 64-bit Intel architectures. In contrast, GCC requires a PCH file for 113each architecture, as the definitions of types in the AST are 114architecture-specific. Since a Clang PTH file essentially represents a lexical 115cache of header files, a single PTH file can be safely used when compiling for 116multiple architectures. This can also reduce compile times because only a single 117PTH file needs to be generated during a build instead of several.</p></li> 118 119<li><p><b>Reduced memory pressure</b>: Similar to GCC, 120Clang reads PTH files via the use of memory mapping (i.e., <tt>mmap</tt>). 121Clang, however, memory maps PTH files as read-only, meaning that multiple 122invocations of <tt>clang -cc1</tt> can share the same pages in memory from a 123memory-mapped PTH file. In comparison, GCC also memory maps its PCH files but 124also modifies those pages in memory, incurring the copy-on-write costs. The 125read-only nature of PTH can greatly reduce memory pressure for builds involving 126multiple cores, thus improving overall scalability.</p></li> 127 128<li><p><b>Fast generation</b>: PTH files can be generated in a small fraction 129of the time needed to generate GCC's PCH files. Since PTH/PCH generation is a 130serial operation that typically blocks progress during a build, faster 131generation time leads to improved processor utilization with parallel builds on 132multicore machines.</p></li> 133 134</ul> 135 136<p>Despite these strengths, PTH's simple design suffers some algorithmic 137handicaps compared to other PCH strategies such as those used by GCC. While PTH 138can greatly speed up the processing time of a header file, the amount of work 139required to process a header file is still roughly linear in the size of the 140header file. In contrast, the amount of work done by GCC to process a 141precompiled header is (theoretically) constant (the ASTs for the header are 142literally memory mapped into the compiler). This means that only the pieces of 143the header file that are referenced by the source file including the header are 144the only ones the compiler needs to process during actual compilation. While 145GCC's particular implementation of PCH mitigates some of these algorithmic 146strengths via the use of copy-on-write pages, the approach itself can 147fundamentally dominate at an algorithmic level, especially when one considers 148header files of arbitrary size.</p> 149 150<p>There are plans to potentially implement an complementary PCH implementation 151for Clang based on the lazy deserialization of ASTs. This approach would 152theoretically have the same constant-time algorithmic advantages just mentioned 153but would also retain some of the strengths of PTH such as reduced memory 154pressure (ideal for multi-core builds).</p> 155 156<h3>Internal PTH Optimizations</h3> 157 158<p>While the main optimization employed by PTH is to reduce lexing time of 159header files by caching pre-lexed tokens, PTH also employs several other 160optimizations to speed up the processing of header files:</p> 161 162<ul> 163 164<li><p><em><tt>stat</tt> caching</em>: PTH files cache information obtained via 165calls to <tt>stat</tt> that <tt>clang -cc1</tt> uses to resolve which files are 166included by <tt>#include</tt> directives. This greatly reduces the overhead 167involved in context-switching to the kernel to resolve included files.</p></li> 168 169<li><p><em>Fasting skipping of <tt>#ifdef</tt>...<tt>#endif</tt> chains</em>: 170PTH files record the basic structure of nested preprocessor blocks. When the 171condition of the preprocessor block is false, all of its tokens are immediately 172skipped instead of requiring them to be handled by Clang's 173preprocessor.</p></li> 174 175</ul> 176 177</div> 178</body> 179</html> 180