• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<html>
2  <head>
3  <title>Pretokenized Headers (PTH)</title>
4  <link type="text/css" rel="stylesheet" href="../menu.css" />
5  <link type="text/css" rel="stylesheet" href="../content.css" />
6  <style type="text/css">
7    td {
8    vertical-align: top;
9    }
10  </style>
11</head>
12<body>
13
14<!--#include virtual="../menu.html.incl"-->
15
16<div id="content">
17
18<h1>Pretokenized Headers (PTH)</h1>
19
20<p>This document first describes the low-level
21interface for using PTH and then briefly elaborates on its design and
22implementation.  If you are interested in the end-user view, please see the
23<a href="UsersManual.html#precompiledheaders">User's Manual</a>.</p>
24
25
26<h2>Using Pretokenized Headers with <tt>clang</tt> (Low-level Interface)</h2>
27
28<p>The Clang compiler frontend, <tt>clang -cc1</tt>, supports three command line
29options for generating and using PTH files.<p>
30
31<p>To generate PTH files using <tt>clang -cc1</tt>, use the option
32<b><tt>-emit-pth</tt></b>:
33
34<pre> $ clang -cc1 test.h -emit-pth -o test.h.pth </pre>
35
36<p>This option is transparently used by <tt>clang</tt> when generating PTH
37files. Similarly, PTH files can be used as prefix headers using the
38<b><tt>-include-pth</tt></b> option:</p>
39
40<pre>
41  $ clang -cc1 -include-pth test.h.pth test.c -o test.s
42</pre>
43
44<p>Alternatively, Clang's PTH files can be used as a raw &quot;token-cache&quot;
45(or &quot;content&quot; cache) of the source included by the original header
46file. This means that the contents of the PTH file are searched as substitutes
47for <em>any</em> source files that are used by <tt>clang -cc1</tt> to process a
48source file. This is done by specifying the <b><tt>-token-cache</tt></b>
49option:</p>
50
51<pre>
52  $ cat test.h
53  #include &lt;stdio.h&gt;
54  $ clang -cc1 -emit-pth test.h -o test.h.pth
55  $ cat test.c
56  #include "test.h"
57  $ clang -cc1 test.c -o test -token-cache test.h.pth
58</pre>
59
60<p>In this example the contents of <tt>stdio.h</tt> (and the files it includes)
61will be retrieved from <tt>test.h.pth</tt>, as the PTH file is being used in
62this case as a raw cache of the contents of <tt>test.h</tt>. This is a low-level
63interface used to both implement the high-level PTH interface as well as to
64provide alternative means to use PTH-style caching.</p>
65
66<h2>PTH Design and Implementation</h2>
67
68<p>Unlike GCC's precompiled headers, which cache the full ASTs and preprocessor
69state of a header file, Clang's pretokenized header files mainly cache the raw
70lexer <em>tokens</em> that are needed to segment the stream of characters in a
71source file into keywords, identifiers, and operators. Consequently, PTH serves
72to mainly directly speed up the lexing and preprocessing of a source file, while
73parsing and type-checking must be completely redone every time a PTH file is
74used.</p>
75
76<h3>Basic Design Tradeoffs</h3>
77
78<p>In the long term there are plans to provide an alternate PCH implementation
79for Clang that also caches the work for parsing and type checking the contents
80of header files. The current implementation of PCH in Clang as pretokenized
81header files was motivated by the following factors:<p>
82
83<ul>
84
85<li><p><b>Language independence</b>: PTH files work with any language that
86Clang's lexer can handle, including C, Objective-C, and (in the early stages)
87C++. This means development on language features at the parsing level or above
88(which is basically almost all interesting pieces) does not require PTH to be
89modified.</p></li>
90
91<li><b>Simple design</b>: Relatively speaking, PTH has a simple design and
92implementation, making it easy to test. Further, because the machinery for PTH
93resides at the lower-levels of the Clang library stack it is fairly
94straightforward to profile and optimize.</li>
95</ul>
96
97<p>Further, compared to GCC's PCH implementation (which is the dominate
98precompiled header file implementation that Clang can be directly compared
99against) the PTH design in Clang yields several attractive features:</p>
100
101<ul>
102
103<li><p><b>Architecture independence</b>: In contrast to GCC's PCH files (and
104those of several other compilers), Clang's PTH files are architecture
105independent, requiring only a single PTH file when building an program for
106multiple architectures.</p>
107
108<p>For example, on Mac OS X one may wish to
109compile a &quot;universal binary&quot; that runs on PowerPC, 32-bit Intel
110(i386), and 64-bit Intel architectures. In contrast, GCC requires a PCH file for
111each architecture, as the definitions of types in the AST are
112architecture-specific. Since a Clang PTH file essentially represents a lexical
113cache of header files, a single PTH file can be safely used when compiling for
114multiple architectures. This can also reduce compile times because only a single
115PTH file needs to be generated during a build instead of several.</p></li>
116
117<li><p><b>Reduced memory pressure</b>: Similar to GCC,
118Clang reads PTH files via the use of memory mapping (i.e., <tt>mmap</tt>).
119Clang, however, memory maps PTH files as read-only, meaning that multiple
120invocations of <tt>clang -cc1</tt> can share the same pages in memory from a
121memory-mapped PTH file. In comparison, GCC also memory maps its PCH files but
122also modifies those pages in memory, incurring the copy-on-write costs. The
123read-only nature of PTH can greatly reduce memory pressure for builds involving
124multiple cores, thus improving overall scalability.</p></li>
125
126<li><p><b>Fast generation</b>: PTH files can be generated in a small fraction
127of the time needed to generate GCC's PCH files. Since PTH/PCH generation is a
128serial operation that typically blocks progress during a build, faster
129generation time leads to improved processor utilization with parallel builds on
130multicore machines.</p></li>
131
132</ul>
133
134<p>Despite these strengths, PTH's simple design suffers some algorithmic
135handicaps compared to other PCH strategies such as those used by GCC. While PTH
136can greatly speed up the processing time of a header file, the amount of work
137required to process a header file is still roughly linear in the size of the
138header file. In contrast, the amount of work done by GCC to process a
139precompiled header is (theoretically) constant (the ASTs for the header are
140literally memory mapped into the compiler). This means that only the pieces of
141the header file that are referenced by the source file including the header are
142the only ones the compiler needs to process during actual compilation. While
143GCC's particular implementation of PCH mitigates some of these algorithmic
144strengths via the use of copy-on-write pages, the approach itself can
145fundamentally dominate at an algorithmic level, especially when one considers
146header files of arbitrary size.</p>
147
148<p>There are plans to potentially implement an complementary PCH implementation
149for Clang based on the lazy deserialization of ASTs. This approach would
150theoretically have the same constant-time algorithmic advantages just mentioned
151but would also retain some of the strengths of PTH such as reduced memory
152pressure (ideal for multi-core builds).</p>
153
154<h3>Internal PTH Optimizations</h3>
155
156<p>While the main optimization employed by PTH is to reduce lexing time of
157header files by caching pre-lexed tokens, PTH also employs several other
158optimizations to speed up the processing of header files:</p>
159
160<ul>
161
162<li><p><em><tt>stat</tt> caching</em>: PTH files cache information obtained via
163calls to <tt>stat</tt> that <tt>clang -cc1</tt> uses to resolve which files are
164included by <tt>#include</tt> directives. This greatly reduces the overhead
165involved in context-switching to the kernel to resolve included files.</p></li>
166
167<li><p><em>Fasting skipping of <tt>#ifdef</tt>...<tt>#endif</tt> chains</em>:
168PTH files record the basic structure of nested preprocessor blocks. When the
169condition of the preprocessor block is false, all of its tokens are immediately
170skipped instead of requiring them to be handled by Clang's
171preprocessor.</p></li>
172
173</ul>
174
175</div>
176</body>
177</html>
178