README.rst
1===============
2pycparser v2.20
3===============
4
5
6.. image:: https://travis-ci.org/eliben/pycparser.png?branch=master
7 :align: center
8 :target: https://travis-ci.org/eliben/pycparser
9
10.. image:: https://ci.appveyor.com/api/projects/status/wrup68o5y8nuk1i9?svg=true
11 :align: center
12 :target: https://ci.appveyor.com/project/eliben/pycparser/
13
14.. contents::
15 :backlinks: none
16
17.. sectnum::
18
19
20Introduction
21============
22
23What is pycparser?
24------------------
25
26**pycparser** is a parser for the C language, written in pure Python. It is a
27module designed to be easily integrated into applications that need to parse
28C source code.
29
30What is it good for?
31--------------------
32
33Anything that needs C code to be parsed. The following are some uses for
34**pycparser**, taken from real user reports:
35
36* C code obfuscator
37* Front-end for various specialized C compilers
38* Static code checker
39* Automatic unit-test discovery
40* Adding specialized extensions to the C language
41
42One of the most popular uses of **pycparser** is in the `cffi
43<https://cffi.readthedocs.io/en/latest/>`_ library, which uses it to parse the
44declarations of C functions and types in order to auto-generate FFIs.
45
46**pycparser** is unique in the sense that it's written in pure Python - a very
47high level language that's easy to experiment with and tweak. To people familiar
48with Lex and Yacc, **pycparser**'s code will be simple to understand. It also
49has no external dependencies (except for a Python interpreter), making it very
50simple to install and deploy.
51
52Which version of C does pycparser support?
53------------------------------------------
54
55**pycparser** aims to support the full C99 language (according to the standard
56ISO/IEC 9899). Some features from C11 are also supported, and patches to support
57more are welcome.
58
59**pycparser** supports very few GCC extensions, but it's fairly easy to set
60things up so that it parses code with a lot of GCC-isms successfully. See the
61`FAQ <https://github.com/eliben/pycparser/wiki/FAQ>`_ for more details.
62
63What grammar does pycparser follow?
64-----------------------------------
65
66**pycparser** very closely follows the C grammar provided in Annex A of the C99
67standard (ISO/IEC 9899).
68
69How is pycparser licensed?
70--------------------------
71
72`BSD license <https://github.com/eliben/pycparser/blob/master/LICENSE>`_.
73
74Contact details
75---------------
76
77For reporting problems with **pycparser** or submitting feature requests, please
78open an `issue <https://github.com/eliben/pycparser/issues>`_, or submit a
79pull request.
80
81
82Installing
83==========
84
85Prerequisites
86-------------
87
88* **pycparser** was tested on Python 2.7, 3.4-3.6, on both Linux and
89 Windows. It should work on any later version (in both the 2.x and 3.x lines)
90 as well.
91
92* **pycparser** has no external dependencies. The only non-stdlib library it
93 uses is PLY, which is bundled in ``pycparser/ply``. The current PLY version is
94 3.10, retrieved from `<http://www.dabeaz.com/ply/>`_
95
96Note that **pycparser** (and PLY) uses docstrings for grammar specifications.
97Python installations that strip docstrings (such as when using the Python
98``-OO`` option) will fail to instantiate and use **pycparser**. You can try to
99work around this problem by making sure the PLY parsing tables are pre-generated
100in normal mode; this isn't an officially supported/tested mode of operation,
101though.
102
103Installation process
104--------------------
105
106Installing **pycparser** is very simple. Once you download and unzip the
107package, you just have to execute the standard ``python setup.py install``. The
108setup script will then place the ``pycparser`` module into ``site-packages`` in
109your Python's installation library.
110
111Alternatively, since **pycparser** is listed in the `Python Package Index
112<https://pypi.org/project/pycparser/>`_ (PyPI), you can install it using your
113favorite Python packaging/distribution tool, for example with::
114
115 > pip install pycparser
116
117Known problems
118--------------
119
120* Some users who've installed a new version of **pycparser** over an existing
121 version ran into a problem using the newly installed library. This has to do
122 with parse tables staying around as ``.pyc`` files from the older version. If
123 you see unexplained errors from **pycparser** after an upgrade, remove it (by
124 deleting the ``pycparser`` directory in your Python's ``site-packages``, or
125 wherever you installed it) and install again.
126
127
128Using
129=====
130
131Interaction with the C preprocessor
132-----------------------------------
133
134In order to be compilable, C code must be preprocessed by the C preprocessor -
135``cpp``. ``cpp`` handles preprocessing directives like ``#include`` and
136``#define``, removes comments, and performs other minor tasks that prepare the C
137code for compilation.
138
139For all but the most trivial snippets of C code **pycparser**, like a C
140compiler, must receive preprocessed C code in order to function correctly. If
141you import the top-level ``parse_file`` function from the **pycparser** package,
142it will interact with ``cpp`` for you, as long as it's in your PATH, or you
143provide a path to it.
144
145Note also that you can use ``gcc -E`` or ``clang -E`` instead of ``cpp``. See
146the ``using_gcc_E_libc.py`` example for more details. Windows users can download
147and install a binary build of Clang for Windows `from this website
148<http://llvm.org/releases/download.html>`_.
149
150What about the standard C library headers?
151------------------------------------------
152
153C code almost always ``#include``\s various header files from the standard C
154library, like ``stdio.h``. While (with some effort) **pycparser** can be made to
155parse the standard headers from any C compiler, it's much simpler to use the
156provided "fake" standard includes in ``utils/fake_libc_include``. These are
157standard C header files that contain only the bare necessities to allow valid
158parsing of the files that use them. As a bonus, since they're minimal, it can
159significantly improve the performance of parsing large C files.
160
161The key point to understand here is that **pycparser** doesn't really care about
162the semantics of types. It only needs to know whether some token encountered in
163the source is a previously defined type. This is essential in order to be able
164to parse C correctly.
165
166See `this blog post
167<https://eli.thegreenplace.net/2015/on-parsing-c-type-declarations-and-fake-headers>`_
168for more details.
169
170Note that the fake headers are not included in the ``pip`` package nor installed
171via ``setup.py`` (`#224 <https://github.com/eliben/pycparser/issues/224>`_).
172
173Basic usage
174-----------
175
176Take a look at the |examples|_ directory of the distribution for a few examples
177of using **pycparser**. These should be enough to get you started. Please note
178that most realistic C code samples would require running the C preprocessor
179before passing the code to **pycparser**; see the previous sections for more
180details.
181
182.. |examples| replace:: ``examples``
183.. _examples: examples
184
185
186Advanced usage
187--------------
188
189The public interface of **pycparser** is well documented with comments in
190``pycparser/c_parser.py``. For a detailed overview of the various AST nodes
191created by the parser, see ``pycparser/_c_ast.cfg``.
192
193There's also a `FAQ available here <https://github.com/eliben/pycparser/wiki/FAQ>`_.
194In any case, you can always drop me an `email <eliben@gmail.com>`_ for help.
195
196
197Modifying
198=========
199
200There are a few points to keep in mind when modifying **pycparser**:
201
202* The code for **pycparser**'s AST nodes is automatically generated from a
203 configuration file - ``_c_ast.cfg``, by ``_ast_gen.py``. If you modify the AST
204 configuration, make sure to re-generate the code.
205* Make sure you understand the optimized mode of **pycparser** - for that you
206 must read the docstring in the constructor of the ``CParser`` class. For
207 development you should create the parser without optimizations, so that it
208 will regenerate the Yacc and Lex tables when you change the grammar.
209
210
211Package contents
212================
213
214Once you unzip the ``pycparser`` package, you'll see the following files and
215directories:
216
217README.rst:
218 This README file.
219
220LICENSE:
221 The pycparser license
222
223setup.py:
224 Installation script
225
226examples/:
227 A directory with some examples of using **pycparser**
228
229pycparser/:
230 The **pycparser** module source code.
231
232tests/:
233 Unit tests.
234
235utils/fake_libc_include:
236 Minimal standard C library include files that should allow to parse any C code.
237
238utils/internal/:
239 Internal utilities for my own use. You probably don't need them.
240
241
242Contributors
243============
244
245Some people have contributed to **pycparser** by opening issues on bugs they've
246found and/or submitting patches. The list of contributors is in the CONTRIBUTORS
247file in the source distribution. After **pycparser** moved to Github I stopped
248updating this list because Github does a much better job at tracking
249contributions.
250
251
252