• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1================================
2Using CFFI for embedding
3================================
4
5.. contents::
6
7You can use CFFI to generate C code which exports the API of your choice
8to any C application that wants to link with this C code.  This API,
9which you define yourself, ends up as the API of a ``.so/.dll/.dylib``
10library---or you can statically link it within a larger application.
11
12Possible use cases:
13
14* Exposing a library written in Python directly to C/C++ programs.
15
16* Using Python to make a "plug-in" for an existing C/C++ program that is
17  already written to load them.
18
19* Using Python to implement part of a larger C/C++ application (with
20  static linking).
21
22* Writing a small C/C++ wrapper around Python, hiding the fact that the
23  application is actually written in Python (to make a custom
24  command-line interface; for distribution purposes; or simply to make
25  it a bit harder to reverse-engineer the application).
26
27The general idea is as follows:
28
29* You write and execute a Python script, which produces a ``.c`` file
30  with the API of your choice (and optionally compile it into a
31  ``.so/.dll/.dylib``).  The script also gives some Python code to be
32  "frozen" inside the ``.so``.
33
34* At runtime, the C application loads this ``.so/.dll/.dylib`` (or is
35  statically linked with the ``.c`` source) without having to know that
36  it was produced from Python and CFFI.
37
38* The first time a C function is called, Python is initialized and
39  the frozen Python code is executed.
40
41* The frozen Python code defines more Python functions that implement the
42  C functions of your API, which are then used for all subsequent C
43  function calls.
44
45One of the goals of this approach is to be entirely independent from
46the CPython C API: no ``Py_Initialize()`` nor ``PyRun_SimpleString()``
47nor even ``PyObject``.  It works identically on CPython and PyPy.
48
49This is entirely *new in version 1.5.*  (PyPy contains CFFI 1.5 since
50release 5.0.)
51
52
53Usage
54-----
55
56.. __: overview.html#embedding
57
58See the `paragraph in the overview page`__ for a quick introduction.
59In this section, we explain every step in more details.  We will use
60here this slightly expanded example:
61
62.. code-block:: c
63
64    /* file plugin.h */
65    typedef struct { int x, y; } point_t;
66    extern int do_stuff(point_t *);
67
68.. code-block:: c
69
70    /* file plugin.h, Windows-friendly version */
71    typedef struct { int x, y; } point_t;
72
73    /* When including this file from ffibuilder.set_source(), the
74       following macro is defined to '__declspec(dllexport)'.  When
75       including this file directly from your C program, we define
76       it to 'extern __declspec(dllimport)' instead.
77
78       With non-MSVC compilers we simply define it to 'extern'.
79       (The 'extern' is needed for sharing global variables;
80       functions would be fine without it.  The macros always
81       include 'extern': you must not repeat it when using the
82       macros later.)
83    */
84    #ifndef CFFI_DLLEXPORT
85    #  if defined(_MSC_VER)
86    #    define CFFI_DLLEXPORT  extern __declspec(dllimport)
87    #  else
88    #    define CFFI_DLLEXPORT  extern
89    #  endif
90    #endif
91
92    CFFI_DLLEXPORT int do_stuff(point_t *);
93
94.. code-block:: python
95
96    # file plugin_build.py
97    import cffi
98    ffibuilder = cffi.FFI()
99
100    with open('plugin.h') as f:
101        # read plugin.h and pass it to embedding_api(), manually
102        # removing the '#' directives and the CFFI_DLLEXPORT
103        data = ''.join([line for line in f if not line.startswith('#')])
104        data = data.replace('CFFI_DLLEXPORT', '')
105        ffibuilder.embedding_api(data)
106
107    ffibuilder.set_source("my_plugin", r'''
108        #include "plugin.h"
109    ''')
110
111    ffibuilder.embedding_init_code("""
112        from my_plugin import ffi
113
114        @ffi.def_extern()
115        def do_stuff(p):
116            print("adding %d and %d" % (p.x, p.y))
117            return p.x + p.y
118    """)
119
120    ffibuilder.compile(target="plugin-1.5.*", verbose=True)
121    # or: ffibuilder.emit_c_code("my_plugin.c")
122
123Running the code above produces a *DLL*, i,e, a dynamically-loadable
124library.  It is a file with the extension ``.dll`` on Windows,
125``.dylib`` on Mac OS/X, or ``.so`` on other platforms.  As usual, it
126is produced by generating some intermediate ``.c`` code and then
127calling the regular platform-specific C compiler.  See below__ for
128some pointers to C-level issues with using the produced library.
129
130.. __: `Issues about using the .so`_
131
132Here are some details about the methods used above:
133
134* **ffibuilder.embedding_api(source):** parses the given C source, which
135  declares functions that you want to be exported by the DLL.  It can
136  also declare types, constants and global variables that are part of
137  the C-level API of your DLL.
138
139  The functions that are found in ``source`` will be automatically
140  defined in the ``.c`` file: they will contain code that initializes
141  the Python interpreter the first time any of them is called,
142  followed by code to call the attached Python function (with
143  ``@ffi.def_extern()``, see next point).
144
145  The global variables, on the other hand, are not automatically
146  produced.  You have to write their definition explicitly in
147  ``ffibuilder.set_source()``, as regular C code (see the point after next).
148
149* **ffibuilder.embedding_init_code(python_code):** this gives
150  initialization-time Python source code.  This code is copied
151  ("frozen") inside the DLL.  At runtime, the code is executed when
152  the DLL is first initialized, just after Python itself is
153  initialized.  This newly initialized Python interpreter has got an
154  extra "built-in" module that can be loaded magically without
155  accessing any files, with a line like "``from my_plugin import ffi,
156  lib``".  The name ``my_plugin`` comes from the first argument to
157  ``ffibuilder.set_source()``.  This module represents "the caller's C world"
158  from the point of view of Python.
159
160  The initialization-time Python code can import other modules or
161  packages as usual.  You may have typical Python issues like needing
162  to set up ``sys.path`` somehow manually first.
163
164  For every function declared within ``ffibuilder.embedding_api()``, the
165  initialization-time Python code or one of the modules it imports
166  should use the decorator ``@ffi.def_extern()`` to attach a
167  corresponding Python function to it.
168
169  If the initialization-time Python code fails with an exception, then
170  you get a traceback printed to stderr, along with more information
171  to help you identify problems like wrong ``sys.path``.  If some
172  function remains unattached at the time where the C code tries to
173  call it, an error message is also printed to stderr and the function
174  returns zero/null.
175
176  Note that the CFFI module never calls ``exit()``, but CPython itself
177  contains code that calls ``exit()``, for example if importing
178  ``site`` fails.  This may be worked around in the future.
179
180* **ffibuilder.set_source(c_module_name, c_code):** set the name of the
181  module from Python's point of view.  It also gives more C code which
182  will be included in the generated C code.  In trivial examples it
183  can be an empty string.  It is where you would ``#include`` some
184  other files, define global variables, and so on.  The macro
185  ``CFFI_DLLEXPORT`` is available to this C code: it expands to the
186  platform-specific way of saying "the following declaration should be
187  exported from the DLL".  For example, you would put "``extern int
188  my_glob;``" in ``ffibuilder.embedding_api()`` and "``CFFI_DLLEXPORT int
189  my_glob = 42;``" in ``ffibuilder.set_source()``.
190
191  Currently, any *type* declared in ``ffibuilder.embedding_api()`` must also
192  be present in the ``c_code``.  This is automatic if this code
193  contains a line like ``#include "plugin.h"`` in the example above.
194
195* **ffibuilder.compile([target=...] [, verbose=True]):** make the C code and
196  compile it.  By default, it produces a file called
197  ``c_module_name.dll``, ``c_module_name.dylib`` or
198  ``c_module_name.so``, but the default can be changed with the
199  optional ``target`` keyword argument.  You can use
200  ``target="foo.*"`` with a literal ``*`` to ask for a file called
201  ``foo.dll`` on Windows, ``foo.dylib`` on OS/X and ``foo.so``
202  elsewhere.  One reason for specifying an alternate ``target`` is to
203  include characters not usually allowed in Python module names, like
204  "``plugin-1.5.*``".
205
206  For more complicated cases, you can call instead
207  ``ffibuilder.emit_c_code("foo.c")`` and compile the resulting ``foo.c``
208  file using other means.  CFFI's compilation logic is based on the
209  standard library ``distutils`` package, which is really developed
210  and tested for the purpose of making CPython extension modules; it
211  might not always be appropriate for making general DLLs.  Also, just
212  getting the C code is what you need if you do not want to make a
213  stand-alone ``.so/.dll/.dylib`` file: this C file can be compiled
214  and statically linked as part of a larger application.
215
216
217More reading
218------------
219
220If you're reading this page about embedding and you are not familiar
221with CFFI already, here are a few pointers to what you could read
222next:
223
224* For the ``@ffi.def_extern()`` functions, integer C types are passed
225  simply as Python integers; and simple pointers-to-struct and basic
226  arrays are all straightforward enough.  However, sooner or later you
227  will need to read about this topic in more details here__.
228
229* ``@ffi.def_extern()``: see `documentation here,`__ notably on what
230  happens if the Python function raises an exception.
231
232* To create Python objects attached to C data, one common solution is
233  to use ``ffi.new_handle()``.  See documentation here__.
234
235* In embedding mode, the major direction is C code that calls Python
236  functions.  This is the opposite of the regular extending mode of
237  CFFI, in which the major direction is Python code calling C.  That's
238  why the page `Using the ffi/lib objects`_ talks first about the
239  latter, and why the direction "C code that calls Python" is
240  generally referred to as "callbacks" in that page.  If you also
241  need to have your Python code call C code, read more about
242  `Embedding and Extending`_ below.
243
244* ``ffibuilder.embedding_api(source)``: follows the same syntax as
245  ``ffibuilder.cdef()``, `documented here.`__  You can use the "``...``"
246  syntax as well, although in practice it may be less useful than it
247  is for ``cdef()``.  On the other hand, it is expected that often the
248  C sources that you need to give to ``ffibuilder.embedding_api()`` would be
249  exactly the same as the content of some ``.h`` file that you want to
250  give to users of your DLL.  That's why the example above does this::
251
252      with open('foo.h') as f:
253          ffibuilder.embedding_api(f.read())
254
255  Note that a drawback of this approach is that ``ffibuilder.embedding_api()``
256  doesn't support ``#ifdef`` directives.  You may have to use a more
257  convoluted expression like::
258
259      with open('foo.h') as f:
260          lines = [line for line in f if not line.startswith('#')]
261          ffibuilder.embedding_api(''.join(lines))
262
263  As in the example above, you can also use the same ``foo.h`` from
264  ``ffibuilder.set_source()``::
265
266      ffibuilder.set_source('module_name', r'''
267          #include "foo.h"
268      ''')
269
270
271.. __: using.html#working
272.. __: using.html#def-extern
273.. __: ref.html#ffi-new-handle
274.. __: cdef.html#cdef
275
276.. _`Using the ffi/lib objects`: using.html
277
278
279Troubleshooting
280---------------
281
282* The error message
283
284    cffi extension module 'c_module_name' has unknown version 0x2701
285
286  means that the running Python interpreter located a CFFI version older
287  than 1.5.  CFFI 1.5 or newer must be installed in the running Python.
288
289* On PyPy, the error message
290
291    debug: pypy_setup_home: directories 'lib-python' and 'lib_pypy' not
292    found in pypy's shared library location or in any parent directory
293
294  means that the ``libpypy-c.so`` file was found, but the standard library
295  was not found from this location.  This occurs at least on some Linux
296  distributions, because they put ``libpypy-c.so`` inside ``/usr/lib/``,
297  instead of the way we recommend, which is: keep that file inside
298  ``/opt/pypy/bin/`` and put a symlink to there from ``/usr/lib/``.
299  The quickest fix is to do that change manually.
300
301
302Issues about using the .so
303--------------------------
304
305This paragraph describes issues that are not necessarily specific to
306CFFI.  It assumes that you have obtained the ``.so/.dylib/.dll`` file as
307described above, but that you have troubles using it.  (In summary: it
308is a mess.  This is my own experience, slowly built by using Google and
309by listening to reports from various platforms.  Please report any
310inaccuracies in this paragraph or better ways to do things.)
311
312* The file produced by CFFI should follow this naming pattern:
313  ``libmy_plugin.so`` on Linux, ``libmy_plugin.dylib`` on Mac, or
314  ``my_plugin.dll`` on Windows (no ``lib`` prefix on Windows).
315
316* First note that this file does not contain the Python interpreter
317  nor the standard library of Python.  You still need it to be
318  somewhere.  There are ways to compact it to a smaller number of files,
319  but this is outside the scope of CFFI (please report if you used some
320  of these ways successfully so that I can add some links here).
321
322* In what we'll call the "main program", the ``.so`` can be either
323  used dynamically (e.g. by calling ``dlopen()`` or ``LoadLibrary()``
324  inside the main program), or at compile-time (e.g. by compiling it
325  with ``gcc -lmy_plugin``).  The former case is always used if you're
326  building a plugin for a program, and the program itself doesn't need
327  to be recompiled.  The latter case is for making a CFFI library that
328  is more tightly integrated inside the main program.
329
330* In the case of compile-time usage: you can add the gcc
331  option ``-Lsome/path/`` before ``-lmy_plugin`` to describe where the
332  ``libmy_plugin.so`` is.  On some platforms, notably Linux, ``gcc``
333  will complain if it can find ``libmy_plugin.so`` but not
334  ``libpython27.so`` or ``libpypy-c.so``.  To fix it, you need to call
335  ``LD_LIBRARY_PATH=/some/path/to/libpypy gcc``.
336
337* When actually executing the main program, it needs to find the
338  ``libmy_plugin.so`` but also ``libpython27.so`` or ``libpypy-c.so``.
339  For PyPy, unpack a PyPy distribution and you get a full directory
340  structure with ``libpypy-c.so`` inside a ``bin`` subdirectory, or on
341  Windows ``pypy-c.dll`` inside the top directory; you must not move
342  this file around, but just point to it.  One way to point to it is by
343  running the main program with some environment variable:
344  ``LD_LIBRARY_PATH=/some/path/to/libpypy`` on Linux,
345  ``DYLD_LIBRARY_PATH=/some/path/to/libpypy`` on OS/X.
346
347* You can avoid the ``LD_LIBRARY_PATH`` issue if you compile
348  ``libmy_plugin.so`` with the path hard-coded inside in the first
349  place.  On Linux, this is done by ``gcc -Wl,-rpath=/some/path``.  You
350  would put this option in ``ffibuilder.set_source("my_plugin", ...,
351  extra_link_args=['-Wl,-rpath=/some/path/to/libpypy'])``.  The path can
352  start with ``$ORIGIN`` to mean "the directory where
353  ``libmy_plugin.so`` is".  You can then specify a path relative to that
354  place, like ``extra_link_args=['-Wl,-rpath=$ORIGIN/../venv/bin']``.
355  Use ``ldd libmy_plugin.so`` to look at what path is currently compiled
356  in after the expansion of ``$ORIGIN``.)
357
358  After this, you don't need ``LD_LIBRARY_PATH`` any more to locate
359  ``libpython27.so`` or ``libpypy-c.so`` at runtime.  In theory it
360  should also cover the call to ``gcc`` for the main program.  I wasn't
361  able to make ``gcc`` happy without ``LD_LIBRARY_PATH`` on Linux if
362  the rpath starts with ``$ORIGIN``, though.
363
364* The same rpath trick might be used to let the main program find
365  ``libmy_plugin.so`` in the first place without ``LD_LIBRARY_PATH``.
366  (This doesn't apply if the main program uses ``dlopen()`` to load it
367  as a dynamic plugin.)  You'd make the main program with ``gcc
368  -Wl,-rpath=/path/to/libmyplugin``, possibly with ``$ORIGIN``.  The
369  ``$`` in ``$ORIGIN`` causes various shell problems on its own: if
370  using a common shell you need to say ``gcc
371  -Wl,-rpath=\$ORIGIN``.  From a Makefile, you need to say
372  something like ``gcc -Wl,-rpath=\$$ORIGIN``.
373
374* On some Linux distributions, notably Debian, the ``.so`` files of
375  CPython C extension modules may be compiled without saying that they
376  depend on ``libpythonX.Y.so``.  This makes such Python systems
377  unsuitable for embedding if the embedder uses ``dlopen(...,
378  RTLD_LOCAL)``.  You get an ``undefined symbol`` error.  See
379  `issue #264`__.  A workaround is to first call
380  ``dlopen("libpythonX.Y.so", RTLD_LAZY|RTLD_GLOBAL)``, which will
381  force ``libpythonX.Y.so`` to be loaded first.
382
383.. __: https://foss.heptapod.net/pypy/cffi/-/issues/264
384
385
386Using multiple CFFI-made DLLs
387-----------------------------
388
389Multiple CFFI-made DLLs can be used by the same process.
390
391Note that all CFFI-made DLLs in a process share a single Python
392interpreter.  The effect is the same as the one you get by trying to
393build a large Python application by assembling a lot of unrelated
394packages.  Some of these might be libraries that monkey-patch some
395functions from the standard library, for example, which might be
396unexpected from other parts.
397
398
399Multithreading
400--------------
401
402Multithreading should work transparently, based on Python's standard
403Global Interpreter Lock.
404
405If two threads both try to call a C function when Python is not yet
406initialized, then locking occurs.  One thread proceeds with
407initialization and blocks the other thread.  The other thread will be
408allowed to continue only when the execution of the initialization-time
409Python code is done.
410
411If the two threads call two *different* CFFI-made DLLs, the Python
412initialization itself will still be serialized, but the two pieces of
413initialization-time Python code will not.  The idea is that there is a
414priori no reason for one DLL to wait for initialization of the other
415DLL to be complete.
416
417After initialization, Python's standard Global Interpreter Lock kicks
418in.  The end result is that when one CPU progresses on executing
419Python code, no other CPU can progress on executing more Python code
420from another thread of the same process.  At regular intervals, the
421lock switches to a different thread, so that no single thread should
422appear to block indefinitely.
423
424
425Testing
426-------
427
428For testing purposes, a CFFI-made DLL can be imported in a running
429Python interpreter instead of being loaded like a C shared library.
430
431You might have some issues with the file name: for example, on
432Windows, Python expects the file to be called ``c_module_name.pyd``,
433but the CFFI-made DLL is called ``target.dll`` instead.  The base name
434``target`` is the one specified in ``ffibuilder.compile()``, and on Windows
435the extension is ``.dll`` instead of ``.pyd``.  You have to rename or
436copy the file, or on POSIX use a symlink.
437
438The module then works like a regular CFFI extension module.  It is
439imported with "``from c_module_name import ffi, lib``" and exposes on
440the ``lib`` object all C functions.  You can test it by calling these
441C functions.  The initialization-time Python code frozen inside the
442DLL is executed the first time such a call is done.
443
444
445Embedding and Extending
446-----------------------
447
448The embedding mode is not incompatible with the non-embedding mode of
449CFFI.
450
451You can use *both* ``ffibuilder.embedding_api()`` and
452``ffibuilder.cdef()`` in the
453same build script.  You put in the former the declarations you want to
454be exported by the DLL; you put in the latter only the C functions and
455types that you want to share between C and Python, but not export from
456the DLL.
457
458As an example of that, consider the case where you would like to have
459a DLL-exported C function written in C directly, maybe to handle some
460cases before calling Python functions.  To do that, you must *not* put
461the function's signature in ``ffibuilder.embedding_api()``.  (Note that this
462requires more hacks if you use ``ffibuilder.embedding_api(f.read())``.)
463You must only write the custom function definition in
464``ffibuilder.set_source()``, and prefix it with the macro CFFI_DLLEXPORT:
465
466.. code-block:: c
467
468    CFFI_DLLEXPORT int myfunc(int a, int b)
469    {
470        /* implementation here */
471    }
472
473This function can, if it wants, invoke Python functions using the
474general mechanism of "callbacks"---called this way because it is a
475call from C to Python, although in this case it is not calling
476anything back:
477
478.. code-block:: python
479
480    ffibuilder.cdef("""
481        extern "Python" int mycb(int);
482    """)
483
484    ffibuilder.set_source("my_plugin", r"""
485
486        static int mycb(int);   /* the callback: forward declaration, to make
487                                   it accessible from the C code that follows */
488
489        CFFI_DLLEXPORT int myfunc(int a, int b)
490        {
491            int product = a * b;   /* some custom C code */
492            return mycb(product);
493        }
494    """)
495
496and then the Python initialization code needs to contain the lines:
497
498.. code-block:: python
499
500    @ffi.def_extern()
501    def mycb(x):
502        print "hi, I'm called with x =", x
503        return x * 10
504
505This ``@ffi.def_extern`` is attaching a Python function to the C
506callback ``mycb()``, which in this case is not exported from the DLL.
507Nevertheless, the automatic initialization of Python occurs when
508``mycb()`` is called, if it happens to be the first function called
509from C.  More precisely, it does not happen when ``myfunc()`` is
510called: this is just a C function, with no extra code magically
511inserted around it.  It only happens when ``myfunc()`` calls
512``mycb()``.
513
514As the above explanation hints, this is how ``ffibuilder.embedding_api()``
515actually implements function calls that directly invoke Python code;
516here, we have merely decomposed it explicitly, in order to add some
517custom C code in the middle.
518
519In case you need to force, from C code, Python to be initialized
520before the first ``@ffi.def_extern()`` is called, you can do so by
521calling the C function ``cffi_start_python()`` with no argument.  It
522returns an integer, 0 or -1, to tell if the initialization succeeded
523or not.  Currently there is no way to prevent a failing initialization
524from also dumping a traceback and more information to stderr.
525Note that the function ``cffi_start_python()`` is static: it must be
526called from C source written inside ``ffibuilder.set_source()``.  To
527call it from somewhere else, you need to make a function (with a
528different non-static name) in the ``ffibuilder.set_source()`` that just
529calls ``cffi_start_python()``.  The reason it is static is to avoid
530naming conflicts in case you are ultimately trying to link a large C
531program with more than one cffi embedded module in it.
532