1================================ 2Using CFFI for embedding 3================================ 4 5.. contents:: 6 7You can use CFFI to generate C code which exports the API of your choice 8to any C application that wants to link with this C code. This API, 9which you define yourself, ends up as the API of a ``.so/.dll/.dylib`` 10library---or you can statically link it within a larger application. 11 12Possible use cases: 13 14* Exposing a library written in Python directly to C/C++ programs. 15 16* Using Python to make a "plug-in" for an existing C/C++ program that is 17 already written to load them. 18 19* Using Python to implement part of a larger C/C++ application (with 20 static linking). 21 22* Writing a small C/C++ wrapper around Python, hiding the fact that the 23 application is actually written in Python (to make a custom 24 command-line interface; for distribution purposes; or simply to make 25 it a bit harder to reverse-engineer the application). 26 27The general idea is as follows: 28 29* You write and execute a Python script, which produces a ``.c`` file 30 with the API of your choice (and optionally compile it into a 31 ``.so/.dll/.dylib``). The script also gives some Python code to be 32 "frozen" inside the ``.so``. 33 34* At runtime, the C application loads this ``.so/.dll/.dylib`` (or is 35 statically linked with the ``.c`` source) without having to know that 36 it was produced from Python and CFFI. 37 38* The first time a C function is called, Python is initialized and 39 the frozen Python code is executed. 40 41* The frozen Python code defines more Python functions that implement the 42 C functions of your API, which are then used for all subsequent C 43 function calls. 44 45One of the goals of this approach is to be entirely independent from 46the CPython C API: no ``Py_Initialize()`` nor ``PyRun_SimpleString()`` 47nor even ``PyObject``. It works identically on CPython and PyPy. 48 49This is entirely *new in version 1.5.* (PyPy contains CFFI 1.5 since 50release 5.0.) 51 52 53Usage 54----- 55 56.. __: overview.html#embedding 57 58See the `paragraph in the overview page`__ for a quick introduction. 59In this section, we explain every step in more details. We will use 60here this slightly expanded example: 61 62.. code-block:: c 63 64 /* file plugin.h */ 65 typedef struct { int x, y; } point_t; 66 extern int do_stuff(point_t *); 67 68.. code-block:: c 69 70 /* file plugin.h, Windows-friendly version */ 71 typedef struct { int x, y; } point_t; 72 73 /* When including this file from ffibuilder.set_source(), the 74 following macro is defined to '__declspec(dllexport)'. When 75 including this file directly from your C program, we define 76 it to 'extern __declspec(dllimport)' instead. 77 78 With non-MSVC compilers we simply define it to 'extern'. 79 (The 'extern' is needed for sharing global variables; 80 functions would be fine without it. The macros always 81 include 'extern': you must not repeat it when using the 82 macros later.) 83 */ 84 #ifndef CFFI_DLLEXPORT 85 # if defined(_MSC_VER) 86 # define CFFI_DLLEXPORT extern __declspec(dllimport) 87 # else 88 # define CFFI_DLLEXPORT extern 89 # endif 90 #endif 91 92 CFFI_DLLEXPORT int do_stuff(point_t *); 93 94.. code-block:: python 95 96 # file plugin_build.py 97 import cffi 98 ffibuilder = cffi.FFI() 99 100 with open('plugin.h') as f: 101 # read plugin.h and pass it to embedding_api(), manually 102 # removing the '#' directives and the CFFI_DLLEXPORT 103 data = ''.join([line for line in f if not line.startswith('#')]) 104 data = data.replace('CFFI_DLLEXPORT', '') 105 ffibuilder.embedding_api(data) 106 107 ffibuilder.set_source("my_plugin", r''' 108 #include "plugin.h" 109 ''') 110 111 ffibuilder.embedding_init_code(""" 112 from my_plugin import ffi 113 114 @ffi.def_extern() 115 def do_stuff(p): 116 print("adding %d and %d" % (p.x, p.y)) 117 return p.x + p.y 118 """) 119 120 ffibuilder.compile(target="plugin-1.5.*", verbose=True) 121 # or: ffibuilder.emit_c_code("my_plugin.c") 122 123Running the code above produces a *DLL*, i,e, a dynamically-loadable 124library. It is a file with the extension ``.dll`` on Windows, 125``.dylib`` on Mac OS/X, or ``.so`` on other platforms. As usual, it 126is produced by generating some intermediate ``.c`` code and then 127calling the regular platform-specific C compiler. See below__ for 128some pointers to C-level issues with using the produced library. 129 130.. __: `Issues about using the .so`_ 131 132Here are some details about the methods used above: 133 134* **ffibuilder.embedding_api(source):** parses the given C source, which 135 declares functions that you want to be exported by the DLL. It can 136 also declare types, constants and global variables that are part of 137 the C-level API of your DLL. 138 139 The functions that are found in ``source`` will be automatically 140 defined in the ``.c`` file: they will contain code that initializes 141 the Python interpreter the first time any of them is called, 142 followed by code to call the attached Python function (with 143 ``@ffi.def_extern()``, see next point). 144 145 The global variables, on the other hand, are not automatically 146 produced. You have to write their definition explicitly in 147 ``ffibuilder.set_source()``, as regular C code (see the point after next). 148 149* **ffibuilder.embedding_init_code(python_code):** this gives 150 initialization-time Python source code. This code is copied 151 ("frozen") inside the DLL. At runtime, the code is executed when 152 the DLL is first initialized, just after Python itself is 153 initialized. This newly initialized Python interpreter has got an 154 extra "built-in" module that can be loaded magically without 155 accessing any files, with a line like "``from my_plugin import ffi, 156 lib``". The name ``my_plugin`` comes from the first argument to 157 ``ffibuilder.set_source()``. This module represents "the caller's C world" 158 from the point of view of Python. 159 160 The initialization-time Python code can import other modules or 161 packages as usual. You may have typical Python issues like needing 162 to set up ``sys.path`` somehow manually first. 163 164 For every function declared within ``ffibuilder.embedding_api()``, the 165 initialization-time Python code or one of the modules it imports 166 should use the decorator ``@ffi.def_extern()`` to attach a 167 corresponding Python function to it. 168 169 If the initialization-time Python code fails with an exception, then 170 you get a traceback printed to stderr, along with more information 171 to help you identify problems like wrong ``sys.path``. If some 172 function remains unattached at the time where the C code tries to 173 call it, an error message is also printed to stderr and the function 174 returns zero/null. 175 176 Note that the CFFI module never calls ``exit()``, but CPython itself 177 contains code that calls ``exit()``, for example if importing 178 ``site`` fails. This may be worked around in the future. 179 180* **ffibuilder.set_source(c_module_name, c_code):** set the name of the 181 module from Python's point of view. It also gives more C code which 182 will be included in the generated C code. In trivial examples it 183 can be an empty string. It is where you would ``#include`` some 184 other files, define global variables, and so on. The macro 185 ``CFFI_DLLEXPORT`` is available to this C code: it expands to the 186 platform-specific way of saying "the following declaration should be 187 exported from the DLL". For example, you would put "``extern int 188 my_glob;``" in ``ffibuilder.embedding_api()`` and "``CFFI_DLLEXPORT int 189 my_glob = 42;``" in ``ffibuilder.set_source()``. 190 191 Currently, any *type* declared in ``ffibuilder.embedding_api()`` must also 192 be present in the ``c_code``. This is automatic if this code 193 contains a line like ``#include "plugin.h"`` in the example above. 194 195* **ffibuilder.compile([target=...] [, verbose=True]):** make the C code and 196 compile it. By default, it produces a file called 197 ``c_module_name.dll``, ``c_module_name.dylib`` or 198 ``c_module_name.so``, but the default can be changed with the 199 optional ``target`` keyword argument. You can use 200 ``target="foo.*"`` with a literal ``*`` to ask for a file called 201 ``foo.dll`` on Windows, ``foo.dylib`` on OS/X and ``foo.so`` 202 elsewhere. One reason for specifying an alternate ``target`` is to 203 include characters not usually allowed in Python module names, like 204 "``plugin-1.5.*``". 205 206 For more complicated cases, you can call instead 207 ``ffibuilder.emit_c_code("foo.c")`` and compile the resulting ``foo.c`` 208 file using other means. CFFI's compilation logic is based on the 209 standard library ``distutils`` package, which is really developed 210 and tested for the purpose of making CPython extension modules; it 211 might not always be appropriate for making general DLLs. Also, just 212 getting the C code is what you need if you do not want to make a 213 stand-alone ``.so/.dll/.dylib`` file: this C file can be compiled 214 and statically linked as part of a larger application. 215 216 217More reading 218------------ 219 220If you're reading this page about embedding and you are not familiar 221with CFFI already, here are a few pointers to what you could read 222next: 223 224* For the ``@ffi.def_extern()`` functions, integer C types are passed 225 simply as Python integers; and simple pointers-to-struct and basic 226 arrays are all straightforward enough. However, sooner or later you 227 will need to read about this topic in more details here__. 228 229* ``@ffi.def_extern()``: see `documentation here,`__ notably on what 230 happens if the Python function raises an exception. 231 232* To create Python objects attached to C data, one common solution is 233 to use ``ffi.new_handle()``. See documentation here__. 234 235* In embedding mode, the major direction is C code that calls Python 236 functions. This is the opposite of the regular extending mode of 237 CFFI, in which the major direction is Python code calling C. That's 238 why the page `Using the ffi/lib objects`_ talks first about the 239 latter, and why the direction "C code that calls Python" is 240 generally referred to as "callbacks" in that page. If you also 241 need to have your Python code call C code, read more about 242 `Embedding and Extending`_ below. 243 244* ``ffibuilder.embedding_api(source)``: follows the same syntax as 245 ``ffibuilder.cdef()``, `documented here.`__ You can use the "``...``" 246 syntax as well, although in practice it may be less useful than it 247 is for ``cdef()``. On the other hand, it is expected that often the 248 C sources that you need to give to ``ffibuilder.embedding_api()`` would be 249 exactly the same as the content of some ``.h`` file that you want to 250 give to users of your DLL. That's why the example above does this:: 251 252 with open('foo.h') as f: 253 ffibuilder.embedding_api(f.read()) 254 255 Note that a drawback of this approach is that ``ffibuilder.embedding_api()`` 256 doesn't support ``#ifdef`` directives. You may have to use a more 257 convoluted expression like:: 258 259 with open('foo.h') as f: 260 lines = [line for line in f if not line.startswith('#')] 261 ffibuilder.embedding_api(''.join(lines)) 262 263 As in the example above, you can also use the same ``foo.h`` from 264 ``ffibuilder.set_source()``:: 265 266 ffibuilder.set_source('module_name', r''' 267 #include "foo.h" 268 ''') 269 270 271.. __: using.html#working 272.. __: using.html#def-extern 273.. __: ref.html#ffi-new-handle 274.. __: cdef.html#cdef 275 276.. _`Using the ffi/lib objects`: using.html 277 278 279Troubleshooting 280--------------- 281 282* The error message 283 284 cffi extension module 'c_module_name' has unknown version 0x2701 285 286 means that the running Python interpreter located a CFFI version older 287 than 1.5. CFFI 1.5 or newer must be installed in the running Python. 288 289* On PyPy, the error message 290 291 debug: pypy_setup_home: directories 'lib-python' and 'lib_pypy' not 292 found in pypy's shared library location or in any parent directory 293 294 means that the ``libpypy-c.so`` file was found, but the standard library 295 was not found from this location. This occurs at least on some Linux 296 distributions, because they put ``libpypy-c.so`` inside ``/usr/lib/``, 297 instead of the way we recommend, which is: keep that file inside 298 ``/opt/pypy/bin/`` and put a symlink to there from ``/usr/lib/``. 299 The quickest fix is to do that change manually. 300 301 302Issues about using the .so 303-------------------------- 304 305This paragraph describes issues that are not necessarily specific to 306CFFI. It assumes that you have obtained the ``.so/.dylib/.dll`` file as 307described above, but that you have troubles using it. (In summary: it 308is a mess. This is my own experience, slowly built by using Google and 309by listening to reports from various platforms. Please report any 310inaccuracies in this paragraph or better ways to do things.) 311 312* The file produced by CFFI should follow this naming pattern: 313 ``libmy_plugin.so`` on Linux, ``libmy_plugin.dylib`` on Mac, or 314 ``my_plugin.dll`` on Windows (no ``lib`` prefix on Windows). 315 316* First note that this file does not contain the Python interpreter 317 nor the standard library of Python. You still need it to be 318 somewhere. There are ways to compact it to a smaller number of files, 319 but this is outside the scope of CFFI (please report if you used some 320 of these ways successfully so that I can add some links here). 321 322* In what we'll call the "main program", the ``.so`` can be either 323 used dynamically (e.g. by calling ``dlopen()`` or ``LoadLibrary()`` 324 inside the main program), or at compile-time (e.g. by compiling it 325 with ``gcc -lmy_plugin``). The former case is always used if you're 326 building a plugin for a program, and the program itself doesn't need 327 to be recompiled. The latter case is for making a CFFI library that 328 is more tightly integrated inside the main program. 329 330* In the case of compile-time usage: you can add the gcc 331 option ``-Lsome/path/`` before ``-lmy_plugin`` to describe where the 332 ``libmy_plugin.so`` is. On some platforms, notably Linux, ``gcc`` 333 will complain if it can find ``libmy_plugin.so`` but not 334 ``libpython27.so`` or ``libpypy-c.so``. To fix it, you need to call 335 ``LD_LIBRARY_PATH=/some/path/to/libpypy gcc``. 336 337* When actually executing the main program, it needs to find the 338 ``libmy_plugin.so`` but also ``libpython27.so`` or ``libpypy-c.so``. 339 For PyPy, unpack a PyPy distribution and you get a full directory 340 structure with ``libpypy-c.so`` inside a ``bin`` subdirectory, or on 341 Windows ``pypy-c.dll`` inside the top directory; you must not move 342 this file around, but just point to it. One way to point to it is by 343 running the main program with some environment variable: 344 ``LD_LIBRARY_PATH=/some/path/to/libpypy`` on Linux, 345 ``DYLD_LIBRARY_PATH=/some/path/to/libpypy`` on OS/X. 346 347* You can avoid the ``LD_LIBRARY_PATH`` issue if you compile 348 ``libmy_plugin.so`` with the path hard-coded inside in the first 349 place. On Linux, this is done by ``gcc -Wl,-rpath=/some/path``. You 350 would put this option in ``ffibuilder.set_source("my_plugin", ..., 351 extra_link_args=['-Wl,-rpath=/some/path/to/libpypy'])``. The path can 352 start with ``$ORIGIN`` to mean "the directory where 353 ``libmy_plugin.so`` is". You can then specify a path relative to that 354 place, like ``extra_link_args=['-Wl,-rpath=$ORIGIN/../venv/bin']``. 355 Use ``ldd libmy_plugin.so`` to look at what path is currently compiled 356 in after the expansion of ``$ORIGIN``.) 357 358 After this, you don't need ``LD_LIBRARY_PATH`` any more to locate 359 ``libpython27.so`` or ``libpypy-c.so`` at runtime. In theory it 360 should also cover the call to ``gcc`` for the main program. I wasn't 361 able to make ``gcc`` happy without ``LD_LIBRARY_PATH`` on Linux if 362 the rpath starts with ``$ORIGIN``, though. 363 364* The same rpath trick might be used to let the main program find 365 ``libmy_plugin.so`` in the first place without ``LD_LIBRARY_PATH``. 366 (This doesn't apply if the main program uses ``dlopen()`` to load it 367 as a dynamic plugin.) You'd make the main program with ``gcc 368 -Wl,-rpath=/path/to/libmyplugin``, possibly with ``$ORIGIN``. The 369 ``$`` in ``$ORIGIN`` causes various shell problems on its own: if 370 using a common shell you need to say ``gcc 371 -Wl,-rpath=\$ORIGIN``. From a Makefile, you need to say 372 something like ``gcc -Wl,-rpath=\$$ORIGIN``. 373 374* On some Linux distributions, notably Debian, the ``.so`` files of 375 CPython C extension modules may be compiled without saying that they 376 depend on ``libpythonX.Y.so``. This makes such Python systems 377 unsuitable for embedding if the embedder uses ``dlopen(..., 378 RTLD_LOCAL)``. You get an ``undefined symbol`` error. See 379 `issue #264`__. A workaround is to first call 380 ``dlopen("libpythonX.Y.so", RTLD_LAZY|RTLD_GLOBAL)``, which will 381 force ``libpythonX.Y.so`` to be loaded first. 382 383.. __: https://foss.heptapod.net/pypy/cffi/-/issues/264 384 385 386Using multiple CFFI-made DLLs 387----------------------------- 388 389Multiple CFFI-made DLLs can be used by the same process. 390 391Note that all CFFI-made DLLs in a process share a single Python 392interpreter. The effect is the same as the one you get by trying to 393build a large Python application by assembling a lot of unrelated 394packages. Some of these might be libraries that monkey-patch some 395functions from the standard library, for example, which might be 396unexpected from other parts. 397 398 399Multithreading 400-------------- 401 402Multithreading should work transparently, based on Python's standard 403Global Interpreter Lock. 404 405If two threads both try to call a C function when Python is not yet 406initialized, then locking occurs. One thread proceeds with 407initialization and blocks the other thread. The other thread will be 408allowed to continue only when the execution of the initialization-time 409Python code is done. 410 411If the two threads call two *different* CFFI-made DLLs, the Python 412initialization itself will still be serialized, but the two pieces of 413initialization-time Python code will not. The idea is that there is a 414priori no reason for one DLL to wait for initialization of the other 415DLL to be complete. 416 417After initialization, Python's standard Global Interpreter Lock kicks 418in. The end result is that when one CPU progresses on executing 419Python code, no other CPU can progress on executing more Python code 420from another thread of the same process. At regular intervals, the 421lock switches to a different thread, so that no single thread should 422appear to block indefinitely. 423 424 425Testing 426------- 427 428For testing purposes, a CFFI-made DLL can be imported in a running 429Python interpreter instead of being loaded like a C shared library. 430 431You might have some issues with the file name: for example, on 432Windows, Python expects the file to be called ``c_module_name.pyd``, 433but the CFFI-made DLL is called ``target.dll`` instead. The base name 434``target`` is the one specified in ``ffibuilder.compile()``, and on Windows 435the extension is ``.dll`` instead of ``.pyd``. You have to rename or 436copy the file, or on POSIX use a symlink. 437 438The module then works like a regular CFFI extension module. It is 439imported with "``from c_module_name import ffi, lib``" and exposes on 440the ``lib`` object all C functions. You can test it by calling these 441C functions. The initialization-time Python code frozen inside the 442DLL is executed the first time such a call is done. 443 444 445Embedding and Extending 446----------------------- 447 448The embedding mode is not incompatible with the non-embedding mode of 449CFFI. 450 451You can use *both* ``ffibuilder.embedding_api()`` and 452``ffibuilder.cdef()`` in the 453same build script. You put in the former the declarations you want to 454be exported by the DLL; you put in the latter only the C functions and 455types that you want to share between C and Python, but not export from 456the DLL. 457 458As an example of that, consider the case where you would like to have 459a DLL-exported C function written in C directly, maybe to handle some 460cases before calling Python functions. To do that, you must *not* put 461the function's signature in ``ffibuilder.embedding_api()``. (Note that this 462requires more hacks if you use ``ffibuilder.embedding_api(f.read())``.) 463You must only write the custom function definition in 464``ffibuilder.set_source()``, and prefix it with the macro CFFI_DLLEXPORT: 465 466.. code-block:: c 467 468 CFFI_DLLEXPORT int myfunc(int a, int b) 469 { 470 /* implementation here */ 471 } 472 473This function can, if it wants, invoke Python functions using the 474general mechanism of "callbacks"---called this way because it is a 475call from C to Python, although in this case it is not calling 476anything back: 477 478.. code-block:: python 479 480 ffibuilder.cdef(""" 481 extern "Python" int mycb(int); 482 """) 483 484 ffibuilder.set_source("my_plugin", r""" 485 486 static int mycb(int); /* the callback: forward declaration, to make 487 it accessible from the C code that follows */ 488 489 CFFI_DLLEXPORT int myfunc(int a, int b) 490 { 491 int product = a * b; /* some custom C code */ 492 return mycb(product); 493 } 494 """) 495 496and then the Python initialization code needs to contain the lines: 497 498.. code-block:: python 499 500 @ffi.def_extern() 501 def mycb(x): 502 print "hi, I'm called with x =", x 503 return x * 10 504 505This ``@ffi.def_extern`` is attaching a Python function to the C 506callback ``mycb()``, which in this case is not exported from the DLL. 507Nevertheless, the automatic initialization of Python occurs when 508``mycb()`` is called, if it happens to be the first function called 509from C. More precisely, it does not happen when ``myfunc()`` is 510called: this is just a C function, with no extra code magically 511inserted around it. It only happens when ``myfunc()`` calls 512``mycb()``. 513 514As the above explanation hints, this is how ``ffibuilder.embedding_api()`` 515actually implements function calls that directly invoke Python code; 516here, we have merely decomposed it explicitly, in order to add some 517custom C code in the middle. 518 519In case you need to force, from C code, Python to be initialized 520before the first ``@ffi.def_extern()`` is called, you can do so by 521calling the C function ``cffi_start_python()`` with no argument. It 522returns an integer, 0 or -1, to tell if the initialization succeeded 523or not. Currently there is no way to prevent a failing initialization 524from also dumping a traceback and more information to stderr. 525Note that the function ``cffi_start_python()`` is static: it must be 526called from C source written inside ``ffibuilder.set_source()``. To 527call it from somewhere else, you need to make a function (with a 528different non-static name) in the ``ffibuilder.set_source()`` that just 529calls ``cffi_start_python()``. The reason it is static is to avoid 530naming conflicts in case you are ultimately trying to link a large C 531program with more than one cffi embedded module in it. 532