1[section Pickle support] 2[section Introduction] 3Pickle is a Python module for object serialization, also known as persistence, marshalling, or flattening. 4 5It is often necessary to save and restore the contents of an object to a file. One approach to this problem is to write a pair of functions that read and write data from a file in a special format. A powerful alternative approach is to use Python's pickle module. Exploiting Python's ability for introspection, the pickle module recursively converts nearly arbitrary Python objects into a stream of bytes that can be written to a file. 6 7The Boost Python Library supports the pickle module through the interface as described in detail in the [@https://docs.python.org/2/library/pickle.html Python Library Reference for pickle]. This interface involves the special methods `__getinitargs__`, `__getstate__` and `__setstate__` as described in the following. Note that `Boost.Python` is also fully compatible with Python's cPickle module. 8[endsect] 9[section The Pickle Interface] 10At the user level, the Boost.Python pickle interface involves three special methods: 11[variablelist 12[[__getinitargs__][When an instance of a Boost.Python extension class is pickled, the pickler tests if the instance has a `__getinitargs__` method. This method must return a Python `tuple` (it is most convenient to use a [link object_wrappers.boost_python_tuple_hpp.class_tuple `boost::python::tuple`]). When the instance is restored by the unpickler, the contents of this tuple are used as the arguments for the class constructor. 13 14If `__getinitargs__` is not defined, `pickle.load` will call the constructor (`__init__`) without arguments; i.e., the object must be default-constructible.]] 15[[__getstate__][When an instance of a `Boost.Python` extension class is pickled, the pickler tests if the instance has a `__getstate__` method. This method should return a Python object representing the state of the instance.]] 16[[__setstate__][When an instance of a `Boost.Python` extension class is restored by the unpickler (`pickle.load`), it is first constructed using the result of `__getinitargs__` as arguments (see above). Subsequently the unpickler tests if the new instance has a `__setstate__` method. If so, this method is called with the result of `__getstate__` (a Python object) as the argument.]] 17] 18 19The three special methods described above may be `.def()`\ 'ed individually by the user. However, `Boost.Python` provides an easy to use high-level interface via the `boost::python::pickle_suite` class that also enforces consistency: `__getstate__` and `__setstate__` must be defined as pairs. Use of this interface is demonstrated by the following examples. 20[endsect] 21[section Example] 22There are three files in `python/test` that show how to provide pickle support. 23[section pickle1.cpp] 24The C++ class in this example can be fully restored by passing the appropriate argument to the constructor. Therefore it is sufficient to define the pickle interface method `__getinitargs__`. This is done in the following way: 25Definition of the C++ pickle function: 26`` 27struct world_pickle_suite : boost::python::pickle_suite 28{ 29 static 30 boost::python::tuple 31 getinitargs(world const& w) 32 { 33 return boost::python::make_tuple(w.get_country()); 34 } 35}; 36`` 37Establishing the Python binding: 38`` 39class_<world>("world", args<const std::string&>()) 40 // ... 41 .def_pickle(world_pickle_suite()) 42 // ... 43`` 44[endsect] 45[section pickle2.cpp] 46The C++ class in this example contains member data that cannot be restored by any of the constructors. Therefore it is necessary to provide the `__getstate__`/`__setstate__` pair of pickle interface methods: 47 48Definition of the C++ pickle functions: 49`` 50struct world_pickle_suite : boost::python::pickle_suite 51 { 52 static 53 boost::python::tuple 54 getinitargs(const world& w) 55 { 56 // ... 57 } 58 59 static 60 boost::python::tuple 61 getstate(const world& w) 62 { 63 // ... 64 } 65 66 static 67 void 68 setstate(world& w, boost::python::tuple state) 69 { 70 // ... 71 } 72 }; 73`` 74Establishing the Python bindings for the entire suite: 75`` 76 class_<world>("world", args<const std::string&>()) 77 // ... 78 .def_pickle(world_pickle_suite()) 79 // ... 80 81`` 82 83For simplicity, the `__dict__` is not included in the result of `__getstate__`. This is not generally recommended, but a valid approach if it is anticipated that the object's `__dict__` will always be empty. Note that the safety guard described below will catch the cases where this assumption is violated. 84[endsect] 85[section pickle3.cpp] 86This example is similar to pickle2.cpp. However, the object's `__dict__` is included in the result of `__getstate__`. This requires a little more code but is unavoidable if the object's `__dict__` is not always empty. 87[endsect] 88[endsect] 89[section Pitfall and Safety Guard] 90The pickle protocol described above has an important pitfall that the end user of a Boost.Python extension module might not be aware of: 91 92[*`__getstate__` is defined and the instance's `__dict__` is not empty.] 93 94The author of a `Boost.Python` extension class might provide a `__getstate__` method without considering the possibilities that: 95* his class is used in Python as a base class. Most likely the `__dict__` of instances of the derived class needs to be pickled in order to restore the instances correctly. 96* the user adds items to the instance's `__dict__` directly. Again, the `__dict__` of the instance then needs to be pickled. 97 98To alert the user to this highly unobvious problem, a safety guard is provided. If `__getstate__` is defined and the instance's `__dict__` is not empty, `Boost.Python` tests if the class has an attribute `__getstate_manages_dict__`. An exception is raised if this attribute is not defined: 99 100`` 101 RuntimeError: Incomplete pickle support (__getstate_manages_dict__ not set) 102`` 103 104To resolve this problem, it should first be established that the `__getstate__` and `__setstate__` methods manage the instances's `__dict__` correctly. Note that this can be done either at the C++ or the Python level. Finally, the safety guard should intentionally be overridden. E.g. in C++ (from pickle3.cpp): 105 106`` 107struct world_pickle_suite : boost::python::pickle_suite 108{ 109 // ... 110 111 static bool getstate_manages_dict() { return true; } 112}; 113`` 114 115Alternatively in Python: 116 117`` 118import your_bpl_module 119class your_class(your_bpl_module.your_class): 120 __getstate_manages_dict__ = 1 121 def __getstate__(self): 122 # your code here 123 def __setstate__(self, state): 124 # your code here 125 126`` 127[endsect] 128[section Practical Advice] 129 130* In `Boost.Python` extension modules with many extension classes, providing complete pickle support for all classes would be a significant overhead. In general complete pickle support should only be implemented for extension classes that will eventually be pickled. 131* Avoid using `__getstate__` if the instance can also be reconstructed by way of `__getinitargs__`. This automatically avoids the pitfall described above. 132* If `__getstate__` is required, include the instance's `__dict__` in the Python object that is returned. 133 134[endsect] 135[section Light-weight alternative: pickle support implemented in Python] 136The pickle4.cpp example demonstrates an alternative technique for implementing pickle support. First we direct Boost.Python via the class_::enable_pickling() member function to define only the basic attributes required for pickling: 137 138`` 139 class_<world>("world", args<const std::string&>()) 140 // ... 141 .enable_pickling() 142 // ... 143`` 144This enables the standard Python pickle interface as described in the Python documentation. By "injecting" a `__getinitargs__` method into the definition of the wrapped class we make all instances pickleable: 145 146`` 147 # import the wrapped world class 148 from pickle4_ext import world 149 150 # definition of __getinitargs__ 151 def world_getinitargs(self): 152 return (self.get_country(),) 153 154 # now inject __getinitargs__ (Python is a dynamic language!) 155 world.__getinitargs__ = world_getinitargs 156`` 157See also the tutorial section on injecting additional methods from Python. 158[endsect] 159[endsect] 160