• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2<html>
3<!--
4(C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com .
5Use, modification and distribution is subject to the Boost Software
6License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
7http://www.boost.org/LICENSE_1_0.txt)
8-->
9<head>
10<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
11<link rel="stylesheet" type="text/css" href="../../../boost.css">
12<link rel="stylesheet" type="text/css" href="style.css">
13<title>Serialization - Tutorial</title>
14</head>
15<body link="#0000ff" vlink="#800080">
16<table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header">
17  <tr>
18    <td valign="top" width="300">
19       <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
20    </td>
21    <td valign="top">
22      <h1 align="center">Serialization</h1>
23      <h2 align="center">Tutorial</h2>
24    </td>
25  </tr>
26</table>
27<hr>
28<dl class="page-index">
29  <dt><a href="#simplecase">A Very Simple Case</a>
30  <dt><a href="#nonintrusiveversion">Non Intrusive Version</a>
31  <dt><a href="#serializablemembers">Serializable Members</a>
32  <dt><a href="#derivedclasses">Derived Classes</a>
33  <dt><a href="#pointers">Pointers</a>
34  <dt><a href="#arrays">Arrays</a>
35  <dt><a href="#stl">STL Collections</a>
36  <dt><a href="#versioning">Class Versioning</a>
37  <dt><a href="#splitting">Splitting <code style="white-space: normal">serialize</code> into <code style="white-space: normal">save/load</code></a>
38  <dt><a href="#archives">Archives</a>
39  <dt><a href="#examples">List of examples</a>
40</dl>
41An output archive is similar to an output data stream. Data can be saved to the archive
42with either the &lt;&lt; or the &amp; operator:
43<pre><code>
44ar &lt;&lt; data;
45ar &amp; data;
46</code></pre>
47An input archive is similar to an input datastream.  Data can be loaded from the archive
48with either the &gt;&gt; or the &amp; operator.
49<pre><code>
50ar &gt;&gt; data;
51ar &amp; data;
52</code></pre>
53<p>
54When these operators are invoked for primitive data types, the data is simply saved/loaded
55to/from the archive. When invoked for class data types, the class
56<code style="white-space: normal">serialize</code> function is invoked. Each
57<code style="white-space: normal">serialize</code> function is uses the above operators
58to save/load its data members.  This process will continue in a recursive manner until
59all the data contained in the class is saved/loaded.
60
61<h3><a name="simplecase">A Very Simple Case</a></h3>
62These operators are used inside the <code style="white-space: normal">serialize</code>
63function to save and load class data members.
64<p>
65Included in this library is a program called
66<a href="../example/demo.cpp" target="demo_cpp">demo.cpp</a> which illustrates how
67to use this system. Below we excerpt code from this program to
68illustrate with the simplest possible case how this library is
69intended to be used.
70<pre>
71<code>
72#include &lt;fstream&gt;
73
74// include headers that implement a archive in simple text format
75#include &lt;boost/archive/text_oarchive.hpp&gt;
76#include &lt;boost/archive/text_iarchive.hpp&gt;
77
78/////////////////////////////////////////////////////////////
79// gps coordinate
80//
81// illustrates serialization for a simple type
82//
83class gps_position
84{
85private:
86    friend class boost::serialization::access;
87    // When the class Archive corresponds to an output archive, the
88    // &amp; operator is defined similar to &lt;&lt;.  Likewise, when the class Archive
89    // is a type of input archive the &amp; operator is defined similar to &gt;&gt;.
90    template&lt;class Archive&gt;
91    void serialize(Archive &amp; ar, const unsigned int version)
92    {
93        ar &amp; degrees;
94        ar &amp; minutes;
95        ar &amp; seconds;
96    }
97    int degrees;
98    int minutes;
99    float seconds;
100public:
101    gps_position(){};
102    gps_position(int d, int m, float s) :
103        degrees(d), minutes(m), seconds(s)
104    {}
105};
106
107int main() {
108    // create and open a character archive for output
109    std::ofstream ofs("filename");
110
111    // create class instance
112    const gps_position g(35, 59, 24.567f);
113
114    // save data to archive
115    {
116        boost::archive::text_oarchive oa(ofs);
117        // write class instance to archive
118        oa &lt;&lt; g;
119    	// archive and stream closed when destructors are called
120    }
121
122    // ... some time later restore the class instance to its orginal state
123    gps_position newg;
124    {
125        // create and open an archive for input
126        std::ifstream ifs("filename");
127        boost::archive::text_iarchive ia(ifs);
128        // read class state from archive
129        ia &gt;&gt; newg;
130        // archive and stream closed when destructors are called
131    }
132    return 0;
133}
134</code>
135</pre>
136<p>For each class to be saved via serialization, there must exist a function to
137save all the class members which define the state of the class.
138For each class to be loaded via serialization, there must exist a function to
139load theese class members in the same sequence as they were saved.
140In the above example, these functions are generated by the
141template member function <code style="white-space: normal">serialize</code>.
142
143<h3><a name="nonintrusiveversion">Non Intrusive Version</a></h3>
144<p>The above formulation is intrusive. That is, it requires
145that classes whose instances are to be serialized be
146altered. This can be inconvenient in some cases.
147An equivalent alternative formulation permitted by the
148system would be:
149<pre><code>
150#include &lt;boost/archive/text_oarchive.hpp&gt;
151#include &lt;boost/archive/text_iarchive.hpp&gt;
152
153class gps_position
154{
155public:
156    int degrees;
157    int minutes;
158    float seconds;
159    gps_position(){};
160    gps_position(int d, int m, float s) :
161        degrees(d), minutes(m), seconds(s)
162    {}
163};
164
165namespace boost {
166namespace serialization {
167
168template&lt;class Archive&gt;
169void serialize(Archive &amp; ar, gps_position &amp; g, const unsigned int version)
170{
171    ar &amp; g.degrees;
172    ar &amp; g.minutes;
173    ar &amp; g.seconds;
174}
175
176} // namespace serialization
177} // namespace boost
178</code></pre>
179<p>
180In this case the generated serialize functions are not members of the
181<code style="white-space: normal">gps_position</code> class.  The two formulations function
182in exactly the same way.
183<p>
184The main application of non-intrusive serialization is to permit serialization
185to be implemented for classes without changing the class definition.
186In order for this to be possible, the class must expose enough information
187to reconstruct the class state.  In this example, we presumed that the
188class had <code style="white-space: normal">public</code> members - not a common occurence.  Only
189classes which expose enough information to save and restore the class
190state will be serializable without changing the class definition.
191<h3><a name="serializablemembers">Serializable Members</a></h3>
192<p>
193A serializable class with serializable members would look like this:
194<pre><code>
195class bus_stop
196{
197    friend class boost::serialization::access;
198    template&lt;class Archive&gt;
199    void serialize(Archive &amp; ar, const unsigned int version)
200    {
201        ar &amp; latitude;
202        ar &amp; longitude;
203    }
204    gps_position latitude;
205    gps_position longitude;
206protected:
207    bus_stop(const gps_position &amp; lat_, const gps_position &amp; long_) :
208    latitude(lat_), longitude(long_)
209    {}
210public:
211    bus_stop(){}
212    // See item # 14 in Effective C++ by Scott Meyers.
213    // re non-virtual destructors in base classes.
214    virtual ~bus_stop(){}
215};
216</code></pre>
217<p>That is, members of class type are serialized just as
218members of primitive types are.
219<p>
220Note that saving an instance of the class <code style="white-space: normal">bus_stop</code>
221with one of the archive operators will invoke the
222<code style="white-space: normal">serialize</code> function which saves
223<code style="white-space: normal">latitude</code> and
224<code style="white-space: normal">longitude</code>. Each of these in turn will be saved by invoking
225<code style="white-space: normal">serialize</code> in the definition of
226<code style="white-space: normal">gps_position</code>.  In this manner the whole
227data structure is saved by the application of an archive operator to
228just its root item.
229
230
231<h3><a name="derivedclasses">Derived Classes</a></h3>
232<p>Derived classes should include serializations of their base classes.
233<pre><code>
234#include &lt;boost/serialization/base_object.hpp&gt;
235
236class bus_stop_corner : public bus_stop
237{
238    friend class boost::serialization::access;
239    template&lt;class Archive&gt;
240    void serialize(Archive &amp; ar, const unsigned int version)
241    {
242        // serialize base class information
243        ar &amp; boost::serialization::base_object&lt;bus_stop&gt;(*this);
244        ar &amp; street1;
245        ar &amp; street2;
246    }
247    std::string street1;
248    std::string street2;
249    virtual std::string description() const
250    {
251        return street1 + " and " + street2;
252    }
253public:
254    bus_stop_corner(){}
255    bus_stop_corner(const gps_position &amp; lat_, const gps_position &amp; long_,
256        const std::string &amp; s1_, const std::string &amp; s2_
257    ) :
258        bus_stop(lat_, long_), street1(s1_), street2(s2_)
259    {}
260};
261</code>
262</pre>
263<p>
264Note the serialization of the base classes from the derived
265class. Do <b>NOT</b> directly call the base class serialize
266functions. Doing so might seem to work but will bypass the code
267that tracks instances written to storage to eliminate redundancies.
268It will also bypass the writing of class version information into
269the archive. For this reason, it is advisable to always make member
270<code style="white-space: normal">serialize</code> functions private.  The declaration
271<code style="white-space: normal">friend boost::serialization::access</code> will grant to the
272serialization library access to private member variables and functions.
273<p>
274<h3><a name="pointers">Pointers</a></h3>
275Suppose we define a bus route as an array of bus stops.  Given that
276<ol>
277    <li>we might have several types of bus stops (remember bus_stop is
278a base class)
279    <li>a given bus_stop might appear in more than one route.
280</ol>
281it's convenient to represent a bus route with an array of pointers
282to <code style="white-space: normal">bus_stop</code>.
283<pre>
284<code>
285class bus_route
286{
287    friend class boost::serialization::access;
288    bus_stop * stops[10];
289    template&lt;class Archive&gt;
290    void serialize(Archive &amp; ar, const unsigned int version)
291    {
292        int i;
293        for(i = 0; i &lt 10; ++i)
294            ar &amp; stops[i];
295    }
296public:
297    bus_route(){}
298};
299</code>
300</pre>
301Each member of the array <code style="white-space: normal">stops</code> will be serialized.
302But remember each member is a pointer - so what can this really
303mean?  The whole object of this serialization is to permit
304reconstruction of the original data structures at another place
305and time.  In order to accomplish this with a pointer, it is
306not sufficient to save the value of the pointer, rather the
307object it points to must be saved.  When the member is later
308loaded, a new object has to be created and a new pointer has
309to be loaded into the class member.
310<p>
311If the same pointer is serialized more than once, only one instance
312is be added to the archive.  When read back, no data is read back in.
313The only operation that occurs is for the second pointer is set equal to the first
314<p>
315Note that, in this example, the array consists of polymorphic pointers.
316That is, each array element point to one of several possible
317kinds of bus stops.  So when the pointer is saved, some sort of class
318identifier must be saved.  When the pointer is loaded, the class
319identifier must be read and and instance of the corresponding class
320must be constructed. Finally the data can be loaded to newly created
321instance of the correct type.
322
323As can be seen in
324<a href="../example/demo.cpp" target="demo_cpp">demo.cpp</a>,
325serialization of pointers to derived classes through a base
326clas pointer may require explicit enumeration of the derived
327classes to be serialized. This is referred to as "registration" or "export"
328of derived classes.  This requirement and the methods of
329satisfying it are explained in detail
330<a href="serialization.html#derivedpointers">here</a>.
331<p>
332All this is accomplished automatically by the serialization
333library.  The above code is all that is necessary to accomplish
334the saving and loading of objects accessed through pointers.
335<p>
336<h3><a name="arrays">Arrays</a></h3>
337The above formulation is in fact more complex than necessary.
338The serialization library detects when the object being
339serialized is an array and emits code equivalent to the above.
340So the above can be shortened to:
341<pre>
342<code>
343class bus_route
344{
345    friend class boost::serialization::access;
346    bus_stop * stops[10];
347    template&lt;class Archive&gt;
348    void serialize(Archive &amp; ar, const unsigned int version)
349    {
350        ar &amp; stops;
351    }
352public:
353    bus_route(){}
354};
355</code>
356</pre>
357<h3><a name="stl">STL Collections</a></h3>
358The above example uses an array of members.  More likely such
359an application would use an STL collection for such a purpose.
360The serialization library contains code for serialization
361of all STL classes.  Hence, the reformulation below will
362also work as one would expect.
363<pre>
364<code>
365#include &lt;boost/serialization/list.hpp&gt;
366
367class bus_route
368{
369    friend class boost::serialization::access;
370    std::list&lt;bus_stop *&gt; stops;
371    template&lt;class Archive&gt;
372    void serialize(Archive &amp; ar, const unsigned int version)
373    {
374        ar &amp; stops;
375    }
376public:
377    bus_route(){}
378};
379</code>
380</pre>
381<h3><a name="versioning">Class Versioning</a></h3>
382<p>
383Suppose we're satisfied with our <code style="white-space: normal">bus_route</code> class, build a program
384that uses it and ship the product.  Some time later, it's decided
385that the program needs enhancement and the <code style="white-space: normal">bus_route</code> class is
386altered to include the name of the driver of the route. So the
387new version looks like:
388<pre>
389<code>
390#include &lt;boost/serialization/list.hpp&gt;
391#include &lt;boost/serialization/string.hpp&gt;
392
393class bus_route
394{
395    friend class boost::serialization::access;
396    std::list&lt;bus_stop *&gt; stops;
397    std::string driver_name;
398    template&lt;class Archive&gt;
399    void serialize(Archive &amp; ar, const unsigned int version)
400    {
401        ar &amp; driver_name;
402        ar &amp; stops;
403    }
404public:
405    bus_route(){}
406};
407</code>
408</pre>
409Great, we're all done. Except... what about people using our application
410who now have a bunch of files created under the previous program.
411How can these be used with our new program version?
412<p>
413In general, the serialization library stores a version number in the
414archive for each class serialized.  By default this version number is 0.
415When the archive is loaded, the version number under which it was saved
416is read.  The above code can be altered to exploit this
417<pre>
418<code>
419#include &lt;boost/serialization/list.hpp&gt;
420#include &lt;boost/serialization/string.hpp&gt;
421#include &lt;boost/serialization/version.hpp&gt;
422
423class bus_route
424{
425    friend class boost::serialization::access;
426    std::list&lt;bus_stop *&gt; stops;
427    std::string driver_name;
428    template&lt;class Archive&gt;
429    void serialize(Archive &amp; ar, const unsigned int version)
430    {
431        // only save/load driver_name for newer archives
432        if(version &gt; 0)
433            ar &amp; driver_name;
434        ar &amp; stops;
435    }
436public:
437    bus_route(){}
438};
439
440BOOST_CLASS_VERSION(bus_route, 1)
441</code>
442</pre>
443By application of versioning to each class, there is no need to
444try to maintain a versioning of files.  That is, a file version
445is the combination of the versions of all its constituent classes.
446
447This system permits programs to be always compatible with archives
448created by all previous versions of a program with no more
449effort than required by this example.
450
451<h3><a name="splitting">Splitting <code style="white-space: normal">serialize</code>
452into <code style="white-space: normal">save/load</code></a></h3>
453The <code style="white-space: normal">serialize</code> function is simple, concise, and guarantees
454that class members are saved and loaded in the same sequence
455- the key to the serialization system.  However, there are cases
456where the load and save operations are not as similar as the examples
457used here.  For example, this could occur with a class that has evolved through
458multiple versions.  The above class can be reformulated as:
459<pre>
460<code>
461#include &lt;boost/serialization/list.hpp&gt;
462#include &lt;boost/serialization/string.hpp&gt;
463#include &lt;boost/serialization/version.hpp&gt;
464#include &lt;boost/serialization/split_member.hpp&gt;
465
466class bus_route
467{
468    friend class boost::serialization::access;
469    std::list&lt;bus_stop *&gt; stops;
470    std::string driver_name;
471    template&lt;class Archive&gt;
472    void save(Archive &amp; ar, const unsigned int version) const
473    {
474        // note, version is always the latest when saving
475        ar  &amp; driver_name;
476        ar  &amp; stops;
477    }
478    template&lt;class Archive&gt;
479    void load(Archive &amp; ar, const unsigned int version)
480    {
481        if(version &gt; 0)
482            ar &amp; driver_name;
483        ar  &amp; stops;
484    }
485    BOOST_SERIALIZATION_SPLIT_MEMBER()
486public:
487    bus_route(){}
488};
489
490BOOST_CLASS_VERSION(bus_route, 1)
491</code>
492</pre>
493The macro <code style="white-space: normal">BOOST_SERIALIZATION_SPLIT_MEMBER()</code> generates
494code which invokes the <code style="white-space: normal">save</code>
495or <code style="white-space: normal">load</code>
496depending on whether the archive is used for saving or loading.
497<h3><a name="archives">Archives</a></h3>
498Our discussion here has focused on adding serialization
499capability to classes.  The actual rendering of the data to be serialized
500is implemented in the archive class.  Thus the stream of serialized
501data is a product of the serialization of the class and the
502archive selected.  It is a key design decision that these two
503components be independent.  This permits any serialization specification
504to be usable with any archive.
505<p>
506In this tutorial, we have used a particular
507archive class - <code style="white-space: normal">text_oarchive</code> for saving and
508<code style="white-space: normal">text_iarchive</code> for loading.
509text archives render data as text and are portable across platforms.  In addition
510to text archives, the library includes archive class for native binary data
511and xml formatted data.  Interfaces to all archive classes are all identical.
512Once serialization has been defined for a class, that class can be serialized to
513any type of archive.
514<p>
515If the current set of archive classes doesn't provide the
516attributes, format, or behavior needed for a particular application,
517one can either make a new archive class or derive from an existing one.
518This is described later in the manual.
519
520<h3><a name="examples">List of Examples</h3>
521<dl>
522    <dt><a href="../example/demo.cpp" target="demo_cpp">demo.cpp</a>
523    <dd>This is the completed example used in this tutorial.
524    It does the following:
525    <ol>
526        <li>Creates a structure of differing kinds of stops, routes and schedules
527        <li>Displays it
528        <li>Serializes it to a file named "testfile.txt" with one
529        statement
530        <li>Restores to another structure
531        <li>Displays the restored structure
532    </ol>
533    <a href="../example/demo_output.txt" target="demo_output">Output of
534    this program</a> is sufficient to verify that all the
535    originally stated requirements for a serialization system
536    are met with this system. The <a href="../example/demofile.txt"
537    target="test_file">contents of the archive file</a> can
538    also be displayed as serialization files are ASCII text.
539
540    <dt><a href="../example/demo_xml.cpp" target="demo_xml_cpp">demo_xml.cpp</a>
541    <dd>This is a variation the original demo which supports xml archives in addition
542    to the others. The extra wrapping macro, BOOST_SERIALIZATION_NVP(name), is
543    needed to associate a data item name with the corresponding xml
544    tag. It is importanted that 'name' be a valid xml tag, else it
545    will be impossible to restore the archive.
546    For more information see
547    <a target="detail" href="wrappers.html#nvp">Name-Value Pairs</a>.
548    <a href="../example/demo_save.xml" target="demo_save_xml">Here</a>
549    is what an xml archive looks like.
550
551    <dt><a href="../example/demo_xml_save.cpp" target="demo_xml_save_cpp">demo_xml_save.cpp</a>
552    and <a href="../example/demo_xml_load.cpp" target="demo_xml_load_cpp">demo_xml_load.cpp</a>
553    <dd>Note also that though our examples save and load the program data
554    to an archive within the same program, this merely a convenience
555    for purposes of illustration.  In general, the archive may or may
556    not be loaded by the same program that created it.
557</dl>
558<p>
559The astute reader might notice that these examples contain a subtle but important flaw.
560They leak memory. The bus stops are created in the <code style="white-space: normal">
561main</code> function.  The bus schedules may refer to these bus stops
562any number of times.  At the end of the main function after the bus schedules are destroyed,
563the bus stops are destroyed.  This seems fine.  But what about the structure
564<code style="white-space: normal">new_schedule</code> data item created by the
565process of loading from an archive? This contains its own separate set of bus stops
566that are not referenced outside of the bus schedule.  These won't be destroyed
567anywhere in the program - a memory leak.
568<p>
569There are couple of ways of fixing this.  One way is to explicitly manage the bus stops.
570However, a more robust and transparent is to use
571<code style="white-space: normal">shared_ptr</code> rather than raw pointers. Along
572with serialization implementations for the Standard Library, the serialization library
573includes implementation of serialization for
574<code style="white-space: normal">boost::shared ptr</code>.  Given this, it should be
575easy to alter any of these examples to eliminate the memory leak. This is left
576as an excercise for the reader.
577
578<hr>
579<p><i>&copy; Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004.
580Distributed under the Boost Software License, Version 1.0. (See
581accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
582</i></p>
583</body>
584</html>
585