1<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> 2<html> 3<!-- 4(C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com . 5Use, modification and distribution is subject to the Boost Software 6License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at 7http://www.boost.org/LICENSE_1_0.txt) 8--> 9<head> 10<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> 11<link rel="stylesheet" type="text/css" href="../../../boost.css"> 12<link rel="stylesheet" type="text/css" href="style.css"> 13<title>Serialization - Tutorial</title> 14</head> 15<body link="#0000ff" vlink="#800080"> 16<table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header"> 17 <tr> 18 <td valign="top" width="300"> 19 <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3> 20 </td> 21 <td valign="top"> 22 <h1 align="center">Serialization</h1> 23 <h2 align="center">Tutorial</h2> 24 </td> 25 </tr> 26</table> 27<hr> 28<dl class="page-index"> 29 <dt><a href="#simplecase">A Very Simple Case</a> 30 <dt><a href="#nonintrusiveversion">Non Intrusive Version</a> 31 <dt><a href="#serializablemembers">Serializable Members</a> 32 <dt><a href="#derivedclasses">Derived Classes</a> 33 <dt><a href="#pointers">Pointers</a> 34 <dt><a href="#arrays">Arrays</a> 35 <dt><a href="#stl">STL Collections</a> 36 <dt><a href="#versioning">Class Versioning</a> 37 <dt><a href="#splitting">Splitting <code style="white-space: normal">serialize</code> into <code style="white-space: normal">save/load</code></a> 38 <dt><a href="#archives">Archives</a> 39 <dt><a href="#examples">List of examples</a> 40</dl> 41An output archive is similar to an output data stream. Data can be saved to the archive 42with either the << or the & operator: 43<pre><code> 44ar << data; 45ar & data; 46</code></pre> 47An input archive is similar to an input datastream. Data can be loaded from the archive 48with either the >> or the & operator. 49<pre><code> 50ar >> data; 51ar & data; 52</code></pre> 53<p> 54When these operators are invoked for primitive data types, the data is simply saved/loaded 55to/from the archive. When invoked for class data types, the class 56<code style="white-space: normal">serialize</code> function is invoked. Each 57<code style="white-space: normal">serialize</code> function is uses the above operators 58to save/load its data members. This process will continue in a recursive manner until 59all the data contained in the class is saved/loaded. 60 61<h3><a name="simplecase">A Very Simple Case</a></h3> 62These operators are used inside the <code style="white-space: normal">serialize</code> 63function to save and load class data members. 64<p> 65Included in this library is a program called 66<a href="../example/demo.cpp" target="demo_cpp">demo.cpp</a> which illustrates how 67to use this system. Below we excerpt code from this program to 68illustrate with the simplest possible case how this library is 69intended to be used. 70<pre> 71<code> 72#include <fstream> 73 74// include headers that implement a archive in simple text format 75#include <boost/archive/text_oarchive.hpp> 76#include <boost/archive/text_iarchive.hpp> 77 78///////////////////////////////////////////////////////////// 79// gps coordinate 80// 81// illustrates serialization for a simple type 82// 83class gps_position 84{ 85private: 86 friend class boost::serialization::access; 87 // When the class Archive corresponds to an output archive, the 88 // & operator is defined similar to <<. Likewise, when the class Archive 89 // is a type of input archive the & operator is defined similar to >>. 90 template<class Archive> 91 void serialize(Archive & ar, const unsigned int version) 92 { 93 ar & degrees; 94 ar & minutes; 95 ar & seconds; 96 } 97 int degrees; 98 int minutes; 99 float seconds; 100public: 101 gps_position(){}; 102 gps_position(int d, int m, float s) : 103 degrees(d), minutes(m), seconds(s) 104 {} 105}; 106 107int main() { 108 // create and open a character archive for output 109 std::ofstream ofs("filename"); 110 111 // create class instance 112 const gps_position g(35, 59, 24.567f); 113 114 // save data to archive 115 { 116 boost::archive::text_oarchive oa(ofs); 117 // write class instance to archive 118 oa << g; 119 // archive and stream closed when destructors are called 120 } 121 122 // ... some time later restore the class instance to its orginal state 123 gps_position newg; 124 { 125 // create and open an archive for input 126 std::ifstream ifs("filename"); 127 boost::archive::text_iarchive ia(ifs); 128 // read class state from archive 129 ia >> newg; 130 // archive and stream closed when destructors are called 131 } 132 return 0; 133} 134</code> 135</pre> 136<p>For each class to be saved via serialization, there must exist a function to 137save all the class members which define the state of the class. 138For each class to be loaded via serialization, there must exist a function to 139load theese class members in the same sequence as they were saved. 140In the above example, these functions are generated by the 141template member function <code style="white-space: normal">serialize</code>. 142 143<h3><a name="nonintrusiveversion">Non Intrusive Version</a></h3> 144<p>The above formulation is intrusive. That is, it requires 145that classes whose instances are to be serialized be 146altered. This can be inconvenient in some cases. 147An equivalent alternative formulation permitted by the 148system would be: 149<pre><code> 150#include <boost/archive/text_oarchive.hpp> 151#include <boost/archive/text_iarchive.hpp> 152 153class gps_position 154{ 155public: 156 int degrees; 157 int minutes; 158 float seconds; 159 gps_position(){}; 160 gps_position(int d, int m, float s) : 161 degrees(d), minutes(m), seconds(s) 162 {} 163}; 164 165namespace boost { 166namespace serialization { 167 168template<class Archive> 169void serialize(Archive & ar, gps_position & g, const unsigned int version) 170{ 171 ar & g.degrees; 172 ar & g.minutes; 173 ar & g.seconds; 174} 175 176} // namespace serialization 177} // namespace boost 178</code></pre> 179<p> 180In this case the generated serialize functions are not members of the 181<code style="white-space: normal">gps_position</code> class. The two formulations function 182in exactly the same way. 183<p> 184The main application of non-intrusive serialization is to permit serialization 185to be implemented for classes without changing the class definition. 186In order for this to be possible, the class must expose enough information 187to reconstruct the class state. In this example, we presumed that the 188class had <code style="white-space: normal">public</code> members - not a common occurence. Only 189classes which expose enough information to save and restore the class 190state will be serializable without changing the class definition. 191<h3><a name="serializablemembers">Serializable Members</a></h3> 192<p> 193A serializable class with serializable members would look like this: 194<pre><code> 195class bus_stop 196{ 197 friend class boost::serialization::access; 198 template<class Archive> 199 void serialize(Archive & ar, const unsigned int version) 200 { 201 ar & latitude; 202 ar & longitude; 203 } 204 gps_position latitude; 205 gps_position longitude; 206protected: 207 bus_stop(const gps_position & lat_, const gps_position & long_) : 208 latitude(lat_), longitude(long_) 209 {} 210public: 211 bus_stop(){} 212 // See item # 14 in Effective C++ by Scott Meyers. 213 // re non-virtual destructors in base classes. 214 virtual ~bus_stop(){} 215}; 216</code></pre> 217<p>That is, members of class type are serialized just as 218members of primitive types are. 219<p> 220Note that saving an instance of the class <code style="white-space: normal">bus_stop</code> 221with one of the archive operators will invoke the 222<code style="white-space: normal">serialize</code> function which saves 223<code style="white-space: normal">latitude</code> and 224<code style="white-space: normal">longitude</code>. Each of these in turn will be saved by invoking 225<code style="white-space: normal">serialize</code> in the definition of 226<code style="white-space: normal">gps_position</code>. In this manner the whole 227data structure is saved by the application of an archive operator to 228just its root item. 229 230 231<h3><a name="derivedclasses">Derived Classes</a></h3> 232<p>Derived classes should include serializations of their base classes. 233<pre><code> 234#include <boost/serialization/base_object.hpp> 235 236class bus_stop_corner : public bus_stop 237{ 238 friend class boost::serialization::access; 239 template<class Archive> 240 void serialize(Archive & ar, const unsigned int version) 241 { 242 // serialize base class information 243 ar & boost::serialization::base_object<bus_stop>(*this); 244 ar & street1; 245 ar & street2; 246 } 247 std::string street1; 248 std::string street2; 249 virtual std::string description() const 250 { 251 return street1 + " and " + street2; 252 } 253public: 254 bus_stop_corner(){} 255 bus_stop_corner(const gps_position & lat_, const gps_position & long_, 256 const std::string & s1_, const std::string & s2_ 257 ) : 258 bus_stop(lat_, long_), street1(s1_), street2(s2_) 259 {} 260}; 261</code> 262</pre> 263<p> 264Note the serialization of the base classes from the derived 265class. Do <b>NOT</b> directly call the base class serialize 266functions. Doing so might seem to work but will bypass the code 267that tracks instances written to storage to eliminate redundancies. 268It will also bypass the writing of class version information into 269the archive. For this reason, it is advisable to always make member 270<code style="white-space: normal">serialize</code> functions private. The declaration 271<code style="white-space: normal">friend boost::serialization::access</code> will grant to the 272serialization library access to private member variables and functions. 273<p> 274<h3><a name="pointers">Pointers</a></h3> 275Suppose we define a bus route as an array of bus stops. Given that 276<ol> 277 <li>we might have several types of bus stops (remember bus_stop is 278a base class) 279 <li>a given bus_stop might appear in more than one route. 280</ol> 281it's convenient to represent a bus route with an array of pointers 282to <code style="white-space: normal">bus_stop</code>. 283<pre> 284<code> 285class bus_route 286{ 287 friend class boost::serialization::access; 288 bus_stop * stops[10]; 289 template<class Archive> 290 void serialize(Archive & ar, const unsigned int version) 291 { 292 int i; 293 for(i = 0; i < 10; ++i) 294 ar & stops[i]; 295 } 296public: 297 bus_route(){} 298}; 299</code> 300</pre> 301Each member of the array <code style="white-space: normal">stops</code> will be serialized. 302But remember each member is a pointer - so what can this really 303mean? The whole object of this serialization is to permit 304reconstruction of the original data structures at another place 305and time. In order to accomplish this with a pointer, it is 306not sufficient to save the value of the pointer, rather the 307object it points to must be saved. When the member is later 308loaded, a new object has to be created and a new pointer has 309to be loaded into the class member. 310<p> 311If the same pointer is serialized more than once, only one instance 312is be added to the archive. When read back, no data is read back in. 313The only operation that occurs is for the second pointer is set equal to the first 314<p> 315Note that, in this example, the array consists of polymorphic pointers. 316That is, each array element point to one of several possible 317kinds of bus stops. So when the pointer is saved, some sort of class 318identifier must be saved. When the pointer is loaded, the class 319identifier must be read and and instance of the corresponding class 320must be constructed. Finally the data can be loaded to newly created 321instance of the correct type. 322 323As can be seen in 324<a href="../example/demo.cpp" target="demo_cpp">demo.cpp</a>, 325serialization of pointers to derived classes through a base 326clas pointer may require explicit enumeration of the derived 327classes to be serialized. This is referred to as "registration" or "export" 328of derived classes. This requirement and the methods of 329satisfying it are explained in detail 330<a href="serialization.html#derivedpointers">here</a>. 331<p> 332All this is accomplished automatically by the serialization 333library. The above code is all that is necessary to accomplish 334the saving and loading of objects accessed through pointers. 335<p> 336<h3><a name="arrays">Arrays</a></h3> 337The above formulation is in fact more complex than necessary. 338The serialization library detects when the object being 339serialized is an array and emits code equivalent to the above. 340So the above can be shortened to: 341<pre> 342<code> 343class bus_route 344{ 345 friend class boost::serialization::access; 346 bus_stop * stops[10]; 347 template<class Archive> 348 void serialize(Archive & ar, const unsigned int version) 349 { 350 ar & stops; 351 } 352public: 353 bus_route(){} 354}; 355</code> 356</pre> 357<h3><a name="stl">STL Collections</a></h3> 358The above example uses an array of members. More likely such 359an application would use an STL collection for such a purpose. 360The serialization library contains code for serialization 361of all STL classes. Hence, the reformulation below will 362also work as one would expect. 363<pre> 364<code> 365#include <boost/serialization/list.hpp> 366 367class bus_route 368{ 369 friend class boost::serialization::access; 370 std::list<bus_stop *> stops; 371 template<class Archive> 372 void serialize(Archive & ar, const unsigned int version) 373 { 374 ar & stops; 375 } 376public: 377 bus_route(){} 378}; 379</code> 380</pre> 381<h3><a name="versioning">Class Versioning</a></h3> 382<p> 383Suppose we're satisfied with our <code style="white-space: normal">bus_route</code> class, build a program 384that uses it and ship the product. Some time later, it's decided 385that the program needs enhancement and the <code style="white-space: normal">bus_route</code> class is 386altered to include the name of the driver of the route. So the 387new version looks like: 388<pre> 389<code> 390#include <boost/serialization/list.hpp> 391#include <boost/serialization/string.hpp> 392 393class bus_route 394{ 395 friend class boost::serialization::access; 396 std::list<bus_stop *> stops; 397 std::string driver_name; 398 template<class Archive> 399 void serialize(Archive & ar, const unsigned int version) 400 { 401 ar & driver_name; 402 ar & stops; 403 } 404public: 405 bus_route(){} 406}; 407</code> 408</pre> 409Great, we're all done. Except... what about people using our application 410who now have a bunch of files created under the previous program. 411How can these be used with our new program version? 412<p> 413In general, the serialization library stores a version number in the 414archive for each class serialized. By default this version number is 0. 415When the archive is loaded, the version number under which it was saved 416is read. The above code can be altered to exploit this 417<pre> 418<code> 419#include <boost/serialization/list.hpp> 420#include <boost/serialization/string.hpp> 421#include <boost/serialization/version.hpp> 422 423class bus_route 424{ 425 friend class boost::serialization::access; 426 std::list<bus_stop *> stops; 427 std::string driver_name; 428 template<class Archive> 429 void serialize(Archive & ar, const unsigned int version) 430 { 431 // only save/load driver_name for newer archives 432 if(version > 0) 433 ar & driver_name; 434 ar & stops; 435 } 436public: 437 bus_route(){} 438}; 439 440BOOST_CLASS_VERSION(bus_route, 1) 441</code> 442</pre> 443By application of versioning to each class, there is no need to 444try to maintain a versioning of files. That is, a file version 445is the combination of the versions of all its constituent classes. 446 447This system permits programs to be always compatible with archives 448created by all previous versions of a program with no more 449effort than required by this example. 450 451<h3><a name="splitting">Splitting <code style="white-space: normal">serialize</code> 452into <code style="white-space: normal">save/load</code></a></h3> 453The <code style="white-space: normal">serialize</code> function is simple, concise, and guarantees 454that class members are saved and loaded in the same sequence 455- the key to the serialization system. However, there are cases 456where the load and save operations are not as similar as the examples 457used here. For example, this could occur with a class that has evolved through 458multiple versions. The above class can be reformulated as: 459<pre> 460<code> 461#include <boost/serialization/list.hpp> 462#include <boost/serialization/string.hpp> 463#include <boost/serialization/version.hpp> 464#include <boost/serialization/split_member.hpp> 465 466class bus_route 467{ 468 friend class boost::serialization::access; 469 std::list<bus_stop *> stops; 470 std::string driver_name; 471 template<class Archive> 472 void save(Archive & ar, const unsigned int version) const 473 { 474 // note, version is always the latest when saving 475 ar & driver_name; 476 ar & stops; 477 } 478 template<class Archive> 479 void load(Archive & ar, const unsigned int version) 480 { 481 if(version > 0) 482 ar & driver_name; 483 ar & stops; 484 } 485 BOOST_SERIALIZATION_SPLIT_MEMBER() 486public: 487 bus_route(){} 488}; 489 490BOOST_CLASS_VERSION(bus_route, 1) 491</code> 492</pre> 493The macro <code style="white-space: normal">BOOST_SERIALIZATION_SPLIT_MEMBER()</code> generates 494code which invokes the <code style="white-space: normal">save</code> 495or <code style="white-space: normal">load</code> 496depending on whether the archive is used for saving or loading. 497<h3><a name="archives">Archives</a></h3> 498Our discussion here has focused on adding serialization 499capability to classes. The actual rendering of the data to be serialized 500is implemented in the archive class. Thus the stream of serialized 501data is a product of the serialization of the class and the 502archive selected. It is a key design decision that these two 503components be independent. This permits any serialization specification 504to be usable with any archive. 505<p> 506In this tutorial, we have used a particular 507archive class - <code style="white-space: normal">text_oarchive</code> for saving and 508<code style="white-space: normal">text_iarchive</code> for loading. 509text archives render data as text and are portable across platforms. In addition 510to text archives, the library includes archive class for native binary data 511and xml formatted data. Interfaces to all archive classes are all identical. 512Once serialization has been defined for a class, that class can be serialized to 513any type of archive. 514<p> 515If the current set of archive classes doesn't provide the 516attributes, format, or behavior needed for a particular application, 517one can either make a new archive class or derive from an existing one. 518This is described later in the manual. 519 520<h3><a name="examples">List of Examples</h3> 521<dl> 522 <dt><a href="../example/demo.cpp" target="demo_cpp">demo.cpp</a> 523 <dd>This is the completed example used in this tutorial. 524 It does the following: 525 <ol> 526 <li>Creates a structure of differing kinds of stops, routes and schedules 527 <li>Displays it 528 <li>Serializes it to a file named "testfile.txt" with one 529 statement 530 <li>Restores to another structure 531 <li>Displays the restored structure 532 </ol> 533 <a href="../example/demo_output.txt" target="demo_output">Output of 534 this program</a> is sufficient to verify that all the 535 originally stated requirements for a serialization system 536 are met with this system. The <a href="../example/demofile.txt" 537 target="test_file">contents of the archive file</a> can 538 also be displayed as serialization files are ASCII text. 539 540 <dt><a href="../example/demo_xml.cpp" target="demo_xml_cpp">demo_xml.cpp</a> 541 <dd>This is a variation the original demo which supports xml archives in addition 542 to the others. The extra wrapping macro, BOOST_SERIALIZATION_NVP(name), is 543 needed to associate a data item name with the corresponding xml 544 tag. It is importanted that 'name' be a valid xml tag, else it 545 will be impossible to restore the archive. 546 For more information see 547 <a target="detail" href="wrappers.html#nvp">Name-Value Pairs</a>. 548 <a href="../example/demo_save.xml" target="demo_save_xml">Here</a> 549 is what an xml archive looks like. 550 551 <dt><a href="../example/demo_xml_save.cpp" target="demo_xml_save_cpp">demo_xml_save.cpp</a> 552 and <a href="../example/demo_xml_load.cpp" target="demo_xml_load_cpp">demo_xml_load.cpp</a> 553 <dd>Note also that though our examples save and load the program data 554 to an archive within the same program, this merely a convenience 555 for purposes of illustration. In general, the archive may or may 556 not be loaded by the same program that created it. 557</dl> 558<p> 559The astute reader might notice that these examples contain a subtle but important flaw. 560They leak memory. The bus stops are created in the <code style="white-space: normal"> 561main</code> function. The bus schedules may refer to these bus stops 562any number of times. At the end of the main function after the bus schedules are destroyed, 563the bus stops are destroyed. This seems fine. But what about the structure 564<code style="white-space: normal">new_schedule</code> data item created by the 565process of loading from an archive? This contains its own separate set of bus stops 566that are not referenced outside of the bus schedule. These won't be destroyed 567anywhere in the program - a memory leak. 568<p> 569There are couple of ways of fixing this. One way is to explicitly manage the bus stops. 570However, a more robust and transparent is to use 571<code style="white-space: normal">shared_ptr</code> rather than raw pointers. Along 572with serialization implementations for the Standard Library, the serialization library 573includes implementation of serialization for 574<code style="white-space: normal">boost::shared ptr</code>. Given this, it should be 575easy to alter any of these examples to eliminate the memory leak. This is left 576as an excercise for the reader. 577 578<hr> 579<p><i>© Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004. 580Distributed under the Boost Software License, Version 1.0. (See 581accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) 582</i></p> 583</body> 584</html> 585