1<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> 2<html> 3<!-- 4(C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com . 5Use, modification and distribution is subject to the Boost Software 6License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at 7http://www.boost.org/LICENSE_1_0.txt) 8--> 9<head> 10<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> 11<link rel="stylesheet" type="text/css" href="../../../boost.css"> 12<link rel="stylesheet" type="text/css" href="style.css"> 13<title>Serialization - Dataflow Iterators</title> 14</head> 15<body link="#0000ff" vlink="#800080"> 16<table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header"> 17 <tr> 18 <td valign="top" width="300"> 19 <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3> 20 </td> 21 <td valign="top"> 22 <h1 align="center">Serialization</h1> 23 <h2 align="center">Dataflow Iterators</h2> 24 </td> 25 </tr> 26</table> 27<hr> 28<h3>Motivation</h3> 29Consider the problem of translating an arbitrary length sequence of 8 bit bytes 30to base64 text. Such a process can be summarized as: 31<p> 32source => 8 bit bytes => 6 bit integers => encode to base64 characters => insert line breaks => destination 33<p> 34We would prefer the solution that is: 35<ul> 36 <li>Decomposable. so we can code, test, verify and use each (simple) stage of the conversion 37 independently. 38 <li>Composable. so we can use this composite as a new component somewhere else. 39 <li>Efficient, so we're not required to re-implement it again. 40 <li>Scalable, so that it works well for short and arbitrarily long sequences. 41</ul> 42The approach that comes closest to meeting these requirements is that described 43and implemented with <a href="../../iterator/doc/index.html">Iterator Adaptors</a>. 44The fundamental feature of an Iterator Adaptor template that makes it interesting to 45us is that it takes as a parameter a base iterator from which it derives its 46input. This suggests that something like the following might be possible. 47<pre><code> 48typedef 49 insert_linebreaks< // insert line breaks every 76 characters 50 base64_from_binary< // convert binary values to base64 characters 51 transform_width< // retrieve 6 bit integers from a sequence of 8 bit bytes 52 const char *, 53 6, 54 8 55 > 56 > 57 ,76 58 > 59 base64_text; // compose all the above operations in to a new iterator 60 61std::copy( 62 base64_text(address), 63 base64_text(address + count), 64 ostream_iterator<CharType>(os) 65); 66</code></pre> 67Indeed, this seems to be exactly the kind of problem that iterator adaptors are 68intended to address. The Iterator Adaptor library already includes 69modules which can be configured to implement some of the operations above. For example, 70included is <a target="transform_iterator" href="../../iterator/doc/transform_iterator.html"> 71transform_iterator</a>, which can be used to implement 6 bit integer => base64 code. 72 73<h3>Dataflow Iterators</h3> 74Unfortunately, not all iterators which inherit from Iterator Adaptors are guaranteed 75to meet the composability goals stated above. To accomplish this purpose, they have 76to be written with some additional considerations in mind. 77 78We define a Dataflow Iterator as an class inherited from <code style="white-space: normal">iterator_adaptor</code> which 79fulfills a small set of additional requirements. 80 81<h4>Templated Constructors</h4> 82<p> 83Templated constructor have the form: 84<pre><code> 85template<class T> 86dataflow_iterator(T start) : 87 iterator_adaptor(Base(start)) 88{} 89</code></pre> 90When these constructors are applied to our example of above, the following code is generated: 91<pre><code> 92std::copy( 93 insert_linebreaks( 94 base64_from_binary( 95 transform_width( 96 address 97 ), 98 ) 99 ), 100 insert_linebreaks( 101 base64_from_binary( 102 transform_width( 103 address + count 104 ) 105 ) 106 ) 107 ostream_iterator<char>(os) 108); 109</code></pre> 110The recursive application of this template is what automatically generates the 111constructor <code style="white-space: normal">base64_text(const char *)</code> in our example above. The original 112Iterator Adaptors include a <code style="white-space: normal">make_xxx_iterator</code> to fulfill this function. 113However, I believe these are unwieldy to use compared to the above solution using 114Templated constructors. 115 116<h4>Dereferencing</h4> 117Dereferencing some iterators can cause problems. For example, a natural 118way to write a <code style="white-space: normal">remove_whitespace</code> iterator is to increment past the initial 119whitespaces when the iterator is constructed. This will fail if the iterator passed to the 120constructor "points" to the end of a string. The 121<a target="filter_iterator" href="../../iterator/doc/filter_iterator.html"> 122<code style="white-space: normal">filter_iterator</code></a> is implemented 123in this way so it can't be used in our context. So, for implementation of this iterator, 124space removal is deferred until the iterator actually is dereferenced. 125 126<h4>Comparison</h4> 127The default implementation of iterator equality of <code style="white-space: normal">iterator_adaptor</code> just 128invokes the equality operator on the base iterators. Generally this is satisfactory. 129However, this implies that other operations (E. G. dereference) do not prematurely 130increment the base iterator. Avoiding this can be surprisingly tricky in some cases. 131(E.G. transform_width) 132 133<p> 134Iterators which fulfill the above requirements should be composable and the above sample 135code should implement our binary to base64 conversion. 136 137<h3>Iterators Included in the Library</h3> 138Dataflow iterators for the serialization library are all defined in the hamespace 139<code style="white-space: normal">boost::archive::iterators</code> included here are: 140<dl class="index"> 141 <dt><a target="base64_from_binary" href="../../../boost/archive/iterators/base64_from_binary.hpp"> 142 base64_from_binary</a></dt> 143 <dd>transforms a sequence of integers to base64 text</dd> 144 145 <dt><a target="base64_from_binary" href="../../../boost/archive/iterators/binary_from_base64.hpp"> 146 binary_from_base64</a></dt> 147 <dd>transforms a sequence of base64 characters to a sequence of integers</dd> 148 149 <dt><a target="insert_linebreaks" href="../../../boost/archive/iterators/insert_linebreaks.hpp"> 150 insert_linebreaks</a></dt> 151 <dd>given a sequence, creates a sequence with newline characters inserted</dd> 152 153 <dt><a target="mb_from_wchar" href="../../../boost/archive/iterators/mb_from_wchar.hpp"> 154 mb_from_wchar</a></dt> 155 <dd>transforms a sequence of wide characters to a sequence of multi-byte characters</dd> 156 157 <dt><a target="remove_whitespace" href="../../../boost/archive/iterators/remove_whitespace.hpp"> 158 remove_whitespace</a></dt> 159 <dd>given a sequence of characters, returns a sequence with the white characters 160 removed. This is a derivation from the <code style="white-space: normal">boost::filter_iterator</code></dd> 161 162 <dt><a target="transform_width" href="../../../boost/archive/iterators/transform_width.hpp"> 163 transform_width</a></dt> 164 <dd>transforms a sequence of x bit elements into a sequence of y bit elements. This 165 is a key component in iterators which translate to and from base64 text.</dd> 166 167 <dt><a target="wchar_from_mb" href="../../../boost/archive/iterators/wchar_from_mb.hpp"> 168 wchar_from_mb</a></dt> 169 <dd>transform a sequence of multi-byte characters in the current locale to wide characters.</dd> 170 171 <dt><a target="xml_escape" href="../../../boost/archive/iterators/xml_escape.hpp"> 172 xml_escape</a></dt> 173 <dd>escapes xml meta-characters from xml text</dd> 174 175 <dt><a target="xml_unescape" href="../../../boost/archive/iterators/xml_unescape.hpp"> 176 xml_unescape</a></dt> 177 <dd>unescapes xml escape sequences to create a sequence of normal text<dd> 178</dl> 179<p> 180The standard stream iterators don't quite work for us. On systems which implement <code style="white-space: normal">wchar_t</code> 181as unsigned short integers (E.G. VC 6) they didn't function as I expected. I also made some 182adjustments to be consistent with our concept of Dataflow Iterators. Like the rest of our 183iterators, they are found in the namespace <code style="white-space: normal">boost::archive::interators</code> to avoid 184conflicts with the standard library versions. 185<dl class = "index"> 186 <dt><a target="istream_iterator" href="../../../boost/archive/iterators/istream_iterator.hpp"> 187 istream_iterator</a></dt> 188 <dt><a target="ostream_iterator" href="../../../boost/archive/iterators/ostream_iterator.hpp"> 189 ostream_iterator</a></dt> 190</dl> 191 192<hr> 193<p><i>© Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004. 194Distributed under the Boost Software License, Version 1.0. (See 195accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) 196</i></p> 197</body> 198</html> 199