• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2<html>
3<head>
4<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
5<title>Appendices</title>
6<link rel="stylesheet" href="../../../doc/src/boostbook.css" type="text/css">
7<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
8<link rel="home" href="../index.html" title="The Boost C++ Libraries BoostBook Documentation Subset">
9<link rel="up" href="../xpressive.html" title="Chapter 46. Boost.Xpressive">
10<link rel="prev" href="../boost_xpressive/acknowledgments.html" title="Acknowledgments">
11<link rel="next" href="../yap.html" title="Chapter 47. Boost.YAP">
12</head>
13<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
14<table cellpadding="2" width="100%"><tr>
15<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../boost.png"></td>
16<td align="center"><a href="../../../index.html">Home</a></td>
17<td align="center"><a href="../../../libs/libraries.htm">Libraries</a></td>
18<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
19<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
20<td align="center"><a href="../../../more/index.htm">More</a></td>
21</tr></table>
22<hr>
23<div class="spirit-nav">
24<a accesskey="p" href="../boost_xpressive/acknowledgments.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../xpressive.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../yap.html"><img src="../../../doc/src/images/next.png" alt="Next"></a>
25</div>
26<div class="section">
27<div class="titlepage"><div><div><h2 class="title" style="clear: both">
28<a name="xpressive.appendices"></a><a class="link" href="appendices.html" title="Appendices">Appendices</a>
29</h2></div></div></div>
30<div class="toc"><dl class="toc">
31<dt><span class="section"><a href="appendices.html#boost_xpressive.appendices.appendix_1__history">Appendix
32      1: History</a></span></dt>
33<dt><span class="section"><a href="appendices.html#boost_xpressive.appendices.appendix_2__not_yet_implemented">Appendix
34      2: Not Yet Implemented</a></span></dt>
35<dt><span class="section"><a href="appendices.html#boost_xpressive.appendices.appendix_3__differences_from_boost_regex">Appendix
36      3: Differences from Boost.Regex</a></span></dt>
37<dt><span class="section"><a href="appendices.html#boost_xpressive.appendices.perf">Appendix 4: Performance
38      Comparison</a></span></dt>
39<dt><span class="section"><a href="appendices.html#xpressive.appendices.appendix_5__implementation_notes">Appendix
40      5: Implementation Notes</a></span></dt>
41</dl></div>
42<div class="section">
43<div class="titlepage"><div><div><h3 class="title">
44<a name="boost_xpressive.appendices.appendix_1__history"></a><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_1__history" title="Appendix 1: History">Appendix
45      1: History</a>
46</h3></div></div></div>
47<h3>
48<a name="boost_xpressive.appendices.appendix_1__history.h0"></a>
49        <span class="phrase"><a name="boost_xpressive.appendices.appendix_1__history.version_2_1_0_6_12_2008"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_1__history.version_2_1_0_6_12_2008">Version
50        2.1.0 6/12/2008</a>
51      </h3>
52<p>
53        New Features:
54      </p>
55<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
56<li class="listitem">
57            <code class="computeroutput"><span class="identifier">skip</span><span class="special">()</span></code>
58            primitive for static regexes, which allows you to specify parts of the
59            input string to ignore during regex matching.
60          </li>
61<li class="listitem">
62            Range-based <code class="computeroutput"><span class="identifier">regex_replace</span><span class="special">()</span></code> algorithm interface.
63          </li>
64<li class="listitem">
65            <code class="computeroutput"><span class="identifier">regex_replace</span><span class="special">()</span></code>
66            accepts formatter objects and formatter lambda expressions in addition
67            to format strings.
68          </li>
69</ul></div>
70<p>
71        Bugs Fixed:
72      </p>
73<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem">
74            Semantic actions in look-aheads, look-behinds and independent sub-expressions
75            execute eagerly instead of causing a crash.
76          </li></ul></div>
77<h3>
78<a name="boost_xpressive.appendices.appendix_1__history.h1"></a>
79        <span class="phrase"><a name="boost_xpressive.appendices.appendix_1__history.version_2_0_1_10_23_2007"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_1__history.version_2_0_1_10_23_2007">Version
80        2.0.1 10/23/2007</a>
81      </h3>
82<p>
83        Bugs Fixed:
84      </p>
85<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem">
86            <code class="computeroutput"><span class="identifier">sub_match</span><span class="special">&lt;&gt;</span></code>
87            constructor copies singular iterator causing debug assert.
88          </li></ul></div>
89<h3>
90<a name="boost_xpressive.appendices.appendix_1__history.h2"></a>
91        <span class="phrase"><a name="boost_xpressive.appendices.appendix_1__history.version_2_0_0__10_12_2007"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_1__history.version_2_0_0__10_12_2007">Version
92        2.0.0, 10/12/2007</a>
93      </h3>
94<p>
95        New Features:
96      </p>
97<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
98<li class="listitem">
99            Semantic actions
100          </li>
101<li class="listitem">
102            Custom assertions
103          </li>
104<li class="listitem">
105            Named captures
106          </li>
107<li class="listitem">
108            Dynamic regex grammars
109          </li>
110<li class="listitem">
111            Recursive dynamic regexes with <code class="literal">(?R)</code> construct
112          </li>
113<li class="listitem">
114            Support for searching non-character data
115          </li>
116<li class="listitem">
117            Better errors for invalid static regexes
118          </li>
119<li class="listitem">
120            Range-based regex algorithm interface
121          </li>
122<li class="listitem">
123            <code class="computeroutput"><span class="identifier">match_flag_type</span><span class="special">::</span><span class="identifier">format_perl</span></code>, <code class="computeroutput"><span class="identifier">match_flag_type</span><span class="special">::</span><span class="identifier">format_sed</span></code>,
124            and <code class="computeroutput"><span class="identifier">match_flag_type</span><span class="special">::</span><span class="identifier">format_all</span></code>
125          </li>
126<li class="listitem">
127            <code class="computeroutput"><span class="keyword">operator</span><span class="special">+(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">sub_match</span><span class="special">&lt;&gt;)</span></code>
128            and variants
129          </li>
130<li class="listitem">
131            Version 2 regex traits get <code class="computeroutput"><span class="identifier">tolower</span><span class="special">()</span></code> and <code class="computeroutput"><span class="identifier">toupper</span><span class="special">()</span></code>
132          </li>
133</ul></div>
134<p>
135        Bugs Fixed:
136      </p>
137<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem">
138            Complementing single-character sets like <code class="computeroutput"><span class="special">~(</span><span class="identifier">set</span><span class="special">=</span><span class="char">'a'</span><span class="special">)</span></code> works.
139          </li></ul></div>
140<h3>
141<a name="boost_xpressive.appendices.appendix_1__history.h3"></a>
142        <span class="phrase"><a name="boost_xpressive.appendices.appendix_1__history.version_1_0_2__april_27__2007"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_1__history.version_1_0_2__april_27__2007">Version
143        1.0.2, April 27, 2007</a>
144      </h3>
145<p>
146        Bugs Fixed:
147      </p>
148<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem">
149            Back-references greater than nine work as advertized.
150          </li></ul></div>
151<p>
152        This is the version that shipped as part of Boost 1.34.
153      </p>
154<h3>
155<a name="boost_xpressive.appendices.appendix_1__history.h4"></a>
156        <span class="phrase"><a name="boost_xpressive.appendices.appendix_1__history.version_1_0_1__october_2__2006"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_1__history.version_1_0_1__october_2__2006">Version
157        1.0.1, October 2, 2006</a>
158      </h3>
159<p>
160        Bugs Fixed:
161      </p>
162<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem">
163            <code class="computeroutput"><span class="identifier">match_results</span><span class="special">::</span><span class="identifier">position</span><span class="special">()</span></code>
164            works for nested results.
165          </li></ul></div>
166<h3>
167<a name="boost_xpressive.appendices.appendix_1__history.h5"></a>
168        <span class="phrase"><a name="boost_xpressive.appendices.appendix_1__history.version_1_0_0__march_16__2006"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_1__history.version_1_0_0__march_16__2006">Version
169        1.0.0, March 16, 2006</a>
170      </h3>
171<p>
172        Version 1.0!
173      </p>
174<h3>
175<a name="boost_xpressive.appendices.appendix_1__history.h6"></a>
176        <span class="phrase"><a name="boost_xpressive.appendices.appendix_1__history.version_0_9_6__august_19__2005"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_1__history.version_0_9_6__august_19__2005">Version
177        0.9.6, August 19, 2005</a>
178      </h3>
179<p>
180        The version reviewed for acceptance into Boost. The review began September
181        8, 2005. Xpressive was accepted into Boost on September 28, 2005.
182      </p>
183<h3>
184<a name="boost_xpressive.appendices.appendix_1__history.h7"></a>
185        <span class="phrase"><a name="boost_xpressive.appendices.appendix_1__history.version_0_9_3__june_30__2005"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_1__history.version_0_9_3__june_30__2005">Version
186        0.9.3, June 30, 2005</a>
187      </h3>
188<p>
189        New Features:
190      </p>
191<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
192<li class="listitem">
193            TR1-style regex_traits interface
194          </li>
195<li class="listitem">
196            Speed enhancements
197          </li>
198<li class="listitem">
199            <code class="computeroutput"><span class="identifier">syntax_option_type</span><span class="special">::</span><span class="identifier">ignore_white_space</span></code>
200          </li>
201</ul></div>
202<h3>
203<a name="boost_xpressive.appendices.appendix_1__history.h8"></a>
204        <span class="phrase"><a name="boost_xpressive.appendices.appendix_1__history.version_0_9_0__september_2__2004"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_1__history.version_0_9_0__september_2__2004">Version
205        0.9.0, September 2, 2004</a>
206      </h3>
207<p>
208        New Features:
209      </p>
210<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem">
211            It sort of works.
212          </li></ul></div>
213<h3>
214<a name="boost_xpressive.appendices.appendix_1__history.h9"></a>
215        <span class="phrase"><a name="boost_xpressive.appendices.appendix_1__history.version_0_0_1__november_16__2003"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_1__history.version_0_0_1__november_16__2003">Version
216        0.0.1, November 16, 2003</a>
217      </h3>
218<p>
219        Announcement of xpressive: <a href="http://lists.boost.org/Archives/boost/2003/11/56312.php" target="_top">http://lists.boost.org/Archives/boost/2003/11/56312.php</a>
220      </p>
221</div>
222<div class="section">
223<div class="titlepage"><div><div><h3 class="title">
224<a name="boost_xpressive.appendices.appendix_2__not_yet_implemented"></a><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_2__not_yet_implemented" title="Appendix 2: Not Yet Implemented">Appendix
225      2: Not Yet Implemented</a>
226</h3></div></div></div>
227<p>
228        The following features are planned for xpressive 2.X:
229      </p>
230<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
231<li class="listitem">
232            <code class="computeroutput"><span class="identifier">syntax_option_type</span><span class="special">::</span><span class="identifier">collate</span></code>
233          </li>
234<li class="listitem">
235            Collation sequences such as <code class="literal">[.a.]</code>
236          </li>
237<li class="listitem">
238            Equivalence classes like <code class="literal">[=a=]</code>
239          </li>
240<li class="listitem">
241            Control of nested results generation with <code class="computeroutput"><span class="identifier">syntax_option_type</span><span class="special">::</span><span class="identifier">nosubs</span></code>,
242            and a <code class="computeroutput"><span class="identifier">nosubs</span><span class="special">()</span></code>
243            modifier for static xpressive.
244          </li>
245</ul></div>
246<p>
247        Here are some wish-list features. You or your company should consider hiring
248        me to implement them!
249      </p>
250<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
251<li class="listitem">
252            Optimized DFA back-end for simple, fast regexing.
253          </li>
254<li class="listitem">
255            Different regex compiler front ends for basic, extended, awk, grep and
256            egrep regex syntax.
257          </li>
258<li class="listitem">
259            Fine-grained control over the dynamic regex syntax
260          </li>
261<li class="listitem">
262            Optional integration with ICU for full Unicode support.
263          </li>
264<li class="listitem">
265            Improved localization support, possibly as a custom facet for <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>.
266          </li>
267</ul></div>
268</div>
269<div class="section">
270<div class="titlepage"><div><div><h3 class="title">
271<a name="boost_xpressive.appendices.appendix_3__differences_from_boost_regex"></a><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_3__differences_from_boost_regex" title="Appendix 3: Differences from Boost.Regex">Appendix
272      3: Differences from Boost.Regex</a>
273</h3></div></div></div>
274<p>
275        Since many of xpressive's users are likely to be familiar with the <a href="../../../libs/regex" target="_top">Boost.Regex</a> library, I would be remiss if
276        I failed to point out some important differences between xpressive and <a href="../../../libs/regex" target="_top">Boost.Regex</a>. In particular:<br>
277      </p>
278<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
279<li class="listitem">
280            <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">basic_regex</span><span class="special">&lt;&gt;</span></code>
281            is a template on the iterator type, not the character type.
282          </li>
283<li class="listitem">
284            <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">basic_regex</span><span class="special">&lt;&gt;</span></code>
285            cannot be constructed directly from a string; rather, you must use <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">()</span></code>
286            or <code class="computeroutput"><span class="identifier">regex_compiler</span><span class="special">&lt;&gt;</span></code>
287            to build a regex object from a string.
288          </li>
289<li class="listitem">
290            <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">basic_regex</span><span class="special">&lt;&gt;</span></code>
291            does not have an <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code> member function; rather, the <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code>
292            member function is in the <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">regex_compiler</span><span class="special">&lt;&gt;</span></code> factory.
293          </li>
294<li class="listitem">
295            <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">basic_regex</span><span class="special">&lt;&gt;</span></code>
296            has a subset of <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special">&lt;&gt;</span></code>'s
297            members. <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">basic_regex</span><span class="special">&lt;&gt;</span></code>
298            does not. The members lacking are: <code class="computeroutput"><span class="identifier">assign</span><span class="special">()</span></code>, <code class="computeroutput"><span class="keyword">operator</span><span class="special">[]()</span></code>, <code class="computeroutput"><span class="identifier">max_size</span><span class="special">()</span></code>, <code class="computeroutput"><span class="identifier">begin</span><span class="special">()</span></code>, <code class="computeroutput"><span class="identifier">end</span><span class="special">()</span></code>, <code class="computeroutput"><span class="identifier">size</span><span class="special">()</span></code>, <code class="computeroutput"><span class="identifier">compare</span><span class="special">()</span></code>, and <code class="computeroutput"><span class="keyword">operator</span><span class="special">=(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special">&lt;&gt;)</span></code>.
299          </li>
300<li class="listitem">
301            Other member functions that exist in <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">basic_regex</span><span class="special">&lt;&gt;</span></code> but do not exist in <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">basic_regex</span><span class="special">&lt;&gt;</span></code>
302            are: <code class="computeroutput"><span class="identifier">set_expression</span><span class="special">()</span></code>,
303            <code class="computeroutput"><span class="identifier">get_allocator</span><span class="special">()</span></code>,
304            <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code>,
305            <code class="computeroutput"><span class="identifier">getloc</span><span class="special">()</span></code>,
306            <code class="computeroutput"><span class="identifier">getflags</span><span class="special">()</span></code>,
307            and <code class="computeroutput"><span class="identifier">str</span><span class="special">()</span></code>.
308          </li>
309<li class="listitem">
310            <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">basic_regex</span><span class="special">&lt;&gt;</span></code>
311            does not have a RegexTraits template parameter. Customization of regex
312            syntax and localization behavior will be controlled by <code class="computeroutput"><span class="identifier">regex_compiler</span><span class="special">&lt;&gt;</span></code>
313            and a custom regex facet for <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>.
314          </li>
315<li class="listitem">
316            <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">basic_regex</span><span class="special">&lt;&gt;</span></code>
317            and <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">match_results</span><span class="special">&lt;&gt;</span></code>
318            do not have an Allocator template parameter. This is by design.
319          </li>
320<li class="listitem">
321            <code class="computeroutput"><span class="identifier">match_not_dot_null</span></code> and
322            <code class="computeroutput"><span class="identifier">match_not_dot_newline</span></code>
323            have moved from the <code class="computeroutput"><span class="identifier">match_flag_type</span></code>
324            enum to the <code class="computeroutput"><span class="identifier">syntax_option_type</span></code>
325            enum, and they have changed names to <code class="computeroutput"><span class="identifier">not_dot_null</span></code>
326            and <code class="computeroutput"><span class="identifier">not_dot_newline</span></code>.
327          </li>
328<li class="listitem">
329            The following <code class="computeroutput"><span class="identifier">syntax_option_type</span></code>
330            enumeration values are not supported: <code class="computeroutput"><span class="identifier">escape_in_lists</span></code>,
331            <code class="computeroutput"><span class="identifier">char_classes</span></code>, <code class="computeroutput"><span class="identifier">intervals</span></code>, <code class="computeroutput"><span class="identifier">limited_ops</span></code>,
332            <code class="computeroutput"><span class="identifier">newline_alt</span></code>, <code class="computeroutput"><span class="identifier">bk_plus_qm</span></code>, <code class="computeroutput"><span class="identifier">bk_braces</span></code>,
333            <code class="computeroutput"><span class="identifier">bk_parens</span></code>, <code class="computeroutput"><span class="identifier">bk_refs</span></code>, <code class="computeroutput"><span class="identifier">bk_vbar</span></code>,
334            <code class="computeroutput"><span class="identifier">use_except</span></code>, <code class="computeroutput"><span class="identifier">failbit</span></code>, <code class="computeroutput"><span class="identifier">literal</span></code>,
335            <code class="computeroutput"><span class="identifier">perlex</span></code>, <code class="computeroutput"><span class="identifier">basic</span></code>, <code class="computeroutput"><span class="identifier">extended</span></code>,
336            <code class="computeroutput"><span class="identifier">emacs</span></code>, <code class="computeroutput"><span class="identifier">awk</span></code>, <code class="computeroutput"><span class="identifier">grep</span></code>
337            ,<code class="computeroutput"><span class="identifier">egrep</span></code>, <code class="computeroutput"><span class="identifier">sed</span></code>, <code class="computeroutput"><span class="identifier">JavaScript</span></code>,
338            <code class="computeroutput"><span class="identifier">JScript</span></code>.
339          </li>
340<li class="listitem">
341            The following <code class="computeroutput"><span class="identifier">match_flag_type</span></code>
342            enumeration values are not supported: <code class="computeroutput"><span class="identifier">match_not_bob</span></code>,
343            <code class="computeroutput"><span class="identifier">match_not_eob</span></code>, <code class="computeroutput"><span class="identifier">match_perl</span></code>, <code class="computeroutput"><span class="identifier">match_posix</span></code>,
344            and <code class="computeroutput"><span class="identifier">match_extra</span></code>.
345          </li>
346</ul></div>
347<p>
348        Also, in the current implementation, the regex algorithms in xpressive will
349        not detect pathological behavior and abort by throwing an exception. It is
350        up to you to write efficient patterns that do not behave pathologically.
351      </p>
352</div>
353<div class="section">
354<div class="titlepage"><div><div><h3 class="title">
355<a name="boost_xpressive.appendices.perf"></a><a class="link" href="appendices.html#boost_xpressive.appendices.perf" title="Appendix 4: Performance Comparison">Appendix 4: Performance
356      Comparison</a>
357</h3></div></div></div>
358<div class="toc"><dl class="toc">
359<dt><span class="section"><a href="appendices.html#boost_xpressive.appendices.perf.perf_gcc">xpressive
360        vs. Boost.Regex with GCC (Cygwin)</a></span></dt>
361<dt><span class="section"><a href="appendices.html#boost_xpressive.appendices.perf.perf_msvc">xpressive
362        vs. Boost.Regex with Visual C++</a></span></dt>
363</dl></div>
364<p>
365        The performance of xpressive is competitive with <a href="../../../libs/regex" target="_top">Boost.Regex</a>.
366        I have run performance benchmarks comparing static xpressive, dynamic xpressive
367        and <a href="../../../libs/regex" target="_top">Boost.Regex</a> on two platforms: gcc
368        (Cygwin) and Visual C++. The tests include short matches and long searches.
369        For both platforms, xpressive comes off well on short matches and roughly
370        on par with <a href="../../../libs/regex" target="_top">Boost.Regex</a> on long searches.
371      </p>
372<p>
373        &lt;disclaimer&gt; As with all benchmarks, the true test is how xpressive
374        performs with <span class="emphasis"><em>your</em></span> patterns, <span class="emphasis"><em>your</em></span>
375        input, and <span class="emphasis"><em>your</em></span> platform, so if performance matters
376        in your application, it's best to run your own tests. &lt;/disclaimer&gt;
377      </p>
378<div class="section">
379<div class="titlepage"><div><div><h4 class="title">
380<a name="boost_xpressive.appendices.perf.perf_gcc"></a><a class="link" href="appendices.html#boost_xpressive.appendices.perf.perf_gcc" title="xpressive vs. Boost.Regex with GCC (Cygwin)">xpressive
381        vs. Boost.Regex with GCC (Cygwin)</a>
382</h4></div></div></div>
383<p>
384          Below are the results of a performance comparison between:
385        </p>
386<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
387<li class="listitem">
388              static xpressive
389            </li>
390<li class="listitem">
391              dynamic xpressive
392            </li>
393<li class="listitem">
394              <a href="../../../libs/regex" target="_top">Boost.Regex</a>
395            </li>
396</ul></div>
397<div class="variablelist">
398<p class="title"><b>Test Specifications</b></p>
399<dl class="variablelist">
400<dt><span class="term">Hardware:</span></dt>
401<dd><p>
402                hyper-threaded 3GHz Xeon with 1Gb RAM
403              </p></dd>
404<dt><span class="term">Operating System:</span></dt>
405<dd><p>
406                Windows XP Pro + Cygwin
407              </p></dd>
408<dt><span class="term">Compiler:</span></dt>
409<dd><p>
410                GNU C++ version 3.4.4 (Cygwin special)
411              </p></dd>
412<dt><span class="term">C++ Standard Library:</span></dt>
413<dd><p>
414                GNU libstdc++ version 3.4.4
415              </p></dd>
416<dt><span class="term"><a href="../../../libs/regex" target="_top">Boost.Regex</a> Version:</span></dt>
417<dd><p>
418                1.33+, BOOST_REGEX_USE_CPP_LOCALE, BOOST_REGEX_RECURSIVE
419              </p></dd>
420<dt><span class="term">xpressive Version:</span></dt>
421<dd><p>
422                0.9.6a
423              </p></dd>
424</dl>
425</div>
426<h3>
427<a name="boost_xpressive.appendices.perf.perf_gcc.h0"></a>
428          <span class="phrase"><a name="boost_xpressive.appendices.perf.perf_gcc.comparison_1__short_matches"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.perf.perf_gcc.comparison_1__short_matches">Comparison
429          1: Short Matches</a>
430        </h3>
431<p>
432          The following tests evaluate the time taken to match the expression to
433          the input string. For each result, the top number has been normalized relative
434          to the fastest time, so 1.0 is as good as it gets. The bottom number (in
435          parentheses) is the actual time in seconds. The best time has been marked
436          in green.
437        </p>
438<div class="informaltable">
439<h5>
440<a name="id-1.3.47.7.5.4.7.1"></a><span class="table-title">Short Matches</span>
441</h5>
442<table class="table">
443<colgroup>
444<col>
445<col>
446<col>
447<col>
448<col>
449</colgroup>
450<thead><tr>
451<th>static xpressive</th>
452<th>dynamic xpressive</th>
453<th>Boost</th>
454<th>Text</th>
455<th>Expression</th>
456</tr></thead>
457<tbody>
458<tr>
459<td><span class="highlight">1<p></p>(8.79e‑07s)</span></td>
460<td><span class="highlight">1.08<p></p>(9.54e‑07s)</span></td>
461<td>2.51<p></p>(2.2e‑06s)</td>
462<td>100- this is a line of ftp response which contains a message string</td>
463<td><code class="literal">^([0-9]+)(\-| |$)(.*)$</code></td>
464</tr>
465<tr>
466<td><span class="highlight">1.06<p></p>(1.07e‑06s)</span></td>
467<td><span class="highlight">1<p></p>(1.01e‑06s)</span></td>
468<td>4.01<p></p>(4.06e‑06s)</td>
469<td>1234-5678-1234-456</td>
470<td><code class="literal">([[:digit:]]{4}[- ]){3}[[:digit:]]{3,4}</code></td>
471</tr>
472<tr>
473<td><span class="highlight">1<p></p>(1.4e‑06s)</span></td>
474<td>1.13<p></p>(1.58e‑06s)</td>
475<td>2.89<p></p>(4.05e‑06s)</td>
476<td>john_maddock@compuserve.com</td>
477<td><code class="literal">^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$</code></td>
478</tr>
479<tr>
480<td><span class="highlight">1<p></p>(1.28e‑06s)</span></td>
481<td>1.16<p></p>(1.49e‑06s)</td>
482<td>3.07<p></p>(3.94e‑06s)</td>
483<td>foo12@foo.edu</td>
484<td><code class="literal">^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$</code></td>
485</tr>
486<tr>
487<td><span class="highlight">1<p></p>(1.22e‑06s)</span></td>
488<td>1.2<p></p>(1.46e‑06s)</td>
489<td>3.22<p></p>(3.93e‑06s)</td>
490<td>bob.smith@foo.tv</td>
491<td><code class="literal">^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$</code></td>
492</tr>
493<tr>
494<td><span class="highlight">1.04<p></p>(8.64e‑07s)</span></td>
495<td><span class="highlight">1<p></p>(8.34e‑07s)</span></td>
496<td>2.5<p></p>(2.09e‑06s)</td>
497<td>EH10 2QQ</td>
498<td><code class="literal">^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$</code></td>
499</tr>
500<tr>
501<td>1.11<p></p>(9.09e‑07s)</td>
502<td><span class="highlight">1<p></p>(8.19e‑07s)</span></td>
503<td>2.47<p></p>(2.03e‑06s)</td>
504<td>G1 1AA</td>
505<td><code class="literal">^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$</code></td>
506</tr>
507<tr>
508<td>1.12<p></p>(9.38e‑07s)</td>
509<td><span class="highlight">1<p></p>(8.34e‑07s)</span></td>
510<td>2.5<p></p>(2.08e‑06s)</td>
511<td>SW1 1ZZ</td>
512<td><code class="literal">^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$</code></td>
513</tr>
514<tr>
515<td><span class="highlight">1<p></p>(7.9e‑07s)</span></td>
516<td><span class="highlight">1.06<p></p>(8.34e‑07s)</span></td>
517<td>2.49<p></p>(1.96e‑06s)</td>
518<td>4/1/2001</td>
519<td><code class="literal">^[[:digit:]]{1,2}/[[:digit:]]{1,2}/[[:digit:]]{4}$</code></td>
520</tr>
521<tr>
522<td><span class="highlight">1<p></p>(8.19e‑07s)</span></td>
523<td><span class="highlight">1.04<p></p>(8.49e‑07s)</span></td>
524<td>2.4<p></p>(1.97e‑06s)</td>
525<td>12/12/2001</td>
526<td><code class="literal">^[[:digit:]]{1,2}/[[:digit:]]{1,2}/[[:digit:]]{4}$</code></td>
527</tr>
528<tr>
529<td><span class="highlight">1.09<p></p>(8.95e‑07s)</span></td>
530<td><span class="highlight">1<p></p>(8.19e‑07s)</span></td>
531<td>2.4<p></p>(1.96e‑06s)</td>
532<td>123</td>
533<td><code class="literal">^[-+]?[[:digit:]]*\.?[[:digit:]]*$</code></td>
534</tr>
535<tr>
536<td>1.11<p></p>(8.79e‑07s)</td>
537<td><span class="highlight">1<p></p>(7.9e‑07s)</span></td>
538<td>2.57<p></p>(2.03e‑06s)</td>
539<td>+3.14159</td>
540<td><code class="literal">^[-+]?[[:digit:]]*\.?[[:digit:]]*$</code></td>
541</tr>
542<tr>
543<td><span class="highlight">1.09<p></p>(8.94e‑07s)</span></td>
544<td><span class="highlight">1<p></p>(8.19e‑07s)</span></td>
545<td>2.47<p></p>(2.03e‑06s)</td>
546<td>-3.14159</td>
547<td><code class="literal">^[-+]?[[:digit:]]*\.?[[:digit:]]*$</code></td>
548</tr>
549</tbody>
550</table>
551</div>
552<h3>
553<a name="boost_xpressive.appendices.perf.perf_gcc.h1"></a>
554          <span class="phrase"><a name="boost_xpressive.appendices.perf.perf_gcc.comparison_2__long_searches"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.perf.perf_gcc.comparison_2__long_searches">Comparison
555          2: Long Searches</a>
556        </h3>
557<p>
558          The next test measures the time to find <span class="emphasis"><em>all</em></span> matches
559          in a long English text. The text is the <a href="http://www.gutenberg.org/dirs/3/2/0/3200/3200.zip" target="_top">complete
560          works of Mark Twain</a>, from <a href="http://promo.net/pg/" target="_top">Project
561          Gutenberg</a>. The text is 19Mb long. As above, the top number is the
562          normalized time and the bottom number is the actual time. The best time
563          is in green.
564        </p>
565<div class="informaltable">
566<h5>
567<a name="id-1.3.47.7.5.4.10.1"></a><span class="table-title">Long Searches</span>
568</h5>
569<table class="table">
570<colgroup>
571<col>
572<col>
573<col>
574<col>
575</colgroup>
576<thead><tr>
577<th>static xpressive</th>
578<th>dynamic xpressive</th>
579<th>Boost</th>
580<th>Expression</th>
581</tr></thead>
582<tbody>
583<tr>
584<td><span class="highlight">1<p></p>(0.0263s)</span></td>
585<td><span class="highlight">1<p></p>(0.0263s)</span></td>
586<td>1.78<p></p>(0.0469s)</td>
587<td><code class="literal">Twain</code></td>
588</tr>
589<tr>
590<td><span class="highlight">1<p></p>(0.0234s)</span></td>
591<td><span class="highlight">1<p></p>(0.0234s)</span></td>
592<td>1.79<p></p>(0.042s)</td>
593<td><code class="literal">Huck[[:alpha:]]+</code></td>
594</tr>
595<tr>
596<td>1.84<p></p>(1.26s)</td>
597<td>2.21<p></p>(1.51s)</td>
598<td><span class="highlight">1<p></p>(0.687s)</span></td>
599<td><code class="literal">[[:alpha:]]+ing</code></td>
600</tr>
601<tr>
602<td><span class="highlight">1.09<p></p>(0.192s)</span></td>
603<td>2<p></p>(0.351s)</td>
604<td><span class="highlight">1<p></p>(0.176s)</span></td>
605<td><code class="literal">^[^
606]*?Twain</code></td>
607</tr>
608<tr>
609<td>1.41<p></p>(0.08s)</td>
610<td>1.21<p></p>(0.0684s)</td>
611<td><span class="highlight">1<p></p>(0.0566s)</span></td>
612<td><code class="literal">Tom|Sawyer|Huckleberry|Finn</code></td>
613</tr>
614<tr>
615<td>1.56<p></p>(0.195s)</td>
616<td>1.12<p></p>(0.141s)</td>
617<td><span class="highlight">1<p></p>(0.125s)</span></td>
618<td><code class="literal">(Tom|Sawyer|Huckleberry|Finn).{0,30}river|river.{0,30}(Tom|Sawyer|Huckleberry|Finn)</code></td>
619</tr>
620</tbody>
621</table>
622</div>
623</div>
624<div class="section">
625<div class="titlepage"><div><div><h4 class="title">
626<a name="boost_xpressive.appendices.perf.perf_msvc"></a><a class="link" href="appendices.html#boost_xpressive.appendices.perf.perf_msvc" title="xpressive vs. Boost.Regex with Visual C++">xpressive
627        vs. Boost.Regex with Visual C++</a>
628</h4></div></div></div>
629<p>
630          Below are the results of a performance comparison between:
631        </p>
632<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
633<li class="listitem">
634              static xpressive
635            </li>
636<li class="listitem">
637              dynamic xpressive
638            </li>
639<li class="listitem">
640              <a href="../../../libs/regex" target="_top">Boost.Regex</a>
641            </li>
642</ul></div>
643<div class="variablelist">
644<p class="title"><b>Test Specifications</b></p>
645<dl class="variablelist">
646<dt><span class="term">Hardware:</span></dt>
647<dd><p>
648                hyper-threaded 3GHz Xeon with 1Gb RAM
649              </p></dd>
650<dt><span class="term">Operating System:</span></dt>
651<dd><p>
652                Windows XP Pro
653              </p></dd>
654<dt><span class="term">Compiler:</span></dt>
655<dd><p>
656                Visual C++ .NET 2003 (7.1)
657              </p></dd>
658<dt><span class="term">C++ Standard Library:</span></dt>
659<dd><p>
660                Dinkumware, version 313
661              </p></dd>
662<dt><span class="term"><a href="../../../libs/regex" target="_top">Boost.Regex</a> Version:</span></dt>
663<dd><p>
664                1.33+, BOOST_REGEX_USE_CPP_LOCALE, BOOST_REGEX_RECURSIVE
665              </p></dd>
666<dt><span class="term">xpressive Version:</span></dt>
667<dd><p>
668                0.9.6a
669              </p></dd>
670</dl>
671</div>
672<h3>
673<a name="boost_xpressive.appendices.perf.perf_msvc.h0"></a>
674          <span class="phrase"><a name="boost_xpressive.appendices.perf.perf_msvc.comparison_1__short_matches"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.perf.perf_msvc.comparison_1__short_matches">Comparison
675          1: Short Matches</a>
676        </h3>
677<p>
678          The following tests evaluate the time taken to match the expression to
679          the input string. For each result, the top number has been normalized relative
680          to the fastest time, so 1.0 is as good as it gets. The bottom number (in
681          parentheses) is the actual time in seconds. The best time has been marked
682          in green.
683        </p>
684<div class="informaltable">
685<h5>
686<a name="id-1.3.47.7.5.5.7.1"></a><span class="table-title">Short Matches</span>
687</h5>
688<table class="table">
689<colgroup>
690<col>
691<col>
692<col>
693<col>
694<col>
695</colgroup>
696<thead><tr>
697<th>static xpressive</th>
698<th>dynamic xpressive</th>
699<th>Boost</th>
700<th>Text</th>
701<th>Expression</th>
702</tr></thead>
703<tbody>
704<tr>
705<td><span class="highlight">1<p></p>(3.2e‑007s)</span></td>
706<td>1.37<p></p>(4.4e‑007s)</td>
707<td>2.38<p></p>(7.6e‑007s)</td>
708<td>100- this is a line of ftp response which contains a message string</td>
709<td><code class="literal">^([0-9]+)(\-| |$)(.*)$</code></td>
710</tr>
711<tr>
712<td><span class="highlight">1<p></p>(6.4e‑007s)</span></td>
713<td>1.12<p></p>(7.15e‑007s)</td>
714<td>1.72<p></p>(1.1e‑006s)</td>
715<td>1234-5678-1234-456</td>
716<td><code class="literal">([[:digit:]]{4}[- ]){3}[[:digit:]]{3,4}</code></td>
717</tr>
718<tr>
719<td><span class="highlight">1<p></p>(9.82e‑007s)</span></td>
720<td>1.3<p></p>(1.28e‑006s)</td>
721<td>1.61<p></p>(1.58e‑006s)</td>
722<td>john_maddock@compuserve.com</td>
723<td><code class="literal">^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$</code></td>
724</tr>
725<tr>
726<td><span class="highlight">1<p></p>(8.94e‑007s)</span></td>
727<td>1.3<p></p>(1.16e‑006s)</td>
728<td>1.7<p></p>(1.52e‑006s)</td>
729<td>foo12@foo.edu</td>
730<td><code class="literal">^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$</code></td>
731</tr>
732<tr>
733<td><span class="highlight">1<p></p>(9.09e‑007s)</span></td>
734<td>1.28<p></p>(1.16e‑006s)</td>
735<td>1.67<p></p>(1.52e‑006s)</td>
736<td>bob.smith@foo.tv</td>
737<td><code class="literal">^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$</code></td>
738</tr>
739<tr>
740<td><span class="highlight">1<p></p>(3.06e‑007s)</span></td>
741<td><span class="highlight">1.07<p></p>(3.28e‑007s)</span></td>
742<td>1.95<p></p>(5.96e‑007s)</td>
743<td>EH10 2QQ</td>
744<td><code class="literal">^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$</code></td>
745</tr>
746<tr>
747<td><span class="highlight">1<p></p>(3.13e‑007s)</span></td>
748<td><span class="highlight">1.09<p></p>(3.42e‑007s)</span></td>
749<td>1.86<p></p>(5.81e‑007s)</td>
750<td>G1 1AA</td>
751<td><code class="literal">^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$</code></td>
752</tr>
753<tr>
754<td><span class="highlight">1<p></p>(3.2e‑007s)</span></td>
755<td><span class="highlight">1.09<p></p>(3.5e‑007s)</span></td>
756<td>1.86<p></p>(5.96e‑007s)</td>
757<td>SW1 1ZZ</td>
758<td><code class="literal">^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$</code></td>
759</tr>
760<tr>
761<td><span class="highlight">1<p></p>(2.68e‑007s)</span></td>
762<td>1.22<p></p>(3.28e‑007s)</td>
763<td>2<p></p>(5.36e‑007s)</td>
764<td>4/1/2001</td>
765<td><code class="literal">^[[:digit:]]{1,2}/[[:digit:]]{1,2}/[[:digit:]]{4}$</code></td>
766</tr>
767<tr>
768<td><span class="highlight">1<p></p>(2.76e‑007s)</span></td>
769<td>1.16<p></p>(3.2e‑007s)</td>
770<td>1.94<p></p>(5.36e‑007s)</td>
771<td>12/12/2001</td>
772<td><code class="literal">^[[:digit:]]{1,2}/[[:digit:]]{1,2}/[[:digit:]]{4}$</code></td>
773</tr>
774<tr>
775<td><span class="highlight">1<p></p>(2.98e‑007s)</span></td>
776<td><span class="highlight">1.03<p></p>(3.06e‑007s)</span></td>
777<td>1.85<p></p>(5.51e‑007s)</td>
778<td>123</td>
779<td><code class="literal">^[-+]?[[:digit:]]*\.?[[:digit:]]*$</code></td>
780</tr>
781<tr>
782<td><span class="highlight">1<p></p>(3.2e‑007s)</span></td>
783<td>1.12<p></p>(3.58e‑007s)</td>
784<td>1.81<p></p>(5.81e‑007s)</td>
785<td>+3.14159</td>
786<td><code class="literal">^[-+]?[[:digit:]]*\.?[[:digit:]]*$</code></td>
787</tr>
788<tr>
789<td><span class="highlight">1<p></p>(3.28e‑007s)</span></td>
790<td>1.11<p></p>(3.65e‑007s)</td>
791<td>1.77<p></p>(5.81e‑007s)</td>
792<td>-3.14159</td>
793<td><code class="literal">^[-+]?[[:digit:]]*\.?[[:digit:]]*$</code></td>
794</tr>
795</tbody>
796</table>
797</div>
798<h3>
799<a name="boost_xpressive.appendices.perf.perf_msvc.h1"></a>
800          <span class="phrase"><a name="boost_xpressive.appendices.perf.perf_msvc.comparison_2__long_searches"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.perf.perf_msvc.comparison_2__long_searches">Comparison
801          2: Long Searches</a>
802        </h3>
803<p>
804          The next test measures the time to find <span class="emphasis"><em>all</em></span> matches
805          in a long English text. The text is the <a href="http://www.gutenberg.org/dirs/3/2/0/3200/3200.zip" target="_top">complete
806          works of Mark Twain</a>, from <a href="http://promo.net/pg/" target="_top">Project
807          Gutenberg</a>. The text is 19Mb long. As above, the top number is the
808          normalized time and the bottom number is the actual time. The best time
809          is in green.
810        </p>
811<div class="informaltable">
812<h5>
813<a name="id-1.3.47.7.5.5.10.1"></a><span class="table-title">Long Searches</span>
814</h5>
815<table class="table">
816<colgroup>
817<col>
818<col>
819<col>
820<col>
821</colgroup>
822<thead><tr>
823<th>static xpressive</th>
824<th>dynamic xpressive</th>
825<th>Boost</th>
826<th>Expression</th>
827</tr></thead>
828<tbody>
829<tr>
830<td><span class="highlight">1<p></p>(0.019s)</span></td>
831<td><span class="highlight">1<p></p>(0.019s)</span></td>
832<td>2.98<p></p>(0.0566s)</td>
833<td><code class="literal">Twain</code></td>
834</tr>
835<tr>
836<td><span class="highlight">1<p></p>(0.0176s)</span></td>
837<td><span class="highlight">1<p></p>(0.0176s)</span></td>
838<td>3.17<p></p>(0.0556s)</td>
839<td><code class="literal">Huck[[:alpha:]]+</code></td>
840</tr>
841<tr>
842<td>3.62<p></p>(1.78s)</td>
843<td>3.97<p></p>(1.95s)</td>
844<td><span class="highlight">1<p></p>(0.492s)</span></td>
845<td><code class="literal">[[:alpha:]]+ing</code></td>
846</tr>
847<tr>
848<td>2.32<p></p>(0.344s)</td>
849<td>3.06<p></p>(0.453s)</td>
850<td><span class="highlight">1<p></p>(0.148s)</span></td>
851<td><code class="literal">^[^
852]*?Twain</code></td>
853</tr>
854<tr>
855<td><span class="highlight">1<p></p>(0.0576s)</span></td>
856<td><span class="highlight">1.05<p></p>(0.0606s)</span></td>
857<td>1.15<p></p>(0.0664s)</td>
858<td><code class="literal">Tom|Sawyer|Huckleberry|Finn</code></td>
859</tr>
860<tr>
861<td>1.24<p></p>(0.164s)</td>
862<td>1.44<p></p>(0.191s)</td>
863<td><span class="highlight">1<p></p>(0.133s)</span></td>
864<td><code class="literal">(Tom|Sawyer|Huckleberry|Finn).{0,30}river|river.{0,30}(Tom|Sawyer|Huckleberry|Finn)</code></td>
865</tr>
866</tbody>
867</table>
868</div>
869</div>
870</div>
871<div class="section">
872<div class="titlepage"><div><div><h3 class="title">
873<a name="xpressive.appendices.appendix_5__implementation_notes"></a><a class="link" href="appendices.html#xpressive.appendices.appendix_5__implementation_notes" title="Appendix 5: Implementation Notes">Appendix
874      5: Implementation Notes</a>
875</h3></div></div></div>
876<div class="toc"><dl class="toc"><dt><span class="section"><a href="appendices.html#boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___">Cycle
877        collection with <code class="literal">tracking_ptr&lt;&gt;</code></a></span></dt></dl></div>
878<div class="section">
879<div class="titlepage"><div><div><h4 class="title">
880<a name="boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___"></a><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___" title="Cycle collection with tracking_ptr&lt;&gt;">Cycle
881        collection with <code class="literal">tracking_ptr&lt;&gt;</code></a>
882</h4></div></div></div>
883<p>
884          In xpressive, regex objects can refer to each other and themselves by value
885          or by reference. In addition, they ref-count their referenced regexes to
886          keep them alive. This creates the possibility for cyclic reference counts,
887          and raises the possibility of memory leaks. xpressive avoids leaks by using
888          a type called <code class="computeroutput"><span class="identifier">tracking_ptr</span><span class="special">&lt;&gt;</span></code>. This doc describes at a high level
889          how <code class="computeroutput"><span class="identifier">tracking_ptr</span><span class="special">&lt;&gt;</span></code>
890          works.
891        </p>
892<h3>
893<a name="boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.h0"></a>
894          <span class="phrase"><a name="boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.constraints"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.constraints">Constraints</a>
895        </h3>
896<p>
897          Our solution must meet the following design constraints:
898        </p>
899<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
900<li class="listitem">
901              No dangling references: All objects referred to directly or indirectly
902              must be kept alive as long as the references are needed.
903            </li>
904<li class="listitem">
905              No leaks: all objects must be freed eventually.
906            </li>
907<li class="listitem">
908              No user intervention: The solution must not require users to explicitly
909              invoke some cycle collection routine.
910            </li>
911<li class="listitem">
912              Clean-up is no-throw: The collection phase will likely be called from
913              a destructor, so it must never throw an exception under any circumstance.
914            </li>
915</ul></div>
916<h3>
917<a name="boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.h1"></a>
918          <span class="phrase"><a name="boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.handle_body_idiom"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.handle_body_idiom">Handle-Body
919          Idiom</a>
920        </h3>
921<p>
922          To use <code class="computeroutput"><span class="identifier">tracking_ptr</span><span class="special">&lt;&gt;</span></code>,
923          you must separate your type into a handle and a body. In the case of xpressive,
924          the handle type is called <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special">&lt;&gt;</span></code> and the body is called <code class="computeroutput"><span class="identifier">regex_impl</span><span class="special">&lt;&gt;</span></code>.
925          The handle will store a <code class="computeroutput"><span class="identifier">tracking_ptr</span><span class="special">&lt;&gt;</span></code> to the body.
926        </p>
927<p>
928          The body type must inherit from <code class="computeroutput"><span class="identifier">enable_reference_tracking</span><span class="special">&lt;&gt;</span></code>. This gives the body the bookkeeping
929          data structures that <code class="computeroutput"><span class="identifier">tracking_ptr</span><span class="special">&lt;&gt;</span></code> will use. In particular, it gives
930          the body:
931        </p>
932<div class="orderedlist"><ol class="orderedlist" type="1">
933<li class="listitem">
934              <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">set</span><span class="special">&lt;</span><span class="identifier">shared_ptr</span><span class="special">&lt;</span><span class="identifier">body</span><span class="special">&gt;</span>
935              <span class="special">&gt;</span> <span class="identifier">refs_</span></code>
936              : collection of bodies to which this body refers, and
937            </li>
938<li class="listitem">
939              <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">set</span><span class="special">&lt;</span><span class="identifier">weak_ptr</span><span class="special">&lt;</span><span class="identifier">body</span><span class="special">&gt;</span>
940              <span class="special">&gt;</span> <span class="identifier">deps_</span></code>
941              : collection of bodies which refer to this body.
942            </li>
943</ol></div>
944<h3>
945<a name="boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.h2"></a>
946          <span class="phrase"><a name="boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.references_and_dependencies"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.references_and_dependencies">References
947          and Dependencies</a>
948        </h3>
949<p>
950          We refer to (1) above as the "references" and (2) as the "dependencies".
951          It is crucial to the understanding of <code class="computeroutput"><span class="identifier">tracking_ptr</span><span class="special">&lt;&gt;</span></code> to recognize that the set of references
952          includes both those objects that are referred to directly as well as those
953          that are referred to indirectly (that is, through another reference). The
954          same is true for the set of dependencies. In other words, each body holds
955          a ref-count directly to every other body that it needs.
956        </p>
957<p>
958          Why is this important? Because it means that when a body no longer has
959          a handle referring to it, all its references can be released immediately
960          without fear of creating dangling references.
961        </p>
962<p>
963          References and dependencies cross-pollinate. Here's how it works:
964        </p>
965<div class="orderedlist"><ol class="orderedlist" type="1">
966<li class="listitem">
967              When one object acquires another as a reference, the second object
968              acquires the first as a dependency.
969            </li>
970<li class="listitem">
971              In addition, the first object acquires all of the second object's references,
972              and the second object acquires all of the first object's dependencies.
973            </li>
974<li class="listitem">
975              When an object picks up a new reference, the reference is also added
976              to all dependent objects.
977            </li>
978<li class="listitem">
979              When an object picks up a new dependency, the dependency is also added
980              to all referenced objects.
981            </li>
982<li class="listitem">
983              An object is never allowed to have itself as a dependency. Objects
984              may have themselves as references, and often do.
985            </li>
986</ol></div>
987<p>
988          Consider the following code:
989        </p>
990<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">expr</span><span class="special">;</span>
991<span class="special">{</span>
992    <span class="identifier">sregex</span> <span class="identifier">group</span>  <span class="special">=</span> <span class="char">'('</span> <span class="special">&gt;&gt;</span> <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">expr</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="char">')'</span><span class="special">;</span>                 <span class="comment">// (1)</span>
993    <span class="identifier">sregex</span> <span class="identifier">fact</span>   <span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span> <span class="special">|</span> <span class="identifier">group</span><span class="special">;</span>                                <span class="comment">// (2)</span>
994    <span class="identifier">sregex</span> <span class="identifier">term</span>   <span class="special">=</span> <span class="identifier">fact</span> <span class="special">&gt;&gt;</span> <span class="special">*((</span><span class="char">'*'</span> <span class="special">&gt;&gt;</span> <span class="identifier">fact</span><span class="special">)</span> <span class="special">|</span> <span class="special">(</span><span class="char">'/'</span> <span class="special">&gt;&gt;</span> <span class="identifier">fact</span><span class="special">));</span>   <span class="comment">// (3)</span>
995    <span class="identifier">expr</span>          <span class="special">=</span> <span class="identifier">term</span> <span class="special">&gt;&gt;</span> <span class="special">*((</span><span class="char">'+'</span> <span class="special">&gt;&gt;</span> <span class="identifier">term</span><span class="special">)</span> <span class="special">|</span> <span class="special">(</span><span class="char">'-'</span> <span class="special">&gt;&gt;</span> <span class="identifier">term</span><span class="special">));</span>   <span class="comment">// (4)</span>
996<span class="special">}</span>                                                               <span class="comment">// (5)</span>
997</pre>
998<p>
999          Here is how the references and dependencies propagate, line by line:
1000        </p>
1001<div class="informaltable"><table class="table">
1002<colgroup>
1003<col>
1004<col>
1005</colgroup>
1006<thead><tr>
1007<th>
1008                  <p>
1009                    Expression
1010                  </p>
1011                </th>
1012<th>
1013                  <p>
1014                    Effects
1015                  </p>
1016                </th>
1017</tr></thead>
1018<tbody>
1019<tr>
1020<td>
1021                  <p>
1022                    1) <code class="computeroutput"><span class="identifier">sregex</span> <span class="identifier">group</span>
1023                    <span class="special">=</span> <span class="char">'('</span>
1024                    <span class="special">&gt;&gt;</span> <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">expr</span><span class="special">)</span> <span class="special">&gt;&gt;</span>
1025                    <span class="char">')'</span><span class="special">;</span></code>
1026                  </p>
1027                </td>
1028<td>
1029                  <p>
1030                    <code class="literal">group: cnt=1 refs={expr} deps={}<br> expr: cnt=2 refs={}
1031                    deps={group}</code>
1032                  </p>
1033                </td>
1034</tr>
1035<tr>
1036<td>
1037                  <p>
1038                    2) <code class="computeroutput"><span class="identifier">sregex</span> <span class="identifier">fact</span>
1039                    <span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span> <span class="special">|</span>
1040                    <span class="identifier">group</span><span class="special">;</span></code>
1041                  </p>
1042                </td>
1043<td>
1044                  <p>
1045                    <code class="literal">group: cnt=2 refs={expr} deps={fact}<br> expr: cnt=3
1046                    refs={} deps={group,fact}<br> fact: cnt=1 refs={expr,group}
1047                    deps={}</code>
1048                  </p>
1049                </td>
1050</tr>
1051<tr>
1052<td>
1053                  <p>
1054                    3) <code class="computeroutput"><span class="identifier">sregex</span> <span class="identifier">term</span>
1055                    <span class="special">=</span> <span class="identifier">fact</span>
1056                    <span class="special">&gt;&gt;</span> <span class="special">*((</span><span class="char">'*'</span> <span class="special">&gt;&gt;</span>
1057                    <span class="identifier">fact</span><span class="special">)</span>
1058                    <span class="special">|</span> <span class="special">(</span><span class="char">'/'</span> <span class="special">&gt;&gt;</span>
1059                    <span class="identifier">fact</span><span class="special">));</span></code>
1060                  </p>
1061                </td>
1062<td>
1063                  <p>
1064                    <code class="literal">group: cnt=3 refs={expr} deps={fact,term}<br> expr:
1065                    cnt=4 refs={} deps={group,fact,term}<br> fact: cnt=2 refs={expr,group}
1066                    deps={term}<br> term: cnt=1 refs={expr,group,fact} deps={}</code>
1067                  </p>
1068                </td>
1069</tr>
1070<tr>
1071<td>
1072                  <p>
1073                    4) <code class="computeroutput"><span class="identifier">expr</span> <span class="special">=</span>
1074                    <span class="identifier">term</span> <span class="special">&gt;&gt;</span>
1075                    <span class="special">*((</span><span class="char">'+'</span>
1076                    <span class="special">&gt;&gt;</span> <span class="identifier">term</span><span class="special">)</span> <span class="special">|</span>
1077                    <span class="special">(</span><span class="char">'-'</span>
1078                    <span class="special">&gt;&gt;</span> <span class="identifier">term</span><span class="special">));</span></code>
1079                  </p>
1080                </td>
1081<td>
1082                  <p>
1083                    <code class="literal">group: cnt=5 refs={expr,group,fact,term} deps={expr,fact,term}<br>
1084                    expr: cnt=5 refs={expr,group,fact,term} deps={group,fact,term}<br>
1085                    fact: cnt=5 refs={expr,group,fact,term} deps={expr,group,term}<br>
1086                    term: cnt=5 refs={expr,group,fact,term} deps={expr,group,fact}</code>
1087                  </p>
1088                </td>
1089</tr>
1090<tr>
1091<td>
1092                  <p>
1093                    5) <code class="computeroutput"><span class="special">}</span></code>
1094                  </p>
1095                </td>
1096<td>
1097                  <p>
1098                    <code class="literal">expr: cnt=2 refs={expr,group,fact,term} deps={group,fact,term}</code>
1099                  </p>
1100                </td>
1101</tr>
1102</tbody>
1103</table></div>
1104<p>
1105          This shows how references and dependencies propagate when creating cycles
1106          of objects. After line (4), which closes the cycle, every object has a
1107          ref-count on every other object, even to itself. So how does this not leak?
1108          Read on.
1109        </p>
1110<h3>
1111<a name="boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.h3"></a>
1112          <span class="phrase"><a name="boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.cycle_breaking"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.cycle_breaking">Cycle
1113          Breaking</a>
1114        </h3>
1115<p>
1116          Now that the bodies have their sets of references and dependencies, the
1117          hard part is done. All that remains is to decide when and where to break
1118          the cycle. That is the job of <code class="computeroutput"><span class="identifier">tracking_ptr</span><span class="special">&lt;&gt;</span></code>, which is part of the handle. The
1119          <code class="computeroutput"><span class="identifier">tracking_ptr</span><span class="special">&lt;&gt;</span></code>
1120          holds 2 <code class="computeroutput"><span class="identifier">shared_ptr</span></code>s. The
1121          first, obviously, is the <code class="computeroutput"><span class="identifier">shared_ptr</span><span class="special">&lt;</span><span class="identifier">body</span><span class="special">&gt;</span></code> -- the reference to the body to which
1122          this handle refers. The other <code class="computeroutput"><span class="identifier">shared_ptr</span></code>
1123          is used to break the cycle. It ensures that when all the handles to a body
1124          go out of scope, the body's set of references is cleared.
1125        </p>
1126<p>
1127          This suggests that more than one handle can refer to a body. In fact,
1128          <code class="computeroutput"><span class="identifier">tracking_ptr</span><span class="special">&lt;&gt;</span></code>
1129          gives you copy-on-write semantics -- when you copy a handle, the body is
1130          shared. That makes copies very efficient. Eventually, all the handles to
1131          a particular body go out of scope. When that happens, the ref count to
1132          the body might still be greater than 0 because some other body (or this
1133          body itself!) might be holding a reference to it. However, we are certain
1134          that the cycle-breaker's ref-count goes to 0 because the cycle-breaker
1135          only lives in handles. No more handles, no more cycle-breakers.
1136        </p>
1137<p>
1138          What does the cycle-breaker do? Recall that the body has a set of references
1139          of type <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">set</span><span class="special">&lt;</span><span class="identifier">shared_ptr</span><span class="special">&lt;</span><span class="identifier">body</span><span class="special">&gt;</span> <span class="special">&gt;</span></code>. Let's call this type "references_type".
1140          The cycle-breaker is a <code class="computeroutput"><span class="identifier">shared_ptr</span><span class="special">&lt;</span><span class="identifier">references_type</span><span class="special">&gt;</span></code>. It uses a custom deleter, which is
1141          defined as follows:
1142        </p>
1143<pre class="programlisting"><span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">DerivedT</span><span class="special">&gt;</span>
1144<span class="keyword">struct</span> <span class="identifier">reference_deleter</span>
1145<span class="special">{</span>
1146    <span class="keyword">void</span> <span class="keyword">operator</span> <span class="special">()(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">set</span><span class="special">&lt;</span><span class="identifier">shared_ptr</span><span class="special">&lt;</span><span class="identifier">DerivedT</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="special">*</span><span class="identifier">refs</span><span class="special">)</span> <span class="keyword">const</span>
1147    <span class="special">{</span>
1148        <span class="identifier">refs</span><span class="special">-&gt;</span><span class="identifier">clear</span><span class="special">();</span>
1149    <span class="special">}</span>
1150<span class="special">};</span>
1151</pre>
1152<p>
1153          The job of to the cycle breaker is to ensure that when the last handle
1154          to a body goes away, the body's set of references is cleared. That's it.
1155        </p>
1156<p>
1157          We can clearly see how this guarantees that all bodies are cleaned up eventually.
1158          Once every handle has gone out of scope, all the bodies' sets of references
1159          will be cleared, leaving none with a non-zero ref-count. No leaks, guaranteed.
1160        </p>
1161<p>
1162          It's a bit harder to see how this guarantees no dangling references. Imagine
1163          that there are 3 bodies: A, B and C. A refers to B which refers to C. Now
1164          all the handles to B go out of scope, so B's set of references is cleared.
1165          Doesn't this mean that C gets deleted, even though it is being used (indirectly)
1166          by A? It doesn't. This situation can never occur because we propagated
1167          the references and dependencies above such that A will be holding a reference
1168          directly to C in addition to B. When B's set of references is cleared,
1169          no bodies get deleted, because they are all still in use by A.
1170        </p>
1171<h3>
1172<a name="boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.h4"></a>
1173          <span class="phrase"><a name="boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.future_work"></a></span><a class="link" href="appendices.html#boost_xpressive.appendices.appendix_5__implementation_notes.cycle_collection_with___tracking_ptr___.future_work">Future
1174          Work</a>
1175        </h3>
1176<p>
1177          All these <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">set</span></code>s and <code class="computeroutput"><span class="identifier">shared_ptr</span></code>s
1178          and <code class="computeroutput"><span class="identifier">weak_ptr</span></code>s! Very inefficient.
1179          I used them because they were handy. I could probably do better.
1180        </p>
1181<p>
1182          Also, some objects stick around longer than they need to. Consider:
1183        </p>
1184<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">b</span><span class="special">;</span>
1185<span class="special">{</span>
1186    <span class="identifier">sregex</span> <span class="identifier">a</span> <span class="special">=</span> <span class="identifier">_</span><span class="special">;</span>
1187    <span class="identifier">b</span> <span class="special">=</span> <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">a</span><span class="special">);</span>
1188    <span class="identifier">b</span> <span class="special">=</span> <span class="identifier">_</span><span class="special">;</span>
1189<span class="special">}</span>
1190<span class="comment">// a is still alive here!</span>
1191</pre>
1192<p>
1193          Due to the way references and dependencies are propagated, the <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">set</span></code> of references can only grow. It never
1194          shrinks, even when some references are no longer needed. For xpressive
1195          this isn't an issue. The graphs of referential objects generally stay small
1196          and isolated. If someone were to try to use <code class="computeroutput"><span class="identifier">tracking_ptr</span><span class="special">&lt;&gt;</span></code> as a general ref-count-cycle-collection
1197          mechanism, this problem would have to be addressed.
1198        </p>
1199</div>
1200</div>
1201</div>
1202<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
1203<td align="left"></td>
1204<td align="right"><div class="copyright-footer">Copyright © 2007 Eric Niebler<p>
1205        Distributed under the Boost Software License, Version 1.0. (See accompanying
1206        file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
1207      </p>
1208</div></td>
1209</tr></table>
1210<hr>
1211<div class="spirit-nav">
1212<a accesskey="p" href="../boost_xpressive/acknowledgments.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../xpressive.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../yap.html"><img src="../../../doc/src/images/next.png" alt="Next"></a>
1213</div>
1214</body>
1215</html>
1216