1<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> 2<html> 3<head> 4<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> 5<title>User's Guide</title> 6<link rel="stylesheet" href="../../../doc/src/boostbook.css" type="text/css"> 7<meta name="generator" content="DocBook XSL Stylesheets V1.79.1"> 8<link rel="home" href="../index.html" title="The Boost C++ Libraries BoostBook Documentation Subset"> 9<link rel="up" href="../xpressive.html" title="Chapter 46. Boost.Xpressive"> 10<link rel="prev" href="../xpressive.html" title="Chapter 46. Boost.Xpressive"> 11<link rel="next" href="reference.html" title="Reference"> 12</head> 13<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> 14<table cellpadding="2" width="100%"><tr> 15<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../boost.png"></td> 16<td align="center"><a href="../../../index.html">Home</a></td> 17<td align="center"><a href="../../../libs/libraries.htm">Libraries</a></td> 18<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td> 19<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td> 20<td align="center"><a href="../../../more/index.htm">More</a></td> 21</tr></table> 22<hr> 23<div class="spirit-nav"> 24<a accesskey="p" href="../xpressive.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../xpressive.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="reference.html"><img src="../../../doc/src/images/next.png" alt="Next"></a> 25</div> 26<div class="section"> 27<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 28<a name="xpressive.user_s_guide"></a><a class="link" href="user_s_guide.html" title="User's Guide">User's Guide</a> 29</h2></div></div></div> 30<div class="toc"><dl class="toc"> 31<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.introduction">Introduction</a></span></dt> 32<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive">Installing 33 xpressive</a></span></dt> 34<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start">Quick Start</a></span></dt> 35<dt><span class="section"><a href="user_s_guide.html#xpressive.user_s_guide.creating_a_regex_object">Creating 36 a Regex Object</a></span></dt> 37<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching">Matching 38 and Searching</a></span></dt> 39<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results">Accessing 40 Results</a></span></dt> 41<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions">String 42 Substitutions</a></span></dt> 43<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization">String 44 Splitting and Tokenization</a></span></dt> 45<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures">Named Captures</a></span></dt> 46<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches">Grammars 47 and Nested Matches</a></span></dt> 48<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions">Semantic 49 Actions and User-Defined Assertions</a></span></dt> 50<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes">Symbol 51 Tables and Attributes</a></span></dt> 52<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits">Localization 53 and Regex Traits</a></span></dt> 54<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks">Tips 'N Tricks</a></span></dt> 55<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.concepts">Concepts</a></span></dt> 56<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.examples">Examples</a></span></dt> 57</dl></div> 58<p> 59 This section describes how to use xpressive to accomplish text manipulation 60 and parsing tasks. If you are looking for detailed information regarding specific 61 components in xpressive, check the <a class="link" href="reference.html" title="Reference">Reference</a> 62 section. 63 </p> 64<div class="section"> 65<div class="titlepage"><div><div><h3 class="title"> 66<a name="boost_xpressive.user_s_guide.introduction"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.introduction" title="Introduction">Introduction</a> 67</h3></div></div></div> 68<h3> 69<a name="boost_xpressive.user_s_guide.introduction.h0"></a> 70 <span class="phrase"><a name="boost_xpressive.user_s_guide.introduction.what_is_xpressive_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.introduction.what_is_xpressive_">What 71 is xpressive?</a> 72 </h3> 73<p> 74 xpressive is a regular expression template library. Regular expressions (regexes) 75 can be written as strings that are parsed dynamically at runtime (dynamic 76 regexes), or as <span class="emphasis"><em>expression templates</em></span><a href="#ftn.boost_xpressive.user_s_guide.introduction.f0" class="footnote" name="boost_xpressive.user_s_guide.introduction.f0"><sup class="footnote">[36]</sup></a> that are parsed at compile-time (static regexes). Dynamic regexes 77 have the advantage that they can be accepted from the user as input at runtime 78 or read from an initialization file. Static regexes have several advantages. 79 Since they are C++ expressions instead of strings, they can be syntax-checked 80 at compile-time. Also, they can naturally refer to code and data elsewhere 81 in your program, giving you the ability to call back into your code from 82 within a regex match. Finally, since they are statically bound, the compiler 83 can generate faster code for static regexes. 84 </p> 85<p> 86 xpressive's dual nature is unique and powerful. Static xpressive is a bit 87 like the <a href="http://spirit.sourceforge.net" target="_top">Spirit Parser Framework</a>. 88 Like <a href="http://spirit.sourceforge.net" target="_top">Spirit</a>, you can build 89 grammars with static regexes using expression templates. (Unlike <a href="http://spirit.sourceforge.net" target="_top">Spirit</a>, 90 xpressive does exhaustive backtracking, trying every possibility to find 91 a match for your pattern.) Dynamic xpressive is a bit like <a href="../../../libs/regex" target="_top">Boost.Regex</a>. 92 In fact, xpressive's interface should be familiar to anyone who has used 93 <a href="../../../libs/regex" target="_top">Boost.Regex</a>. xpressive's innovation 94 comes from allowing you to mix and match static and dynamic regexes in the 95 same program, and even in the same expression! You can embed a dynamic regex 96 in a static regex, or <span class="emphasis"><em>vice versa</em></span>, and the embedded regex 97 will participate fully in the search, back-tracking as needed to make the 98 match succeed. 99 </p> 100<h3> 101<a name="boost_xpressive.user_s_guide.introduction.h1"></a> 102 <span class="phrase"><a name="boost_xpressive.user_s_guide.introduction.hello__world_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.introduction.hello__world_">Hello, 103 world!</a> 104 </h3> 105<p> 106 Enough theory. Let's have a look at <span class="emphasis"><em>Hello World</em></span>, xpressive 107 style: 108 </p> 109<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 110<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 111 112<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span> 113 114<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 115<span class="special">{</span> 116 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">hello</span><span class="special">(</span> <span class="string">"hello world!"</span> <span class="special">);</span> 117 118 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\w+) (\\w+)!"</span> <span class="special">);</span> 119 <span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span> 120 121 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">hello</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rex</span> <span class="special">)</span> <span class="special">)</span> 122 <span class="special">{</span> 123 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// whole match</span> 124 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// first capture</span> 125 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">2</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// second capture</span> 126 <span class="special">}</span> 127 128 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span> 129<span class="special">}</span> 130</pre> 131<p> 132 This program outputs the following: 133 </p> 134<pre class="programlisting">hello world! 135hello 136world 137</pre> 138<p> 139 The first thing you'll notice about the code is that all the types in xpressive 140 live in the <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span></code> namespace. 141 </p> 142<div class="note"><table border="0" summary="Note"> 143<tr> 144<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td> 145<th align="left">Note</th> 146</tr> 147<tr><td align="left" valign="top"><p> 148 Most of the rest of the examples in this document will leave off the <code class="computeroutput"><span class="keyword">using</span> <span class="keyword">namespace</span> 149 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span></code> 150 directive. Just pretend it's there. 151 </p></td></tr> 152</table></div> 153<p> 154 Next, you'll notice the type of the regular expression object is <code class="computeroutput"><span class="identifier">sregex</span></code>. If you are familiar with <a href="../../../libs/regex" target="_top">Boost.Regex</a>, this is different than what you 155 are used to. The "<code class="computeroutput"><span class="identifier">s</span></code>" 156 in "<code class="computeroutput"><span class="identifier">sregex</span></code>" stands 157 for "<code class="computeroutput"><span class="identifier">string</span></code>", indicating 158 that this regex can be used to find patterns in <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> 159 objects. I'll discuss this difference and its implications in detail later. 160 </p> 161<p> 162 Notice how the regex object is initialized: 163 </p> 164<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\w+) (\\w+)!"</span> <span class="special">);</span> 165</pre> 166<p> 167 To create a regular expression object from a string, you must call a factory 168 method such as <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id-1_3_47_5_18_2_1_1_9_1-bb">basic_regex<>::compile()</a></code></code>. 169 This is another area in which xpressive differs from other object-oriented 170 regular expression libraries. Other libraries encourage you to think of a 171 regular expression as a kind of string on steroids. In xpressive, regular 172 expressions are not strings; they are little programs in a domain-specific 173 language. Strings are only one <span class="emphasis"><em>representation</em></span> of that 174 language. Another representation is an expression template. For example, 175 the above line of code is equivalent to the following: 176 </p> 177<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">' '</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">'!'</span><span class="special">;</span> 178</pre> 179<p> 180 This describes the same regular expression, except it uses the domain-specific 181 embedded language defined by static xpressive. 182 </p> 183<p> 184 As you can see, static regexes have a syntax that is noticeably different 185 than standard Perl syntax. That is because we are constrained by C++'s syntax. 186 The biggest difference is the use of <code class="computeroutput"><span class="special">>></span></code> 187 to mean "followed by". For instance, in Perl you can just put sub-expressions 188 next to each other: 189 </p> 190<pre class="programlisting"><span class="identifier">abc</span> 191</pre> 192<p> 193 But in C++, there must be an operator separating sub-expressions: 194 </p> 195<pre class="programlisting"><span class="identifier">a</span> <span class="special">>></span> <span class="identifier">b</span> <span class="special">>></span> <span class="identifier">c</span> 196</pre> 197<p> 198 In Perl, parentheses <code class="computeroutput"><span class="special">()</span></code> have 199 special meaning. They group, but as a side-effect they also create back-references 200 like <code class="literal">$1</code> and <code class="literal">$2</code>. In C++, there is no 201 way to overload parentheses to give them side-effects. To get the same effect, 202 we use the special <code class="computeroutput"><span class="identifier">s1</span></code>, <code class="computeroutput"><span class="identifier">s2</span></code>, etc. tokens. Assign to one to create 203 a back-reference (known as a sub-match in xpressive). 204 </p> 205<p> 206 You'll also notice that the one-or-more repetition operator <code class="computeroutput"><span class="special">+</span></code> has moved from postfix to prefix position. 207 That's because C++ doesn't have a postfix <code class="computeroutput"><span class="special">+</span></code> 208 operator. So: 209 </p> 210<pre class="programlisting"><span class="string">"\\w+"</span> 211</pre> 212<p> 213 is the same as: 214 </p> 215<pre class="programlisting"><span class="special">+</span><span class="identifier">_w</span> 216</pre> 217<p> 218 We'll cover all the other differences <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes" title="Static Regexes">later</a>. 219 </p> 220</div> 221<div class="section"> 222<div class="titlepage"><div><div><h3 class="title"> 223<a name="boost_xpressive.user_s_guide.installing_xpressive"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive" title="Installing xpressive">Installing 224 xpressive</a> 225</h3></div></div></div> 226<h3> 227<a name="boost_xpressive.user_s_guide.installing_xpressive.h0"></a> 228 <span class="phrase"><a name="boost_xpressive.user_s_guide.installing_xpressive.getting_xpressive"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.getting_xpressive">Getting 229 xpressive</a> 230 </h3> 231<p> 232 There are two ways to get xpressive. The first and simplest is to download 233 the latest version of Boost. Just go to <a href="http://sf.net/projects/boost" target="_top">http://sf.net/projects/boost</a> 234 and follow the <span class="quote">“<span class="quote">Download</span>”</span> link. 235 </p> 236<p> 237 The second way is by directly accessing the Boost Subversion repository. 238 Just go to <a href="http://svn.boost.org/trac/boost/" target="_top">http://svn.boost.org/trac/boost/</a> 239 and follow the instructions there for anonymous Subversion access. The version 240 in Boost Subversion is unstable. 241 </p> 242<h3> 243<a name="boost_xpressive.user_s_guide.installing_xpressive.h1"></a> 244 <span class="phrase"><a name="boost_xpressive.user_s_guide.installing_xpressive.building_with_xpressive"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.building_with_xpressive">Building 245 with xpressive</a> 246 </h3> 247<p> 248 Xpressive is a header-only template library, which means you don't need to 249 alter your build scripts or link to any separate lib file to use it. All 250 you need to do is <code class="computeroutput"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code>. 251 If you are only using static regexes, you can improve compile times by only 252 including <code class="computeroutput"><span class="identifier">xpressive_static</span><span class="special">.</span><span class="identifier">hpp</span></code>. Likewise, 253 you can include <code class="computeroutput"><span class="identifier">xpressive_dynamic</span><span class="special">.</span><span class="identifier">hpp</span></code> if 254 you only plan on using dynamic regexes. 255 </p> 256<p> 257 If you would also like to use semantic actions or custom assertions with 258 your static regexes, you will need to additionally include <code class="computeroutput"><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span></code>. 259 </p> 260<h3> 261<a name="boost_xpressive.user_s_guide.installing_xpressive.h2"></a> 262 <span class="phrase"><a name="boost_xpressive.user_s_guide.installing_xpressive.requirements"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.requirements">Requirements</a> 263 </h3> 264<p> 265 Xpressive requires Boost version 1.34.1 or higher. 266 </p> 267<h3> 268<a name="boost_xpressive.user_s_guide.installing_xpressive.h3"></a> 269 <span class="phrase"><a name="boost_xpressive.user_s_guide.installing_xpressive.supported_compilers"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.supported_compilers">Supported 270 Compilers</a> 271 </h3> 272<p> 273 Currently, Boost.Xpressive is known to work on the following compilers: 274 </p> 275<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> 276<li class="listitem"> 277 Visual C++ 7.1 and higher 278 </li> 279<li class="listitem"> 280 GNU C++ 3.4 and higher 281 </li> 282<li class="listitem"> 283 Intel for Linux 8.1 and higher 284 </li> 285<li class="listitem"> 286 Intel for Windows 10 and higher 287 </li> 288<li class="listitem"> 289 tru64cxx 71 and higher 290 </li> 291<li class="listitem"> 292 MinGW 3.4 and higher 293 </li> 294<li class="listitem"> 295 HP C/aC++ A.06.14 296 </li> 297</ul></div> 298<p> 299 Check the latest tests results at Boost's <a href="http://beta.boost.org/development/tests/trunk/developer/xpressive.html" target="_top">Regression 300 Results Page</a>. 301 </p> 302<div class="note"><table border="0" summary="Note"> 303<tr> 304<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td> 305<th align="left">Note</th> 306</tr> 307<tr><td align="left" valign="top"><p> 308 Please send any questions, comments and bug reports to eric <at> 309 boost-consulting <dot> com. 310 </p></td></tr> 311</table></div> 312</div> 313<div class="section"> 314<div class="titlepage"><div><div><h3 class="title"> 315<a name="boost_xpressive.user_s_guide.quick_start"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start" title="Quick Start">Quick Start</a> 316</h3></div></div></div> 317<p> 318 You don't need to know much to start being productive with xpressive. Let's 319 begin with the nickel tour of the types and algorithms xpressive provides. 320 </p> 321<div class="table"> 322<a name="boost_xpressive.user_s_guide.quick_start.t0"></a><p class="title"><b>Table 46.1. xpressive's Tool-Box</b></p> 323<div class="table-contents"><table class="table" summary="xpressive's Tool-Box"> 324<colgroup> 325<col> 326<col> 327</colgroup> 328<thead><tr> 329<th> 330 <p> 331 Tool 332 </p> 333 </th> 334<th> 335 <p> 336 Description 337 </p> 338 </th> 339</tr></thead> 340<tbody> 341<tr> 342<td> 343 <p> 344 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 345 </p> 346 </td> 347<td> 348 <p> 349 Contains a compiled regular expression. <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 350 is the most important type in xpressive. Everything you do with 351 xpressive will begin with creating an object of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>. 352 </p> 353 </td> 354</tr> 355<tr> 356<td> 357 <p> 358 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>, 359 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 360 </p> 361 </td> 362<td> 363 <p> 364 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 365 contains the results of a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 366 or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 367 operation. It acts like a vector of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 368 objects. A <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 369 object contains a marked sub-expression (also known as a back-reference 370 in Perl). It is basically just a pair of iterators representing 371 the begin and end of the marked sub-expression. 372 </p> 373 </td> 374</tr> 375<tr> 376<td> 377 <p> 378 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 379 </p> 380 </td> 381<td> 382 <p> 383 Checks to see if a string matches a regex. For <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 384 to succeed, the <span class="emphasis"><em>whole string</em></span> must match the 385 regex, from beginning to end. If you give <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 386 a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>, 387 it will write into it any marked sub-expressions it finds. 388 </p> 389 </td> 390</tr> 391<tr> 392<td> 393 <p> 394 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 395 </p> 396 </td> 397<td> 398 <p> 399 Searches a string to find a sub-string that matches the regex. 400 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 401 will try to find a match at every position in the string, starting 402 at the beginning, and stopping when it finds a match or when the 403 string is exhausted. As with <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>, 404 if you give <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 405 a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>, 406 it will write into it any marked sub-expressions it finds. 407 </p> 408 </td> 409</tr> 410<tr> 411<td> 412 <p> 413 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code> 414 </p> 415 </td> 416<td> 417 <p> 418 Given an input string, a regex, and a substitution string, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code> 419 builds a new string by replacing those parts of the input string 420 that match the regex with the substitution string. The substitution 421 string can contain references to marked sub-expressions. 422 </p> 423 </td> 424</tr> 425<tr> 426<td> 427 <p> 428 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code> 429 </p> 430 </td> 431<td> 432 <p> 433 An STL-compatible iterator that makes it easy to find all the places 434 in a string that match a regex. Dereferencing a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code> 435 returns a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>. 436 Incrementing a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code> 437 finds the next match. 438 </p> 439 </td> 440</tr> 441<tr> 442<td> 443 <p> 444 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 445 </p> 446 </td> 447<td> 448 <p> 449 Like <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>, 450 except dereferencing a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 451 returns a string. By default, it will return the whole sub-string 452 that the regex matched, but it can be configured to return any 453 or all of the marked sub-expressions one at a time, or even the 454 parts of the string that <span class="emphasis"><em>didn't</em></span> match the 455 regex. 456 </p> 457 </td> 458</tr> 459<tr> 460<td> 461 <p> 462 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code> 463 </p> 464 </td> 465<td> 466 <p> 467 A factory for <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 468 objects. It "compiles" a string into a regular expression. 469 You will not usually have to deal directly with <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code> 470 because the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 471 class has a factory method that uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code> 472 internally. But if you need to do anything fancy like create a 473 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 474 object with a different <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>, 475 you will need to use a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code> 476 explicitly. 477 </p> 478 </td> 479</tr> 480</tbody> 481</table></div> 482</div> 483<br class="table-break"><p> 484 Now that you know a bit about the tools xpressive provides, you can pick 485 the right tool for you by answering the following two questions: 486 </p> 487<div class="orderedlist"><ol class="orderedlist" type="1"> 488<li class="listitem"> 489 What <span class="emphasis"><em>iterator</em></span> type will you use to traverse your 490 data? 491 </li> 492<li class="listitem"> 493 What do you want to <span class="emphasis"><em>do</em></span> to your data? 494 </li> 495</ol></div> 496<h3> 497<a name="boost_xpressive.user_s_guide.quick_start.h0"></a> 498 <span class="phrase"><a name="boost_xpressive.user_s_guide.quick_start.know_your_iterator_type"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start.know_your_iterator_type">Know 499 Your Iterator Type</a> 500 </h3> 501<p> 502 Most of the classes in xpressive are templates that are parameterized on 503 the iterator type. xpressive defines some common typedefs to make the job 504 of choosing the right types easier. You can use the table below to find the 505 right types based on the type of your iterator. 506 </p> 507<div class="table"> 508<a name="boost_xpressive.user_s_guide.quick_start.t1"></a><p class="title"><b>Table 46.2. xpressive Typedefs vs. Iterator Types</b></p> 509<div class="table-contents"><table class="table" summary="xpressive Typedefs vs. Iterator Types"> 510<colgroup> 511<col> 512<col> 513<col> 514<col> 515<col> 516</colgroup> 517<thead><tr> 518<th> 519 </th> 520<th> 521 <p> 522 std::string::const_iterator 523 </p> 524 </th> 525<th> 526 <p> 527 char const * 528 </p> 529 </th> 530<th> 531 <p> 532 std::wstring::const_iterator 533 </p> 534 </th> 535<th> 536 <p> 537 wchar_t const * 538 </p> 539 </th> 540</tr></thead> 541<tbody> 542<tr> 543<td> 544 <p> 545 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 546 </p> 547 </td> 548<td> 549 <p> 550 <code class="computeroutput"><span class="identifier">sregex</span></code> 551 </p> 552 </td> 553<td> 554 <p> 555 <code class="computeroutput"><span class="identifier">cregex</span></code> 556 </p> 557 </td> 558<td> 559 <p> 560 <code class="computeroutput"><span class="identifier">wsregex</span></code> 561 </p> 562 </td> 563<td> 564 <p> 565 <code class="computeroutput"><span class="identifier">wcregex</span></code> 566 </p> 567 </td> 568</tr> 569<tr> 570<td> 571 <p> 572 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 573 </p> 574 </td> 575<td> 576 <p> 577 <code class="computeroutput"><span class="identifier">smatch</span></code> 578 </p> 579 </td> 580<td> 581 <p> 582 <code class="computeroutput"><span class="identifier">cmatch</span></code> 583 </p> 584 </td> 585<td> 586 <p> 587 <code class="computeroutput"><span class="identifier">wsmatch</span></code> 588 </p> 589 </td> 590<td> 591 <p> 592 <code class="computeroutput"><span class="identifier">wcmatch</span></code> 593 </p> 594 </td> 595</tr> 596<tr> 597<td> 598 <p> 599 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code> 600 </p> 601 </td> 602<td> 603 <p> 604 <code class="computeroutput"><span class="identifier">sregex_compiler</span></code> 605 </p> 606 </td> 607<td> 608 <p> 609 <code class="computeroutput"><span class="identifier">cregex_compiler</span></code> 610 </p> 611 </td> 612<td> 613 <p> 614 <code class="computeroutput"><span class="identifier">wsregex_compiler</span></code> 615 </p> 616 </td> 617<td> 618 <p> 619 <code class="computeroutput"><span class="identifier">wcregex_compiler</span></code> 620 </p> 621 </td> 622</tr> 623<tr> 624<td> 625 <p> 626 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code> 627 </p> 628 </td> 629<td> 630 <p> 631 <code class="computeroutput"><span class="identifier">sregex_iterator</span></code> 632 </p> 633 </td> 634<td> 635 <p> 636 <code class="computeroutput"><span class="identifier">cregex_iterator</span></code> 637 </p> 638 </td> 639<td> 640 <p> 641 <code class="computeroutput"><span class="identifier">wsregex_iterator</span></code> 642 </p> 643 </td> 644<td> 645 <p> 646 <code class="computeroutput"><span class="identifier">wcregex_iterator</span></code> 647 </p> 648 </td> 649</tr> 650<tr> 651<td> 652 <p> 653 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 654 </p> 655 </td> 656<td> 657 <p> 658 <code class="computeroutput"><span class="identifier">sregex_token_iterator</span></code> 659 </p> 660 </td> 661<td> 662 <p> 663 <code class="computeroutput"><span class="identifier">cregex_token_iterator</span></code> 664 </p> 665 </td> 666<td> 667 <p> 668 <code class="computeroutput"><span class="identifier">wsregex_token_iterator</span></code> 669 </p> 670 </td> 671<td> 672 <p> 673 <code class="computeroutput"><span class="identifier">wcregex_token_iterator</span></code> 674 </p> 675 </td> 676</tr> 677</tbody> 678</table></div> 679</div> 680<br class="table-break"><p> 681 You should notice the systematic naming convention. Many of these types are 682 used together, so the naming convention helps you to use them consistently. 683 For instance, if you have a <code class="computeroutput"><span class="identifier">sregex</span></code>, 684 you should also be using a <code class="computeroutput"><span class="identifier">smatch</span></code>. 685 </p> 686<p> 687 If you are not using one of those four iterator types, then you can use the 688 templates directly and specify your iterator type. 689 </p> 690<h3> 691<a name="boost_xpressive.user_s_guide.quick_start.h1"></a> 692 <span class="phrase"><a name="boost_xpressive.user_s_guide.quick_start.know_your_task"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start.know_your_task">Know Your 693 Task</a> 694 </h3> 695<p> 696 Do you want to find a pattern once? Many times? Search and replace? xpressive 697 has tools for all that and more. Below is a quick reference: 698 </p> 699<div class="table"> 700<a name="boost_xpressive.user_s_guide.quick_start.t2"></a><p class="title"><b>Table 46.3. Tasks and Tools</b></p> 701<div class="table-contents"><table class="table" summary="Tasks and Tools"> 702<colgroup> 703<col> 704<col> 705</colgroup> 706<thead><tr> 707<th> 708 <p> 709 To do this ... 710 </p> 711 </th> 712<th> 713 <p> 714 Use this ... 715 </p> 716 </th> 717</tr></thead> 718<tbody> 719<tr> 720<td> 721 <p> 722 <span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex">See 723 if a whole string matches a regex</a> 724 </p> 725 </td> 726<td> 727 <p> 728 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 729 algorithm 730 </p> 731 </td> 732</tr> 733<tr> 734<td> 735 <p> 736 <span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex">See 737 if a string contains a sub-string that matches a regex</a> 738 </p> 739 </td> 740<td> 741 <p> 742 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 743 algorithm 744 </p> 745 </td> 746</tr> 747<tr> 748<td> 749 <p> 750 <span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex">Replace 751 all sub-strings that match a regex</a> 752 </p> 753 </td> 754<td> 755 <p> 756 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code> 757 algorithm 758 </p> 759 </td> 760</tr> 761<tr> 762<td> 763 <p> 764 <span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.find_all_the_sub_strings_that_match_a_regex_and_step_through_them_one_at_a_time">Find 765 all the sub-strings that match a regex and step through them one 766 at a time</a> 767 </p> 768 </td> 769<td> 770 <p> 771 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code> 772 class 773 </p> 774 </td> 775</tr> 776<tr> 777<td> 778 <p> 779 <span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_into_tokens_that_each_match_a_regex">Split 780 a string into tokens that each match a regex</a> 781 </p> 782 </td> 783<td> 784 <p> 785 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 786 class 787 </p> 788 </td> 789</tr> 790<tr> 791<td> 792 <p> 793 <span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_using_a_regex_as_a_delimiter">Split 794 a string using a regex as a delimiter</a> 795 </p> 796 </td> 797<td> 798 <p> 799 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 800 class 801 </p> 802 </td> 803</tr> 804</tbody> 805</table></div> 806</div> 807<br class="table-break"><p> 808 These algorithms and classes are described in excruciating detail in the 809 Reference section. 810 </p> 811<div class="tip"><table border="0" summary="Tip"> 812<tr> 813<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td> 814<th align="left">Tip</th> 815</tr> 816<tr><td align="left" valign="top"><p> 817 Try clicking on a task in the table above to see a complete example program 818 that uses xpressive to solve that particular task. 819 </p></td></tr> 820</table></div> 821</div> 822<div class="section"> 823<div class="titlepage"><div><div><h3 class="title"> 824<a name="xpressive.user_s_guide.creating_a_regex_object"></a><a class="link" href="user_s_guide.html#xpressive.user_s_guide.creating_a_regex_object" title="Creating a Regex Object">Creating 825 a Regex Object</a> 826</h3></div></div></div> 827<div class="toc"><dl class="toc"> 828<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes">Static 829 Regexes</a></span></dt> 830<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes">Dynamic 831 Regexes</a></span></dt> 832</dl></div> 833<p> 834 When using xpressive, the first thing you'll do is create a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 835 object. This section goes over the nuts and bolts of building a regular expression 836 in the two dialects xpressive supports: static and dynamic. 837 </p> 838<div class="section"> 839<div class="titlepage"><div><div><h4 class="title"> 840<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes" title="Static Regexes">Static 841 Regexes</a> 842</h4></div></div></div> 843<h3> 844<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h0"></a> 845 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.overview">Overview</a> 846 </h3> 847<p> 848 The feature that really sets xpressive apart from other C/C++ regular expression 849 libraries is the ability to author a regular expression using C++ expressions. 850 xpressive achieves this through operator overloading, using a technique 851 called <span class="emphasis"><em>expression templates</em></span> to embed a mini-language 852 dedicated to pattern matching within C++. These "static regexes" 853 have many advantages over their string-based brethren. In particular, static 854 regexes: 855 </p> 856<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> 857<li class="listitem"> 858 are syntax-checked at compile-time; they will never fail at run-time 859 due to a syntax error. 860 </li> 861<li class="listitem"> 862 can naturally refer to other C++ data and code, including other regexes, 863 making it simple to build grammars out of regular expressions and bind 864 user-defined actions that execute when parts of your regex match. 865 </li> 866<li class="listitem"> 867 are statically bound for better inlining and optimization. Static regexes 868 require no state tables, virtual functions, byte-code or calls through 869 function pointers that cannot be resolved at compile time. 870 </li> 871<li class="listitem"> 872 are not limited to searching for patterns in strings. You can declare 873 a static regex that finds patterns in an array of integers, for instance. 874 </li> 875</ul></div> 876<p> 877 Since we compose static regexes using C++ expressions, we are constrained 878 by the rules for legal C++ expressions. Unfortunately, that means that 879 "classic" regular expression syntax cannot always be mapped cleanly 880 into C++. Rather, we map the regex <span class="emphasis"><em>constructs</em></span>, picking 881 new syntax that is legal C++. 882 </p> 883<h3> 884<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h1"></a> 885 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.construction_and_assignment"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.construction_and_assignment">Construction 886 and Assignment</a> 887 </h3> 888<p> 889 You create a static regex by assigning one to an object of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>. 890 For instance, the following defines a regex that can be used to find patterns 891 in objects of type <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>: 892 </p> 893<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="char">'$'</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_d</span> <span class="special">>></span> <span class="char">'.'</span> <span class="special">>></span> <span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span><span class="special">;</span> 894</pre> 895<p> 896 Assignment works similarly. 897 </p> 898<h3> 899<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h2"></a> 900 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.character_and_string_literals"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.character_and_string_literals">Character 901 and String Literals</a> 902 </h3> 903<p> 904 In static regexes, character and string literals match themselves. For 905 instance, in the regex above, <code class="computeroutput"><span class="char">'$'</span></code> 906 and <code class="computeroutput"><span class="char">'.'</span></code> match the characters 907 <code class="computeroutput"><span class="char">'$'</span></code> and <code class="computeroutput"><span class="char">'.'</span></code> 908 respectively. Don't be confused by the fact that <code class="literal">$</code> and 909 <code class="literal">.</code> are meta-characters in Perl. In xpressive, literals 910 always represent themselves. 911 </p> 912<p> 913 When using literals in static regexes, you must take care that at least 914 one operand is not a literal. For instance, the following are <span class="emphasis"><em>not</em></span> 915 valid regexes: 916 </p> 917<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re1</span> <span class="special">=</span> <span class="char">'a'</span> <span class="special">>></span> <span class="char">'b'</span><span class="special">;</span> <span class="comment">// ERROR!</span> 918<span class="identifier">sregex</span> <span class="identifier">re2</span> <span class="special">=</span> <span class="special">+</span><span class="char">'a'</span><span class="special">;</span> <span class="comment">// ERROR!</span> 919</pre> 920<p> 921 The two operands to the binary <code class="computeroutput"><span class="special">>></span></code> 922 operator are both literals, and the operand of the unary <code class="computeroutput"><span class="special">+</span></code> operator is also a literal, so these statements 923 will call the native C++ binary right-shift and unary plus operators, respectively. 924 That's not what we want. To get operator overloading to kick in, at least 925 one operand must be a user-defined type. We can use xpressive's <code class="computeroutput"><span class="identifier">as_xpr</span><span class="special">()</span></code> 926 helper function to "taint" an expression with regex-ness, forcing 927 operator overloading to find the correct operators. The two regexes above 928 should be written as: 929 </p> 930<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re1</span> <span class="special">=</span> <span class="identifier">as_xpr</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span> <span class="special">>></span> <span class="char">'b'</span><span class="special">;</span> <span class="comment">// OK</span> 931<span class="identifier">sregex</span> <span class="identifier">re2</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">as_xpr</span><span class="special">(</span><span class="char">'a'</span><span class="special">);</span> <span class="comment">// OK</span> 932</pre> 933<h3> 934<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h3"></a> 935 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.sequencing_and_alternation"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.sequencing_and_alternation">Sequencing 936 and Alternation</a> 937 </h3> 938<p> 939 As you've probably already noticed, sub-expressions in static regexes must 940 be separated by the sequencing operator, <code class="computeroutput"><span class="special">>></span></code>. 941 You can read this operator as "followed by". 942 </p> 943<pre class="programlisting"><span class="comment">// Match an 'a' followed by a digit</span> 944<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="char">'a'</span> <span class="special">>></span> <span class="identifier">_d</span><span class="special">;</span> 945</pre> 946<p> 947 Alternation works just as it does in Perl with the <code class="computeroutput"><span class="special">|</span></code> 948 operator. You can read this operator as "or". For example: 949 </p> 950<pre class="programlisting"><span class="comment">// match a digit character or a word character one or more times</span> 951<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">|</span> <span class="identifier">_w</span> <span class="special">);</span> 952</pre> 953<h3> 954<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h4"></a> 955 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.grouping_and_captures"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.grouping_and_captures">Grouping 956 and Captures</a> 957 </h3> 958<p> 959 In Perl, parentheses <code class="computeroutput"><span class="special">()</span></code> have 960 special meaning. They group, but as a side-effect they also create back-references 961 like <code class="literal">$1</code> and <code class="literal">$2</code>. In C++, parentheses 962 only group -- there is no way to give them side-effects. To get the same 963 effect, we use the special <code class="computeroutput"><span class="identifier">s1</span></code>, 964 <code class="computeroutput"><span class="identifier">s2</span></code>, etc. tokens. Assigning 965 to one creates a back-reference. You can then use the back-reference later 966 in your expression, like using <code class="literal">\1</code> and <code class="literal">\2</code> 967 in Perl. For example, consider the following regex, which finds matching 968 HTML tags: 969 </p> 970<pre class="programlisting"><span class="string">"<(\\w+)>.*?</\\1>"</span> 971</pre> 972<p> 973 In static xpressive, this would be: 974 </p> 975<pre class="programlisting"><span class="char">'<'</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">'>'</span> <span class="special">>></span> <span class="special">-*</span><span class="identifier">_</span> <span class="special">>></span> <span class="string">"</"</span> <span class="special">>></span> <span class="identifier">s1</span> <span class="special">>></span> <span class="char">'>'</span> 976</pre> 977<p> 978 Notice how you capture a back-reference by assigning to <code class="computeroutput"><span class="identifier">s1</span></code>, 979 and then you use <code class="computeroutput"><span class="identifier">s1</span></code> later 980 in the pattern to find the matching end tag. 981 </p> 982<div class="tip"><table border="0" summary="Tip"> 983<tr> 984<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td> 985<th align="left">Tip</th> 986</tr> 987<tr><td align="left" valign="top"><p> 988 <span class="bold"><strong>Grouping without capturing a back-reference</strong></span> 989 <br> <br> In xpressive, if you just want grouping without capturing 990 a back-reference, you can just use <code class="computeroutput"><span class="special">()</span></code> 991 without <code class="computeroutput"><span class="identifier">s1</span></code>. That is the 992 equivalent of Perl's <code class="literal">(?:)</code> non-capturing grouping construct. 993 </p></td></tr> 994</table></div> 995<h3> 996<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h5"></a> 997 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.case_insensitivity_and_internationalization"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.case_insensitivity_and_internationalization">Case-Insensitivity 998 and Internationalization</a> 999 </h3> 1000<p> 1001 Perl lets you make part of your regular expression case-insensitive by 1002 using the <code class="literal">(?i:)</code> pattern modifier. xpressive also has 1003 a case-insensitivity pattern modifier, called <code class="computeroutput"><span class="identifier">icase</span></code>. 1004 You can use it as follows: 1005 </p> 1006<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="string">"this"</span> <span class="special">>></span> <span class="identifier">icase</span><span class="special">(</span> <span class="string">"that"</span> <span class="special">);</span> 1007</pre> 1008<p> 1009 In this regular expression, <code class="computeroutput"><span class="string">"this"</span></code> 1010 will be matched exactly, but <code class="computeroutput"><span class="string">"that"</span></code> 1011 will be matched irrespective of case. 1012 </p> 1013<p> 1014 Case-insensitive regular expressions raise the issue of internationalization: 1015 how should case-insensitive character comparisons be evaluated? Also, many 1016 character classes are locale-specific. Which characters are matched by 1017 <code class="computeroutput"><span class="identifier">digit</span></code> and which are matched 1018 by <code class="computeroutput"><span class="identifier">alpha</span></code>? The answer depends 1019 on the <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code> object the regular expression 1020 object is using. By default, all regular expression objects use the global 1021 locale. You can override the default by using the <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code> pattern modifier, as follows: 1022 </p> 1023<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">my_locale</span> <span class="special">=</span> <span class="comment">/* initialize a std::locale object */</span><span class="special">;</span> 1024<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span> <span class="identifier">my_locale</span> <span class="special">)(</span> <span class="special">+</span><span class="identifier">alpha</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">digit</span> <span class="special">);</span> 1025</pre> 1026<p> 1027 This regular expression will evaluate <code class="computeroutput"><span class="identifier">alpha</span></code> 1028 and <code class="computeroutput"><span class="identifier">digit</span></code> according to 1029 <code class="computeroutput"><span class="identifier">my_locale</span></code>. See the section 1030 on <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits" title="Localization and Regex Traits">Localization 1031 and Regex Traits</a> for more information about how to customize the 1032 behavior of your regexes. 1033 </p> 1034<h3> 1035<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h6"></a> 1036 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.static_xpressive_syntax_cheat_sheet"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.static_xpressive_syntax_cheat_sheet">Static 1037 xpressive Syntax Cheat Sheet</a> 1038 </h3> 1039<p> 1040 The table below lists the familiar regex constructs and their equivalents 1041 in static xpressive. 1042 </p> 1043<div class="table"> 1044<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.t0"></a><p class="title"><b>Table 46.4. Perl syntax vs. Static xpressive syntax</b></p> 1045<div class="table-contents"><table class="table" summary="Perl syntax vs. Static xpressive syntax"> 1046<colgroup> 1047<col> 1048<col> 1049<col> 1050</colgroup> 1051<thead><tr> 1052<th> 1053 <p> 1054 Perl 1055 </p> 1056 </th> 1057<th> 1058 <p> 1059 Static xpressive 1060 </p> 1061 </th> 1062<th> 1063 <p> 1064 Meaning 1065 </p> 1066 </th> 1067</tr></thead> 1068<tbody> 1069<tr> 1070<td> 1071 <p> 1072 <code class="literal">.</code> 1073 </p> 1074 </td> 1075<td> 1076 <p> 1077 <code class="computeroutput"><a class="link" href="../boost/xpressive/_.html" title="Global _">_</a></code> 1078 </p> 1079 </td> 1080<td> 1081 <p> 1082 any character (assuming Perl's /s modifier). 1083 </p> 1084 </td> 1085</tr> 1086<tr> 1087<td> 1088 <p> 1089 <code class="literal">ab</code> 1090 </p> 1091 </td> 1092<td> 1093 <p> 1094 <code class="computeroutput"><span class="identifier">a</span> <span class="special">>></span> 1095 <span class="identifier">b</span></code> 1096 </p> 1097 </td> 1098<td> 1099 <p> 1100 sequencing of <code class="literal">a</code> and <code class="literal">b</code> sub-expressions. 1101 </p> 1102 </td> 1103</tr> 1104<tr> 1105<td> 1106 <p> 1107 <code class="literal">a|b</code> 1108 </p> 1109 </td> 1110<td> 1111 <p> 1112 <code class="computeroutput"><span class="identifier">a</span> <span class="special">|</span> 1113 <span class="identifier">b</span></code> 1114 </p> 1115 </td> 1116<td> 1117 <p> 1118 alternation of <code class="literal">a</code> and <code class="literal">b</code> 1119 sub-expressions. 1120 </p> 1121 </td> 1122</tr> 1123<tr> 1124<td> 1125 <p> 1126 <code class="literal">(a)</code> 1127 </p> 1128 </td> 1129<td> 1130 <p> 1131 <code class="computeroutput"><span class="special">(</span><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s1</a><span class="special">=</span> <span class="identifier">a</span><span class="special">)</span></code> 1132 </p> 1133 </td> 1134<td> 1135 <p> 1136 group and capture a back-reference. 1137 </p> 1138 </td> 1139</tr> 1140<tr> 1141<td> 1142 <p> 1143 <code class="literal">(?:a)</code> 1144 </p> 1145 </td> 1146<td> 1147 <p> 1148 <code class="computeroutput"><span class="special">(</span><span class="identifier">a</span><span class="special">)</span></code> 1149 </p> 1150 </td> 1151<td> 1152 <p> 1153 group and do not capture a back-reference. 1154 </p> 1155 </td> 1156</tr> 1157<tr> 1158<td> 1159 <p> 1160 <code class="literal">\1</code> 1161 </p> 1162 </td> 1163<td> 1164 <p> 1165 <code class="computeroutput"><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s1</a></code> 1166 </p> 1167 </td> 1168<td> 1169 <p> 1170 a previously captured back-reference. 1171 </p> 1172 </td> 1173</tr> 1174<tr> 1175<td> 1176 <p> 1177 <code class="literal">a*</code> 1178 </p> 1179 </td> 1180<td> 1181 <p> 1182 <code class="computeroutput"><span class="special">*</span><span class="identifier">a</span></code> 1183 </p> 1184 </td> 1185<td> 1186 <p> 1187 zero or more times, greedy. 1188 </p> 1189 </td> 1190</tr> 1191<tr> 1192<td> 1193 <p> 1194 <code class="literal">a+</code> 1195 </p> 1196 </td> 1197<td> 1198 <p> 1199 <code class="computeroutput"><span class="special">+</span><span class="identifier">a</span></code> 1200 </p> 1201 </td> 1202<td> 1203 <p> 1204 one or more times, greedy. 1205 </p> 1206 </td> 1207</tr> 1208<tr> 1209<td> 1210 <p> 1211 <code class="literal">a?</code> 1212 </p> 1213 </td> 1214<td> 1215 <p> 1216 <code class="computeroutput"><span class="special">!</span><span class="identifier">a</span></code> 1217 </p> 1218 </td> 1219<td> 1220 <p> 1221 zero or one time, greedy. 1222 </p> 1223 </td> 1224</tr> 1225<tr> 1226<td> 1227 <p> 1228 <code class="literal">a{n,m}</code> 1229 </p> 1230 </td> 1231<td> 1232 <p> 1233 <code class="computeroutput"><a class="link" href="../boost/xpressive/repeat.html" title="Function repeat">repeat</a><span class="special"><</span><span class="identifier">n</span><span class="special">,</span><span class="identifier">m</span><span class="special">>(</span><span class="identifier">a</span><span class="special">)</span></code> 1234 </p> 1235 </td> 1236<td> 1237 <p> 1238 between <code class="literal">n</code> and <code class="literal">m</code> times, 1239 greedy. 1240 </p> 1241 </td> 1242</tr> 1243<tr> 1244<td> 1245 <p> 1246 <code class="literal">a*?</code> 1247 </p> 1248 </td> 1249<td> 1250 <p> 1251 <code class="computeroutput"><span class="special">-*</span><span class="identifier">a</span></code> 1252 </p> 1253 </td> 1254<td> 1255 <p> 1256 zero or more times, non-greedy. 1257 </p> 1258 </td> 1259</tr> 1260<tr> 1261<td> 1262 <p> 1263 <code class="literal">a+?</code> 1264 </p> 1265 </td> 1266<td> 1267 <p> 1268 <code class="computeroutput"><span class="special">-+</span><span class="identifier">a</span></code> 1269 </p> 1270 </td> 1271<td> 1272 <p> 1273 one or more times, non-greedy. 1274 </p> 1275 </td> 1276</tr> 1277<tr> 1278<td> 1279 <p> 1280 <code class="literal">a??</code> 1281 </p> 1282 </td> 1283<td> 1284 <p> 1285 <code class="computeroutput"><span class="special">-!</span><span class="identifier">a</span></code> 1286 </p> 1287 </td> 1288<td> 1289 <p> 1290 zero or one time, non-greedy. 1291 </p> 1292 </td> 1293</tr> 1294<tr> 1295<td> 1296 <p> 1297 <code class="literal">a{n,m}?</code> 1298 </p> 1299 </td> 1300<td> 1301 <p> 1302 <code class="computeroutput"><span class="special">-</span><a class="link" href="../boost/xpressive/repeat.html" title="Function repeat">repeat</a><span class="special"><</span><span class="identifier">n</span><span class="special">,</span><span class="identifier">m</span><span class="special">>(</span><span class="identifier">a</span><span class="special">)</span></code> 1303 </p> 1304 </td> 1305<td> 1306 <p> 1307 between <code class="literal">n</code> and <code class="literal">m</code> times, 1308 non-greedy. 1309 </p> 1310 </td> 1311</tr> 1312<tr> 1313<td> 1314 <p> 1315 <code class="literal">^</code> 1316 </p> 1317 </td> 1318<td> 1319 <p> 1320 <code class="computeroutput"><a class="link" href="../boost/xpressive/bos.html" title="Global bos">bos</a></code> 1321 </p> 1322 </td> 1323<td> 1324 <p> 1325 beginning of sequence assertion. 1326 </p> 1327 </td> 1328</tr> 1329<tr> 1330<td> 1331 <p> 1332 <code class="literal">$</code> 1333 </p> 1334 </td> 1335<td> 1336 <p> 1337 <code class="computeroutput"><a class="link" href="../boost/xpressive/eos.html" title="Global eos">eos</a></code> 1338 </p> 1339 </td> 1340<td> 1341 <p> 1342 end of sequence assertion. 1343 </p> 1344 </td> 1345</tr> 1346<tr> 1347<td> 1348 <p> 1349 <code class="literal">\b</code> 1350 </p> 1351 </td> 1352<td> 1353 <p> 1354 <code class="computeroutput"><a class="link" href="../boost/xpressive/_b.html" title="Global _b">_b</a></code> 1355 </p> 1356 </td> 1357<td> 1358 <p> 1359 word boundary assertion. 1360 </p> 1361 </td> 1362</tr> 1363<tr> 1364<td> 1365 <p> 1366 <code class="literal">\B</code> 1367 </p> 1368 </td> 1369<td> 1370 <p> 1371 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_b.html" title="Global _b">_b</a></code> 1372 </p> 1373 </td> 1374<td> 1375 <p> 1376 not word boundary assertion. 1377 </p> 1378 </td> 1379</tr> 1380<tr> 1381<td> 1382 <p> 1383 <code class="literal">\n</code> 1384 </p> 1385 </td> 1386<td> 1387 <p> 1388 <code class="computeroutput"><a class="link" href="../boost/xpressive/_n.html" title="Global _n">_n</a></code> 1389 </p> 1390 </td> 1391<td> 1392 <p> 1393 literal newline. 1394 </p> 1395 </td> 1396</tr> 1397<tr> 1398<td> 1399 <p> 1400 <code class="literal">.</code> 1401 </p> 1402 </td> 1403<td> 1404 <p> 1405 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_n.html" title="Global _n">_n</a></code> 1406 </p> 1407 </td> 1408<td> 1409 <p> 1410 any character except a literal newline (without Perl's /s modifier). 1411 </p> 1412 </td> 1413</tr> 1414<tr> 1415<td> 1416 <p> 1417 <code class="literal">\r?\n|\r</code> 1418 </p> 1419 </td> 1420<td> 1421 <p> 1422 <code class="computeroutput"><a class="link" href="../boost/xpressive/_ln.html" title="Global _ln">_ln</a></code> 1423 </p> 1424 </td> 1425<td> 1426 <p> 1427 logical newline. 1428 </p> 1429 </td> 1430</tr> 1431<tr> 1432<td> 1433 <p> 1434 <code class="literal">[^\r\n]</code> 1435 </p> 1436 </td> 1437<td> 1438 <p> 1439 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_ln.html" title="Global _ln">_ln</a></code> 1440 </p> 1441 </td> 1442<td> 1443 <p> 1444 any single character not a logical newline. 1445 </p> 1446 </td> 1447</tr> 1448<tr> 1449<td> 1450 <p> 1451 <code class="literal">\w</code> 1452 </p> 1453 </td> 1454<td> 1455 <p> 1456 <code class="computeroutput"><a class="link" href="../boost/xpressive/_w.html" title="Global _w">_w</a></code> 1457 </p> 1458 </td> 1459<td> 1460 <p> 1461 a word character, equivalent to set[alnum | '_']. 1462 </p> 1463 </td> 1464</tr> 1465<tr> 1466<td> 1467 <p> 1468 <code class="literal">\W</code> 1469 </p> 1470 </td> 1471<td> 1472 <p> 1473 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_w.html" title="Global _w">_w</a></code> 1474 </p> 1475 </td> 1476<td> 1477 <p> 1478 not a word character, equivalent to ~set[alnum | '_']. 1479 </p> 1480 </td> 1481</tr> 1482<tr> 1483<td> 1484 <p> 1485 <code class="literal">\d</code> 1486 </p> 1487 </td> 1488<td> 1489 <p> 1490 <code class="computeroutput"><a class="link" href="../boost/xpressive/_d.html" title="Global _d">_d</a></code> 1491 </p> 1492 </td> 1493<td> 1494 <p> 1495 a digit character. 1496 </p> 1497 </td> 1498</tr> 1499<tr> 1500<td> 1501 <p> 1502 <code class="literal">\D</code> 1503 </p> 1504 </td> 1505<td> 1506 <p> 1507 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_d.html" title="Global _d">_d</a></code> 1508 </p> 1509 </td> 1510<td> 1511 <p> 1512 not a digit character. 1513 </p> 1514 </td> 1515</tr> 1516<tr> 1517<td> 1518 <p> 1519 <code class="literal">\s</code> 1520 </p> 1521 </td> 1522<td> 1523 <p> 1524 <code class="computeroutput"><a class="link" href="../boost/xpressive/_s.html" title="Global _s">_s</a></code> 1525 </p> 1526 </td> 1527<td> 1528 <p> 1529 a space character. 1530 </p> 1531 </td> 1532</tr> 1533<tr> 1534<td> 1535 <p> 1536 <code class="literal">\S</code> 1537 </p> 1538 </td> 1539<td> 1540 <p> 1541 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_s.html" title="Global _s">_s</a></code> 1542 </p> 1543 </td> 1544<td> 1545 <p> 1546 not a space character. 1547 </p> 1548 </td> 1549</tr> 1550<tr> 1551<td> 1552 <p> 1553 <code class="literal">[:alnum:]</code> 1554 </p> 1555 </td> 1556<td> 1557 <p> 1558 <code class="computeroutput"><a class="link" href="../boost/xpressive/alnum.html" title="Global alnum">alnum</a></code> 1559 </p> 1560 </td> 1561<td> 1562 <p> 1563 an alpha-numeric character. 1564 </p> 1565 </td> 1566</tr> 1567<tr> 1568<td> 1569 <p> 1570 <code class="literal">[:alpha:]</code> 1571 </p> 1572 </td> 1573<td> 1574 <p> 1575 <code class="computeroutput"><a class="link" href="../boost/xpressive/alpha.html" title="Global alpha">alpha</a></code> 1576 </p> 1577 </td> 1578<td> 1579 <p> 1580 an alphabetic character. 1581 </p> 1582 </td> 1583</tr> 1584<tr> 1585<td> 1586 <p> 1587 <code class="literal">[:blank:]</code> 1588 </p> 1589 </td> 1590<td> 1591 <p> 1592 <code class="computeroutput"><a class="link" href="../boost/xpressive/blank.html" title="Global blank">blank</a></code> 1593 </p> 1594 </td> 1595<td> 1596 <p> 1597 a horizontal white-space character. 1598 </p> 1599 </td> 1600</tr> 1601<tr> 1602<td> 1603 <p> 1604 <code class="literal">[:cntrl:]</code> 1605 </p> 1606 </td> 1607<td> 1608 <p> 1609 <code class="computeroutput"><a class="link" href="../boost/xpressive/cntrl.html" title="Global cntrl">cntrl</a></code> 1610 </p> 1611 </td> 1612<td> 1613 <p> 1614 a control character. 1615 </p> 1616 </td> 1617</tr> 1618<tr> 1619<td> 1620 <p> 1621 <code class="literal">[:digit:]</code> 1622 </p> 1623 </td> 1624<td> 1625 <p> 1626 <code class="computeroutput"><a class="link" href="../boost/xpressive/digit.html" title="Global digit">digit</a></code> 1627 </p> 1628 </td> 1629<td> 1630 <p> 1631 a digit character. 1632 </p> 1633 </td> 1634</tr> 1635<tr> 1636<td> 1637 <p> 1638 <code class="literal">[:graph:]</code> 1639 </p> 1640 </td> 1641<td> 1642 <p> 1643 <code class="computeroutput"><a class="link" href="../boost/xpressive/graph.html" title="Global graph">graph</a></code> 1644 </p> 1645 </td> 1646<td> 1647 <p> 1648 a graphable character. 1649 </p> 1650 </td> 1651</tr> 1652<tr> 1653<td> 1654 <p> 1655 <code class="literal">[:lower:]</code> 1656 </p> 1657 </td> 1658<td> 1659 <p> 1660 <code class="computeroutput"><a class="link" href="../boost/xpressive/lower.html" title="Global lower">lower</a></code> 1661 </p> 1662 </td> 1663<td> 1664 <p> 1665 a lower-case character. 1666 </p> 1667 </td> 1668</tr> 1669<tr> 1670<td> 1671 <p> 1672 <code class="literal">[:print:]</code> 1673 </p> 1674 </td> 1675<td> 1676 <p> 1677 <code class="computeroutput"><a class="link" href="../boost/xpressive/print.html" title="Global print">print</a></code> 1678 </p> 1679 </td> 1680<td> 1681 <p> 1682 a printing character. 1683 </p> 1684 </td> 1685</tr> 1686<tr> 1687<td> 1688 <p> 1689 <code class="literal">[:punct:]</code> 1690 </p> 1691 </td> 1692<td> 1693 <p> 1694 <code class="computeroutput"><a class="link" href="../boost/xpressive/punct.html" title="Global punct">punct</a></code> 1695 </p> 1696 </td> 1697<td> 1698 <p> 1699 a punctuation character. 1700 </p> 1701 </td> 1702</tr> 1703<tr> 1704<td> 1705 <p> 1706 <code class="literal">[:space:]</code> 1707 </p> 1708 </td> 1709<td> 1710 <p> 1711 <code class="computeroutput"><a class="link" href="../boost/xpressive/space.html" title="Global space">space</a></code> 1712 </p> 1713 </td> 1714<td> 1715 <p> 1716 a white-space character. 1717 </p> 1718 </td> 1719</tr> 1720<tr> 1721<td> 1722 <p> 1723 <code class="literal">[:upper:]</code> 1724 </p> 1725 </td> 1726<td> 1727 <p> 1728 <code class="computeroutput"><a class="link" href="../boost/xpressive/upper.html" title="Global upper">upper</a></code> 1729 </p> 1730 </td> 1731<td> 1732 <p> 1733 an upper-case character. 1734 </p> 1735 </td> 1736</tr> 1737<tr> 1738<td> 1739 <p> 1740 <code class="literal">[:xdigit:]</code> 1741 </p> 1742 </td> 1743<td> 1744 <p> 1745 <code class="computeroutput"><a class="link" href="../boost/xpressive/xdigit.html" title="Global xdigit">xdigit</a></code> 1746 </p> 1747 </td> 1748<td> 1749 <p> 1750 a hexadecimal digit character. 1751 </p> 1752 </td> 1753</tr> 1754<tr> 1755<td> 1756 <p> 1757 <code class="literal">[0-9]</code> 1758 </p> 1759 </td> 1760<td> 1761 <p> 1762 <code class="computeroutput"><a class="link" href="../boost/xpressive/range.html" title="Function template range">range</a><span class="special">(</span><span class="char">'0'</span><span class="special">,</span><span class="char">'9'</span><span class="special">)</span></code> 1763 </p> 1764 </td> 1765<td> 1766 <p> 1767 characters in range <code class="computeroutput"><span class="char">'0'</span></code> 1768 through <code class="computeroutput"><span class="char">'9'</span></code>. 1769 </p> 1770 </td> 1771</tr> 1772<tr> 1773<td> 1774 <p> 1775 <code class="literal">[abc]</code> 1776 </p> 1777 </td> 1778<td> 1779 <p> 1780 <code class="computeroutput"><span class="identifier">as_xpr</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span> <span class="special">|</span> <span class="char">'b'</span> <span class="special">|</span><span class="char">'c'</span></code> 1781 </p> 1782 </td> 1783<td> 1784 <p> 1785 characters <code class="computeroutput"><span class="char">'a'</span></code>, <code class="computeroutput"><span class="char">'b'</span></code>, or <code class="computeroutput"><span class="char">'c'</span></code>. 1786 </p> 1787 </td> 1788</tr> 1789<tr> 1790<td> 1791 <p> 1792 <code class="literal">[abc]</code> 1793 </p> 1794 </td> 1795<td> 1796 <p> 1797 <code class="computeroutput"><span class="special">(</span><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">=</span> <span class="char">'a'</span><span class="special">,</span><span class="char">'b'</span><span class="special">,</span><span class="char">'c'</span><span class="special">)</span></code> 1798 </p> 1799 </td> 1800<td> 1801 <p> 1802 <span class="emphasis"><em>same as above</em></span> 1803 </p> 1804 </td> 1805</tr> 1806<tr> 1807<td> 1808 <p> 1809 <code class="literal">[0-9abc]</code> 1810 </p> 1811 </td> 1812<td> 1813 <p> 1814 <code class="computeroutput"><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">[</span> <a class="link" href="../boost/xpressive/range.html" title="Function template range">range</a><span class="special">(</span><span class="char">'0'</span><span class="special">,</span><span class="char">'9'</span><span class="special">)</span> <span class="special">|</span> 1815 <span class="char">'a'</span> <span class="special">|</span> 1816 <span class="char">'b'</span> <span class="special">|</span> 1817 <span class="char">'c'</span> <span class="special">]</span></code> 1818 </p> 1819 </td> 1820<td> 1821 <p> 1822 characters <code class="computeroutput"><span class="char">'a'</span></code>, <code class="computeroutput"><span class="char">'b'</span></code>, <code class="computeroutput"><span class="char">'c'</span></code> 1823 or in range <code class="computeroutput"><span class="char">'0'</span></code> through 1824 <code class="computeroutput"><span class="char">'9'</span></code>. 1825 </p> 1826 </td> 1827</tr> 1828<tr> 1829<td> 1830 <p> 1831 <code class="literal">[0-9abc]</code> 1832 </p> 1833 </td> 1834<td> 1835 <p> 1836 <code class="computeroutput"><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">[</span> <a class="link" href="../boost/xpressive/range.html" title="Function template range">range</a><span class="special">(</span><span class="char">'0'</span><span class="special">,</span><span class="char">'9'</span><span class="special">)</span> <span class="special">|</span> 1837 <span class="special">(</span><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">=</span> <span class="char">'a'</span><span class="special">,</span><span class="char">'b'</span><span class="special">,</span><span class="char">'c'</span><span class="special">)</span> <span class="special">]</span></code> 1838 </p> 1839 </td> 1840<td> 1841 <p> 1842 <span class="emphasis"><em>same as above</em></span> 1843 </p> 1844 </td> 1845</tr> 1846<tr> 1847<td> 1848 <p> 1849 <code class="literal">[^abc]</code> 1850 </p> 1851 </td> 1852<td> 1853 <p> 1854 <code class="computeroutput"><span class="special">~(</span><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">=</span> <span class="char">'a'</span><span class="special">,</span><span class="char">'b'</span><span class="special">,</span><span class="char">'c'</span><span class="special">)</span></code> 1855 </p> 1856 </td> 1857<td> 1858 <p> 1859 not characters <code class="computeroutput"><span class="char">'a'</span></code>, 1860 <code class="computeroutput"><span class="char">'b'</span></code>, or <code class="computeroutput"><span class="char">'c'</span></code>. 1861 </p> 1862 </td> 1863</tr> 1864<tr> 1865<td> 1866 <p> 1867 <code class="literal">(?i:<span class="emphasis"><em>stuff</em></span>)</code> 1868 </p> 1869 </td> 1870<td> 1871 <p> 1872 <code class="computeroutput"><a class="link" href="../boost/xpressive/icase.html" title="Function template icase">icase</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code> 1873 </p> 1874 </td> 1875<td> 1876 <p> 1877 match <span class="emphasis"><em>stuff</em></span> disregarding case. 1878 </p> 1879 </td> 1880</tr> 1881<tr> 1882<td> 1883 <p> 1884 <code class="literal">(?><span class="emphasis"><em>stuff</em></span>)</code> 1885 </p> 1886 </td> 1887<td> 1888 <p> 1889 <code class="computeroutput"><a class="link" href="../boost/xpressive/keep.html" title="Function template keep">keep</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code> 1890 </p> 1891 </td> 1892<td> 1893 <p> 1894 independent sub-expression, match <span class="emphasis"><em>stuff</em></span> 1895 and turn off backtracking. 1896 </p> 1897 </td> 1898</tr> 1899<tr> 1900<td> 1901 <p> 1902 <code class="literal">(?=<span class="emphasis"><em>stuff</em></span>)</code> 1903 </p> 1904 </td> 1905<td> 1906 <p> 1907 <code class="computeroutput"><a class="link" href="../boost/xpressive/before.html" title="Function template before">before</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code> 1908 </p> 1909 </td> 1910<td> 1911 <p> 1912 positive look-ahead assertion, match if before <span class="emphasis"><em>stuff</em></span> 1913 but don't include <span class="emphasis"><em>stuff</em></span> in the match. 1914 </p> 1915 </td> 1916</tr> 1917<tr> 1918<td> 1919 <p> 1920 <code class="literal">(?!<span class="emphasis"><em>stuff</em></span>)</code> 1921 </p> 1922 </td> 1923<td> 1924 <p> 1925 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/before.html" title="Function template before">before</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code> 1926 </p> 1927 </td> 1928<td> 1929 <p> 1930 negative look-ahead assertion, match if not before <span class="emphasis"><em>stuff</em></span>. 1931 </p> 1932 </td> 1933</tr> 1934<tr> 1935<td> 1936 <p> 1937 <code class="literal">(?<=<span class="emphasis"><em>stuff</em></span>)</code> 1938 </p> 1939 </td> 1940<td> 1941 <p> 1942 <code class="computeroutput"><a class="link" href="../boost/xpressive/after.html" title="Function template after">after</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code> 1943 </p> 1944 </td> 1945<td> 1946 <p> 1947 positive look-behind assertion, match if after <span class="emphasis"><em>stuff</em></span> 1948 but don't include <span class="emphasis"><em>stuff</em></span> in the match. (<span class="emphasis"><em>stuff</em></span> 1949 must be constant-width.) 1950 </p> 1951 </td> 1952</tr> 1953<tr> 1954<td> 1955 <p> 1956 <code class="literal">(?<!<span class="emphasis"><em>stuff</em></span>)</code> 1957 </p> 1958 </td> 1959<td> 1960 <p> 1961 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/after.html" title="Function template after">after</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code> 1962 </p> 1963 </td> 1964<td> 1965 <p> 1966 negative look-behind assertion, match if not after <span class="emphasis"><em>stuff</em></span>. 1967 (<span class="emphasis"><em>stuff</em></span> must be constant-width.) 1968 </p> 1969 </td> 1970</tr> 1971<tr> 1972<td> 1973 <p> 1974 <code class="literal">(?P<<span class="emphasis"><em>name</em></span>><span class="emphasis"><em>stuff</em></span>)</code> 1975 </p> 1976 </td> 1977<td> 1978 <p> 1979 <code class="computeroutput"><code class="literal"><a class="link" href="../boost/xpressive/mark_tag.html" title="Struct mark_tag">mark_tag</a></code> 1980 </code><code class="literal"><span class="emphasis"><em>name</em></span></code><code class="computeroutput"><span class="special">(</span></code><span class="emphasis"><em>n</em></span><code class="computeroutput"><span class="special">);</span></code><br> ...<br> <code class="computeroutput"><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>name</em></span></code><code class="computeroutput"><span class="special">=</span> </code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code> 1981 </p> 1982 </td> 1983<td> 1984 <p> 1985 Create a named capture. 1986 </p> 1987 </td> 1988</tr> 1989<tr> 1990<td> 1991 <p> 1992 <code class="literal">(?P=<span class="emphasis"><em>name</em></span>)</code> 1993 </p> 1994 </td> 1995<td> 1996 <p> 1997 <code class="computeroutput"><code class="literal"><a class="link" href="../boost/xpressive/mark_tag.html" title="Struct mark_tag">mark_tag</a></code> 1998 </code><code class="literal"><span class="emphasis"><em>name</em></span></code><code class="computeroutput"><span class="special">(</span></code><span class="emphasis"><em>n</em></span><code class="computeroutput"><span class="special">);</span></code><br> ...<br> <code class="literal"><span class="emphasis"><em>name</em></span></code> 1999 </p> 2000 </td> 2001<td> 2002 <p> 2003 Refer back to a previously created named capture. 2004 </p> 2005 </td> 2006</tr> 2007</tbody> 2008</table></div> 2009</div> 2010<br class="table-break"><p> 2011 <br> 2012 </p> 2013</div> 2014<div class="section"> 2015<div class="titlepage"><div><div><h4 class="title"> 2016<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes" title="Dynamic Regexes">Dynamic 2017 Regexes</a> 2018</h4></div></div></div> 2019<h3> 2020<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.h0"></a> 2021 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.overview">Overview</a> 2022 </h3> 2023<p> 2024 Static regexes are dandy, but sometimes you need something a bit more ... 2025 dynamic. Imagine you are developing a text editor with a regex search/replace 2026 feature. You need to accept a regular expression from the end user as input 2027 at run-time. There should be a way to parse a string into a regular expression. 2028 That's what xpressive's dynamic regexes are for. They are built from the 2029 same core components as their static counterparts, but they are late-bound 2030 so you can specify them at run-time. 2031 </p> 2032<h3> 2033<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.h1"></a> 2034 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.construction_and_assignment"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.construction_and_assignment">Construction 2035 and Assignment</a> 2036 </h3> 2037<p> 2038 There are two ways to create a dynamic regex: with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id-1_3_47_5_18_2_1_1_9_1-bb">basic_regex<>::compile()</a></code></code> 2039 function or with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code> 2040 class template. Use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id-1_3_47_5_18_2_1_1_9_1-bb">basic_regex<>::compile()</a></code></code> 2041 if you want the default locale. Use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code> 2042 if you need to specify a different locale. In the section on <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches" title="Grammars and Nested Matches">regex 2043 grammars</a>, we'll see another use for <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>. 2044 </p> 2045<p> 2046 Here is an example of using <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special"><>::</span><span class="identifier">compile</span><span class="special">()</span></code>: 2047 </p> 2048<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"this|that"</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">icase</span> <span class="special">);</span> 2049</pre> 2050<p> 2051 Here is the same example using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>: 2052 </p> 2053<pre class="programlisting"><span class="identifier">sregex_compiler</span> <span class="identifier">compiler</span><span class="special">;</span> 2054<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"this|that"</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">icase</span> <span class="special">);</span> 2055</pre> 2056<p> 2057 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id-1_3_47_5_18_2_1_1_9_1-bb">basic_regex<>::compile()</a></code></code> 2058 is implemented in terms of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>. 2059 </p> 2060<h3> 2061<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.h2"></a> 2062 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.dynamic_xpressive_syntax"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.dynamic_xpressive_syntax">Dynamic 2063 xpressive Syntax</a> 2064 </h3> 2065<p> 2066 Since the dynamic syntax is not constrained by the rules for valid C++ 2067 expressions, we are free to use familiar syntax for dynamic regexes. For 2068 this reason, the syntax used by xpressive for dynamic regexes follows the 2069 lead set by John Maddock's <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1429.htm" target="_top">proposal</a> 2070 to add regular expressions to the Standard Library. It is essentially the 2071 syntax standardized by <a href="http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf" target="_top">ECMAScript</a>, 2072 with minor changes in support of internationalization. 2073 </p> 2074<p> 2075 Since the syntax is documented exhaustively elsewhere, I will simply refer 2076 you to the existing standards, rather than duplicate the specification 2077 here. 2078 </p> 2079<h3> 2080<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.h3"></a> 2081 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.internationalization"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.internationalization">Internationalization</a> 2082 </h3> 2083<p> 2084 As with static regexes, dynamic regexes support internationalization by 2085 allowing you to specify a different <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>. 2086 To do this, you must use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>. 2087 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code> 2088 class has an <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code> 2089 function. After you have imbued a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code> 2090 object with a custom <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>, 2091 all regex objects compiled by that <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code> 2092 will use that locale. For example: 2093 </p> 2094<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">my_locale</span> <span class="special">=</span> <span class="comment">/* initialize your locale object here */</span><span class="special">;</span> 2095<span class="identifier">sregex_compiler</span> <span class="identifier">compiler</span><span class="special">;</span> 2096<span class="identifier">compiler</span><span class="special">.</span><span class="identifier">imbue</span><span class="special">(</span> <span class="identifier">my_locale</span> <span class="special">);</span> 2097<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"\\w+|\\d+"</span> <span class="special">);</span> 2098</pre> 2099<p> 2100 This regex will use <code class="computeroutput"><span class="identifier">my_locale</span></code> 2101 when evaluating the intrinsic character sets <code class="computeroutput"><span class="string">"\\w"</span></code> 2102 and <code class="computeroutput"><span class="string">"\\d"</span></code>. 2103 </p> 2104</div> 2105</div> 2106<div class="section"> 2107<div class="titlepage"><div><div><h3 class="title"> 2108<a name="boost_xpressive.user_s_guide.matching_and_searching"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching" title="Matching and Searching">Matching 2109 and Searching</a> 2110</h3></div></div></div> 2111<h3> 2112<a name="boost_xpressive.user_s_guide.matching_and_searching.h0"></a> 2113 <span class="phrase"><a name="boost_xpressive.user_s_guide.matching_and_searching.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching.overview">Overview</a> 2114 </h3> 2115<p> 2116 Once you have created a regex object, you can use the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 2117 and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 2118 algorithms to find patterns in strings. This page covers the basics of regex 2119 matching and searching. In all cases, if you are familiar with how <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 2120 and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 2121 in the <a href="../../../libs/regex" target="_top">Boost.Regex</a> library work, xpressive's 2122 versions work the same way. 2123 </p> 2124<h3> 2125<a name="boost_xpressive.user_s_guide.matching_and_searching.h1"></a> 2126 <span class="phrase"><a name="boost_xpressive.user_s_guide.matching_and_searching.seeing_if_a_string_matches_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching.seeing_if_a_string_matches_a_regex">Seeing 2127 if a String Matches a Regex</a> 2128 </h3> 2129<p> 2130 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 2131 algorithm checks to see if a regex matches a given input. 2132 </p> 2133<div class="warning"><table border="0" summary="Warning"> 2134<tr> 2135<td rowspan="2" align="center" valign="top" width="25"><img alt="[Warning]" src="../../../doc/src/images/warning.png"></td> 2136<th align="left">Warning</th> 2137</tr> 2138<tr><td align="left" valign="top"><p> 2139 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 2140 algorithm will only report success if the regex matches the <span class="emphasis"><em>whole 2141 input</em></span>, from beginning to end. If the regex matches only a part 2142 of the input, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 2143 will return false. If you want to search through the string looking for 2144 sub-strings that the regex matches, use the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 2145 algorithm. 2146 </p></td></tr> 2147</table></div> 2148<p> 2149 The input can be a bidirectional range such as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>, 2150 a C-style null-terminated string or a pair of iterators. In all cases, the 2151 type of the iterator used to traverse the input sequence must match the iterator 2152 type used to declare the regex object. (You can use the table in the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start.know_your_iterator_type">Quick 2153 Start</a> to find the correct regex type for your iterator.) 2154 </p> 2155<pre class="programlisting"><span class="identifier">cregex</span> <span class="identifier">cre</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> <span class="comment">// this regex can match C-style strings</span> 2156<span class="identifier">sregex</span> <span class="identifier">sre</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> <span class="comment">// this regex can match std::strings</span> 2157 2158<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="string">"hello"</span><span class="special">,</span> <span class="identifier">cre</span> <span class="special">)</span> <span class="special">)</span> <span class="comment">// OK</span> 2159 <span class="special">{</span> <span class="comment">/*...*/</span> <span class="special">}</span> 2160 2161<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">(</span><span class="string">"hello"</span><span class="special">),</span> <span class="identifier">sre</span> <span class="special">)</span> <span class="special">)</span> <span class="comment">// OK</span> 2162 <span class="special">{</span> <span class="comment">/*...*/</span> <span class="special">}</span> 2163 2164<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="string">"hello"</span><span class="special">,</span> <span class="identifier">sre</span> <span class="special">)</span> <span class="special">)</span> <span class="comment">// ERROR! iterator mis-match!</span> 2165 <span class="special">{</span> <span class="comment">/*...*/</span> <span class="special">}</span> 2166</pre> 2167<p> 2168 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 2169 algorithm optionally accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2170 struct as an out parameter. If given, the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 2171 algorithm fills in the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2172 struct with information about which parts of the regex matched which parts 2173 of the input. 2174 </p> 2175<pre class="programlisting"><span class="identifier">cmatch</span> <span class="identifier">what</span><span class="special">;</span> 2176<span class="identifier">cregex</span> <span class="identifier">cre</span> <span class="special">=</span> <span class="special">+(</span><span class="identifier">s1</span><span class="special">=</span> <span class="identifier">_w</span><span class="special">);</span> 2177 2178<span class="comment">// store the results of the regex_match in "what"</span> 2179<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="string">"hello"</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">cre</span> <span class="special">)</span> <span class="special">)</span> 2180<span class="special">{</span> 2181 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// prints "o"</span> 2182<span class="special">}</span> 2183</pre> 2184<p> 2185 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 2186 algorithm also optionally accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code> 2187 bitmask. With <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>, 2188 you can control certain aspects of how the match is evaluated. See the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code> 2189 reference for a complete list of the flags and their meanings. 2190 </p> 2191<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"hello"</span><span class="special">);</span> 2192<span class="identifier">sregex</span> <span class="identifier">sre</span> <span class="special">=</span> <span class="identifier">bol</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> 2193 2194<span class="comment">// match_not_bol means that "bol" should not match at [begin,begin)</span> 2195<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">sre</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">match_not_bol</span> <span class="special">)</span> <span class="special">)</span> 2196<span class="special">{</span> 2197 <span class="comment">// should never get here!!!</span> 2198<span class="special">}</span> 2199</pre> 2200<p> 2201 Click <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex">here</a> 2202 to see a complete example program that shows how to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>. 2203 And check the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 2204 reference to see a complete list of the available overloads. 2205 </p> 2206<h3> 2207<a name="boost_xpressive.user_s_guide.matching_and_searching.h2"></a> 2208 <span class="phrase"><a name="boost_xpressive.user_s_guide.matching_and_searching.searching_for_matching_sub_strings"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching.searching_for_matching_sub_strings">Searching 2209 for Matching Sub-Strings</a> 2210 </h3> 2211<p> 2212 Use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 2213 when you want to know if an input sequence contains a sub-sequence that a 2214 regex matches. <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 2215 will try to match the regex at the beginning of the input sequence and scan 2216 forward in the sequence until it either finds a match or exhausts the sequence. 2217 </p> 2218<p> 2219 In all other regards, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 2220 behaves like <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 2221 <span class="emphasis"><em>(see above)</em></span>. In particular, it can operate on a bidirectional 2222 range such as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>, C-style null-terminated strings 2223 or iterator ranges. The same care must be taken to ensure that the iterator 2224 type of your regex matches the iterator type of your input sequence. As with 2225 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>, 2226 you can optionally provide a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2227 struct to receive the results of the search, and a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code> 2228 bitmask to control how the match is evaluated. 2229 </p> 2230<p> 2231 Click <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex">here</a> 2232 to see a complete example program that shows how to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>. 2233 And check the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 2234 reference to see a complete list of the available overloads. 2235 </p> 2236</div> 2237<div class="section"> 2238<div class="titlepage"><div><div><h3 class="title"> 2239<a name="boost_xpressive.user_s_guide.accessing_results"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results" title="Accessing Results">Accessing 2240 Results</a> 2241</h3></div></div></div> 2242<h3> 2243<a name="boost_xpressive.user_s_guide.accessing_results.h0"></a> 2244 <span class="phrase"><a name="boost_xpressive.user_s_guide.accessing_results.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results.overview">Overview</a> 2245 </h3> 2246<p> 2247 Sometimes, it is not enough to know simply whether a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 2248 or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 2249 was successful or not. If you pass an object of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2250 to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 2251 or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>, 2252 then after the algorithm has completed successfully the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2253 will contain extra information about which parts of the regex matched which 2254 parts of the sequence. In Perl, these sub-sequences are called <span class="emphasis"><em>back-references</em></span>, 2255 and they are stored in the variables <code class="literal">$1</code>, <code class="literal">$2</code>, 2256 etc. In xpressive, they are objects of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>, 2257 and they are stored in the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2258 structure, which acts as a vector of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 2259 objects. 2260 </p> 2261<h3> 2262<a name="boost_xpressive.user_s_guide.accessing_results.h1"></a> 2263 <span class="phrase"><a name="boost_xpressive.user_s_guide.accessing_results.match_results"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results.match_results">match_results</a> 2264 </h3> 2265<p> 2266 So, you've passed a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2267 object to a regex algorithm, and the algorithm has succeeded. Now you want 2268 to examine the results. Most of what you'll be doing with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2269 object is indexing into it to access its internally stored <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 2270 objects, but there are a few other things you can do with a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2271 object besides. 2272 </p> 2273<p> 2274 The table below shows how to access the information stored in a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2275 object named <code class="computeroutput"><span class="identifier">what</span></code>. 2276 </p> 2277<div class="table"> 2278<a name="boost_xpressive.user_s_guide.accessing_results.t0"></a><p class="title"><b>Table 46.5. match_results<> Accessors</b></p> 2279<div class="table-contents"><table class="table" summary="match_results<> Accessors"> 2280<colgroup> 2281<col> 2282<col> 2283</colgroup> 2284<thead><tr> 2285<th> 2286 <p> 2287 Accessor 2288 </p> 2289 </th> 2290<th> 2291 <p> 2292 Effects 2293 </p> 2294 </th> 2295</tr></thead> 2296<tbody> 2297<tr> 2298<td> 2299 <p> 2300 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">size</span><span class="special">()</span></code> 2301 </p> 2302 </td> 2303<td> 2304 <p> 2305 Returns the number of sub-matches, which is always greater than 2306 zero after a successful match because the full match is stored 2307 in the zero-th sub-match. 2308 </p> 2309 </td> 2310</tr> 2311<tr> 2312<td> 2313 <p> 2314 <code class="computeroutput"><span class="identifier">what</span><span class="special">[</span><span class="identifier">n</span><span class="special">]</span></code> 2315 </p> 2316 </td> 2317<td> 2318 <p> 2319 Returns the <span class="emphasis"><em>n</em></span>-th sub-match. 2320 </p> 2321 </td> 2322</tr> 2323<tr> 2324<td> 2325 <p> 2326 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">length</span><span class="special">(</span><span class="identifier">n</span><span class="special">)</span></code> 2327 </p> 2328 </td> 2329<td> 2330 <p> 2331 Returns the length of the <span class="emphasis"><em>n</em></span>-th sub-match. 2332 Same as <code class="computeroutput"><span class="identifier">what</span><span class="special">[</span><span class="identifier">n</span><span class="special">].</span><span class="identifier">length</span><span class="special">()</span></code>. 2333 </p> 2334 </td> 2335</tr> 2336<tr> 2337<td> 2338 <p> 2339 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">position</span><span class="special">(</span><span class="identifier">n</span><span class="special">)</span></code> 2340 </p> 2341 </td> 2342<td> 2343 <p> 2344 Returns the offset into the input sequence at which the <span class="emphasis"><em>n</em></span>-th 2345 sub-match begins. 2346 </p> 2347 </td> 2348</tr> 2349<tr> 2350<td> 2351 <p> 2352 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">str</span><span class="special">(</span><span class="identifier">n</span><span class="special">)</span></code> 2353 </p> 2354 </td> 2355<td> 2356 <p> 2357 Returns a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><></span></code> 2358 constructed from the <span class="emphasis"><em>n</em></span>-th sub-match. Same 2359 as <code class="computeroutput"><span class="identifier">what</span><span class="special">[</span><span class="identifier">n</span><span class="special">].</span><span class="identifier">str</span><span class="special">()</span></code>. 2360 </p> 2361 </td> 2362</tr> 2363<tr> 2364<td> 2365 <p> 2366 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">prefix</span><span class="special">()</span></code> 2367 </p> 2368 </td> 2369<td> 2370 <p> 2371 Returns a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 2372 object which represents the sub-sequence from the beginning of 2373 the input sequence to the start of the full match. 2374 </p> 2375 </td> 2376</tr> 2377<tr> 2378<td> 2379 <p> 2380 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">suffix</span><span class="special">()</span></code> 2381 </p> 2382 </td> 2383<td> 2384 <p> 2385 Returns a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 2386 object which represents the sub-sequence from the end of the full 2387 match to the end of the input sequence. 2388 </p> 2389 </td> 2390</tr> 2391<tr> 2392<td> 2393 <p> 2394 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">regex_id</span><span class="special">()</span></code> 2395 </p> 2396 </td> 2397<td> 2398 <p> 2399 Returns the <code class="computeroutput"><span class="identifier">regex_id</span></code> 2400 of the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 2401 object that was last used with this <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2402 object. 2403 </p> 2404 </td> 2405</tr> 2406</tbody> 2407</table></div> 2408</div> 2409<br class="table-break"><p> 2410 There is more you can do with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2411 object, but that will be covered when we talk about <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches" title="Grammars and Nested Matches">Grammars 2412 and Nested Matches</a>. 2413 </p> 2414<h3> 2415<a name="boost_xpressive.user_s_guide.accessing_results.h2"></a> 2416 <span class="phrase"><a name="boost_xpressive.user_s_guide.accessing_results.sub_match"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results.sub_match">sub_match</a> 2417 </h3> 2418<p> 2419 When you index into a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2420 object, you get back a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 2421 object. A <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 2422 is basically a pair of iterators. It is defined like this: 2423 </p> 2424<pre class="programlisting"><span class="keyword">template</span><span class="special"><</span> <span class="keyword">class</span> <span class="identifier">BidirectionalIterator</span> <span class="special">></span> 2425<span class="keyword">struct</span> <span class="identifier">sub_match</span> 2426 <span class="special">:</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">pair</span><span class="special"><</span> <span class="identifier">BidirectionalIterator</span><span class="special">,</span> <span class="identifier">BidirectionalIterator</span> <span class="special">></span> 2427<span class="special">{</span> 2428 <span class="keyword">bool</span> <span class="identifier">matched</span><span class="special">;</span> 2429 <span class="comment">// ...</span> 2430<span class="special">};</span> 2431</pre> 2432<p> 2433 Since it inherits publicaly from <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">pair</span><span class="special"><></span></code>, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 2434 has <code class="computeroutput"><span class="identifier">first</span></code> and <code class="computeroutput"><span class="identifier">second</span></code> data members of type <code class="computeroutput"><span class="identifier">BidirectionalIterator</span></code>. These are the beginning 2435 and end of the sub-sequence this <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 2436 represents. <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 2437 also has a Boolean <code class="computeroutput"><span class="identifier">matched</span></code> 2438 data member, which is true if this <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 2439 participated in the full match. 2440 </p> 2441<p> 2442 The following table shows how you might access the information stored in 2443 a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 2444 object called <code class="computeroutput"><span class="identifier">sub</span></code>. 2445 </p> 2446<div class="table"> 2447<a name="boost_xpressive.user_s_guide.accessing_results.t1"></a><p class="title"><b>Table 46.6. sub_match<> Accessors</b></p> 2448<div class="table-contents"><table class="table" summary="sub_match<> Accessors"> 2449<colgroup> 2450<col> 2451<col> 2452</colgroup> 2453<thead><tr> 2454<th> 2455 <p> 2456 Accessor 2457 </p> 2458 </th> 2459<th> 2460 <p> 2461 Effects 2462 </p> 2463 </th> 2464</tr></thead> 2465<tbody> 2466<tr> 2467<td> 2468 <p> 2469 <code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">length</span><span class="special">()</span></code> 2470 </p> 2471 </td> 2472<td> 2473 <p> 2474 Returns the length of the sub-match. Same as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">distance</span><span class="special">(</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">first</span><span class="special">,</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">second</span><span class="special">)</span></code>. 2475 </p> 2476 </td> 2477</tr> 2478<tr> 2479<td> 2480 <p> 2481 <code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">str</span><span class="special">()</span></code> 2482 </p> 2483 </td> 2484<td> 2485 <p> 2486 Returns a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><></span></code> 2487 constructed from the sub-match. Same as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><</span><span class="identifier">char_type</span><span class="special">>(</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">first</span><span class="special">,</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">second</span><span class="special">)</span></code>. 2488 </p> 2489 </td> 2490</tr> 2491<tr> 2492<td> 2493 <p> 2494 <code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">compare</span><span class="special">(</span><span class="identifier">str</span><span class="special">)</span></code> 2495 </p> 2496 </td> 2497<td> 2498 <p> 2499 Performs a string comparison between the sub-match and <code class="computeroutput"><span class="identifier">str</span></code>, where <code class="computeroutput"><span class="identifier">str</span></code> 2500 can be a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><></span></code>, 2501 C-style null-terminated string, or another sub-match. Same as 2502 <code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">str</span><span class="special">().</span><span class="identifier">compare</span><span class="special">(</span><span class="identifier">str</span><span class="special">)</span></code>. 2503 </p> 2504 </td> 2505</tr> 2506</tbody> 2507</table></div> 2508</div> 2509<br class="table-break"><h3> 2510<a name="boost_xpressive.user_s_guide.accessing_results.h3"></a> 2511 <span class="phrase"><a name="boost_xpressive.user_s_guide.accessing_results._inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject__results_invalidation__inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results._inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject__results_invalidation__inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject_"><span class="inlinemediaobject"><img src="../images/caution.png" alt="caution"></span> Results Invalidation <span class="inlinemediaobject"><img src="../images/caution.png" alt="caution"></span></a> 2512 </h3> 2513<p> 2514 Results are stored as iterators into the input sequence. Anything which invalidates 2515 the input sequence will invalidate the match results. For instance, if you 2516 match a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> object, the results are only valid 2517 until your next call to a non-const member function of that <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> 2518 object. After that, the results held by the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 2519 object are invalid. Don't use them! 2520 </p> 2521</div> 2522<div class="section"> 2523<div class="titlepage"><div><div><h3 class="title"> 2524<a name="boost_xpressive.user_s_guide.string_substitutions"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions" title="String Substitutions">String 2525 Substitutions</a> 2526</h3></div></div></div> 2527<p> 2528 Regular expressions are not only good for searching text; they're good at 2529 <span class="emphasis"><em>manipulating</em></span> it. And one of the most common text manipulation 2530 tasks is search-and-replace. xpressive provides the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code> 2531 algorithm for searching and replacing. 2532 </p> 2533<h3> 2534<a name="boost_xpressive.user_s_guide.string_substitutions.h0"></a> 2535 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.regex_replace__"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.regex_replace__">regex_replace()</a> 2536 </h3> 2537<p> 2538 Performing search-and-replace using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code> 2539 is simple. All you need is an input sequence, a regex object, and a format 2540 string or a formatter object. There are several versions of the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code> 2541 algorithm. Some accept the input sequence as a bidirectional container such 2542 as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> and returns the result in a new 2543 container of the same type. Others accept the input as a null terminated 2544 string and return a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>. Still others accept the input sequence 2545 as a pair of iterators and writes the result into an output iterator. The 2546 substitution may be specified as a string with format sequences or as a formatter 2547 object. Below are some simple examples of using string-based substitutions. 2548 </p> 2549<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"This is his face"</span><span class="special">);</span> 2550<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">as_xpr</span><span class="special">(</span><span class="string">"his"</span><span class="special">);</span> <span class="comment">// find all occurrences of "his" ...</span> 2551<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">format</span><span class="special">(</span><span class="string">"her"</span><span class="special">);</span> <span class="comment">// ... and replace them with "her"</span> 2552 2553<span class="comment">// use the version of regex_replace() that operates on strings</span> 2554<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span> <span class="identifier">input</span><span class="special">,</span> <span class="identifier">re</span><span class="special">,</span> <span class="identifier">format</span> <span class="special">);</span> 2555<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">output</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 2556 2557<span class="comment">// use the version of regex_replace() that operates on iterators</span> 2558<span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="keyword">char</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">);</span> 2559<span class="identifier">regex_replace</span><span class="special">(</span> <span class="identifier">out_iter</span><span class="special">,</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="identifier">format</span> <span class="special">);</span> 2560</pre> 2561<p> 2562 The above program prints out the following: 2563 </p> 2564<pre class="programlisting">Ther is her face 2565Ther is her face 2566</pre> 2567<p> 2568 Notice that <span class="emphasis"><em>all</em></span> the occurrences of <code class="computeroutput"><span class="string">"his"</span></code> 2569 have been replaced with <code class="computeroutput"><span class="string">"her"</span></code>. 2570 </p> 2571<p> 2572 Click <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex">here</a> 2573 to see a complete example program that shows how to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>. 2574 And check the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code> 2575 reference to see a complete list of the available overloads. 2576 </p> 2577<h3> 2578<a name="boost_xpressive.user_s_guide.string_substitutions.h1"></a> 2579 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.replace_options"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.replace_options">Replace 2580 Options</a> 2581 </h3> 2582<p> 2583 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code> 2584 algorithm takes an optional bitmask parameter to control the formatting. 2585 The possible values of the bitmask are: 2586 </p> 2587<div class="table"> 2588<a name="boost_xpressive.user_s_guide.string_substitutions.t0"></a><p class="title"><b>Table 46.7. Format Flags</b></p> 2589<div class="table-contents"><table class="table" summary="Format Flags"> 2590<colgroup> 2591<col> 2592<col> 2593</colgroup> 2594<thead><tr> 2595<th> 2596 <p> 2597 Flag 2598 </p> 2599 </th> 2600<th> 2601 <p> 2602 Meaning 2603 </p> 2604 </th> 2605</tr></thead> 2606<tbody> 2607<tr> 2608<td> 2609 <p> 2610 <code class="computeroutput"><span class="identifier">format_default</span></code> 2611 </p> 2612 </td> 2613<td> 2614 <p> 2615 Recognize the ECMA-262 format sequences (see below). 2616 </p> 2617 </td> 2618</tr> 2619<tr> 2620<td> 2621 <p> 2622 <code class="computeroutput"><span class="identifier">format_first_only</span></code> 2623 </p> 2624 </td> 2625<td> 2626 <p> 2627 Only replace the first match, not all of them. 2628 </p> 2629 </td> 2630</tr> 2631<tr> 2632<td> 2633 <p> 2634 <code class="computeroutput"><span class="identifier">format_no_copy</span></code> 2635 </p> 2636 </td> 2637<td> 2638 <p> 2639 Don't copy the parts of the input sequence that didn't match the 2640 regex to the output sequence. 2641 </p> 2642 </td> 2643</tr> 2644<tr> 2645<td> 2646 <p> 2647 <code class="computeroutput"><span class="identifier">format_literal</span></code> 2648 </p> 2649 </td> 2650<td> 2651 <p> 2652 Treat the format string as a literal; that is, don't recognize 2653 any escape sequences. 2654 </p> 2655 </td> 2656</tr> 2657<tr> 2658<td> 2659 <p> 2660 <code class="computeroutput"><span class="identifier">format_perl</span></code> 2661 </p> 2662 </td> 2663<td> 2664 <p> 2665 Recognize the Perl format sequences (see below). 2666 </p> 2667 </td> 2668</tr> 2669<tr> 2670<td> 2671 <p> 2672 <code class="computeroutput"><span class="identifier">format_sed</span></code> 2673 </p> 2674 </td> 2675<td> 2676 <p> 2677 Recognize the sed format sequences (see below). 2678 </p> 2679 </td> 2680</tr> 2681<tr> 2682<td> 2683 <p> 2684 <code class="computeroutput"><span class="identifier">format_all</span></code> 2685 </p> 2686 </td> 2687<td> 2688 <p> 2689 In addition to the Perl format sequences, recognize some Boost-specific 2690 format sequences. 2691 </p> 2692 </td> 2693</tr> 2694</tbody> 2695</table></div> 2696</div> 2697<br class="table-break"><p> 2698 These flags live in the <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">regex_constants</span></code> 2699 namespace. If the substitution parameter is a function object instead of 2700 a string, the flags <code class="computeroutput"><span class="identifier">format_literal</span></code>, 2701 <code class="computeroutput"><span class="identifier">format_perl</span></code>, <code class="computeroutput"><span class="identifier">format_sed</span></code>, and <code class="computeroutput"><span class="identifier">format_all</span></code> 2702 are ignored. 2703 </p> 2704<h3> 2705<a name="boost_xpressive.user_s_guide.string_substitutions.h2"></a> 2706 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.the_ecma_262_format_sequences"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_ecma_262_format_sequences">The 2707 ECMA-262 Format Sequences</a> 2708 </h3> 2709<p> 2710 When you haven't specified a substitution string dialect with one of the 2711 format flags above, you get the dialect defined by ECMA-262, the standard 2712 for ECMAScript. The table below shows the escape sequences recognized in 2713 ECMA-262 mode. 2714 </p> 2715<div class="table"> 2716<a name="boost_xpressive.user_s_guide.string_substitutions.t1"></a><p class="title"><b>Table 46.8. Format Escape Sequences</b></p> 2717<div class="table-contents"><table class="table" summary="Format Escape Sequences"> 2718<colgroup> 2719<col> 2720<col> 2721</colgroup> 2722<thead><tr> 2723<th> 2724 <p> 2725 Escape Sequence 2726 </p> 2727 </th> 2728<th> 2729 <p> 2730 Meaning 2731 </p> 2732 </th> 2733</tr></thead> 2734<tbody> 2735<tr> 2736<td> 2737 <p> 2738 <code class="literal">$1</code>, <code class="literal">$2</code>, etc. 2739 </p> 2740 </td> 2741<td> 2742 <p> 2743 the corresponding sub-match 2744 </p> 2745 </td> 2746</tr> 2747<tr> 2748<td> 2749 <p> 2750 <code class="literal">$&</code> 2751 </p> 2752 </td> 2753<td> 2754 <p> 2755 the full match 2756 </p> 2757 </td> 2758</tr> 2759<tr> 2760<td> 2761 <p> 2762 <code class="literal">$`</code> 2763 </p> 2764 </td> 2765<td> 2766 <p> 2767 the match prefix 2768 </p> 2769 </td> 2770</tr> 2771<tr> 2772<td> 2773 <p> 2774 <code class="literal">$'</code> 2775 </p> 2776 </td> 2777<td> 2778 <p> 2779 the match suffix 2780 </p> 2781 </td> 2782</tr> 2783<tr> 2784<td> 2785 <p> 2786 <code class="literal">$$</code> 2787 </p> 2788 </td> 2789<td> 2790 <p> 2791 a literal <code class="computeroutput"><span class="char">'$'</span></code> character 2792 </p> 2793 </td> 2794</tr> 2795</tbody> 2796</table></div> 2797</div> 2798<br class="table-break"><p> 2799 Any other sequence beginning with <code class="computeroutput"><span class="char">'$'</span></code> 2800 simply represents itself. For example, if the format string were <code class="computeroutput"><span class="string">"$a"</span></code> then <code class="computeroutput"><span class="string">"$a"</span></code> 2801 would be inserted into the output sequence. 2802 </p> 2803<h3> 2804<a name="boost_xpressive.user_s_guide.string_substitutions.h3"></a> 2805 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.the_sed_format_sequences"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_sed_format_sequences">The 2806 Sed Format Sequences</a> 2807 </h3> 2808<p> 2809 When specifying the <code class="computeroutput"><span class="identifier">format_sed</span></code> 2810 flag to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>, 2811 the following escape sequences are recognized: 2812 </p> 2813<div class="table"> 2814<a name="boost_xpressive.user_s_guide.string_substitutions.t2"></a><p class="title"><b>Table 46.9. Sed Format Escape Sequences</b></p> 2815<div class="table-contents"><table class="table" summary="Sed Format Escape Sequences"> 2816<colgroup> 2817<col> 2818<col> 2819</colgroup> 2820<thead><tr> 2821<th> 2822 <p> 2823 Escape Sequence 2824 </p> 2825 </th> 2826<th> 2827 <p> 2828 Meaning 2829 </p> 2830 </th> 2831</tr></thead> 2832<tbody> 2833<tr> 2834<td> 2835 <p> 2836 <code class="literal">\1</code>, <code class="literal">\2</code>, etc. 2837 </p> 2838 </td> 2839<td> 2840 <p> 2841 The corresponding sub-match 2842 </p> 2843 </td> 2844</tr> 2845<tr> 2846<td> 2847 <p> 2848 <code class="literal">&</code> 2849 </p> 2850 </td> 2851<td> 2852 <p> 2853 the full match 2854 </p> 2855 </td> 2856</tr> 2857<tr> 2858<td> 2859 <p> 2860 <code class="literal">\a</code> 2861 </p> 2862 </td> 2863<td> 2864 <p> 2865 A literal <code class="computeroutput"><span class="char">'\a'</span></code> 2866 </p> 2867 </td> 2868</tr> 2869<tr> 2870<td> 2871 <p> 2872 <code class="literal">\e</code> 2873 </p> 2874 </td> 2875<td> 2876 <p> 2877 A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">27</span><span class="special">)</span></code> 2878 </p> 2879 </td> 2880</tr> 2881<tr> 2882<td> 2883 <p> 2884 <code class="literal">\f</code> 2885 </p> 2886 </td> 2887<td> 2888 <p> 2889 A literal <code class="computeroutput"><span class="char">'\f'</span></code> 2890 </p> 2891 </td> 2892</tr> 2893<tr> 2894<td> 2895 <p> 2896 <code class="literal">\n</code> 2897 </p> 2898 </td> 2899<td> 2900 <p> 2901 A literal <code class="computeroutput"><span class="char">'\n'</span></code> 2902 </p> 2903 </td> 2904</tr> 2905<tr> 2906<td> 2907 <p> 2908 <code class="literal">\r</code> 2909 </p> 2910 </td> 2911<td> 2912 <p> 2913 A literal <code class="computeroutput"><span class="char">'\r'</span></code> 2914 </p> 2915 </td> 2916</tr> 2917<tr> 2918<td> 2919 <p> 2920 <code class="literal">\t</code> 2921 </p> 2922 </td> 2923<td> 2924 <p> 2925 A literal <code class="computeroutput"><span class="char">'\t'</span></code> 2926 </p> 2927 </td> 2928</tr> 2929<tr> 2930<td> 2931 <p> 2932 <code class="literal">\v</code> 2933 </p> 2934 </td> 2935<td> 2936 <p> 2937 A literal <code class="computeroutput"><span class="char">'\v'</span></code> 2938 </p> 2939 </td> 2940</tr> 2941<tr> 2942<td> 2943 <p> 2944 <code class="literal">\xFF</code> 2945 </p> 2946 </td> 2947<td> 2948 <p> 2949 A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code> 2950 is any hex digit 2951 </p> 2952 </td> 2953</tr> 2954<tr> 2955<td> 2956 <p> 2957 <code class="literal">\x{FFFF}</code> 2958 </p> 2959 </td> 2960<td> 2961 <p> 2962 A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFFFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code> 2963 is any hex digit 2964 </p> 2965 </td> 2966</tr> 2967<tr> 2968<td> 2969 <p> 2970 <code class="literal">\cX</code> 2971 </p> 2972 </td> 2973<td> 2974 <p> 2975 The control character <code class="literal"><span class="emphasis"><em>X</em></span></code> 2976 </p> 2977 </td> 2978</tr> 2979</tbody> 2980</table></div> 2981</div> 2982<br class="table-break"><h3> 2983<a name="boost_xpressive.user_s_guide.string_substitutions.h4"></a> 2984 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.the_perl_format_sequences"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_perl_format_sequences">The 2985 Perl Format Sequences</a> 2986 </h3> 2987<p> 2988 When specifying the <code class="computeroutput"><span class="identifier">format_perl</span></code> 2989 flag to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>, 2990 the following escape sequences are recognized: 2991 </p> 2992<div class="table"> 2993<a name="boost_xpressive.user_s_guide.string_substitutions.t3"></a><p class="title"><b>Table 46.10. Perl Format Escape Sequences</b></p> 2994<div class="table-contents"><table class="table" summary="Perl Format Escape Sequences"> 2995<colgroup> 2996<col> 2997<col> 2998</colgroup> 2999<thead><tr> 3000<th> 3001 <p> 3002 Escape Sequence 3003 </p> 3004 </th> 3005<th> 3006 <p> 3007 Meaning 3008 </p> 3009 </th> 3010</tr></thead> 3011<tbody> 3012<tr> 3013<td> 3014 <p> 3015 <code class="literal">$1</code>, <code class="literal">$2</code>, etc. 3016 </p> 3017 </td> 3018<td> 3019 <p> 3020 the corresponding sub-match 3021 </p> 3022 </td> 3023</tr> 3024<tr> 3025<td> 3026 <p> 3027 <code class="literal">$&</code> 3028 </p> 3029 </td> 3030<td> 3031 <p> 3032 the full match 3033 </p> 3034 </td> 3035</tr> 3036<tr> 3037<td> 3038 <p> 3039 <code class="literal">$`</code> 3040 </p> 3041 </td> 3042<td> 3043 <p> 3044 the match prefix 3045 </p> 3046 </td> 3047</tr> 3048<tr> 3049<td> 3050 <p> 3051 <code class="literal">$'</code> 3052 </p> 3053 </td> 3054<td> 3055 <p> 3056 the match suffix 3057 </p> 3058 </td> 3059</tr> 3060<tr> 3061<td> 3062 <p> 3063 <code class="literal">$$</code> 3064 </p> 3065 </td> 3066<td> 3067 <p> 3068 a literal <code class="computeroutput"><span class="char">'$'</span></code> character 3069 </p> 3070 </td> 3071</tr> 3072<tr> 3073<td> 3074 <p> 3075 <code class="literal">\a</code> 3076 </p> 3077 </td> 3078<td> 3079 <p> 3080 A literal <code class="computeroutput"><span class="char">'\a'</span></code> 3081 </p> 3082 </td> 3083</tr> 3084<tr> 3085<td> 3086 <p> 3087 <code class="literal">\e</code> 3088 </p> 3089 </td> 3090<td> 3091 <p> 3092 A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">27</span><span class="special">)</span></code> 3093 </p> 3094 </td> 3095</tr> 3096<tr> 3097<td> 3098 <p> 3099 <code class="literal">\f</code> 3100 </p> 3101 </td> 3102<td> 3103 <p> 3104 A literal <code class="computeroutput"><span class="char">'\f'</span></code> 3105 </p> 3106 </td> 3107</tr> 3108<tr> 3109<td> 3110 <p> 3111 <code class="literal">\n</code> 3112 </p> 3113 </td> 3114<td> 3115 <p> 3116 A literal <code class="computeroutput"><span class="char">'\n'</span></code> 3117 </p> 3118 </td> 3119</tr> 3120<tr> 3121<td> 3122 <p> 3123 <code class="literal">\r</code> 3124 </p> 3125 </td> 3126<td> 3127 <p> 3128 A literal <code class="computeroutput"><span class="char">'\r'</span></code> 3129 </p> 3130 </td> 3131</tr> 3132<tr> 3133<td> 3134 <p> 3135 <code class="literal">\t</code> 3136 </p> 3137 </td> 3138<td> 3139 <p> 3140 A literal <code class="computeroutput"><span class="char">'\t'</span></code> 3141 </p> 3142 </td> 3143</tr> 3144<tr> 3145<td> 3146 <p> 3147 <code class="literal">\v</code> 3148 </p> 3149 </td> 3150<td> 3151 <p> 3152 A literal <code class="computeroutput"><span class="char">'\v'</span></code> 3153 </p> 3154 </td> 3155</tr> 3156<tr> 3157<td> 3158 <p> 3159 <code class="literal">\xFF</code> 3160 </p> 3161 </td> 3162<td> 3163 <p> 3164 A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code> 3165 is any hex digit 3166 </p> 3167 </td> 3168</tr> 3169<tr> 3170<td> 3171 <p> 3172 <code class="literal">\x{FFFF}</code> 3173 </p> 3174 </td> 3175<td> 3176 <p> 3177 A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFFFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code> 3178 is any hex digit 3179 </p> 3180 </td> 3181</tr> 3182<tr> 3183<td> 3184 <p> 3185 <code class="literal">\cX</code> 3186 </p> 3187 </td> 3188<td> 3189 <p> 3190 The control character <code class="literal"><span class="emphasis"><em>X</em></span></code> 3191 </p> 3192 </td> 3193</tr> 3194<tr> 3195<td> 3196 <p> 3197 <code class="literal">\l</code> 3198 </p> 3199 </td> 3200<td> 3201 <p> 3202 Make the next character lowercase 3203 </p> 3204 </td> 3205</tr> 3206<tr> 3207<td> 3208 <p> 3209 <code class="literal">\L</code> 3210 </p> 3211 </td> 3212<td> 3213 <p> 3214 Make the rest of the substitution lowercase until the next <code class="literal">\E</code> 3215 </p> 3216 </td> 3217</tr> 3218<tr> 3219<td> 3220 <p> 3221 <code class="literal">\u</code> 3222 </p> 3223 </td> 3224<td> 3225 <p> 3226 Make the next character uppercase 3227 </p> 3228 </td> 3229</tr> 3230<tr> 3231<td> 3232 <p> 3233 <code class="literal">\U</code> 3234 </p> 3235 </td> 3236<td> 3237 <p> 3238 Make the rest of the substitution uppercase until the next <code class="literal">\E</code> 3239 </p> 3240 </td> 3241</tr> 3242<tr> 3243<td> 3244 <p> 3245 <code class="literal">\E</code> 3246 </p> 3247 </td> 3248<td> 3249 <p> 3250 Terminate <code class="literal">\L</code> or <code class="literal">\U</code> 3251 </p> 3252 </td> 3253</tr> 3254<tr> 3255<td> 3256 <p> 3257 <code class="literal">\1</code>, <code class="literal">\2</code>, etc. 3258 </p> 3259 </td> 3260<td> 3261 <p> 3262 The corresponding sub-match 3263 </p> 3264 </td> 3265</tr> 3266<tr> 3267<td> 3268 <p> 3269 <code class="literal">\g<name></code> 3270 </p> 3271 </td> 3272<td> 3273 <p> 3274 The named backref <span class="emphasis"><em>name</em></span> 3275 </p> 3276 </td> 3277</tr> 3278</tbody> 3279</table></div> 3280</div> 3281<br class="table-break"><h3> 3282<a name="boost_xpressive.user_s_guide.string_substitutions.h5"></a> 3283 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.the_boost_specific_format_sequences"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_boost_specific_format_sequences">The 3284 Boost-Specific Format Sequences</a> 3285 </h3> 3286<p> 3287 When specifying the <code class="computeroutput"><span class="identifier">format_all</span></code> 3288 flag to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>, 3289 the escape sequences recognized are the same as those above for <code class="computeroutput"><span class="identifier">format_perl</span></code>. In addition, conditional expressions 3290 of the following form are recognized: 3291 </p> 3292<pre class="programlisting">?Ntrue-expression:false-expression 3293</pre> 3294<p> 3295 where <span class="emphasis"><em>N</em></span> is a decimal digit representing a sub-match. 3296 If the corresponding sub-match participated in the full match, then the substitution 3297 is <span class="emphasis"><em>true-expression</em></span>. Otherwise, it is <span class="emphasis"><em>false-expression</em></span>. 3298 In this mode, you can use parens <code class="literal">()</code> for grouping. If you 3299 want a literal paren, you must escape it as <code class="literal">\(</code>. 3300 </p> 3301<h3> 3302<a name="boost_xpressive.user_s_guide.string_substitutions.h6"></a> 3303 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.formatter_objects"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.formatter_objects">Formatter 3304 Objects</a> 3305 </h3> 3306<p> 3307 Format strings are not always expressive enough for all your text substitution 3308 needs. Consider the simple example of wanting to map input strings to output 3309 strings, as you may want to do with environment variables. Rather than a 3310 format <span class="emphasis"><em>string</em></span>, for this you would use a formatter <span class="emphasis"><em>object</em></span>. 3311 Consider the following code, which finds embedded environment variables of 3312 the form <code class="computeroutput"><span class="string">"$(XYZ)"</span></code> and 3313 computes the substitution string by looking up the environment variable in 3314 a map. 3315 </p> 3316<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">map</span><span class="special">></span> 3317<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span> 3318<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 3319<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 3320<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">;</span> 3321<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">xpressive</span><span class="special">;</span> 3322 3323<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span> <span class="identifier">env</span><span class="special">;</span> 3324 3325<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">format_fun</span><span class="special">(</span><span class="identifier">smatch</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">what</span><span class="special">)</span> 3326<span class="special">{</span> 3327 <span class="keyword">return</span> <span class="identifier">env</span><span class="special">[</span><span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">].</span><span class="identifier">str</span><span class="special">()];</span> 3328<span class="special">}</span> 3329 3330<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 3331<span class="special">{</span> 3332 <span class="identifier">env</span><span class="special">[</span><span class="string">"X"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"this"</span><span class="special">;</span> 3333 <span class="identifier">env</span><span class="special">[</span><span class="string">"Y"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"that"</span><span class="special">;</span> 3334 3335 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"\"$(X)\" has the value \"$(Y)\""</span><span class="special">);</span> 3336 3337 <span class="comment">// replace strings like "$(XYZ)" with the result of env["XYZ"]</span> 3338 <span class="identifier">sregex</span> <span class="identifier">envar</span> <span class="special">=</span> <span class="string">"$("</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s1</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span> 3339 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">envar</span><span class="special">,</span> <span class="identifier">format_fun</span><span class="special">);</span> 3340 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">output</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 3341<span class="special">}</span> 3342</pre> 3343<p> 3344 In this case, we use a function, <code class="computeroutput"><span class="identifier">format_fun</span><span class="special">()</span></code> to compute the substitution string on the 3345 fly. It accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 3346 object which contains the results of the current match. <code class="computeroutput"><span class="identifier">format_fun</span><span class="special">()</span></code> uses the first submatch as a key into the 3347 global <code class="computeroutput"><span class="identifier">env</span></code> map. The above 3348 code displays: 3349 </p> 3350<pre class="programlisting">"this" has the value "that" 3351</pre> 3352<p> 3353 The formatter need not be an ordinary function. It may be an object of class 3354 type. And rather than return a string, it may accept an output iterator into 3355 which it writes the substitution. Consider the following, which is functionally 3356 equivalent to the above. 3357 </p> 3358<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">map</span><span class="special">></span> 3359<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span> 3360<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 3361<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 3362<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">;</span> 3363<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">xpressive</span><span class="special">;</span> 3364 3365<span class="keyword">struct</span> <span class="identifier">formatter</span> 3366<span class="special">{</span> 3367 <span class="keyword">typedef</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span> <span class="identifier">env_map</span><span class="special">;</span> 3368 <span class="identifier">env_map</span> <span class="identifier">env</span><span class="special">;</span> 3369 3370 <span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">Out</span><span class="special">></span> 3371 <span class="identifier">Out</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">smatch</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">what</span><span class="special">,</span> <span class="identifier">Out</span> <span class="identifier">out</span><span class="special">)</span> <span class="keyword">const</span> 3372 <span class="special">{</span> 3373 <span class="identifier">env_map</span><span class="special">::</span><span class="identifier">const_iterator</span> <span class="identifier">where</span> <span class="special">=</span> <span class="identifier">env</span><span class="special">.</span><span class="identifier">find</span><span class="special">(</span><span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]);</span> 3374 <span class="keyword">if</span><span class="special">(</span><span class="identifier">where</span> <span class="special">!=</span> <span class="identifier">env</span><span class="special">.</span><span class="identifier">end</span><span class="special">())</span> 3375 <span class="special">{</span> 3376 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">sub</span> <span class="special">=</span> <span class="identifier">where</span><span class="special">-></span><span class="identifier">second</span><span class="special">;</span> 3377 <span class="identifier">out</span> <span class="special">=</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">sub</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">out</span><span class="special">);</span> 3378 <span class="special">}</span> 3379 <span class="keyword">return</span> <span class="identifier">out</span><span class="special">;</span> 3380 <span class="special">}</span> 3381 3382<span class="special">};</span> 3383 3384<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 3385<span class="special">{</span> 3386 <span class="identifier">formatter</span> <span class="identifier">fmt</span><span class="special">;</span> 3387 <span class="identifier">fmt</span><span class="special">.</span><span class="identifier">env</span><span class="special">[</span><span class="string">"X"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"this"</span><span class="special">;</span> 3388 <span class="identifier">fmt</span><span class="special">.</span><span class="identifier">env</span><span class="special">[</span><span class="string">"Y"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"that"</span><span class="special">;</span> 3389 3390 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"\"$(X)\" has the value \"$(Y)\""</span><span class="special">);</span> 3391 3392 <span class="identifier">sregex</span> <span class="identifier">envar</span> <span class="special">=</span> <span class="string">"$("</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s1</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span> 3393 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">envar</span><span class="special">,</span> <span class="identifier">fmt</span><span class="special">);</span> 3394 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">output</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 3395<span class="special">}</span> 3396</pre> 3397<p> 3398 The formatter must be a callable object -- a function or a function object 3399 -- that has one of three possible signatures, detailed in the table below. 3400 For the table, <code class="computeroutput"><span class="identifier">fmt</span></code> is a function 3401 pointer or function object, <code class="computeroutput"><span class="identifier">what</span></code> 3402 is a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 3403 object, <code class="computeroutput"><span class="identifier">out</span></code> is an OutputIterator, 3404 and <code class="computeroutput"><span class="identifier">flags</span></code> is a value of 3405 <code class="computeroutput"><span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">match_flag_type</span></code>: 3406 </p> 3407<div class="table"> 3408<a name="boost_xpressive.user_s_guide.string_substitutions.t4"></a><p class="title"><b>Table 46.11. Formatter Signatures</b></p> 3409<div class="table-contents"><table class="table" summary="Formatter Signatures"> 3410<colgroup> 3411<col> 3412<col> 3413<col> 3414</colgroup> 3415<thead><tr> 3416<th> 3417 <p> 3418 Formatter Invocation 3419 </p> 3420 </th> 3421<th> 3422 <p> 3423 Return Type 3424 </p> 3425 </th> 3426<th> 3427 <p> 3428 Semantics 3429 </p> 3430 </th> 3431</tr></thead> 3432<tbody> 3433<tr> 3434<td> 3435 <p> 3436 <code class="computeroutput"><span class="identifier">fmt</span><span class="special">(</span><span class="identifier">what</span><span class="special">)</span></code> 3437 </p> 3438 </td> 3439<td> 3440 <p> 3441 Range of characters (e.g. <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>) 3442 or null-terminated string 3443 </p> 3444 </td> 3445<td> 3446 <p> 3447 The string matched by the regex is replaced with the string returned 3448 by the formatter. 3449 </p> 3450 </td> 3451</tr> 3452<tr> 3453<td> 3454 <p> 3455 <code class="computeroutput"><span class="identifier">fmt</span><span class="special">(</span><span class="identifier">what</span><span class="special">,</span> 3456 <span class="identifier">out</span><span class="special">)</span></code> 3457 </p> 3458 </td> 3459<td> 3460 <p> 3461 OutputIterator 3462 </p> 3463 </td> 3464<td> 3465 <p> 3466 The formatter writes the replacement string into <code class="computeroutput"><span class="identifier">out</span></code> and returns <code class="computeroutput"><span class="identifier">out</span></code>. 3467 </p> 3468 </td> 3469</tr> 3470<tr> 3471<td> 3472 <p> 3473 <code class="computeroutput"><span class="identifier">fmt</span><span class="special">(</span><span class="identifier">what</span><span class="special">,</span> 3474 <span class="identifier">out</span><span class="special">,</span> 3475 <span class="identifier">flags</span><span class="special">)</span></code> 3476 </p> 3477 </td> 3478<td> 3479 <p> 3480 OutputIterator 3481 </p> 3482 </td> 3483<td> 3484 <p> 3485 The formatter writes the replacement string into <code class="computeroutput"><span class="identifier">out</span></code> and returns <code class="computeroutput"><span class="identifier">out</span></code>. The <code class="computeroutput"><span class="identifier">flags</span></code> 3486 parameter is the value of the match flags passed to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code> 3487 algorithm. 3488 </p> 3489 </td> 3490</tr> 3491</tbody> 3492</table></div> 3493</div> 3494<br class="table-break"><h3> 3495<a name="boost_xpressive.user_s_guide.string_substitutions.h7"></a> 3496 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.formatter_expressions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.formatter_expressions">Formatter 3497 Expressions</a> 3498 </h3> 3499<p> 3500 In addition to format <span class="emphasis"><em>strings</em></span> and formatter <span class="emphasis"><em>objects</em></span>, 3501 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code> 3502 also accepts formatter <span class="emphasis"><em>expressions</em></span>. A formatter expression 3503 is a lambda expression that generates a string. It uses the same syntax as 3504 that for <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions" title="Semantic Actions and User-Defined Assertions">Semantic 3505 Actions</a>, which are covered later. The above example, which uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code> 3506 to substitute strings for environment variables, is repeated here using a 3507 formatter expression. 3508 </p> 3509<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">map</span><span class="special">></span> 3510<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span> 3511<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 3512<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 3513<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 3514<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span> 3515 3516<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 3517<span class="special">{</span> 3518 <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span> <span class="identifier">env</span><span class="special">;</span> 3519 <span class="identifier">env</span><span class="special">[</span><span class="string">"X"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"this"</span><span class="special">;</span> 3520 <span class="identifier">env</span><span class="special">[</span><span class="string">"Y"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"that"</span><span class="special">;</span> 3521 3522 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"\"$(X)\" has the value \"$(Y)\""</span><span class="special">);</span> 3523 3524 <span class="identifier">sregex</span> <span class="identifier">envar</span> <span class="special">=</span> <span class="string">"$("</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s1</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span> 3525 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">envar</span><span class="special">,</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">env</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]);</span> 3526 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">output</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 3527<span class="special">}</span> 3528</pre> 3529<p> 3530 In the above, the formatter expression is <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">env</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span></code>. This 3531 means to use the value of the first submatch, <code class="computeroutput"><span class="identifier">s1</span></code>, 3532 as a key into the <code class="computeroutput"><span class="identifier">env</span></code> map. 3533 The purpose of <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code> 3534 here is to make the reference to the <code class="computeroutput"><span class="identifier">env</span></code> 3535 local variable <span class="emphasis"><em>lazy</em></span> so that the index operation is deferred 3536 until we know what to replace <code class="computeroutput"><span class="identifier">s1</span></code> 3537 with. 3538 </p> 3539</div> 3540<div class="section"> 3541<div class="titlepage"><div><div><h3 class="title"> 3542<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization" title="String Splitting and Tokenization">String 3543 Splitting and Tokenization</a> 3544</h3></div></div></div> 3545<p> 3546 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3547 is the Ginsu knife of the text manipulation world. It slices! It dices! This 3548 section describes how to use the highly-configurable <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3549 to chop up input sequences. 3550 </p> 3551<h3> 3552<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h0"></a> 3553 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.overview">Overview</a> 3554 </h3> 3555<p> 3556 You initialize a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3557 with an input sequence, a regex, and some optional configuration parameters. 3558 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3559 will use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 3560 to find the first place in the sequence that the regex matches. When dereferenced, 3561 the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3562 returns a <span class="emphasis"><em>token</em></span> in the form of a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><></span></code>. Which string it returns depends 3563 on the configuration parameters. By default it returns a string corresponding 3564 to the full match, but it could also return a string corresponding to a particular 3565 marked sub-expression, or even the part of the sequence that <span class="emphasis"><em>didn't</em></span> 3566 match. When you increment the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>, 3567 it will move to the next token. Which token is next depends on the configuration 3568 parameters. It could simply be a different marked sub-expression in the current 3569 match, or it could be part or all of the next match. Or it could be the part 3570 that <span class="emphasis"><em>didn't</em></span> match. 3571 </p> 3572<p> 3573 As you can see, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3574 can do a lot. That makes it hard to describe, but some examples should make 3575 it clear. 3576 </p> 3577<h3> 3578<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h1"></a> 3579 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_1__simple_tokenization"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_1__simple_tokenization">Example 3580 1: Simple Tokenization</a> 3581 </h3> 3582<p> 3583 This example uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3584 to chop a sequence into a series of tokens consisting of words. 3585 </p> 3586<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"This is his face"</span><span class="special">);</span> 3587<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> <span class="comment">// find a word</span> 3588 3589<span class="comment">// iterate over all the words in the input</span> 3590<span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span> 3591 3592<span class="comment">// write all the words to std::cout</span> 3593<span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span> 3594<span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span> 3595</pre> 3596<p> 3597 This program displays the following: 3598 </p> 3599<pre class="programlisting">This 3600is 3601his 3602face 3603</pre> 3604<h3> 3605<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h2"></a> 3606 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_2__simple_tokenization__reloaded"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_2__simple_tokenization__reloaded">Example 3607 2: Simple Tokenization, Reloaded</a> 3608 </h3> 3609<p> 3610 This example also uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3611 to chop a sequence into a series of tokens consisting of words, but it uses 3612 the regex as a delimiter. When we pass a <code class="computeroutput"><span class="special">-</span><span class="number">1</span></code> as the last parameter to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3613 constructor, it instructs the token iterator to consider as tokens those 3614 parts of the input that <span class="emphasis"><em>didn't</em></span> match the regex. 3615 </p> 3616<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"This is his face"</span><span class="special">);</span> 3617<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_s</span><span class="special">;</span> <span class="comment">// find white space</span> 3618 3619<span class="comment">// iterate over all non-white space in the input. Note the -1 below:</span> 3620<span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="special">-</span><span class="number">1</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span> 3621 3622<span class="comment">// write all the words to std::cout</span> 3623<span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span> 3624<span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span> 3625</pre> 3626<p> 3627 This program displays the following: 3628 </p> 3629<pre class="programlisting">This 3630is 3631his 3632face 3633</pre> 3634<h3> 3635<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h3"></a> 3636 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_3__simple_tokenization__revolutions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_3__simple_tokenization__revolutions">Example 3637 3: Simple Tokenization, Revolutions</a> 3638 </h3> 3639<p> 3640 This example also uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3641 to chop a sequence containing a bunch of dates into a series of tokens consisting 3642 of just the years. When we pass a positive integer <code class="literal"><span class="emphasis"><em>N</em></span></code> 3643 as the last parameter to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3644 constructor, it instructs the token iterator to consider as tokens only the 3645 <code class="literal"><span class="emphasis"><em>N</em></span></code>-th marked sub-expression of each 3646 match. 3647 </p> 3648<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"01/02/2003 blahblah 04/23/1999 blahblah 11/13/1981"</span><span class="special">);</span> 3649<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(\\d{2})/(\\d{2})/(\\d{4})"</span><span class="special">);</span> <span class="comment">// find a date</span> 3650 3651<span class="comment">// iterate over all the years in the input. Note the 3 below, corresponding to the 3rd sub-expression:</span> 3652<span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="number">3</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span> 3653 3654<span class="comment">// write all the words to std::cout</span> 3655<span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span> 3656<span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span> 3657</pre> 3658<p> 3659 This program displays the following: 3660 </p> 3661<pre class="programlisting">2003 36621999 36631981 3664</pre> 3665<h3> 3666<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h4"></a> 3667 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_4__not_so_simple_tokenization"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_4__not_so_simple_tokenization">Example 3668 4: Not-So-Simple Tokenization</a> 3669 </h3> 3670<p> 3671 This example is like the previous one, except that instead of tokenizing 3672 just the years, this program turns the days, months and years into tokens. 3673 When we pass an array of integers <code class="literal"><span class="emphasis"><em>{I,J,...}</em></span></code> 3674 as the last parameter to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3675 constructor, it instructs the token iterator to consider as tokens the <code class="literal"><span class="emphasis"><em>I</em></span></code>-th, 3676 <code class="literal"><span class="emphasis"><em>J</em></span></code>-th, etc. marked sub-expression 3677 of each match. 3678 </p> 3679<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"01/02/2003 blahblah 04/23/1999 blahblah 11/13/1981"</span><span class="special">);</span> 3680<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(\\d{2})/(\\d{2})/(\\d{4})"</span><span class="special">);</span> <span class="comment">// find a date</span> 3681 3682<span class="comment">// iterate over the days, months and years in the input</span> 3683<span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">sub_matches</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">2</span><span class="special">,</span> <span class="number">1</span><span class="special">,</span> <span class="number">3</span> <span class="special">};</span> <span class="comment">// day, month, year</span> 3684<span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="identifier">sub_matches</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span> 3685 3686<span class="comment">// write all the words to std::cout</span> 3687<span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span> 3688<span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span> 3689</pre> 3690<p> 3691 This program displays the following: 3692 </p> 3693<pre class="programlisting">02 369401 36952003 369623 369704 36981999 369913 370011 37011981 3702</pre> 3703<p> 3704 The <code class="computeroutput"><span class="identifier">sub_matches</span></code> array instructs 3705 the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 3706 to first take the value of the 2nd sub-match, then the 1st sub-match, and 3707 finally the 3rd. Incrementing the iterator again instructs it to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 3708 again to find the next match. At that point, the process repeats -- the token 3709 iterator takes the value of the 2nd sub-match, then the 1st, et cetera. 3710 </p> 3711</div> 3712<div class="section"> 3713<div class="titlepage"><div><div><h3 class="title"> 3714<a name="boost_xpressive.user_s_guide.named_captures"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures" title="Named Captures">Named Captures</a> 3715</h3></div></div></div> 3716<h3> 3717<a name="boost_xpressive.user_s_guide.named_captures.h0"></a> 3718 <span class="phrase"><a name="boost_xpressive.user_s_guide.named_captures.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures.overview">Overview</a> 3719 </h3> 3720<p> 3721 For complicated regular expressions, dealing with numbered captures can be 3722 a pain. Counting left parentheses to figure out which capture to reference 3723 is no fun. Less fun is the fact that merely editing a regular expression 3724 could cause a capture to be assigned a new number, invaliding code that refers 3725 back to it by the old number. 3726 </p> 3727<p> 3728 Other regular expression engines solve this problem with a feature called 3729 <span class="emphasis"><em>named captures</em></span>. This feature allows you to assign a 3730 name to a capture, and to refer back to the capture by name rather by number. 3731 Xpressive also supports named captures, both in dynamic and in static regexes. 3732 </p> 3733<h3> 3734<a name="boost_xpressive.user_s_guide.named_captures.h1"></a> 3735 <span class="phrase"><a name="boost_xpressive.user_s_guide.named_captures.dynamic_named_captures"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures.dynamic_named_captures">Dynamic 3736 Named Captures</a> 3737 </h3> 3738<p> 3739 For dynamic regular expressions, xpressive follows the lead of other popular 3740 regex engines with the syntax of named captures. You can create a named capture 3741 with <code class="computeroutput"><span class="string">"(?P<xxx>...)"</span></code> 3742 and refer back to that capture with <code class="computeroutput"><span class="string">"(?P=xxx)"</span></code>. 3743 Here, for instance, is a regular expression that creates a named capture 3744 and refers back to it: 3745 </p> 3746<pre class="programlisting"><span class="comment">// Create a named capture called "char" that matches a single</span> 3747<span class="comment">// character and refer back to that capture by name.</span> 3748<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(?P<char>.)(?P=char)"</span><span class="special">);</span> 3749</pre> 3750<p> 3751 The effect of the above regular expression is to find the first doubled character. 3752 </p> 3753<p> 3754 Once you have executed a match or search operation using a regex with named 3755 captures, you can access the named capture through the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 3756 object using the capture's name. 3757 </p> 3758<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span> 3759<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(?P<char>.)(?P=char)"</span><span class="special">);</span> 3760<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span> 3761<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span> 3762<span class="special">{</span> 3763 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"char = "</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="string">"char"</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 3764<span class="special">}</span> 3765</pre> 3766<p> 3767 The above code displays: 3768 </p> 3769<pre class="programlisting">char = e 3770</pre> 3771<p> 3772 You can also refer back to a named capture from within a substitution string. 3773 The syntax for that is <code class="computeroutput"><span class="string">"\\g<xxx>"</span></code>. 3774 Below is some code that demonstrates how to use named captures when doing 3775 string substitution. 3776 </p> 3777<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span> 3778<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(?P<char>.)(?P=char)"</span><span class="special">);</span> 3779<span class="identifier">str</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">,</span> <span class="string">"**\\g<char>**"</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">format_perl</span><span class="special">);</span> 3780<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">str</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 3781</pre> 3782<p> 3783 Notice that you have to specify <code class="computeroutput"><span class="identifier">format_perl</span></code> 3784 when using named captures. Only the perl syntax recognizes the <code class="computeroutput"><span class="string">"\\g<xxx>"</span></code> syntax. The above 3785 code displays: 3786 </p> 3787<pre class="programlisting">tw**e**t 3788</pre> 3789<h3> 3790<a name="boost_xpressive.user_s_guide.named_captures.h2"></a> 3791 <span class="phrase"><a name="boost_xpressive.user_s_guide.named_captures.static_named_captures"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures.static_named_captures">Static 3792 Named Captures</a> 3793 </h3> 3794<p> 3795 If you're using static regular expressions, creating and using named captures 3796 is even easier. You can use the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/mark_tag.html" title="Struct mark_tag">mark_tag</a></code></code> 3797 type to create a variable that you can use like <code class="computeroutput"><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s1</a></code>, <code class="computeroutput"><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s2</a></code> and friends, but with a 3798 name that is more meaningful. Below is how the above example would look using 3799 static regexes: 3800 </p> 3801<pre class="programlisting"><span class="identifier">mark_tag</span> <span class="identifier">char_</span><span class="special">(</span><span class="number">1</span><span class="special">);</span> <span class="comment">// char_ is now a synonym for s1</span> 3802<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">char_</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">char_</span><span class="special">;</span> 3803</pre> 3804<p> 3805 After a match operation, you can use the <code class="computeroutput"><span class="identifier">mark_tag</span></code> 3806 to index into the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 3807 to access the named capture: 3808 </p> 3809<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span> 3810<span class="identifier">mark_tag</span> <span class="identifier">char_</span><span class="special">(</span><span class="number">1</span><span class="special">);</span> 3811<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">char_</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">char_</span><span class="special">;</span> 3812<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span> 3813<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span> 3814<span class="special">{</span> 3815 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">char_</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 3816<span class="special">}</span> 3817</pre> 3818<p> 3819 The above code displays: 3820 </p> 3821<pre class="programlisting">char = e 3822</pre> 3823<p> 3824 When doing string substitutions with <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>, 3825 you can use named captures to create <span class="emphasis"><em>format expressions</em></span> 3826 as below: 3827 </p> 3828<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span> 3829<span class="identifier">mark_tag</span> <span class="identifier">char_</span><span class="special">(</span><span class="number">1</span><span class="special">);</span> 3830<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">char_</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">char_</span><span class="special">;</span> 3831<span class="identifier">str</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">,</span> <span class="string">"**"</span> <span class="special">+</span> <span class="identifier">char_</span> <span class="special">+</span> <span class="string">"**"</span><span class="special">);</span> 3832<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">str</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 3833</pre> 3834<p> 3835 The above code displays: 3836 </p> 3837<pre class="programlisting">tw**e**t 3838</pre> 3839<div class="note"><table border="0" summary="Note"> 3840<tr> 3841<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td> 3842<th align="left">Note</th> 3843</tr> 3844<tr><td align="left" valign="top"><p> 3845 You need to include <code class="literal"><boost/xpressive/regex_actions.hpp></code> 3846 to use format expressions. 3847 </p></td></tr> 3848</table></div> 3849</div> 3850<div class="section"> 3851<div class="titlepage"><div><div><h3 class="title"> 3852<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches" title="Grammars and Nested Matches">Grammars 3853 and Nested Matches</a> 3854</h3></div></div></div> 3855<h3> 3856<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h0"></a> 3857 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.overview">Overview</a> 3858 </h3> 3859<p> 3860 One of the key benefits of representing regexes as C++ expressions is the 3861 ability to easily refer to other C++ code and data from within the regex. 3862 This enables programming idioms that are not possible with other regular 3863 expression libraries. Of particular note is the ability for one regex to 3864 refer to another regex, allowing you to build grammars out of regular expressions. 3865 This section describes how to embed one regex in another by value and by 3866 reference, how regex objects behave when they refer to other regexes, and 3867 how to access the tree of results after a successful parse. 3868 </p> 3869<h3> 3870<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h1"></a> 3871 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_value"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_value">Embedding 3872 a Regex by Value</a> 3873 </h3> 3874<p> 3875 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 3876 object has value semantics. When a regex object appears on the right-hand 3877 side in the definition of another regex, it is as if the regex were embedded 3878 by value; that is, a copy of the nested regex is stored by the enclosing 3879 regex. The inner regex is invoked by the outer regex during pattern matching. 3880 The inner regex participates fully in the match, back-tracking as needed 3881 to make the match succeed. 3882 </p> 3883<p> 3884 Consider a text editor that has a regex-find feature with a whole-word option. 3885 You can implement this with xpressive as follows: 3886 </p> 3887<pre class="programlisting"><span class="identifier">find_dialog</span> <span class="identifier">dlg</span><span class="special">;</span> 3888<span class="keyword">if</span><span class="special">(</span> <span class="identifier">dialog_ok</span> <span class="special">==</span> <span class="identifier">dlg</span><span class="special">.</span><span class="identifier">do_modal</span><span class="special">()</span> <span class="special">)</span> 3889<span class="special">{</span> 3890 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">pattern</span> <span class="special">=</span> <span class="identifier">dlg</span><span class="special">.</span><span class="identifier">get_text</span><span class="special">();</span> <span class="comment">// the pattern the user entered</span> 3891 <span class="keyword">bool</span> <span class="identifier">whole_word</span> <span class="special">=</span> <span class="identifier">dlg</span><span class="special">.</span><span class="identifier">whole_word</span><span class="special">.</span><span class="identifier">is_checked</span><span class="special">();</span> <span class="comment">// did the user select the whole-word option?</span> 3892 3893 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="identifier">pattern</span> <span class="special">);</span> <span class="comment">// try to compile the pattern</span> 3894 3895 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">whole_word</span> <span class="special">)</span> 3896 <span class="special">{</span> 3897 <span class="comment">// wrap the regex in begin-word / end-word assertions</span> 3898 <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">bow</span> <span class="special">>></span> <span class="identifier">re</span> <span class="special">>></span> <span class="identifier">eow</span><span class="special">;</span> 3899 <span class="special">}</span> 3900 3901 <span class="comment">// ... use re ...</span> 3902<span class="special">}</span> 3903</pre> 3904<p> 3905 Look closely at this line: 3906 </p> 3907<pre class="programlisting"><span class="comment">// wrap the regex in begin-word / end-word assertions</span> 3908<span class="identifier">re</span> <span class="special">=</span> <span class="identifier">bow</span> <span class="special">>></span> <span class="identifier">re</span> <span class="special">>></span> <span class="identifier">eow</span><span class="special">;</span> 3909</pre> 3910<p> 3911 This line creates a new regex that embeds the old regex by value. Then, the 3912 new regex is assigned back to the original regex. Since a copy of the old 3913 regex was made on the right-hand side, this works as you might expect: the 3914 new regex has the behavior of the old regex wrapped in begin- and end-word 3915 assertions. 3916 </p> 3917<div class="note"><table border="0" summary="Note"> 3918<tr> 3919<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td> 3920<th align="left">Note</th> 3921</tr> 3922<tr><td align="left" valign="top"><p> 3923 Note that <code class="computeroutput"><span class="identifier">re</span> <span class="special">=</span> 3924 <span class="identifier">bow</span> <span class="special">>></span> 3925 <span class="identifier">re</span> <span class="special">>></span> 3926 <span class="identifier">eow</span></code> does <span class="emphasis"><em>not</em></span> 3927 define a recursive regular expression, since regex objects embed by value 3928 by default. The next section shows how to define a recursive regular expression 3929 by embedding a regex by reference. 3930 </p></td></tr> 3931</table></div> 3932<h3> 3933<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h2"></a> 3934 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_reference"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_reference">Embedding 3935 a Regex by Reference</a> 3936 </h3> 3937<p> 3938 If you want to be able to build recursive regular expressions and context-free 3939 grammars, embedding a regex by value is not enough. You need to be able to 3940 make your regular expressions self-referential. Most regular expression engines 3941 don't give you that power, but xpressive does. 3942 </p> 3943<div class="tip"><table border="0" summary="Tip"> 3944<tr> 3945<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td> 3946<th align="left">Tip</th> 3947</tr> 3948<tr><td align="left" valign="top"><p> 3949 The theoretical computer scientists out there will correctly point out 3950 that a self-referential regular expression is not "regular", 3951 so in the strict sense, xpressive isn't really a <span class="emphasis"><em>regular</em></span> 3952 expression engine at all. But as Larry Wall once said, "the term [regular expression] has 3953 grown with the capabilities of our pattern matching engines, so I'm not 3954 going to try to fight linguistic necessity here." 3955 </p></td></tr> 3956</table></div> 3957<p> 3958 Consider the following code, which uses the <code class="computeroutput"><span class="identifier">by_ref</span><span class="special">()</span></code> helper to define a recursive regular expression 3959 that matches balanced, nested parentheses: 3960 </p> 3961<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">parentheses</span><span class="special">;</span> 3962<span class="identifier">parentheses</span> <span class="comment">// A balanced set of parentheses ...</span> 3963 <span class="special">=</span> <span class="char">'('</span> <span class="comment">// is an opening parenthesis ...</span> 3964 <span class="special">>></span> <span class="comment">// followed by ...</span> 3965 <span class="special">*(</span> <span class="comment">// zero or more ...</span> 3966 <span class="identifier">keep</span><span class="special">(</span> <span class="special">+~(</span><span class="identifier">set</span><span class="special">=</span><span class="char">'('</span><span class="special">,</span><span class="char">')'</span><span class="special">)</span> <span class="special">)</span> <span class="comment">// of a bunch of things that are not parentheses ...</span> 3967 <span class="special">|</span> <span class="comment">// or ...</span> 3968 <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">parentheses</span><span class="special">)</span> <span class="comment">// a balanced set of parentheses</span> 3969 <span class="special">)</span> <span class="comment">// (ooh, recursion!) ...</span> 3970 <span class="special">>></span> <span class="comment">// followed by ...</span> 3971 <span class="char">')'</span> <span class="comment">// a closing parenthesis</span> 3972 <span class="special">;</span> 3973</pre> 3974<p> 3975 Matching balanced, nested tags is an important text processing task, and 3976 it is one that "classic" regular expressions cannot do. The <code class="computeroutput"><span class="identifier">by_ref</span><span class="special">()</span></code> 3977 helper makes it possible. It allows one regex object to be embedded in another 3978 <span class="emphasis"><em>by reference</em></span>. Since the right-hand side holds <code class="computeroutput"><span class="identifier">parentheses</span></code> by reference, assigning the 3979 right-hand side back to <code class="computeroutput"><span class="identifier">parentheses</span></code> 3980 creates a cycle, which will execute recursively. 3981 </p> 3982<h3> 3983<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h3"></a> 3984 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.building_a_grammar"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.building_a_grammar">Building 3985 a Grammar</a> 3986 </h3> 3987<p> 3988 Once we allow self-reference in our regular expressions, the genie is out 3989 of the bottle and all manner of fun things are possible. In particular, we 3990 can now build grammars out of regular expressions. Let's have a look at the 3991 text-book grammar example: the humble calculator. 3992 </p> 3993<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">group</span><span class="special">,</span> <span class="identifier">factor</span><span class="special">,</span> <span class="identifier">term</span><span class="special">,</span> <span class="identifier">expression</span><span class="special">;</span> 3994 3995<span class="identifier">group</span> <span class="special">=</span> <span class="char">'('</span> <span class="special">>></span> <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">expression</span><span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span> 3996<span class="identifier">factor</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span> <span class="special">|</span> <span class="identifier">group</span><span class="special">;</span> 3997<span class="identifier">term</span> <span class="special">=</span> <span class="identifier">factor</span> <span class="special">>></span> <span class="special">*((</span><span class="char">'*'</span> <span class="special">>></span> <span class="identifier">factor</span><span class="special">)</span> <span class="special">|</span> <span class="special">(</span><span class="char">'/'</span> <span class="special">>></span> <span class="identifier">factor</span><span class="special">));</span> 3998<span class="identifier">expression</span> <span class="special">=</span> <span class="identifier">term</span> <span class="special">>></span> <span class="special">*((</span><span class="char">'+'</span> <span class="special">>></span> <span class="identifier">term</span><span class="special">)</span> <span class="special">|</span> <span class="special">(</span><span class="char">'-'</span> <span class="special">>></span> <span class="identifier">term</span><span class="special">));</span> 3999</pre> 4000<p> 4001 The regex <code class="computeroutput"><span class="identifier">expression</span></code> defined 4002 above does something rather remarkable for a regular expression: it matches 4003 mathematical expressions. For example, if the input string were <code class="computeroutput"><span class="string">"foo 9*(10+3) bar"</span></code>, this pattern 4004 would match <code class="computeroutput"><span class="string">"9*(10+3)"</span></code>. 4005 It only matches well-formed mathematical expressions, where the parentheses 4006 are balanced and the infix operators have two arguments each. Don't try this 4007 with just any regular expression engine! 4008 </p> 4009<p> 4010 Let's take a closer look at this regular expression grammar. Notice that 4011 it is cyclic: <code class="computeroutput"><span class="identifier">expression</span></code> 4012 is implemented in terms of <code class="computeroutput"><span class="identifier">term</span></code>, 4013 which is implemented in terms of <code class="computeroutput"><span class="identifier">factor</span></code>, 4014 which is implemented in terms of <code class="computeroutput"><span class="identifier">group</span></code>, 4015 which is implemented in terms of <code class="computeroutput"><span class="identifier">expression</span></code>, 4016 closing the loop. In general, the way to define a cyclic grammar is to forward-declare 4017 the regex objects and embed by reference those regular expressions that have 4018 not yet been initialized. In the above grammar, there is only one place where 4019 we need to reference a regex object that has not yet been initialized: the 4020 definition of <code class="computeroutput"><span class="identifier">group</span></code>. In that 4021 place, we use <code class="computeroutput"><span class="identifier">by_ref</span><span class="special">()</span></code> 4022 to embed <code class="computeroutput"><span class="identifier">expression</span></code> by reference. 4023 In all other places, it is sufficient to embed the other regex objects by 4024 value, since they have already been initialized and their values will not 4025 change. 4026 </p> 4027<div class="tip"><table border="0" summary="Tip"> 4028<tr> 4029<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td> 4030<th align="left">Tip</th> 4031</tr> 4032<tr><td align="left" valign="top"><p> 4033 <span class="bold"><strong>Embed by value if possible</strong></span> <br> <br> 4034 In general, prefer embedding regular expressions by value rather than by 4035 reference. It involves one less indirection, making your patterns match 4036 a little faster. Besides, value semantics are simpler and will make your 4037 grammars easier to reason about. Don't worry about the expense of "copying" 4038 a regex. Each regex object shares its implementation with all of its copies. 4039 </p></td></tr> 4040</table></div> 4041<h3> 4042<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h4"></a> 4043 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.dynamic_regex_grammars"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.dynamic_regex_grammars">Dynamic 4044 Regex Grammars</a> 4045 </h3> 4046<p> 4047 Using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>, 4048 you can also build grammars out of dynamic regular expressions. You do that 4049 by creating named regexes, and referring to other regexes by name. Each 4050 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code> 4051 instance keeps a mapping from names to regexes that have been created with 4052 it. 4053 </p> 4054<p> 4055 You can create a named dynamic regex by prefacing your regex with <code class="computeroutput"><span class="string">"(?$name=)"</span></code>, where <span class="emphasis"><em>name</em></span> 4056 is the name of the regex. You can refer to a named regex from another regex 4057 with <code class="computeroutput"><span class="string">"(?$name)"</span></code>. The 4058 named regex does not need to exist yet at the time it is referenced in another 4059 regex, but it must exist by the time you use the regex. 4060 </p> 4061<p> 4062 Below is a code fragment that uses dynamic regex grammars to implement the 4063 calculator example from above. 4064 </p> 4065<pre class="programlisting"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span> 4066<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">regex_constants</span><span class="special">;</span> 4067 4068<span class="identifier">sregex</span> <span class="identifier">expr</span><span class="special">;</span> 4069 4070<span class="special">{</span> 4071 <span class="identifier">sregex_compiler</span> <span class="identifier">compiler</span><span class="special">;</span> 4072 <span class="identifier">syntax_option_type</span> <span class="identifier">x</span> <span class="special">=</span> <span class="identifier">ignore_white_space</span><span class="special">;</span> 4073 4074 <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $group = ) \\( (? $expr ) \\) "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span> 4075 <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $factor = ) \\d+ | (? $group ) "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span> 4076 <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $term = ) (? $factor )"</span> 4077 <span class="string">" ( \\* (? $factor ) | / (? $factor ) )* "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span> 4078 <span class="identifier">expr</span> <span class="special">=</span> <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $expr = ) (? $term )"</span> 4079 <span class="string">" ( \\+ (? $term ) | - (? $term ) )* "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span> 4080<span class="special">}</span> 4081 4082<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"foo 9*(10+3) bar"</span><span class="special">);</span> 4083<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span> 4084 4085<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">expr</span><span class="special">))</span> 4086<span class="special">{</span> 4087 <span class="comment">// This prints "9*(10+3)":</span> 4088 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 4089<span class="special">}</span> 4090</pre> 4091<p> 4092 As with static regex grammars, nested regex invocations create nested match 4093 results (see <span class="emphasis"><em>Nested Results</em></span> below). The result is a 4094 complete parse tree for string that matched. Unlike static regexes, dynamic 4095 regexes are always embedded by reference, not by value. 4096 </p> 4097<h3> 4098<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h5"></a> 4099 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.cyclic_patterns__copying_and_memory_management__oh_my_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.cyclic_patterns__copying_and_memory_management__oh_my_">Cyclic 4100 Patterns, Copying and Memory Management, Oh My!</a> 4101 </h3> 4102<p> 4103 The calculator examples above raises a number of very complicated memory-management 4104 issues. Each of the four regex objects refer to each other, some directly 4105 and some indirectly, some by value and some by reference. What if we were 4106 to return one of them from a function and let the others go out of scope? 4107 What becomes of the references? The answer is that the regex objects are 4108 internally reference counted, such that they keep their referenced regex 4109 objects alive as long as they need them. So passing a regex object by value 4110 is never a problem, even if it refers to other regex objects that have gone 4111 out of scope. 4112 </p> 4113<p> 4114 Those of you who have dealt with reference counting are probably familiar 4115 with its Achilles Heel: cyclic references. If regex objects are reference 4116 counted, what happens to cycles like the one created in the calculator examples? 4117 Are they leaked? The answer is no, they are not leaked. The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 4118 object has some tricky reference tracking code that ensures that even cyclic 4119 regex grammars are cleaned up when the last external reference goes away. 4120 So don't worry about it. Create cyclic grammars, pass your regex objects 4121 around and copy them all you want. It is fast and efficient and guaranteed 4122 not to leak or result in dangling references. 4123 </p> 4124<h3> 4125<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h6"></a> 4126 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_regexes_and_sub_match_scoping"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_regexes_and_sub_match_scoping">Nested 4127 Regexes and Sub-Match Scoping</a> 4128 </h3> 4129<p> 4130 Nested regular expressions raise the issue of sub-match scoping. If both 4131 the inner and outer regex write to and read from the same sub-match vector, 4132 chaos would ensue. The inner regex would stomp on the sub-matches written 4133 by the outer regex. For example, what does this do? 4134 </p> 4135<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">inner</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(.)\\1"</span> <span class="special">);</span> 4136<span class="identifier">sregex</span> <span class="identifier">outer</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">inner</span> <span class="special">>></span> <span class="identifier">s1</span><span class="special">;</span> 4137</pre> 4138<p> 4139 The author probably didn't intend for the inner regex to overwrite the sub-match 4140 written by the outer regex. The problem is particularly acute when the inner 4141 regex is accepted from the user as input. The author has no way of knowing 4142 whether the inner regex will stomp the sub-match vector or not. This is clearly 4143 not acceptable. 4144 </p> 4145<p> 4146 Instead, what actually happens is that each invocation of a nested regex 4147 gets its own scope. Sub-matches belong to that scope. That is, each nested 4148 regex invocation gets its own copy of the sub-match vector to play with, 4149 so there is no way for an inner regex to stomp on the sub-matches of an outer 4150 regex. So, for example, the regex <code class="computeroutput"><span class="identifier">outer</span></code> 4151 defined above would match <code class="computeroutput"><span class="string">"ABBA"</span></code>, 4152 as it should. 4153 </p> 4154<h3> 4155<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h7"></a> 4156 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_results"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_results">Nested 4157 Results</a> 4158 </h3> 4159<p> 4160 If nested regexes have their own sub-matches, there should be a way to access 4161 them after a successful match. In fact, there is. After a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 4162 or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>, 4163 the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 4164 struct behaves like the head of a tree of nested results. The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 4165 class provides a <code class="computeroutput"><span class="identifier">nested_results</span><span class="special">()</span></code> member function that returns an ordered 4166 sequence of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 4167 structures, representing the results of the nested regexes. The order of 4168 the nested results is the same as the order in which the nested regex objects 4169 matched. 4170 </p> 4171<p> 4172 Take as an example the regex for balanced, nested parentheses we saw earlier: 4173 </p> 4174<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">parentheses</span><span class="special">;</span> 4175<span class="identifier">parentheses</span> <span class="special">=</span> <span class="char">'('</span> <span class="special">>></span> <span class="special">*(</span> <span class="identifier">keep</span><span class="special">(</span> <span class="special">+~(</span><span class="identifier">set</span><span class="special">=</span><span class="char">'('</span><span class="special">,</span><span class="char">')'</span><span class="special">)</span> <span class="special">)</span> <span class="special">|</span> <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">parentheses</span><span class="special">)</span> <span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span> 4176 4177<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span> 4178<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"blah blah( a(b)c (c(e)f (g)h )i (j)6 )blah"</span> <span class="special">);</span> 4179 4180<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_search</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">parentheses</span> <span class="special">)</span> <span class="special">)</span> 4181<span class="special">{</span> 4182 <span class="comment">// display the whole match</span> 4183 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 4184 4185 <span class="comment">// display the nested results</span> 4186 <span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span> 4187 <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">begin</span><span class="special">(),</span> 4188 <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">end</span><span class="special">(),</span> 4189 <span class="identifier">output_nested_results</span><span class="special">()</span> <span class="special">);</span> 4190<span class="special">}</span> 4191</pre> 4192<p> 4193 This program displays the following: 4194 </p> 4195<pre class="programlisting">( a(b)c (c(e)f (g)h )i (j)6 ) 4196 (b) 4197 (c(e)f (g)h ) 4198 (e) 4199 (g) 4200 (j) 4201</pre> 4202<p> 4203 Here you can see how the results are nested and that they are stored in the 4204 order in which they are found. 4205 </p> 4206<div class="tip"><table border="0" summary="Tip"> 4207<tr> 4208<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td> 4209<th align="left">Tip</th> 4210</tr> 4211<tr><td align="left" valign="top"><p> 4212 See the definition of <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.display_a_tree_of_nested_results">output_nested_results</a> 4213 in the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">Examples</a> 4214 section. 4215 </p></td></tr> 4216</table></div> 4217<h3> 4218<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h8"></a> 4219 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.filtering_nested_results"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.filtering_nested_results">Filtering 4220 Nested Results</a> 4221 </h3> 4222<p> 4223 Sometimes a regex will have several nested regex objects, and you want to 4224 know which result corresponds to which regex object. That's where <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special"><>::</span><span class="identifier">regex_id</span><span class="special">()</span></code> 4225 and <code class="computeroutput"><span class="identifier">match_results</span><span class="special"><>::</span><span class="identifier">regex_id</span><span class="special">()</span></code> 4226 come in handy. When iterating over the nested results, you can compare the 4227 regex id from the results to the id of the regex object you're interested 4228 in. 4229 </p> 4230<p> 4231 To make this a bit easier, xpressive provides a predicate to make it simple 4232 to iterate over just the results that correspond to a certain nested regex. 4233 It is called <code class="computeroutput"><span class="identifier">regex_id_filter_predicate</span></code>, 4234 and it is intended to be used with <a href="../../../libs/iterator/doc/index.html" target="_top">Boost.Iterator</a>. 4235 You can use it as follows: 4236 </p> 4237<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">name</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">alpha</span><span class="special">;</span> 4238<span class="identifier">sregex</span> <span class="identifier">integer</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">;</span> 4239<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">*(</span> <span class="special">*</span><span class="identifier">_s</span> <span class="special">>></span> <span class="special">(</span> <span class="identifier">name</span> <span class="special">|</span> <span class="identifier">integer</span> <span class="special">)</span> <span class="special">);</span> 4240 4241<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span> 4242<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"marsha 123 jan 456 cindy 789"</span> <span class="special">);</span> 4243 4244<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">re</span> <span class="special">)</span> <span class="special">)</span> 4245<span class="special">{</span> 4246 <span class="identifier">smatch</span><span class="special">::</span><span class="identifier">nested_results_type</span><span class="special">::</span><span class="identifier">const_iterator</span> <span class="identifier">begin</span> <span class="special">=</span> <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">begin</span><span class="special">();</span> 4247 <span class="identifier">smatch</span><span class="special">::</span><span class="identifier">nested_results_type</span><span class="special">::</span><span class="identifier">const_iterator</span> <span class="identifier">end</span> <span class="special">=</span> <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">end</span><span class="special">();</span> 4248 4249 <span class="comment">// declare filter predicates to select just the names or the integers</span> 4250 <span class="identifier">sregex_id_filter_predicate</span> <span class="identifier">name_id</span><span class="special">(</span> <span class="identifier">name</span><span class="special">.</span><span class="identifier">regex_id</span><span class="special">()</span> <span class="special">);</span> 4251 <span class="identifier">sregex_id_filter_predicate</span> <span class="identifier">integer_id</span><span class="special">(</span> <span class="identifier">integer</span><span class="special">.</span><span class="identifier">regex_id</span><span class="special">()</span> <span class="special">);</span> 4252 4253 <span class="comment">// iterate over only the results from the name regex</span> 4254 <span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span> 4255 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">name_id</span><span class="special">,</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span> 4256 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">name_id</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span> 4257 <span class="identifier">output_result</span> 4258 <span class="special">);</span> 4259 4260 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 4261 4262 <span class="comment">// iterate over only the results from the integer regex</span> 4263 <span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span> 4264 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">integer_id</span><span class="special">,</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span> 4265 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">integer_id</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span> 4266 <span class="identifier">output_result</span> 4267 <span class="special">);</span> 4268<span class="special">}</span> 4269</pre> 4270<p> 4271 where <code class="computeroutput"><span class="identifier">output_results</span></code> is a 4272 simple function that takes a <code class="computeroutput"><span class="identifier">smatch</span></code> 4273 and displays the full match. Notice how we use the <code class="computeroutput"><span class="identifier">regex_id_filter_predicate</span></code> 4274 together with <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special"><>::</span><span class="identifier">regex_id</span><span class="special">()</span></code> and <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">()</span></code> from the <a href="../../../libs/iterator/doc/index.html" target="_top">Boost.Iterator</a> 4275 to select only those results corresponding to a particular nested regex. 4276 This program displays the following: 4277 </p> 4278<pre class="programlisting">marsha 4279jan 4280cindy 4281123 4282456 4283789 4284</pre> 4285</div> 4286<div class="section"> 4287<div class="titlepage"><div><div><h3 class="title"> 4288<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions" title="Semantic Actions and User-Defined Assertions">Semantic 4289 Actions and User-Defined Assertions</a> 4290</h3></div></div></div> 4291<h3> 4292<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h0"></a> 4293 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.overview">Overview</a> 4294 </h3> 4295<p> 4296 Imagine you want to parse an input string and build a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code> 4297 from it. For something like that, matching a regular expression isn't enough. 4298 You want to <span class="emphasis"><em>do something</em></span> when parts of your regular 4299 expression match. Xpressive lets you attach semantic actions to parts of 4300 your static regular expressions. This section shows you how. 4301 </p> 4302<h3> 4303<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h1"></a> 4304 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.semantic_actions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.semantic_actions">Semantic 4305 Actions</a> 4306 </h3> 4307<p> 4308 Consider the following code, which uses xpressive's semantic actions to parse 4309 a string of word/integer pairs and stuffs them into a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>. 4310 It is described below. 4311 </p> 4312<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span> 4313<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 4314<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 4315<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 4316<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span> 4317 4318<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 4319<span class="special">{</span> 4320 <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span> 4321 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"aaa=>1 bbb=>23 ccc=>456"</span><span class="special">);</span> 4322 4323 <span class="comment">// Match a word and an integer, separated by =>,</span> 4324 <span class="comment">// and then stuff the result into a std::map<></span> 4325 <span class="identifier">sregex</span> <span class="identifier">pair</span> <span class="special">=</span> <span class="special">(</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="string">"=>"</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">)</span> 4326 <span class="special">[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">];</span> 4327 4328 <span class="comment">// Match one or more word/integer pairs, separated</span> 4329 <span class="comment">// by whitespace.</span> 4330 <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">pair</span> <span class="special">>></span> <span class="special">*(+</span><span class="identifier">_s</span> <span class="special">>></span> <span class="identifier">pair</span><span class="special">);</span> 4331 4332 <span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_match</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span> 4333 <span class="special">{</span> 4334 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"aaa"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 4335 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"bbb"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 4336 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"ccc"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 4337 <span class="special">}</span> 4338 4339 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span> 4340<span class="special">}</span> 4341</pre> 4342<p> 4343 This program prints the following: 4344 </p> 4345<pre class="programlisting">1 434623 4347456 4348</pre> 4349<p> 4350 The regular expression <code class="computeroutput"><span class="identifier">pair</span></code> 4351 has two parts: the pattern and the action. The pattern says to match a word, 4352 capturing it in sub-match 1, and an integer, capturing it in sub-match 2, 4353 separated by <code class="computeroutput"><span class="string">"=>"</span></code>. 4354 The action is the part in square brackets: <code class="computeroutput"><span class="special">[</span> 4355 <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> 4356 <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">]</span></code>. It says 4357 to take sub-match one and use it to index into the <code class="computeroutput"><span class="identifier">results</span></code> 4358 map, and assign to it the result of converting sub-match 2 to an integer. 4359 </p> 4360<div class="note"><table border="0" summary="Note"> 4361<tr> 4362<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td> 4363<th align="left">Note</th> 4364</tr> 4365<tr><td align="left" valign="top"><p> 4366 To use semantic actions with your static regexes, you must <code class="computeroutput"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code> 4367 </p></td></tr> 4368</table></div> 4369<p> 4370 How does this work? Just as the rest of the static regular expression, the 4371 part between brackets is an expression template. It encodes the action and 4372 executes it later. The expression <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)</span></code> creates a lazy reference to the <code class="computeroutput"><span class="identifier">result</span></code> object. The larger expression <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span></code> 4373 is a lazy map index operation. Later, when this action is getting executed, 4374 <code class="computeroutput"><span class="identifier">s1</span></code> gets replaced with the 4375 first <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>. 4376 Likewise, when <code class="computeroutput"><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span></code> gets executed, <code class="computeroutput"><span class="identifier">s2</span></code> 4377 is replaced with the second <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>. 4378 The <code class="computeroutput"><span class="identifier">as</span><span class="special"><></span></code> 4379 action converts its argument to the requested type using Boost.Lexical_cast. 4380 The effect of the whole action is to insert a new word/integer pair into 4381 the map. 4382 </p> 4383<div class="note"><table border="0" summary="Note"> 4384<tr> 4385<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td> 4386<th align="left">Note</th> 4387</tr> 4388<tr><td align="left" valign="top"><p> 4389 There is an important difference between the function <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code> in <code class="computeroutput"><span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">ref</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code> 4390 and <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code> 4391 in <code class="computeroutput"><span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code>. The first returns a plain <code class="computeroutput"><span class="identifier">reference_wrapper</span><span class="special"><></span></code> 4392 which behaves in many respects like an ordinary reference. By contrast, 4393 <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code> 4394 returns a <span class="emphasis"><em>lazy</em></span> reference that you can use in expressions 4395 that are executed lazily. That is why we can say <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span></code>, even though <code class="computeroutput"><span class="identifier">result</span></code> 4396 doesn't have an <code class="computeroutput"><span class="keyword">operator</span><span class="special">[]</span></code> 4397 that would accept <code class="computeroutput"><span class="identifier">s1</span></code>. 4398 </p></td></tr> 4399</table></div> 4400<p> 4401 In addition to the sub-match placeholders <code class="computeroutput"><span class="identifier">s1</span></code>, 4402 <code class="computeroutput"><span class="identifier">s2</span></code>, etc., you can also use 4403 the placeholder <code class="computeroutput"><span class="identifier">_</span></code> within 4404 an action to refer back to the string matched by the sub-expression to which 4405 the action is attached. For instance, you can use the following regex to 4406 match a bunch of digits, interpret them as an integer and assign the result 4407 to a local variable: 4408 </p> 4409<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> 4410<span class="comment">// Here, _ refers back to all the</span> 4411<span class="comment">// characters matched by (+_d)</span> 4412<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(+</span><span class="identifier">_d</span><span class="special">)[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">];</span> 4413</pre> 4414<h4> 4415<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h2"></a> 4416 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_action_execution"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_action_execution">Lazy 4417 Action Execution</a> 4418 </h4> 4419<p> 4420 What does it mean, exactly, to attach an action to part of a regular expression 4421 and perform a match? When does the action execute? If the action is part 4422 of a repeated sub-expression, does the action execute once or many times? 4423 And if the sub-expression initially matches, but ultimately fails because 4424 the rest of the regular expression fails to match, is the action executed 4425 at all? 4426 </p> 4427<p> 4428 The answer is that by default, actions are executed <span class="emphasis"><em>lazily</em></span>. 4429 When a sub-expression matches a string, its action is placed on a queue, 4430 along with the current values of any sub-matches to which the action refers. 4431 If the match algorithm must backtrack, actions are popped off the queue as 4432 necessary. Only after the entire regex has matched successfully are the actions 4433 actually exeucted. They are executed all at once, in the order in which they 4434 were added to the queue, as the last step before <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 4435 returns. 4436 </p> 4437<p> 4438 For example, consider the following regex that increments a counter whenever 4439 it finds a digit. 4440 </p> 4441<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> 4442<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"1!2!3?"</span><span class="special">);</span> 4443<span class="comment">// count the exciting digits, but not the</span> 4444<span class="comment">// questionable ones.</span> 4445<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span> 4446<span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span> 4447<span class="identifier">assert</span><span class="special">(</span> <span class="identifier">i</span> <span class="special">==</span> <span class="number">2</span> <span class="special">);</span> 4448</pre> 4449<p> 4450 The action <code class="computeroutput"><span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span></code> 4451 is queued three times: once for each found digit. But it is only <span class="emphasis"><em>executed</em></span> 4452 twice: once for each digit that precedes a <code class="computeroutput"><span class="char">'!'</span></code> 4453 character. When the <code class="computeroutput"><span class="char">'?'</span></code> character 4454 is encountered, the match algorithm backtracks, removing the final action 4455 from the queue. 4456 </p> 4457<h4> 4458<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h3"></a> 4459 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.immediate_action_execution"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.immediate_action_execution">Immediate 4460 Action Execution</a> 4461 </h4> 4462<p> 4463 When you want semantic actions to execute immediately, you can wrap the sub-expression 4464 containing the action in a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/keep.html" title="Function template keep">keep()</a></code></code>. 4465 <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code> 4466 turns off back-tracking for its sub-expression, but it also causes any actions 4467 queued by the sub-expression to execute at the end of the <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>. It is as if the sub-expression in the 4468 <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code> 4469 were compiled into an independent regex object, and matching the <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code> 4470 is like a separate invocation of <code class="computeroutput"><span class="identifier">regex_search</span><span class="special">()</span></code>. It matches characters and executes actions 4471 but never backtracks or unwinds. For example, imagine the above example had 4472 been written as follows: 4473 </p> 4474<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> 4475<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"1!2!3?"</span><span class="special">);</span> 4476<span class="comment">// count all the digits.</span> 4477<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">keep</span><span class="special">(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">)</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span> 4478<span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span> 4479<span class="identifier">assert</span><span class="special">(</span> <span class="identifier">i</span> <span class="special">==</span> <span class="number">3</span> <span class="special">);</span> 4480</pre> 4481<p> 4482 We have wrapped the sub-expression <code class="computeroutput"><span class="identifier">_d</span> 4483 <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span></code> in <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>. 4484 Now, whenever this regex matches a digit, the action will be queued and then 4485 immediately executed before we try to match a <code class="computeroutput"><span class="char">'!'</span></code> 4486 character. In this case, the action executes three times. 4487 </p> 4488<div class="note"><table border="0" summary="Note"> 4489<tr> 4490<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td> 4491<th align="left">Note</th> 4492</tr> 4493<tr><td align="left" valign="top"><p> 4494 Like <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>, 4495 actions within <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/before.html" title="Function template before">before()</a></code></code> 4496 and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/after.html" title="Function template after">after()</a></code></code> 4497 are also executed early when their sub-expressions have matched. 4498 </p></td></tr> 4499</table></div> 4500<h4> 4501<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h4"></a> 4502 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_functions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_functions">Lazy 4503 Functions</a> 4504 </h4> 4505<p> 4506 So far, we've seen how to write semantic actions consisting of variables 4507 and operators. But what if you want to be able to call a function from a 4508 semantic action? Xpressive provides a mechanism to do this. 4509 </p> 4510<p> 4511 The first step is to define a function object type. Here, for instance, is 4512 a function object type that calls <code class="computeroutput"><span class="identifier">push</span><span class="special">()</span></code> on its argument: 4513 </p> 4514<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">push_impl</span> 4515<span class="special">{</span> 4516 <span class="comment">// Result type, needed for tr1::result_of</span> 4517 <span class="keyword">typedef</span> <span class="keyword">void</span> <span class="identifier">result_type</span><span class="special">;</span> 4518 4519 <span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">Sequence</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Value</span><span class="special">></span> 4520 <span class="keyword">void</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">Sequence</span> <span class="special">&</span><span class="identifier">seq</span><span class="special">,</span> <span class="identifier">Value</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">val</span><span class="special">)</span> <span class="keyword">const</span> 4521 <span class="special">{</span> 4522 <span class="identifier">seq</span><span class="special">.</span><span class="identifier">push</span><span class="special">(</span><span class="identifier">val</span><span class="special">);</span> 4523 <span class="special">}</span> 4524<span class="special">};</span> 4525</pre> 4526<p> 4527 The next step is to use xpressive's <code class="computeroutput"><span class="identifier">function</span><span class="special"><></span></code> template to define a function object 4528 named <code class="computeroutput"><span class="identifier">push</span></code>: 4529 </p> 4530<pre class="programlisting"><span class="comment">// Global "push" function object.</span> 4531<span class="identifier">function</span><span class="special"><</span><span class="identifier">push_impl</span><span class="special">>::</span><span class="identifier">type</span> <span class="keyword">const</span> <span class="identifier">push</span> <span class="special">=</span> <span class="special">{{}};</span> 4532</pre> 4533<p> 4534 The initialization looks a bit odd, but this is because <code class="computeroutput"><span class="identifier">push</span></code> 4535 is being statically initialized. That means it doesn't need to be constructed 4536 at runtime. We can use <code class="computeroutput"><span class="identifier">push</span></code> 4537 in semantic actions as follows: 4538 </p> 4539<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">stack</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">ints</span><span class="special">;</span> 4540<span class="comment">// Match digits, cast them to an int</span> 4541<span class="comment">// and push it on the stack.</span> 4542<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(+</span><span class="identifier">_d</span><span class="special">)[</span><span class="identifier">push</span><span class="special">(</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">ints</span><span class="special">),</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">))];</span> 4543</pre> 4544<p> 4545 You'll notice that doing it this way causes member function invocations to 4546 look like ordinary function invocations. You can choose to write your semantic 4547 action in a different way that makes it look a bit more like a member function 4548 call: 4549 </p> 4550<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(+</span><span class="identifier">_d</span><span class="special">)[</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">ints</span><span class="special">)->*</span><span class="identifier">push</span><span class="special">(</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">))];</span> 4551</pre> 4552<p> 4553 Xpressive recognizes the use of the <code class="computeroutput"><span class="special">->*</span></code> 4554 and treats this expression exactly the same as the one above. 4555 </p> 4556<p> 4557 When your function object must return a type that depends on its arguments, 4558 you can use a <code class="computeroutput"><span class="identifier">result</span><span class="special"><></span></code> 4559 member template instead of the <code class="computeroutput"><span class="identifier">result_type</span></code> 4560 typedef. Here, for example, is a <code class="computeroutput"><span class="identifier">first</span></code> 4561 function object that returns the <code class="computeroutput"><span class="identifier">first</span></code> 4562 member of a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">pair</span><span class="special"><></span></code> 4563 or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>: 4564 </p> 4565<pre class="programlisting"><span class="comment">// Function object that returns the</span> 4566<span class="comment">// first element of a pair.</span> 4567<span class="keyword">struct</span> <span class="identifier">first_impl</span> 4568<span class="special">{</span> 4569 <span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">Sig</span><span class="special">></span> <span class="keyword">struct</span> <span class="identifier">result</span> <span class="special">{};</span> 4570 4571 <span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">This</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Pair</span><span class="special">></span> 4572 <span class="keyword">struct</span> <span class="identifier">result</span><span class="special"><</span><span class="identifier">This</span><span class="special">(</span><span class="identifier">Pair</span><span class="special">)></span> 4573 <span class="special">{</span> 4574 <span class="keyword">typedef</span> <span class="keyword">typename</span> <span class="identifier">remove_reference</span><span class="special"><</span><span class="identifier">Pair</span><span class="special">></span> 4575 <span class="special">::</span><span class="identifier">type</span><span class="special">::</span><span class="identifier">first_type</span> <span class="identifier">type</span><span class="special">;</span> 4576 <span class="special">};</span> 4577 4578 <span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">Pair</span><span class="special">></span> 4579 <span class="keyword">typename</span> <span class="identifier">Pair</span><span class="special">::</span><span class="identifier">first_type</span> 4580 <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">Pair</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">p</span><span class="special">)</span> <span class="keyword">const</span> 4581 <span class="special">{</span> 4582 <span class="keyword">return</span> <span class="identifier">p</span><span class="special">.</span><span class="identifier">first</span><span class="special">;</span> 4583 <span class="special">}</span> 4584<span class="special">};</span> 4585 4586<span class="comment">// OK, use as first(s1) to get the begin iterator</span> 4587<span class="comment">// of the sub-match referred to by s1.</span> 4588<span class="identifier">function</span><span class="special"><</span><span class="identifier">first_impl</span><span class="special">>::</span><span class="identifier">type</span> <span class="keyword">const</span> <span class="identifier">first</span> <span class="special">=</span> <span class="special">{{}};</span> 4589</pre> 4590<h4> 4591<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h5"></a> 4592 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_local_variables"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_local_variables">Referring 4593 to Local Variables</a> 4594 </h4> 4595<p> 4596 As we've seen in the examples above, we can refer to local variables within 4597 an actions using <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>. 4598 Any such variables are held by reference by the regular expression, and care 4599 should be taken to avoid letting those references dangle. For instance, in 4600 the following code, the reference to <code class="computeroutput"><span class="identifier">i</span></code> 4601 is left to dangle when <code class="computeroutput"><span class="identifier">bad_voodoo</span><span class="special">()</span></code> returns: 4602 </p> 4603<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">bad_voodoo</span><span class="special">()</span> 4604<span class="special">{</span> 4605 <span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> 4606 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span> 4607 <span class="comment">// ERROR! rex refers by reference to a local</span> 4608 <span class="comment">// variable, which will dangle after bad_voodoo()</span> 4609 <span class="comment">// returns.</span> 4610 <span class="keyword">return</span> <span class="identifier">rex</span><span class="special">;</span> 4611<span class="special">}</span> 4612</pre> 4613<p> 4614 When writing semantic actions, it is your responsibility to make sure that 4615 all the references do not dangle. One way to do that would be to make the 4616 variables shared pointers that are held by the regex by value. 4617 </p> 4618<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">good_voodoo</span><span class="special">(</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">pi</span><span class="special">)</span> 4619<span class="special">{</span> 4620 <span class="comment">// Use val() to hold the shared_ptr by value:</span> 4621 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++*</span><span class="identifier">val</span><span class="special">(</span><span class="identifier">pi</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span> 4622 <span class="comment">// OK, rex holds a reference count to the integer.</span> 4623 <span class="keyword">return</span> <span class="identifier">rex</span><span class="special">;</span> 4624<span class="special">}</span> 4625</pre> 4626<p> 4627 In the above code, we use <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">val</span><span class="special">()</span></code> 4628 to hold the shared pointer by value. That's not normally necessary because 4629 local variables appearing in actions are held by value by default, but in 4630 this case, it is necessary. Had we written the action as <code class="computeroutput"><span class="special">++*</span><span class="identifier">pi</span></code>, it would have executed immediately. 4631 That's because <code class="computeroutput"><span class="special">++*</span><span class="identifier">pi</span></code> 4632 is not an expression template, but <code class="computeroutput"><span class="special">++*</span><span class="identifier">val</span><span class="special">(</span><span class="identifier">pi</span><span class="special">)</span></code> is. 4633 </p> 4634<p> 4635 It can be tedious to wrap all your variables in <code class="computeroutput"><span class="identifier">ref</span><span class="special">()</span></code> and <code class="computeroutput"><span class="identifier">val</span><span class="special">()</span></code> in your semantic actions. Xpressive provides 4636 the <code class="computeroutput"><span class="identifier">reference</span><span class="special"><></span></code> 4637 and <code class="computeroutput"><span class="identifier">value</span><span class="special"><></span></code> 4638 templates to make things easier. The following table shows the equivalencies: 4639 </p> 4640<div class="table"> 4641<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.t0"></a><p class="title"><b>Table 46.12. reference<> and value<></b></p> 4642<div class="table-contents"><table class="table" summary="reference<> and value<>"> 4643<colgroup> 4644<col> 4645<col> 4646</colgroup> 4647<thead><tr> 4648<th> 4649 <p> 4650 This ... 4651 </p> 4652 </th> 4653<th> 4654 <p> 4655 ... is equivalent to this ... 4656 </p> 4657 </th> 4658</tr></thead> 4659<tbody> 4660<tr> 4661<td> 4662 <p> 4663</p> 4664<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> 4665 4666<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre> 4667<p> 4668 </p> 4669 </td> 4670<td> 4671 <p> 4672</p> 4673<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> 4674<span class="identifier">reference</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">ri</span><span class="special">(</span><span class="identifier">i</span><span class="special">);</span> 4675<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ri</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre> 4676<p> 4677 </p> 4678 </td> 4679</tr> 4680<tr> 4681<td> 4682 <p> 4683</p> 4684<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">pi</span><span class="special">(</span><span class="keyword">new</span> <span class="keyword">int</span><span class="special">(</span><span class="number">0</span><span class="special">));</span> 4685 4686<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++*</span><span class="identifier">val</span><span class="special">(</span><span class="identifier">pi</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre> 4687<p> 4688 </p> 4689 </td> 4690<td> 4691 <p> 4692</p> 4693<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">pi</span><span class="special">(</span><span class="keyword">new</span> <span class="keyword">int</span><span class="special">(</span><span class="number">0</span><span class="special">));</span> 4694<span class="identifier">value</span><span class="special"><</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="special">></span> <span class="identifier">vpi</span><span class="special">(</span><span class="identifier">pi</span><span class="special">);</span> 4695<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++*</span><span class="identifier">vpi</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre> 4696<p> 4697 </p> 4698 </td> 4699</tr> 4700</tbody> 4701</table></div> 4702</div> 4703<br class="table-break"><p> 4704 As you can see, when using <code class="computeroutput"><span class="identifier">reference</span><span class="special"><></span></code>, you need to first declare a local 4705 variable and then declare a <code class="computeroutput"><span class="identifier">reference</span><span class="special"><></span></code> to it. These two steps can be combined 4706 into one using <code class="computeroutput"><span class="identifier">local</span><span class="special"><></span></code>. 4707 </p> 4708<div class="table"> 4709<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.t1"></a><p class="title"><b>Table 46.13. local<> vs. reference<></b></p> 4710<div class="table-contents"><table class="table" summary="local<> vs. reference<>"> 4711<colgroup> 4712<col> 4713<col> 4714</colgroup> 4715<thead><tr> 4716<th> 4717 <p> 4718 This ... 4719 </p> 4720 </th> 4721<th> 4722 <p> 4723 ... is equivalent to this ... 4724 </p> 4725 </th> 4726</tr></thead> 4727<tbody><tr> 4728<td> 4729 <p> 4730</p> 4731<pre class="programlisting"><span class="identifier">local</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">i</span><span class="special">(</span><span class="number">0</span><span class="special">);</span> 4732 4733<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">i</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre> 4734<p> 4735 </p> 4736 </td> 4737<td> 4738 <p> 4739</p> 4740<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> 4741<span class="identifier">reference</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">ri</span><span class="special">(</span><span class="identifier">i</span><span class="special">);</span> 4742<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ri</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre> 4743<p> 4744 </p> 4745 </td> 4746</tr></tbody> 4747</table></div> 4748</div> 4749<br class="table-break"><p> 4750 We can use <code class="computeroutput"><span class="identifier">local</span><span class="special"><></span></code> 4751 to rewrite the above example as follows: 4752 </p> 4753<pre class="programlisting"><span class="identifier">local</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">i</span><span class="special">(</span><span class="number">0</span><span class="special">);</span> 4754<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"1!2!3?"</span><span class="special">);</span> 4755<span class="comment">// count the exciting digits, but not the</span> 4756<span class="comment">// questionable ones.</span> 4757<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">i</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span> 4758<span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span> 4759<span class="identifier">assert</span><span class="special">(</span> <span class="identifier">i</span><span class="special">.</span><span class="identifier">get</span><span class="special">()</span> <span class="special">==</span> <span class="number">2</span> <span class="special">);</span> 4760</pre> 4761<p> 4762 Notice that we use <code class="computeroutput"><span class="identifier">local</span><span class="special"><>::</span><span class="identifier">get</span><span class="special">()</span></code> to access the value of the local variable. 4763 Also, beware that <code class="computeroutput"><span class="identifier">local</span><span class="special"><></span></code> 4764 can be used to create a dangling reference, just as <code class="computeroutput"><span class="identifier">reference</span><span class="special"><></span></code> can. 4765 </p> 4766<h4> 4767<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h6"></a> 4768 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_non_local_variables"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_non_local_variables">Referring 4769 to Non-Local Variables</a> 4770 </h4> 4771<p> 4772 In the beginning of this section, we used a regex with a semantic action 4773 to parse a string of word/integer pairs and stuff them into a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>. That required that the map and the 4774 regex be defined together and used before either could go out of scope. What 4775 if we wanted to define the regex once and use it to fill lots of different 4776 maps? We would rather pass the map into the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 4777 algorithm rather than embed a reference to it directly in the regex object. 4778 What we can do instead is define a placeholder and use that in the semantic 4779 action instead of the map itself. Later, when we call one of the regex algorithms, 4780 we can bind the reference to an actual map object. The following code shows 4781 how. 4782 </p> 4783<pre class="programlisting"><span class="comment">// Define a placeholder for a map object:</span> 4784<span class="identifier">placeholder</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="special">></span> <span class="identifier">_map</span><span class="special">;</span> 4785 4786<span class="comment">// Match a word and an integer, separated by =>,</span> 4787<span class="comment">// and then stuff the result into a std::map<></span> 4788<span class="identifier">sregex</span> <span class="identifier">pair</span> <span class="special">=</span> <span class="special">(</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="string">"=>"</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">)</span> 4789 <span class="special">[</span> <span class="identifier">_map</span><span class="special">[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">];</span> 4790 4791<span class="comment">// Match one or more word/integer pairs, separated</span> 4792<span class="comment">// by whitespace.</span> 4793<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">pair</span> <span class="special">>></span> <span class="special">*(+</span><span class="identifier">_s</span> <span class="special">>></span> <span class="identifier">pair</span><span class="special">);</span> 4794 4795<span class="comment">// The string to parse</span> 4796<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"aaa=>1 bbb=>23 ccc=>456"</span><span class="special">);</span> 4797 4798<span class="comment">// Here is the actual map to fill in:</span> 4799<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span> 4800 4801<span class="comment">// Bind the _map placeholder to the actual map</span> 4802<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span> 4803<span class="identifier">what</span><span class="special">.</span><span class="identifier">let</span><span class="special">(</span> <span class="identifier">_map</span> <span class="special">=</span> <span class="identifier">result</span> <span class="special">);</span> 4804 4805<span class="comment">// Execute the match and fill in result map</span> 4806<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_match</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span> 4807<span class="special">{</span> 4808 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"aaa"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 4809 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"bbb"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 4810 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"ccc"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 4811<span class="special">}</span> 4812</pre> 4813<p> 4814 This program displays: 4815 </p> 4816<pre class="programlisting">1 481723 4818456 4819</pre> 4820<p> 4821 We use <code class="computeroutput"><span class="identifier">placeholder</span><span class="special"><></span></code> 4822 here to define <code class="computeroutput"><span class="identifier">_map</span></code>, which 4823 stands in for a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code> 4824 variable. We can use the placeholder in the semantic action as if it were 4825 a map. Then, we define a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 4826 struct and bind an actual map to the placeholder with "<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">let</span><span class="special">(</span> <span class="identifier">_map</span> <span class="special">=</span> <span class="identifier">result</span> <span class="special">);</span></code>". The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 4827 call behaves as if the placeholder in the semantic action had been replaced 4828 with a reference to <code class="computeroutput"><span class="identifier">result</span></code>. 4829 </p> 4830<div class="note"><table border="0" summary="Note"> 4831<tr> 4832<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td> 4833<th align="left">Note</th> 4834</tr> 4835<tr><td align="left" valign="top"><p> 4836 Placeholders in semantic actions are not <span class="emphasis"><em>actually</em></span> 4837 replaced at runtime with references to variables. The regex object is never 4838 mutated in any way during any of the regex algorithms, so they are safe 4839 to use in multiple threads. 4840 </p></td></tr> 4841</table></div> 4842<p> 4843 The syntax for late-bound action arguments is a little different if you are 4844 using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code> 4845 or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>. 4846 The regex iterators accept an extra constructor parameter for specifying 4847 the argument bindings. There is a <code class="computeroutput"><span class="identifier">let</span><span class="special">()</span></code> function that you can use to bind variables 4848 to their placeholders. The following code demonstrates how. 4849 </p> 4850<pre class="programlisting"><span class="comment">// Define a placeholder for a map object:</span> 4851<span class="identifier">placeholder</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="special">></span> <span class="identifier">_map</span><span class="special">;</span> 4852 4853<span class="comment">// Match a word and an integer, separated by =>,</span> 4854<span class="comment">// and then stuff the result into a std::map<></span> 4855<span class="identifier">sregex</span> <span class="identifier">pair</span> <span class="special">=</span> <span class="special">(</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="string">"=>"</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">)</span> 4856 <span class="special">[</span> <span class="identifier">_map</span><span class="special">[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">];</span> 4857 4858<span class="comment">// The string to parse</span> 4859<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"aaa=>1 bbb=>23 ccc=>456"</span><span class="special">);</span> 4860 4861<span class="comment">// Here is the actual map to fill in:</span> 4862<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span> 4863 4864<span class="comment">// Create a regex_iterator to find all the matches</span> 4865<span class="identifier">sregex_iterator</span> <span class="identifier">it</span><span class="special">(</span><span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">pair</span><span class="special">,</span> <span class="identifier">let</span><span class="special">(</span><span class="identifier">_map</span><span class="special">=</span><span class="identifier">result</span><span class="special">));</span> 4866<span class="identifier">sregex_iterator</span> <span class="identifier">end</span><span class="special">;</span> 4867 4868<span class="comment">// step through all the matches, and fill in</span> 4869<span class="comment">// the result map</span> 4870<span class="keyword">while</span><span class="special">(</span><span class="identifier">it</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">)</span> 4871 <span class="special">++</span><span class="identifier">it</span><span class="special">;</span> 4872 4873<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"aaa"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 4874<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"bbb"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 4875<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"ccc"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 4876</pre> 4877<p> 4878 This program displays: 4879 </p> 4880<pre class="programlisting">1 488123 4882456 4883</pre> 4884<h3> 4885<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h7"></a> 4886 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.user_defined_assertions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.user_defined_assertions">User-Defined 4887 Assertions</a> 4888 </h3> 4889<p> 4890 You are probably already familiar with regular expression <span class="emphasis"><em>assertions</em></span>. 4891 In Perl, some examples are the <code class="literal">^</code> and <code class="literal">$</code> 4892 assertions, which you can use to match the beginning and end of a string, 4893 respectively. Xpressive lets you define your own assertions. A custom assertion 4894 is a contition which must be true at a point in the match in order for the 4895 match to succeed. You can check a custom assertion with xpressive's <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/check.html" title="Function template check">check()</a></code></code> function. 4896 </p> 4897<p> 4898 There are a couple of ways to define a custom assertion. The simplest is 4899 to use a function object. Let's say that you want to ensure that a sub-expression 4900 matches a sub-string that is either 3 or 6 characters long. The following 4901 struct defines such a predicate: 4902 </p> 4903<pre class="programlisting"><span class="comment">// A predicate that is true IFF a sub-match is</span> 4904<span class="comment">// either 3 or 6 characters long.</span> 4905<span class="keyword">struct</span> <span class="identifier">three_or_six</span> 4906<span class="special">{</span> 4907 <span class="keyword">bool</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">ssub_match</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">sub</span><span class="special">)</span> <span class="keyword">const</span> 4908 <span class="special">{</span> 4909 <span class="keyword">return</span> <span class="identifier">sub</span><span class="special">.</span><span class="identifier">length</span><span class="special">()</span> <span class="special">==</span> <span class="number">3</span> <span class="special">||</span> <span class="identifier">sub</span><span class="special">.</span><span class="identifier">length</span><span class="special">()</span> <span class="special">==</span> <span class="number">6</span><span class="special">;</span> 4910 <span class="special">}</span> 4911<span class="special">};</span> 4912</pre> 4913<p> 4914 You can use this predicate within a regular expression as follows: 4915 </p> 4916<pre class="programlisting"><span class="comment">// match words of 3 characters or 6 characters.</span> 4917<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">bow</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">>></span> <span class="identifier">eow</span><span class="special">)[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">three_or_six</span><span class="special">())</span> <span class="special">]</span> <span class="special">;</span> 4918</pre> 4919<p> 4920 The above regular expression will find whole words that are either 3 or 6 4921 characters long. The <code class="computeroutput"><span class="identifier">three_or_six</span></code> 4922 predicate accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code> 4923 that refers back to the part of the string matched by the sub-expression 4924 to which the custom assertion is attached. 4925 </p> 4926<div class="note"><table border="0" summary="Note"> 4927<tr> 4928<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td> 4929<th align="left">Note</th> 4930</tr> 4931<tr><td align="left" valign="top"><p> 4932 The custom assertion participates in determining whether the match succeeds 4933 or fails. Unlike actions, which execute lazily, custom assertions execute 4934 immediately while the regex engine is searching for a match. 4935 </p></td></tr> 4936</table></div> 4937<p> 4938 Custom assertions can also be defined inline using the same syntax as for 4939 semantic actions. Below is the same custom assertion written inline: 4940 </p> 4941<pre class="programlisting"><span class="comment">// match words of 3 characters or 6 characters.</span> 4942<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">bow</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">>></span> <span class="identifier">eow</span><span class="special">)[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">length</span><span class="special">(</span><span class="identifier">_</span><span class="special">)==</span><span class="number">3</span> <span class="special">||</span> <span class="identifier">length</span><span class="special">(</span><span class="identifier">_</span><span class="special">)==</span><span class="number">6</span><span class="special">)</span> <span class="special">]</span> <span class="special">;</span> 4943</pre> 4944<p> 4945 In the above, <code class="computeroutput"><span class="identifier">length</span><span class="special">()</span></code> 4946 is a lazy function that calls the <code class="computeroutput"><span class="identifier">length</span><span class="special">()</span></code> member function of its argument, and <code class="computeroutput"><span class="identifier">_</span></code> is a placeholder that receives the <code class="computeroutput"><span class="identifier">sub_match</span></code>. 4947 </p> 4948<p> 4949 Once you get the hang of writing custom assertions inline, they can be very 4950 powerful. For example, you can write a regular expression that only matches 4951 valid dates (for some suitably liberal definition of the term <span class="quote">“<span class="quote">valid</span>”</span>). 4952 </p> 4953<pre class="programlisting"><span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">days_per_month</span><span class="special">[]</span> <span class="special">=</span> 4954 <span class="special">{</span><span class="number">31</span><span class="special">,</span> <span class="number">29</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">30</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">30</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">30</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">31</span><span class="special">};</span> 4955 4956<span class="identifier">mark_tag</span> <span class="identifier">month</span><span class="special">(</span><span class="number">1</span><span class="special">),</span> <span class="identifier">day</span><span class="special">(</span><span class="number">2</span><span class="special">);</span> 4957<span class="comment">// find a valid date of the form month/day/year.</span> 4958<span class="identifier">sregex</span> <span class="identifier">date</span> <span class="special">=</span> 4959 <span class="special">(</span> 4960 <span class="comment">// Month must be between 1 and 12 inclusive</span> 4961 <span class="special">(</span><span class="identifier">month</span><span class="special">=</span> <span class="identifier">_d</span> <span class="special">>></span> <span class="special">!</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">>=</span> <span class="number">1</span> 4962 <span class="special">&&</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special"><=</span> <span class="number">12</span><span class="special">)</span> <span class="special">]</span> 4963 <span class="special">>></span> <span class="char">'/'</span> 4964 <span class="comment">// Day must be between 1 and 31 inclusive</span> 4965 <span class="special">>></span> <span class="special">(</span><span class="identifier">day</span><span class="special">=</span> <span class="identifier">_d</span> <span class="special">>></span> <span class="special">!</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">>=</span> <span class="number">1</span> 4966 <span class="special">&&</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special"><=</span> <span class="number">31</span><span class="special">)</span> <span class="special">]</span> 4967 <span class="special">>></span> <span class="char">'/'</span> 4968 <span class="comment">// Only consider years between 1970 and 2038</span> 4969 <span class="special">>></span> <span class="special">(</span><span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span><span class="special">)</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">>=</span> <span class="number">1970</span> 4970 <span class="special">&&</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special"><=</span> <span class="number">2038</span><span class="special">)</span> <span class="special">]</span> 4971 <span class="special">)</span> 4972 <span class="comment">// Ensure the month actually has that many days!</span> 4973 <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">days_per_month</span><span class="special">)[</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">month</span><span class="special">)-</span><span class="number">1</span><span class="special">]</span> <span class="special">>=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">day</span><span class="special">)</span> <span class="special">)</span> <span class="special">]</span> 4974<span class="special">;</span> 4975 4976<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span> 4977<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"99/99/9999 2/30/2006 2/28/2006"</span><span class="special">);</span> 4978 4979<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">date</span><span class="special">))</span> 4980<span class="special">{</span> 4981 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 4982<span class="special">}</span> 4983</pre> 4984<p> 4985 The above program prints out the following: 4986 </p> 4987<pre class="programlisting">2/28/2006 4988</pre> 4989<p> 4990 Notice how the inline custom assertions are used to range-check the values 4991 for the month, day and year. The regular expression doesn't match <code class="computeroutput"><span class="string">"99/99/9999"</span></code> or <code class="computeroutput"><span class="string">"2/30/2006"</span></code> 4992 because they are not valid dates. (There is no 99th month, and February doesn't 4993 have 30 days.) 4994 </p> 4995</div> 4996<div class="section"> 4997<div class="titlepage"><div><div><h3 class="title"> 4998<a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes" title="Symbol Tables and Attributes">Symbol 4999 Tables and Attributes</a> 5000</h3></div></div></div> 5001<h3> 5002<a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.h0"></a> 5003 <span class="phrase"><a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes.overview">Overview</a> 5004 </h3> 5005<p> 5006 Symbol tables can be built into xpressive regular expressions with just a 5007 <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>. 5008 The map keys are the strings to be matched and the map values are the data 5009 to be returned to your semantic action. Xpressive attributes, named <code class="computeroutput"><span class="identifier">a1</span></code>, <code class="computeroutput"><span class="identifier">a2</span></code>, 5010 through <code class="computeroutput"><span class="identifier">a9</span></code>, hold the value 5011 corresponding to a matching key so that it can be used in a semantic action. 5012 A default value can be specified for an attribute if a symbol is not found. 5013 </p> 5014<h3> 5015<a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.h1"></a> 5016 <span class="phrase"><a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.symbol_tables"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes.symbol_tables">Symbol 5017 Tables</a> 5018 </h3> 5019<p> 5020 An xpressive symbol table is just a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>, 5021 where the key is a string type and the value can be anything. For example, 5022 the following regular expression matches a key from map1 and assigns the 5023 corresponding value to the attribute <code class="computeroutput"><span class="identifier">a1</span></code>. 5024 Then, in the semantic action, it assigns the value stored in attribute <code class="computeroutput"><span class="identifier">a1</span></code> to an integer result. 5025 </p> 5026<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">result</span><span class="special">;</span> 5027<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">map1</span><span class="special">;</span> 5028<span class="comment">// ... (fill the map)</span> 5029<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span> <span class="identifier">a1</span> <span class="special">=</span> <span class="identifier">map1</span> <span class="special">)</span> <span class="special">[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)</span> <span class="special">=</span> <span class="identifier">a1</span> <span class="special">];</span> 5030</pre> 5031<p> 5032 Consider the following example code, which translates number names into integers. 5033 It is described below. 5034 </p> 5035<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span> 5036<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 5037<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 5038<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 5039<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span> 5040 5041<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 5042<span class="special">{</span> 5043 <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">number_map</span><span class="special">;</span> 5044 <span class="identifier">number_map</span><span class="special">[</span><span class="string">"one"</span><span class="special">]</span> <span class="special">=</span> <span class="number">1</span><span class="special">;</span> 5045 <span class="identifier">number_map</span><span class="special">[</span><span class="string">"two"</span><span class="special">]</span> <span class="special">=</span> <span class="number">2</span><span class="special">;</span> 5046 <span class="identifier">number_map</span><span class="special">[</span><span class="string">"three"</span><span class="special">]</span> <span class="special">=</span> <span class="number">3</span><span class="special">;</span> 5047 <span class="comment">// Match a string from number_map</span> 5048 <span class="comment">// and store the integer value in 'result'</span> 5049 <span class="comment">// if not found, store -1 in 'result'</span> 5050 <span class="keyword">int</span> <span class="identifier">result</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> 5051 <span class="identifier">cregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">((</span><span class="identifier">a1</span> <span class="special">=</span> <span class="identifier">number_map</span> <span class="special">)</span> <span class="special">|</span> <span class="special">*</span><span class="identifier">_</span><span class="special">)</span> 5052 <span class="special">[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">a1</span> <span class="special">|</span> <span class="special">-</span><span class="number">1</span><span class="special">)];</span> 5053 5054 <span class="identifier">regex_match</span><span class="special">(</span><span class="string">"three"</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">);</span> 5055 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 5056 <span class="identifier">regex_match</span><span class="special">(</span><span class="string">"two"</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">);</span> 5057 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 5058 <span class="identifier">regex_match</span><span class="special">(</span><span class="string">"stuff"</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">);</span> 5059 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 5060 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span> 5061<span class="special">}</span> 5062</pre> 5063<p> 5064 This program prints the following: 5065 </p> 5066<pre class="programlisting">3 50672 5068-1 5069</pre> 5070<p> 5071 First the program builds a number map, with number names as string keys and 5072 the corresponding integers as values. Then it constructs a static regular 5073 expression using an attribute <code class="computeroutput"><span class="identifier">a1</span></code> 5074 to represent the result of the symbol table lookup. In the semantic action, 5075 the attribute is assigned to an integer variable <code class="computeroutput"><span class="identifier">result</span></code>. 5076 If the symbol was not found, a default value of <code class="computeroutput"><span class="special">-</span><span class="number">1</span></code> is assigned to <code class="computeroutput"><span class="identifier">result</span></code>. 5077 A wildcard, <code class="computeroutput"><span class="special">*</span><span class="identifier">_</span></code>, 5078 makes sure the regex matches even if the symbol is not found. 5079 </p> 5080<p> 5081 A more complete version of this example can be found in <code class="literal">libs/xpressive/example/numbers.cpp</code><a href="#ftn.boost_xpressive.user_s_guide.symbol_tables_and_attributes.f0" class="footnote" name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.f0"><sup class="footnote">[37]</sup></a>. It translates number names up to "nine hundred ninety nine 5082 million nine hundred ninety nine thousand nine hundred ninety nine" 5083 along with some special number names like "dozen". 5084 </p> 5085<p> 5086 Symbol table matches are case sensitive by default, but they can be made 5087 case-insensitive by enclosing the expression in <code class="computeroutput"><span class="identifier">icase</span><span class="special">()</span></code>. 5088 </p> 5089<h3> 5090<a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.h2"></a> 5091 <span class="phrase"><a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.attributes"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes.attributes">Attributes</a> 5092 </h3> 5093<p> 5094 Up to nine attributes can be used in a regular expression. They are named 5095 <code class="computeroutput"><span class="identifier">a1</span></code>, <code class="computeroutput"><span class="identifier">a2</span></code>, 5096 ..., <code class="computeroutput"><span class="identifier">a9</span></code> in the <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span></code> namespace. The attribute type 5097 is the same as the second component of the map that is assigned to it. A 5098 default value for an attribute can be specified in a semantic action with 5099 the syntax <code class="computeroutput"><span class="special">(</span><span class="identifier">a1</span> 5100 <span class="special">|</span> <em class="replaceable"><code>default-value</code></em><span class="special">)</span></code>. 5101 </p> 5102<p> 5103 Attributes are properly scoped, so you can do crazy things like: <code class="computeroutput"><span class="special">(</span> <span class="special">(</span><span class="identifier">a1</span><span class="special">=</span><span class="identifier">sym1</span><span class="special">)</span> 5104 <span class="special">>></span> <span class="special">(</span><span class="identifier">a1</span><span class="special">=</span><span class="identifier">sym2</span><span class="special">)[</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">x</span><span class="special">)=</span><span class="identifier">a1</span><span class="special">]</span> <span class="special">)[</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">y</span><span class="special">)=</span><span class="identifier">a1</span><span class="special">]</span></code>. The 5105 inner semantic action sees the inner <code class="computeroutput"><span class="identifier">a1</span></code>, 5106 and the outer semantic action sees the outer one. They can even have different 5107 types. 5108 </p> 5109<div class="note"><table border="0" summary="Note"> 5110<tr> 5111<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td> 5112<th align="left">Note</th> 5113</tr> 5114<tr><td align="left" valign="top"><p> 5115 Xpressive builds a hidden ternary search trie from the map so it can search 5116 quickly. If BOOST_DISABLE_THREADS is defined, the hidden ternary search 5117 trie "self adjusts", so after each search it restructures itself 5118 to improve the efficiency of future searches based on the frequency of 5119 previous searches. 5120 </p></td></tr> 5121</table></div> 5122</div> 5123<div class="section"> 5124<div class="titlepage"><div><div><h3 class="title"> 5125<a name="boost_xpressive.user_s_guide.localization_and_regex_traits"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits" title="Localization and Regex Traits">Localization 5126 and Regex Traits</a> 5127</h3></div></div></div> 5128<h3> 5129<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h0"></a> 5130 <span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.overview">Overview</a> 5131 </h3> 5132<p> 5133 Matching a regular expression against a string often requires locale-dependent 5134 information. For example, how are case-insensitive comparisons performed? 5135 The locale-sensitive behavior is captured in a traits class. xpressive provides 5136 three traits class templates: <code class="computeroutput"><span class="identifier">cpp_regex_traits</span><span class="special"><></span></code>, <code class="computeroutput"><span class="identifier">c_regex_traits</span><span class="special"><></span></code> and <code class="computeroutput"><span class="identifier">null_regex_traits</span><span class="special"><></span></code>. The first wraps a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>, 5137 the second wraps the global C locale, and the third is a stub traits type 5138 for use when searching non-character data. All traits templates conform to 5139 the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.traits_requirements">Regex 5140 Traits Concept</a>. 5141 </p> 5142<h3> 5143<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h1"></a> 5144 <span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.setting_the_default_regex_trait"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.setting_the_default_regex_trait">Setting 5145 the Default Regex Trait</a> 5146 </h3> 5147<p> 5148 By default, xpressive uses <code class="computeroutput"><span class="identifier">cpp_regex_traits</span><span class="special"><></span></code> for all patterns. This causes all 5149 regex objects to use the global <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>. 5150 If you compile with <code class="computeroutput"><span class="identifier">BOOST_XPRESSIVE_USE_C_TRAITS</span></code> 5151 defined, then xpressive will use <code class="computeroutput"><span class="identifier">c_regex_traits</span><span class="special"><></span></code> by default. 5152 </p> 5153<h3> 5154<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h2"></a> 5155 <span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_dynamic_regexes"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_dynamic_regexes">Using 5156 Custom Traits with Dynamic Regexes</a> 5157 </h3> 5158<p> 5159 To create a dynamic regex that uses a custom traits object, you must use 5160 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>. 5161 The basic steps are shown in the following example: 5162 </p> 5163<pre class="programlisting"><span class="comment">// Declare a regex_compiler that uses the global C locale</span> 5164<span class="identifier">regex_compiler</span><span class="special"><</span><span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*,</span> <span class="identifier">c_regex_traits</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="special">></span> <span class="identifier">crxcomp</span><span class="special">;</span> 5165<span class="identifier">cregex</span> <span class="identifier">crx</span> <span class="special">=</span> <span class="identifier">crxcomp</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"\\w+"</span> <span class="special">);</span> 5166 5167<span class="comment">// Declare a regex_compiler that uses a custom std::locale</span> 5168<span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">loc</span> <span class="special">=</span> <span class="comment">/* ... create a locale here ... */</span><span class="special">;</span> 5169<span class="identifier">regex_compiler</span><span class="special"><</span><span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*,</span> <span class="identifier">cpp_regex_traits</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="special">></span> <span class="identifier">cpprxcomp</span><span class="special">(</span><span class="identifier">loc</span><span class="special">);</span> 5170<span class="identifier">cregex</span> <span class="identifier">cpprx</span> <span class="special">=</span> <span class="identifier">cpprxcomp</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"\\w+"</span> <span class="special">);</span> 5171</pre> 5172<p> 5173 The <code class="computeroutput"><span class="identifier">regex_compiler</span></code> objects 5174 act as regex factories. Once they have been imbued with a locale, every regex 5175 object they create will use that locale. 5176 </p> 5177<h3> 5178<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h3"></a> 5179 <span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_static_regexes"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_static_regexes">Using 5180 Custom Traits with Static Regexes</a> 5181 </h3> 5182<p> 5183 If you want a particular static regex to use a different set of traits, you 5184 can use the special <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code> pattern modifier. For instance: 5185 </p> 5186<pre class="programlisting"><span class="comment">// Define a regex that uses the global C locale</span> 5187<span class="identifier">c_regex_traits</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="identifier">ctraits</span><span class="special">;</span> 5188<span class="identifier">sregex</span> <span class="identifier">crx</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">ctraits</span><span class="special">)(</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">);</span> 5189 5190<span class="comment">// Define a regex that uses a customized std::locale</span> 5191<span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">loc</span> <span class="special">=</span> <span class="comment">/* ... create a locale here ... */</span><span class="special">;</span> 5192<span class="identifier">cpp_regex_traits</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="identifier">cpptraits</span><span class="special">(</span><span class="identifier">loc</span><span class="special">);</span> 5193<span class="identifier">sregex</span> <span class="identifier">cpprx1</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">cpptraits</span><span class="special">)(</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">);</span> 5194 5195<span class="comment">// A shorthand for above</span> 5196<span class="identifier">sregex</span> <span class="identifier">cpprx2</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">loc</span><span class="special">)(</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">);</span> 5197</pre> 5198<p> 5199 The <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code> 5200 pattern modifier must wrap the entire pattern. It is an error to <code class="computeroutput"><span class="identifier">imbue</span></code> only part of a static regex. For 5201 example: 5202 </p> 5203<pre class="programlisting"><span class="comment">// ERROR! Cannot imbue() only part of a regex</span> 5204<span class="identifier">sregex</span> <span class="identifier">error</span> <span class="special">=</span> <span class="identifier">_w</span> <span class="special">>></span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">loc</span><span class="special">)(</span> <span class="identifier">_w</span> <span class="special">);</span> 5205</pre> 5206<h3> 5207<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h4"></a> 5208 <span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.searching_non_character_data_with__literal_null_regex_traits__literal_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.searching_non_character_data_with__literal_null_regex_traits__literal_">Searching 5209 Non-Character Data With <code class="literal">null_regex_traits</code></a> 5210 </h3> 5211<p> 5212 With xpressive static regexes, you are not limitted to searching for patterns 5213 in character sequences. You can search for patterns in raw bytes, integers, 5214 or anything that conforms to the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.chart_requirements">Char 5215 Concept</a>. The <code class="computeroutput"><span class="identifier">null_regex_traits</span><span class="special"><></span></code> makes it simple. It is a stub implementation 5216 of the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.traits_requirements">Regex 5217 Traits Concept</a>. It recognizes no character classes and does no case-sensitive 5218 mappings. 5219 </p> 5220<p> 5221 For example, with <code class="computeroutput"><span class="identifier">null_regex_traits</span><span class="special"><></span></code>, you can write a static regex to 5222 find a pattern in a sequence of integers as follows: 5223 </p> 5224<pre class="programlisting"><span class="comment">// some integral data to search</span> 5225<span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">data</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span><span class="number">0</span><span class="special">,</span> <span class="number">1</span><span class="special">,</span> <span class="number">2</span><span class="special">,</span> <span class="number">3</span><span class="special">,</span> <span class="number">4</span><span class="special">,</span> <span class="number">5</span><span class="special">,</span> <span class="number">6</span><span class="special">};</span> 5226 5227<span class="comment">// create a null_regex_traits<> object for searching integers ...</span> 5228<span class="identifier">null_regex_traits</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">nul</span><span class="special">;</span> 5229 5230<span class="comment">// imbue a regex object with the null_regex_traits ...</span> 5231<span class="identifier">basic_regex</span><span class="special"><</span><span class="keyword">int</span> <span class="keyword">const</span> <span class="special">*></span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">nul</span><span class="special">)(</span><span class="number">1</span> <span class="special">>></span> <span class="special">+((</span><span class="identifier">set</span><span class="special">=</span> <span class="number">2</span><span class="special">,</span><span class="number">3</span><span class="special">)</span> <span class="special">|</span> <span class="number">4</span><span class="special">)</span> <span class="special">>></span> <span class="number">5</span><span class="special">);</span> 5232<span class="identifier">match_results</span><span class="special"><</span><span class="keyword">int</span> <span class="keyword">const</span> <span class="special">*></span> <span class="identifier">what</span><span class="special">;</span> 5233 5234<span class="comment">// search for the pattern in the array of integers ...</span> 5235<span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">data</span><span class="special">,</span> <span class="identifier">data</span> <span class="special">+</span> <span class="number">7</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span> 5236 5237<span class="identifier">assert</span><span class="special">(</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">matched</span><span class="special">);</span> 5238<span class="identifier">assert</span><span class="special">(*</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">first</span> <span class="special">==</span> <span class="number">1</span><span class="special">);</span> 5239<span class="identifier">assert</span><span class="special">(*</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">second</span> <span class="special">==</span> <span class="number">6</span><span class="special">);</span> 5240</pre> 5241</div> 5242<div class="section"> 5243<div class="titlepage"><div><div><h3 class="title"> 5244<a name="boost_xpressive.user_s_guide.tips_n_tricks"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks" title="Tips 'N Tricks">Tips 'N Tricks</a> 5245</h3></div></div></div> 5246<p> 5247 Squeeze the most performance out of xpressive with these tips and tricks. 5248 </p> 5249<h3> 5250<a name="boost_xpressive.user_s_guide.tips_n_tricks.h0"></a> 5251 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.compile_patterns_once_and_reuse_them"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.compile_patterns_once_and_reuse_them">Compile 5252 Patterns Once And Reuse Them</a> 5253 </h3> 5254<p> 5255 Compiling a regex (dynamic or static) is <span class="emphasis"><em>far</em></span> more expensive 5256 than executing a match or search. If you have the option, prefer to compile 5257 a pattern into a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 5258 object once and reuse it rather than recreating it over and over. 5259 </p> 5260<p> 5261 Since <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 5262 objects are not mutated by any of the regex algorithms, they are completely 5263 thread-safe once their initialization (and that of any grammars of which 5264 they are members) completes. The easiest way to reuse your patterns is to 5265 simply make your <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code> 5266 objects "static const". 5267 </p> 5268<h3> 5269<a name="boost_xpressive.user_s_guide.tips_n_tricks.h1"></a> 5270 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.reuse__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__objects"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.reuse__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__objects">Reuse 5271 match_results<> 5272 Objects</a> 5273 </h3> 5274<p> 5275 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 5276 object caches dynamically allocated memory. For this reason, it is far better 5277 to reuse the same <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 5278 object if you have to do many regex searches. 5279 </p> 5280<p> 5281 Caveat: <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 5282 objects are not thread-safe, so don't go wild reusing them across threads. 5283 </p> 5284<h3> 5285<a name="boost_xpressive.user_s_guide.tips_n_tricks.h2"></a> 5286 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_take_a__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__object"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_take_a__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__object">Prefer 5287 Algorithms That Take A match_results<> 5288 Object</a> 5289 </h3> 5290<p> 5291 This is a corollary to the previous tip. If you are doing multiple searches, 5292 you should prefer the regex algorithms that accept a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 5293 object over the ones that don't, and you should reuse the same <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 5294 object each time. If you don't provide a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code> 5295 object, a temporary one will be created for you and discarded when the algorithm 5296 returns. Any memory cached in the object will be deallocated and will have 5297 to be reallocated the next time. 5298 </p> 5299<h3> 5300<a name="boost_xpressive.user_s_guide.tips_n_tricks.h3"></a> 5301 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_accept_iterator_ranges_over_null_terminated_strings"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_accept_iterator_ranges_over_null_terminated_strings">Prefer 5302 Algorithms That Accept Iterator Ranges Over Null-Terminated Strings</a> 5303 </h3> 5304<p> 5305 xpressive provides overloads of the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code> 5306 and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code> 5307 algorithms that operate on C-style null-terminated strings. You should prefer 5308 the overloads that take iterator ranges. When you pass a null-terminated 5309 string to a regex algorithm, the end iterator is calculated immediately by 5310 calling <code class="computeroutput"><span class="identifier">strlen</span></code>. If you already 5311 know the length of the string, you can avoid this overhead by calling the 5312 regex algorithms with a <code class="computeroutput"><span class="special">[</span><span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">)</span></code> 5313 pair. 5314 </p> 5315<h3> 5316<a name="boost_xpressive.user_s_guide.tips_n_tricks.h4"></a> 5317 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.use_static_regexes"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.use_static_regexes">Use 5318 Static Regexes</a> 5319 </h3> 5320<p> 5321 On average, static regexes execute about 10 to 15% faster than their dynamic 5322 counterparts. It's worth familiarizing yourself with the static regex dialect. 5323 </p> 5324<h3> 5325<a name="boost_xpressive.user_s_guide.tips_n_tricks.h5"></a> 5326 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.understand__literal_syntax_option_type__optimize__literal_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.understand__literal_syntax_option_type__optimize__literal_">Understand 5327 <code class="literal">syntax_option_type::optimize</code></a> 5328 </h3> 5329<p> 5330 The <code class="computeroutput"><span class="identifier">optimize</span></code> flag tells the 5331 regex compiler to spend some extra time analyzing the pattern. It can cause 5332 some patterns to execute faster, but it increases the time to compile the 5333 pattern, and often increases the amount of memory consumed by the pattern. 5334 If you plan to reuse your pattern, <code class="computeroutput"><span class="identifier">optimize</span></code> 5335 is usually a win. If you will only use the pattern once, don't use <code class="computeroutput"><span class="identifier">optimize</span></code>. 5336 </p> 5337<h2> 5338<a name="boost_xpressive.user_s_guide.tips_n_tricks.h6"></a> 5339 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.common_pitfalls"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.common_pitfalls">Common 5340 Pitfalls</a> 5341 </h2> 5342<p> 5343 Keep the following tips in mind to avoid stepping in potholes with xpressive. 5344 </p> 5345<h3> 5346<a name="boost_xpressive.user_s_guide.tips_n_tricks.h7"></a> 5347 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.create_grammars_on_a_single_thread"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.create_grammars_on_a_single_thread">Create 5348 Grammars On A Single Thread</a> 5349 </h3> 5350<p> 5351 With static regexes, you can create grammars by nesting regexes inside one 5352 another. When compiling the outer regex, both the outer and inner regex objects, 5353 and all the regex objects to which they refer either directly or indirectly, 5354 are modified. For this reason, it's dangerous for global regex objects to 5355 participate in grammars. It's best to build regex grammars from a single 5356 thread. Once built, the resulting regex grammar can be executed from multiple 5357 threads without problems. 5358 </p> 5359<h3> 5360<a name="boost_xpressive.user_s_guide.tips_n_tricks.h8"></a> 5361 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.beware_nested_quantifiers"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.beware_nested_quantifiers">Beware 5362 Nested Quantifiers</a> 5363 </h3> 5364<p> 5365 This is a pitfall common to many regular expression engines. Some patterns 5366 can cause exponentially bad performance. Often these patterns involve one 5367 quantified term nested withing another quantifier, such as <code class="computeroutput"><span class="string">"(a*)*"</span></code>, although in many cases, 5368 the problem is harder to spot. Beware of patterns that have nested quantifiers. 5369 </p> 5370</div> 5371<div class="section"> 5372<div class="titlepage"><div><div><h3 class="title"> 5373<a name="boost_xpressive.user_s_guide.concepts"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts" title="Concepts">Concepts</a> 5374</h3></div></div></div> 5375<h3> 5376<a name="boost_xpressive.user_s_guide.concepts.h0"></a> 5377 <span class="phrase"><a name="boost_xpressive.user_s_guide.concepts.chart_requirements"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.chart_requirements">CharT 5378 requirements</a> 5379 </h3> 5380<p> 5381 If type <code class="computeroutput"><span class="identifier">BidiIterT</span></code> is used 5382 as a template argument to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>, 5383 then <code class="computeroutput"><span class="identifier">CharT</span></code> is <code class="computeroutput"><span class="identifier">iterator_traits</span><span class="special"><</span><span class="identifier">BidiIterT</span><span class="special">>::</span><span class="identifier">value_type</span></code>. Type <code class="computeroutput"><span class="identifier">CharT</span></code> 5384 must have a trivial default constructor, copy constructor, assignment operator, 5385 and destructor. In addition the following requirements must be met for objects; 5386 <code class="computeroutput"><span class="identifier">c</span></code> of type <code class="computeroutput"><span class="identifier">CharT</span></code>, 5387 <code class="computeroutput"><span class="identifier">c1</span></code> and <code class="computeroutput"><span class="identifier">c2</span></code> 5388 of type <code class="computeroutput"><span class="identifier">CharT</span> <span class="keyword">const</span></code>, 5389 and <code class="computeroutput"><span class="identifier">i</span></code> of type <code class="computeroutput"><span class="keyword">int</span></code>: 5390 </p> 5391<div class="table"> 5392<a name="boost_xpressive.user_s_guide.concepts.t0"></a><p class="title"><b>Table 46.14. CharT Requirements</b></p> 5393<div class="table-contents"><table class="table" summary="CharT Requirements"> 5394<colgroup> 5395<col> 5396<col> 5397<col> 5398</colgroup> 5399<thead><tr> 5400<th> 5401 <p> 5402 <span class="bold"><strong>Expression</strong></span> 5403 </p> 5404 </th> 5405<th> 5406 <p> 5407 <span class="bold"><strong>Return type</strong></span> 5408 </p> 5409 </th> 5410<th> 5411 <p> 5412 <span class="bold"><strong>Assertion / Note / Pre- / Post-condition</strong></span> 5413 </p> 5414 </th> 5415</tr></thead> 5416<tbody> 5417<tr> 5418<td> 5419 <p> 5420 <code class="computeroutput"><span class="identifier">CharT</span> <span class="identifier">c</span></code> 5421 </p> 5422 </td> 5423<td> 5424 <p> 5425 <code class="computeroutput"><span class="identifier">CharT</span></code> 5426 </p> 5427 </td> 5428<td> 5429 <p> 5430 Default constructor (must be trivial). 5431 </p> 5432 </td> 5433</tr> 5434<tr> 5435<td> 5436 <p> 5437 <code class="computeroutput"><span class="identifier">CharT</span> <span class="identifier">c</span><span class="special">(</span><span class="identifier">c1</span><span class="special">)</span></code> 5438 </p> 5439 </td> 5440<td> 5441 <p> 5442 <code class="computeroutput"><span class="identifier">CharT</span></code> 5443 </p> 5444 </td> 5445<td> 5446 <p> 5447 Copy constructor (must be trivial). 5448 </p> 5449 </td> 5450</tr> 5451<tr> 5452<td> 5453 <p> 5454 <code class="computeroutput"><span class="identifier">c1</span> <span class="special">=</span> 5455 <span class="identifier">c2</span></code> 5456 </p> 5457 </td> 5458<td> 5459 <p> 5460 <code class="computeroutput"><span class="identifier">CharT</span></code> 5461 </p> 5462 </td> 5463<td> 5464 <p> 5465 Assignment operator (must be trivial). 5466 </p> 5467 </td> 5468</tr> 5469<tr> 5470<td> 5471 <p> 5472 <code class="computeroutput"><span class="identifier">c1</span> <span class="special">==</span> 5473 <span class="identifier">c2</span></code> 5474 </p> 5475 </td> 5476<td> 5477 <p> 5478 <code class="computeroutput"><span class="keyword">bool</span></code> 5479 </p> 5480 </td> 5481<td> 5482 <p> 5483 <code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> has the same value as <code class="computeroutput"><span class="identifier">c2</span></code>. 5484 </p> 5485 </td> 5486</tr> 5487<tr> 5488<td> 5489 <p> 5490 <code class="computeroutput"><span class="identifier">c1</span> <span class="special">!=</span> 5491 <span class="identifier">c2</span></code> 5492 </p> 5493 </td> 5494<td> 5495 <p> 5496 <code class="computeroutput"><span class="keyword">bool</span></code> 5497 </p> 5498 </td> 5499<td> 5500 <p> 5501 <code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> and <code class="computeroutput"><span class="identifier">c2</span></code> 5502 are not equal. 5503 </p> 5504 </td> 5505</tr> 5506<tr> 5507<td> 5508 <p> 5509 <code class="computeroutput"><span class="identifier">c1</span> <span class="special"><</span> 5510 <span class="identifier">c2</span></code> 5511 </p> 5512 </td> 5513<td> 5514 <p> 5515 <code class="computeroutput"><span class="keyword">bool</span></code> 5516 </p> 5517 </td> 5518<td> 5519 <p> 5520 <code class="computeroutput"><span class="keyword">true</span></code> if the value 5521 of <code class="computeroutput"><span class="identifier">c1</span></code> is less than 5522 <code class="computeroutput"><span class="identifier">c2</span></code>. 5523 </p> 5524 </td> 5525</tr> 5526<tr> 5527<td> 5528 <p> 5529 <code class="computeroutput"><span class="identifier">c1</span> <span class="special">></span> 5530 <span class="identifier">c2</span></code> 5531 </p> 5532 </td> 5533<td> 5534 <p> 5535 <code class="computeroutput"><span class="keyword">bool</span></code> 5536 </p> 5537 </td> 5538<td> 5539 <p> 5540 <code class="computeroutput"><span class="keyword">true</span></code> if the value 5541 of <code class="computeroutput"><span class="identifier">c1</span></code> is greater 5542 than <code class="computeroutput"><span class="identifier">c2</span></code>. 5543 </p> 5544 </td> 5545</tr> 5546<tr> 5547<td> 5548 <p> 5549 <code class="computeroutput"><span class="identifier">c1</span> <span class="special"><=</span> 5550 <span class="identifier">c2</span></code> 5551 </p> 5552 </td> 5553<td> 5554 <p> 5555 <code class="computeroutput"><span class="keyword">bool</span></code> 5556 </p> 5557 </td> 5558<td> 5559 <p> 5560 <code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> is less than or equal to 5561 <code class="computeroutput"><span class="identifier">c2</span></code>. 5562 </p> 5563 </td> 5564</tr> 5565<tr> 5566<td> 5567 <p> 5568 <code class="computeroutput"><span class="identifier">c1</span> <span class="special">>=</span> 5569 <span class="identifier">c2</span></code> 5570 </p> 5571 </td> 5572<td> 5573 <p> 5574 <code class="computeroutput"><span class="keyword">bool</span></code> 5575 </p> 5576 </td> 5577<td> 5578 <p> 5579 <code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> is greater than or equal to 5580 <code class="computeroutput"><span class="identifier">c2</span></code>. 5581 </p> 5582 </td> 5583</tr> 5584<tr> 5585<td> 5586 <p> 5587 <code class="computeroutput"><span class="identifier">intmax_t</span> <span class="identifier">i</span> 5588 <span class="special">=</span> <span class="identifier">c1</span></code> 5589 </p> 5590 </td> 5591<td> 5592 <p> 5593 <code class="computeroutput"><span class="keyword">int</span></code> 5594 </p> 5595 </td> 5596<td> 5597 <p> 5598 <code class="computeroutput"><span class="identifier">CharT</span></code> must be convertible 5599 to an integral type. 5600 </p> 5601 </td> 5602</tr> 5603<tr> 5604<td> 5605 <p> 5606 <code class="computeroutput"><span class="identifier">CharT</span> <span class="identifier">c</span><span class="special">(</span><span class="identifier">i</span><span class="special">);</span></code> 5607 </p> 5608 </td> 5609<td> 5610 <p> 5611 <code class="computeroutput"><span class="identifier">CharT</span></code> 5612 </p> 5613 </td> 5614<td> 5615 <p> 5616 <code class="computeroutput"><span class="identifier">CharT</span></code> must be constructable 5617 from an integral type. 5618 </p> 5619 </td> 5620</tr> 5621</tbody> 5622</table></div> 5623</div> 5624<br class="table-break"><h3> 5625<a name="boost_xpressive.user_s_guide.concepts.h1"></a> 5626 <span class="phrase"><a name="boost_xpressive.user_s_guide.concepts.traits_requirements"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.traits_requirements">Traits 5627 Requirements</a> 5628 </h3> 5629<p> 5630 In the following table <code class="computeroutput"><span class="identifier">X</span></code> 5631 denotes a traits class defining types and functions for the character container 5632 type <code class="computeroutput"><span class="identifier">CharT</span></code>; <code class="computeroutput"><span class="identifier">u</span></code> is an object of type <code class="computeroutput"><span class="identifier">X</span></code>; 5633 <code class="computeroutput"><span class="identifier">v</span></code> is an object of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">X</span></code>; 5634 <code class="computeroutput"><span class="identifier">p</span></code> is a value of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">CharT</span><span class="special">*</span></code>; <code class="computeroutput"><span class="identifier">I1</span></code> 5635 and <code class="computeroutput"><span class="identifier">I2</span></code> are <code class="computeroutput"><span class="identifier">Input</span> <span class="identifier">Iterators</span></code>; 5636 <code class="computeroutput"><span class="identifier">c</span></code> is a value of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">CharT</span></code>; 5637 <code class="computeroutput"><span class="identifier">s</span></code> is an object of type <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>; 5638 <code class="computeroutput"><span class="identifier">cs</span></code> is an object of type 5639 <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>; 5640 <code class="computeroutput"><span class="identifier">b</span></code> is a value of type <code class="computeroutput"><span class="keyword">bool</span></code>; <code class="computeroutput"><span class="identifier">i</span></code> 5641 is a value of type <code class="computeroutput"><span class="keyword">int</span></code>; <code class="computeroutput"><span class="identifier">F1</span></code> and <code class="computeroutput"><span class="identifier">F2</span></code> 5642 are values of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">CharT</span><span class="special">*</span></code>; <code class="computeroutput"><span class="identifier">loc</span></code> 5643 is an object of type <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>; and <code class="computeroutput"><span class="identifier">ch</span></code> 5644 is an object of <code class="computeroutput"><span class="keyword">const</span> <span class="keyword">char</span></code>. 5645 </p> 5646<div class="table"> 5647<a name="boost_xpressive.user_s_guide.concepts.t1"></a><p class="title"><b>Table 46.15. Traits Requirements</b></p> 5648<div class="table-contents"><table class="table" summary="Traits Requirements"> 5649<colgroup> 5650<col> 5651<col> 5652<col> 5653</colgroup> 5654<thead><tr> 5655<th> 5656 <p> 5657 <span class="bold"><strong>Expression</strong></span> 5658 </p> 5659 </th> 5660<th> 5661 <p> 5662 <span class="bold"><strong>Return type</strong></span> 5663 </p> 5664 </th> 5665<th> 5666 <p> 5667 <span class="bold"><strong>Assertion / Note<br> Pre / Post condition</strong></span> 5668 </p> 5669 </th> 5670</tr></thead> 5671<tbody> 5672<tr> 5673<td> 5674 <p> 5675 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_type</span></code> 5676 </p> 5677 </td> 5678<td> 5679 <p> 5680 <code class="computeroutput"><span class="identifier">CharT</span></code> 5681 </p> 5682 </td> 5683<td> 5684 <p> 5685 The character container type used in the implementation of class 5686 template <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>. 5687 </p> 5688 </td> 5689</tr> 5690<tr> 5691<td> 5692 <p> 5693 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code> 5694 </p> 5695 </td> 5696<td> 5697 <p> 5698 <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><</span><span class="identifier">CharT</span><span class="special">></span></code> 5699 or <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">CharT</span><span class="special">></span></code> 5700 </p> 5701 </td> 5702<td> 5703 </td> 5704</tr> 5705<tr> 5706<td> 5707 <p> 5708 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code> 5709 </p> 5710 </td> 5711<td> 5712 <p> 5713 <span class="emphasis"><em>Implementation defined</em></span> 5714 </p> 5715 </td> 5716<td> 5717 <p> 5718 A copy constructible type that represents the locale used by the 5719 traits class. 5720 </p> 5721 </td> 5722</tr> 5723<tr> 5724<td> 5725 <p> 5726 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_class_type</span></code> 5727 </p> 5728 </td> 5729<td> 5730 <p> 5731 <span class="emphasis"><em>Implementation defined</em></span> 5732 </p> 5733 </td> 5734<td> 5735 <p> 5736 A bitmask type representing a particular character classification. 5737 Multiple values of this type can be bitwise-or'ed together to obtain 5738 a new valid value. 5739 </p> 5740 </td> 5741</tr> 5742<tr> 5743<td> 5744 <p> 5745 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">hash</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code> 5746 </p> 5747 </td> 5748<td> 5749 <p> 5750 <code class="computeroutput"><span class="keyword">unsigned</span> <span class="keyword">char</span></code> 5751 </p> 5752 </td> 5753<td> 5754 <p> 5755 Yields a value between <code class="computeroutput"><span class="number">0</span></code> 5756 and <code class="computeroutput"><span class="identifier">UCHAR_MAX</span></code> inclusive. 5757 </p> 5758 </td> 5759</tr> 5760<tr> 5761<td> 5762 <p> 5763 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">widen</span><span class="special">(</span><span class="identifier">ch</span><span class="special">)</span></code> 5764 </p> 5765 </td> 5766<td> 5767 <p> 5768 <code class="computeroutput"><span class="identifier">CharT</span></code> 5769 </p> 5770 </td> 5771<td> 5772 <p> 5773 Widens the specified <code class="computeroutput"><span class="keyword">char</span></code> 5774 and returns the resulting <code class="computeroutput"><span class="identifier">CharT</span></code>. 5775 </p> 5776 </td> 5777</tr> 5778<tr> 5779<td> 5780 <p> 5781 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">in_range</span><span class="special">(</span><span class="identifier">r1</span><span class="special">,</span> 5782 <span class="identifier">r2</span><span class="special">,</span> 5783 <span class="identifier">c</span><span class="special">)</span></code> 5784 </p> 5785 </td> 5786<td> 5787 <p> 5788 <code class="computeroutput"><span class="keyword">bool</span></code> 5789 </p> 5790 </td> 5791<td> 5792 <p> 5793 For any characters <code class="computeroutput"><span class="identifier">r1</span></code> 5794 and <code class="computeroutput"><span class="identifier">r2</span></code>, returns 5795 <code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">r1</span> <span class="special"><=</span> 5796 <span class="identifier">c</span> <span class="special">&&</span> 5797 <span class="identifier">c</span> <span class="special"><=</span> 5798 <span class="identifier">r2</span></code>. Requires that <code class="computeroutput"><span class="identifier">r1</span> <span class="special"><=</span> 5799 <span class="identifier">r2</span></code>. 5800 </p> 5801 </td> 5802</tr> 5803<tr> 5804<td> 5805 <p> 5806 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">in_range_nocase</span><span class="special">(</span><span class="identifier">r1</span><span class="special">,</span> 5807 <span class="identifier">r2</span><span class="special">,</span> 5808 <span class="identifier">c</span><span class="special">)</span></code> 5809 </p> 5810 </td> 5811<td> 5812 <p> 5813 <code class="computeroutput"><span class="keyword">bool</span></code> 5814 </p> 5815 </td> 5816<td> 5817 <p> 5818 For characters <code class="computeroutput"><span class="identifier">r1</span></code> 5819 and <code class="computeroutput"><span class="identifier">r2</span></code>, returns 5820 <code class="computeroutput"><span class="keyword">true</span></code> if there is some 5821 character <code class="computeroutput"><span class="identifier">d</span></code> for 5822 which <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">d</span><span class="special">)</span> 5823 <span class="special">==</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code> and <code class="computeroutput"><span class="identifier">r1</span> 5824 <span class="special"><=</span> <span class="identifier">d</span> 5825 <span class="special">&&</span> <span class="identifier">d</span> 5826 <span class="special"><=</span> <span class="identifier">r2</span></code>. 5827 Requires that <code class="computeroutput"><span class="identifier">r1</span> <span class="special"><=</span> <span class="identifier">r2</span></code>. 5828 </p> 5829 </td> 5830</tr> 5831<tr> 5832<td> 5833 <p> 5834 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code> 5835 </p> 5836 </td> 5837<td> 5838 <p> 5839 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_type</span></code> 5840 </p> 5841 </td> 5842<td> 5843 <p> 5844 Returns a character such that for any character <code class="computeroutput"><span class="identifier">d</span></code> 5845 that is to be considered equivalent to <code class="computeroutput"><span class="identifier">c</span></code> 5846 then <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span> 5847 <span class="special">==</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">translate</span><span class="special">(</span><span class="identifier">d</span><span class="special">)</span></code>. 5848 </p> 5849 </td> 5850</tr> 5851<tr> 5852<td> 5853 <p> 5854 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code> 5855 </p> 5856 </td> 5857<td> 5858 <p> 5859 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_type</span></code> 5860 </p> 5861 </td> 5862<td> 5863 <p> 5864 For all characters <code class="computeroutput"><span class="identifier">C</span></code> 5865 that are to be considered equivalent to <code class="computeroutput"><span class="identifier">c</span></code> 5866 when comparisons are to be performed without regard to case, then 5867 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span> 5868 <span class="special">==</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">C</span><span class="special">)</span></code>. 5869 </p> 5870 </td> 5871</tr> 5872<tr> 5873<td> 5874 <p> 5875 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span> 5876 <span class="identifier">F2</span><span class="special">)</span></code> 5877 </p> 5878 </td> 5879<td> 5880 <p> 5881 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code> 5882 </p> 5883 </td> 5884<td> 5885 <p> 5886 Returns a sort key for the character sequence designated by the 5887 iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code> such that if the character sequence 5888 <code class="computeroutput"><span class="special">[</span><span class="identifier">G1</span><span class="special">,</span> <span class="identifier">G2</span><span class="special">)</span></code> sorts before the character sequence 5889 <code class="computeroutput"><span class="special">[</span><span class="identifier">H1</span><span class="special">,</span> <span class="identifier">H2</span><span class="special">)</span></code> then <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">G1</span><span class="special">,</span> <span class="identifier">G2</span><span class="special">)</span> <span class="special"><</span> 5890 <span class="identifier">v</span><span class="special">.</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">H1</span><span class="special">,</span> 5891 <span class="identifier">H2</span><span class="special">)</span></code>. 5892 </p> 5893 </td> 5894</tr> 5895<tr> 5896<td> 5897 <p> 5898 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform_primary</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span> 5899 <span class="identifier">F2</span><span class="special">)</span></code> 5900 </p> 5901 </td> 5902<td> 5903 <p> 5904 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code> 5905 </p> 5906 </td> 5907<td> 5908 <p> 5909 Returns a sort key for the character sequence designated by the 5910 iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code> such that if the character sequence 5911 <code class="computeroutput"><span class="special">[</span><span class="identifier">G1</span><span class="special">,</span> <span class="identifier">G2</span><span class="special">)</span></code> sorts before the character sequence 5912 <code class="computeroutput"><span class="special">[</span><span class="identifier">H1</span><span class="special">,</span> <span class="identifier">H2</span><span class="special">)</span></code> when character case is not considered 5913 then <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform_primary</span><span class="special">(</span><span class="identifier">G1</span><span class="special">,</span> 5914 <span class="identifier">G2</span><span class="special">)</span> 5915 <span class="special"><</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">transform_primary</span><span class="special">(</span><span class="identifier">H1</span><span class="special">,</span> <span class="identifier">H2</span><span class="special">)</span></code>. 5916 </p> 5917 </td> 5918</tr> 5919<tr> 5920<td> 5921 <p> 5922 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">lookup_classname</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span> 5923 <span class="identifier">F2</span><span class="special">)</span></code> 5924 </p> 5925 </td> 5926<td> 5927 <p> 5928 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_class_type</span></code> 5929 </p> 5930 </td> 5931<td> 5932 <p> 5933 Converts the character sequence designated by the iterator range 5934 <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span><span class="identifier">F2</span><span class="special">)</span></code> into a bitmask type that can subsequently 5935 be passed to <code class="computeroutput"><span class="identifier">isctype</span></code>. 5936 Values returned from <code class="computeroutput"><span class="identifier">lookup_classname</span></code> 5937 can be safely bitwise or'ed together. Returns <code class="computeroutput"><span class="number">0</span></code> 5938 if the character sequence is not the name of a character class 5939 recognized by <code class="computeroutput"><span class="identifier">X</span></code>. 5940 The value returned shall be independent of the case of the characters 5941 in the sequence. 5942 </p> 5943 </td> 5944</tr> 5945<tr> 5946<td> 5947 <p> 5948 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">lookup_collatename</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span> 5949 <span class="identifier">F2</span><span class="special">)</span></code> 5950 </p> 5951 </td> 5952<td> 5953 <p> 5954 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code> 5955 </p> 5956 </td> 5957<td> 5958 <p> 5959 Returns a sequence of characters that represents the collating 5960 element consisting of the character sequence designated by the 5961 iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code>. Returns an empty string if the 5962 character sequence is not a valid collating element. 5963 </p> 5964 </td> 5965</tr> 5966<tr> 5967<td> 5968 <p> 5969 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">isctype</span><span class="special">(</span><span class="identifier">c</span><span class="special">,</span> 5970 <span class="identifier">v</span><span class="special">.</span><span class="identifier">lookup_classname</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span> 5971 <span class="identifier">F2</span><span class="special">))</span></code> 5972 </p> 5973 </td> 5974<td> 5975 <p> 5976 <code class="computeroutput"><span class="keyword">bool</span></code> 5977 </p> 5978 </td> 5979<td> 5980 <p> 5981 Returns <code class="computeroutput"><span class="keyword">true</span></code> if character 5982 <code class="computeroutput"><span class="identifier">c</span></code> is a member of 5983 the character class designated by the iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code>, <code class="computeroutput"><span class="keyword">false</span></code> 5984 otherwise. 5985 </p> 5986 </td> 5987</tr> 5988<tr> 5989<td> 5990 <p> 5991 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">value</span><span class="special">(</span><span class="identifier">c</span><span class="special">,</span> 5992 <span class="identifier">i</span><span class="special">)</span></code> 5993 </p> 5994 </td> 5995<td> 5996 <p> 5997 <code class="computeroutput"><span class="keyword">int</span></code> 5998 </p> 5999 </td> 6000<td> 6001 <p> 6002 Returns the value represented by the digit <code class="computeroutput"><span class="identifier">c</span></code> 6003 in base <code class="computeroutput"><span class="identifier">i</span></code> if the 6004 character <code class="computeroutput"><span class="identifier">c</span></code> is 6005 a valid digit in base <code class="computeroutput"><span class="identifier">i</span></code>; 6006 otherwise returns <code class="computeroutput"><span class="special">-</span><span class="number">1</span></code>.<br> [Note: the value of <code class="computeroutput"><span class="identifier">i</span></code> will only be <code class="computeroutput"><span class="number">8</span></code>, <code class="computeroutput"><span class="number">10</span></code>, 6007 or <code class="computeroutput"><span class="number">16</span></code>. -end note] 6008 </p> 6009 </td> 6010</tr> 6011<tr> 6012<td> 6013 <p> 6014 <code class="computeroutput"><span class="identifier">u</span><span class="special">.</span><span class="identifier">imbue</span><span class="special">(</span><span class="identifier">loc</span><span class="special">)</span></code> 6015 </p> 6016 </td> 6017<td> 6018 <p> 6019 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code> 6020 </p> 6021 </td> 6022<td> 6023 <p> 6024 Imbues <code class="computeroutput"><span class="identifier">u</span></code> with the 6025 locale <code class="computeroutput"><span class="identifier">loc</span></code>, returns 6026 the previous locale used by <code class="computeroutput"><span class="identifier">u</span></code>. 6027 </p> 6028 </td> 6029</tr> 6030<tr> 6031<td> 6032 <p> 6033 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">getloc</span><span class="special">()</span></code> 6034 </p> 6035 </td> 6036<td> 6037 <p> 6038 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code> 6039 </p> 6040 </td> 6041<td> 6042 <p> 6043 Returns the current locale used by <code class="computeroutput"><span class="identifier">v</span></code>. 6044 </p> 6045 </td> 6046</tr> 6047</tbody> 6048</table></div> 6049</div> 6050<br class="table-break"><h3> 6051<a name="boost_xpressive.user_s_guide.concepts.h2"></a> 6052 <span class="phrase"><a name="boost_xpressive.user_s_guide.concepts.acknowledgements"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.acknowledgements">Acknowledgements</a> 6053 </h3> 6054<p> 6055 This section is adapted from the equivalent page in the <a href="../../../libs/regex" target="_top">Boost.Regex</a> 6056 documentation and from the <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1429.htm" target="_top">proposal</a> 6057 to add regular expressions to the Standard Library. 6058 </p> 6059</div> 6060<div class="section"> 6061<div class="titlepage"><div><div><h3 class="title"> 6062<a name="boost_xpressive.user_s_guide.examples"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">Examples</a> 6063</h3></div></div></div> 6064<p> 6065 Below you can find six complete sample programs. <br> 6066 </p> 6067<p></p> 6068<h5> 6069<a name="boost_xpressive.user_s_guide.examples.h0"></a> 6070 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex">See 6071 if a whole string matches a regex</a> 6072 </h5> 6073<p> 6074 This is the example from the Introduction. It is reproduced here for your 6075 convenience. 6076 </p> 6077<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 6078<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 6079 6080<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span> 6081 6082<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 6083<span class="special">{</span> 6084 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">hello</span><span class="special">(</span> <span class="string">"hello world!"</span> <span class="special">);</span> 6085 6086 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\w+) (\\w+)!"</span> <span class="special">);</span> 6087 <span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span> 6088 6089 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">hello</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rex</span> <span class="special">)</span> <span class="special">)</span> 6090 <span class="special">{</span> 6091 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// whole match</span> 6092 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// first capture</span> 6093 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">2</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// second capture</span> 6094 <span class="special">}</span> 6095 6096 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span> 6097<span class="special">}</span> 6098</pre> 6099<p> 6100 This program outputs the following: 6101 </p> 6102<pre class="programlisting">hello world! 6103hello 6104world 6105</pre> 6106<p> 6107 <br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a> 6108 </p> 6109<p></p> 6110<h5> 6111<a name="boost_xpressive.user_s_guide.examples.h1"></a> 6112 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex">See 6113 if a string contains a sub-string that matches a regex</a> 6114 </h5> 6115<p> 6116 Notice in this example how we use custom <code class="computeroutput"><span class="identifier">mark_tag</span></code>s 6117 to make the pattern more readable. We can use the <code class="computeroutput"><span class="identifier">mark_tag</span></code>s 6118 later to index into the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>. 6119 </p> 6120<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 6121<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 6122 6123<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span> 6124 6125<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 6126<span class="special">{</span> 6127 <span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*</span><span class="identifier">str</span> <span class="special">=</span> <span class="string">"I was born on 5/30/1973 at 7am."</span><span class="special">;</span> 6128 6129 <span class="comment">// define some custom mark_tags with names more meaningful than s1, s2, etc.</span> 6130 <span class="identifier">mark_tag</span> <span class="identifier">day</span><span class="special">(</span><span class="number">1</span><span class="special">),</span> <span class="identifier">month</span><span class="special">(</span><span class="number">2</span><span class="special">),</span> <span class="identifier">year</span><span class="special">(</span><span class="number">3</span><span class="special">),</span> <span class="identifier">delim</span><span class="special">(</span><span class="number">4</span><span class="special">);</span> 6131 6132 <span class="comment">// this regex finds a date</span> 6133 <span class="identifier">cregex</span> <span class="identifier">date</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">month</span><span class="special">=</span> <span class="identifier">repeat</span><span class="special"><</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">>(</span><span class="identifier">_d</span><span class="special">))</span> <span class="comment">// find the month ...</span> 6134 <span class="special">>></span> <span class="special">(</span><span class="identifier">delim</span><span class="special">=</span> <span class="special">(</span><span class="identifier">set</span><span class="special">=</span> <span class="char">'/'</span><span class="special">,</span><span class="char">'-'</span><span class="special">))</span> <span class="comment">// followed by a delimiter ...</span> 6135 <span class="special">>></span> <span class="special">(</span><span class="identifier">day</span><span class="special">=</span> <span class="identifier">repeat</span><span class="special"><</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">>(</span><span class="identifier">_d</span><span class="special">))</span> <span class="special">>></span> <span class="identifier">delim</span> <span class="comment">// and a day followed by the same delimiter ...</span> 6136 <span class="special">>></span> <span class="special">(</span><span class="identifier">year</span><span class="special">=</span> <span class="identifier">repeat</span><span class="special"><</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">>(</span><span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span><span class="special">));</span> <span class="comment">// and the year.</span> 6137 6138 <span class="identifier">cmatch</span> <span class="identifier">what</span><span class="special">;</span> 6139 6140 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_search</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">date</span> <span class="special">)</span> <span class="special">)</span> 6141 <span class="special">{</span> 6142 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// whole match</span> 6143 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">day</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the day</span> 6144 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">month</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the month</span> 6145 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">year</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the year</span> 6146 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">delim</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the delimiter</span> 6147 <span class="special">}</span> 6148 6149 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span> 6150<span class="special">}</span> 6151</pre> 6152<p> 6153 This program outputs the following: 6154 </p> 6155<pre class="programlisting">5/30/1973 615630 61575 61581973 6159/ 6160</pre> 6161<p> 6162 <br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a> 6163 </p> 6164<p></p> 6165<h5> 6166<a name="boost_xpressive.user_s_guide.examples.h2"></a> 6167 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex">Replace 6168 all sub-strings that match a regex</a> 6169 </h5> 6170<p> 6171 The following program finds dates in a string and marks them up with pseudo-HTML. 6172 </p> 6173<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 6174<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 6175 6176<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span> 6177 6178<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 6179<span class="special">{</span> 6180 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"I was born on 5/30/1973 at 7am."</span> <span class="special">);</span> 6181 6182 <span class="comment">// essentially the same regex as in the previous example, but using a dynamic regex</span> 6183 <span class="identifier">sregex</span> <span class="identifier">date</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\d{1,2})([/-])(\\d{1,2})\\2((?:\\d{2}){1,2})"</span> <span class="special">);</span> 6184 6185 <span class="comment">// As in Perl, $& is a reference to the sub-string that matched the regex</span> 6186 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">format</span><span class="special">(</span> <span class="string">"<date>$&</date>"</span> <span class="special">);</span> 6187 6188 <span class="identifier">str</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">date</span><span class="special">,</span> <span class="identifier">format</span> <span class="special">);</span> 6189 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">str</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 6190 6191 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span> 6192<span class="special">}</span> 6193</pre> 6194<p> 6195 This program outputs the following: 6196 </p> 6197<pre class="programlisting">I was born on <date>5/30/1973</date> at 7am. 6198</pre> 6199<p> 6200 <br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a> 6201 </p> 6202<p></p> 6203<h5> 6204<a name="boost_xpressive.user_s_guide.examples.h3"></a> 6205 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.find_all_the_sub_strings_that_match_a_regex_and_step_through_them_one_at_a_time"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.find_all_the_sub_strings_that_match_a_regex_and_step_through_them_one_at_a_time">Find 6206 all the sub-strings that match a regex and step through them one at a time</a> 6207 </h5> 6208<p> 6209 The following program finds the words in a wide-character string. It uses 6210 <code class="computeroutput"><span class="identifier">wsregex_iterator</span></code>. Notice 6211 that dereferencing a <code class="computeroutput"><span class="identifier">wsregex_iterator</span></code> 6212 yields a <code class="computeroutput"><span class="identifier">wsmatch</span></code> object. 6213 </p> 6214<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 6215<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 6216 6217<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span> 6218 6219<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 6220<span class="special">{</span> 6221 <span class="identifier">std</span><span class="special">::</span><span class="identifier">wstring</span> <span class="identifier">str</span><span class="special">(</span> <span class="identifier">L</span><span class="string">"This is his face."</span> <span class="special">);</span> 6222 6223 <span class="comment">// find a whole word</span> 6224 <span class="identifier">wsregex</span> <span class="identifier">token</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">alnum</span><span class="special">;</span> 6225 6226 <span class="identifier">wsregex_iterator</span> <span class="identifier">cur</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">token</span> <span class="special">);</span> 6227 <span class="identifier">wsregex_iterator</span> <span class="identifier">end</span><span class="special">;</span> 6228 6229 <span class="keyword">for</span><span class="special">(</span> <span class="special">;</span> <span class="identifier">cur</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">;</span> <span class="special">++</span><span class="identifier">cur</span> <span class="special">)</span> 6230 <span class="special">{</span> 6231 <span class="identifier">wsmatch</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">what</span> <span class="special">=</span> <span class="special">*</span><span class="identifier">cur</span><span class="special">;</span> 6232 <span class="identifier">std</span><span class="special">::</span><span class="identifier">wcout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">L</span><span class="char">'\n'</span><span class="special">;</span> 6233 <span class="special">}</span> 6234 6235 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span> 6236<span class="special">}</span> 6237</pre> 6238<p> 6239 This program outputs the following: 6240 </p> 6241<pre class="programlisting">This 6242is 6243his 6244face 6245</pre> 6246<p> 6247 <br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a> 6248 </p> 6249<p></p> 6250<h5> 6251<a name="boost_xpressive.user_s_guide.examples.h4"></a> 6252 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.split_a_string_into_tokens_that_each_match_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_into_tokens_that_each_match_a_regex">Split 6253 a string into tokens that each match a regex</a> 6254 </h5> 6255<p> 6256 The following program finds race times in a string and displays first the 6257 minutes and then the seconds. It uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>. 6258 </p> 6259<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 6260<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 6261 6262<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span> 6263 6264<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 6265<span class="special">{</span> 6266 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"Eric: 4:40, Karl: 3:35, Francesca: 2:32"</span> <span class="special">);</span> 6267 6268 <span class="comment">// find a race time</span> 6269 <span class="identifier">sregex</span> <span class="identifier">time</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\d):(\\d\\d)"</span> <span class="special">);</span> 6270 6271 <span class="comment">// for each match, the token iterator should first take the value of</span> 6272 <span class="comment">// the first marked sub-expression followed by the value of the second</span> 6273 <span class="comment">// marked sub-expression</span> 6274 <span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">subs</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">1</span><span class="special">,</span> <span class="number">2</span> <span class="special">};</span> 6275 6276 <span class="identifier">sregex_token_iterator</span> <span class="identifier">cur</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">time</span><span class="special">,</span> <span class="identifier">subs</span> <span class="special">);</span> 6277 <span class="identifier">sregex_token_iterator</span> <span class="identifier">end</span><span class="special">;</span> 6278 6279 <span class="keyword">for</span><span class="special">(</span> <span class="special">;</span> <span class="identifier">cur</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">;</span> <span class="special">++</span><span class="identifier">cur</span> <span class="special">)</span> 6280 <span class="special">{</span> 6281 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="special">*</span><span class="identifier">cur</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 6282 <span class="special">}</span> 6283 6284 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span> 6285<span class="special">}</span> 6286</pre> 6287<p> 6288 This program outputs the following: 6289 </p> 6290<pre class="programlisting">4 629140 62923 629335 62942 629532 6296</pre> 6297<p> 6298 <br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a> 6299 </p> 6300<p></p> 6301<h5> 6302<a name="boost_xpressive.user_s_guide.examples.h5"></a> 6303 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.split_a_string_using_a_regex_as_a_delimiter"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_using_a_regex_as_a_delimiter">Split 6304 a string using a regex as a delimiter</a> 6305 </h5> 6306<p> 6307 The following program takes some text that has been marked up with html and 6308 strips out the mark-up. It uses a regex that matches an HTML tag and a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code> 6309 that returns the parts of the string that do <span class="emphasis"><em>not</em></span> match 6310 the regex. 6311 </p> 6312<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 6313<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 6314 6315<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span> 6316 6317<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 6318<span class="special">{</span> 6319 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"Now <bold>is the time <i>for all good men</i> to come to the aid of their</bold> country."</span> <span class="special">);</span> 6320 6321 <span class="comment">// find a HTML tag</span> 6322 <span class="identifier">sregex</span> <span class="identifier">html</span> <span class="special">=</span> <span class="char">'<'</span> <span class="special">>></span> <span class="identifier">optional</span><span class="special">(</span><span class="char">'/'</span><span class="special">)</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">>></span> <span class="char">'>'</span><span class="special">;</span> 6323 6324 <span class="comment">// the -1 below directs the token iterator to display the parts of</span> 6325 <span class="comment">// the string that did NOT match the regular expression.</span> 6326 <span class="identifier">sregex_token_iterator</span> <span class="identifier">cur</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">html</span><span class="special">,</span> <span class="special">-</span><span class="number">1</span> <span class="special">);</span> 6327 <span class="identifier">sregex_token_iterator</span> <span class="identifier">end</span><span class="special">;</span> 6328 6329 <span class="keyword">for</span><span class="special">(</span> <span class="special">;</span> <span class="identifier">cur</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">;</span> <span class="special">++</span><span class="identifier">cur</span> <span class="special">)</span> 6330 <span class="special">{</span> 6331 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="char">'{'</span> <span class="special"><<</span> <span class="special">*</span><span class="identifier">cur</span> <span class="special"><<</span> <span class="char">'}'</span><span class="special">;</span> 6332 <span class="special">}</span> 6333 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 6334 6335 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span> 6336<span class="special">}</span> 6337</pre> 6338<p> 6339 This program outputs the following: 6340 </p> 6341<pre class="programlisting">{Now }{is the time }{for all good men}{ to come to the aid of their}{ country.} 6342</pre> 6343<p> 6344 <br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a> 6345 </p> 6346<p></p> 6347<h5> 6348<a name="boost_xpressive.user_s_guide.examples.h6"></a> 6349 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.display_a_tree_of_nested_results"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.display_a_tree_of_nested_results">Display 6350 a tree of nested results</a> 6351 </h5> 6352<p> 6353 Here is a helper class to demonstrate how you might display a tree of nested 6354 results: 6355 </p> 6356<pre class="programlisting"><span class="comment">// Displays nested results to std::cout with indenting</span> 6357<span class="keyword">struct</span> <span class="identifier">output_nested_results</span> 6358<span class="special">{</span> 6359 <span class="keyword">int</span> <span class="identifier">tabs_</span><span class="special">;</span> 6360 6361 <span class="identifier">output_nested_results</span><span class="special">(</span> <span class="keyword">int</span> <span class="identifier">tabs</span> <span class="special">=</span> <span class="number">0</span> <span class="special">)</span> 6362 <span class="special">:</span> <span class="identifier">tabs_</span><span class="special">(</span> <span class="identifier">tabs</span> <span class="special">)</span> 6363 <span class="special">{</span> 6364 <span class="special">}</span> 6365 6366 <span class="keyword">template</span><span class="special"><</span> <span class="keyword">typename</span> <span class="identifier">BidiIterT</span> <span class="special">></span> 6367 <span class="keyword">void</span> <span class="keyword">operator</span> <span class="special">()(</span> <span class="identifier">match_results</span><span class="special"><</span> <span class="identifier">BidiIterT</span> <span class="special">></span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">what</span> <span class="special">)</span> <span class="keyword">const</span> 6368 <span class="special">{</span> 6369 <span class="comment">// first, do some indenting</span> 6370 <span class="keyword">typedef</span> <span class="keyword">typename</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">iterator_traits</span><span class="special"><</span> <span class="identifier">BidiIterT</span> <span class="special">>::</span><span class="identifier">value_type</span> <span class="identifier">char_type</span><span class="special">;</span> 6371 <span class="identifier">char_type</span> <span class="identifier">space_ch</span> <span class="special">=</span> <span class="identifier">char_type</span><span class="special">(</span><span class="char">' '</span><span class="special">);</span> 6372 <span class="identifier">std</span><span class="special">::</span><span class="identifier">fill_n</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span><span class="identifier">char_type</span><span class="special">>(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">),</span> <span class="identifier">tabs_</span> <span class="special">*</span> <span class="number">4</span><span class="special">,</span> <span class="identifier">space_ch</span> <span class="special">);</span> 6373 6374 <span class="comment">// output the match</span> 6375 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> 6376 6377 <span class="comment">// output any nested matches</span> 6378 <span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span> 6379 <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">begin</span><span class="special">(),</span> 6380 <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">end</span><span class="special">(),</span> 6381 <span class="identifier">output_nested_results</span><span class="special">(</span> <span class="identifier">tabs_</span> <span class="special">+</span> <span class="number">1</span> <span class="special">)</span> <span class="special">);</span> 6382 <span class="special">}</span> 6383<span class="special">};</span> 6384</pre> 6385<p> 6386 <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a> 6387 </p> 6388</div> 6389<div class="footnotes"> 6390<br><hr style="width:100; text-align:left;margin-left: 0"> 6391<div id="ftn.boost_xpressive.user_s_guide.introduction.f0" class="footnote"><p><a href="#boost_xpressive.user_s_guide.introduction.f0" class="para"><sup class="para">[36] </sup></a> 6392 See <a href="http://www.osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html" target="_top">Expression 6393 Templates</a> 6394 </p></div> 6395<div id="ftn.boost_xpressive.user_s_guide.symbol_tables_and_attributes.f0" class="footnote"><p><a href="#boost_xpressive.user_s_guide.symbol_tables_and_attributes.f0" class="para"><sup class="para">[37] </sup></a> 6396 Many thanks to David Jenkins, who contributed this example. 6397 </p></div> 6398</div> 6399</div> 6400<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr> 6401<td align="left"></td> 6402<td align="right"><div class="copyright-footer">Copyright © 2007 Eric Niebler<p> 6403 Distributed under the Boost Software License, Version 1.0. (See accompanying 6404 file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>) 6405 </p> 6406</div></td> 6407</tr></table> 6408<hr> 6409<div class="spirit-nav"> 6410<a accesskey="p" href="../xpressive.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../xpressive.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="reference.html"><img src="../../../doc/src/images/next.png" alt="Next"></a> 6411</div> 6412</body> 6413</html> 6414