1[/ 2 / Copyright (c) 2008 Eric Niebler 3 / 4 / Distributed under the Boost Software License, Version 1.0. (See accompanying 5 / file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) 6 /] 7 8[section Introduction] 9 10[h2 What is xpressive?] 11 12xpressive is a regular expression template library. Regular expressions 13(regexes) can be written as strings that are parsed dynamically at runtime 14(dynamic regexes), or as ['expression templates][footnote See 15[@http://www.osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html 16Expression Templates]] that are parsed at compile-time (static regexes). 17Dynamic regexes have the advantage that they can be accepted from the user 18as input at runtime or read from an initialization file. Static regexes 19have several advantages. Since they are C++ expressions instead of 20strings, they can be syntax-checked at compile-time. Also, they can naturally 21refer to code and data elsewhere in your program, giving you the ability to call 22back into your code from within a regex match. Finally, since they are statically 23bound, the compiler can generate faster code for static regexes. 24 25xpressive's dual nature is unique and powerful. Static xpressive is a bit 26like the _spirit_fx_. Like _spirit_, you can build grammars with 27static regexes using expression templates. (Unlike _spirit_, xpressive does 28exhaustive backtracking, trying every possibility to find a match for your 29pattern.) Dynamic xpressive is a bit like _regexpp_. In fact, 30xpressive's interface should be familiar to anyone who has used _regexpp_. 31xpressive's innovation comes from allowing you to mix and match static and 32dynamic regexes in the same program, and even in the same expression! You 33can embed a dynamic regex in a static regex, or /vice versa/, and the embedded 34regex will participate fully in the search, back-tracking as needed to make 35the match succeed. 36 37[h2 Hello, world!] 38 39Enough theory. Let's have a look at ['Hello World], xpressive style: 40 41 #include <iostream> 42 #include <boost/xpressive/xpressive.hpp> 43 44 using namespace boost::xpressive; 45 46 int main() 47 { 48 std::string hello( "hello world!" ); 49 50 sregex rex = sregex::compile( "(\\w+) (\\w+)!" ); 51 smatch what; 52 53 if( regex_match( hello, what, rex ) ) 54 { 55 std::cout << what[0] << '\n'; // whole match 56 std::cout << what[1] << '\n'; // first capture 57 std::cout << what[2] << '\n'; // second capture 58 } 59 60 return 0; 61 } 62 63This program outputs the following: 64 65[pre 66hello world! 67hello 68world 69] 70 71The first thing you'll notice about the code is that all the types in xpressive live in 72the `boost::xpressive` namespace. 73 74[note Most of the rest of the examples in this document will leave off the 75`using namespace boost::xpressive;` directive. Just pretend it's there.] 76 77Next, you'll notice the type of the regular expression object is `sregex`. If you are familiar 78with _regexpp_, this is different than what you are used to. The "`s`" in "`sregex`" stands for 79"`string`", indicating that this regex can be used to find patterns in `std::string` objects. 80I'll discuss this difference and its implications in detail later. 81 82Notice how the regex object is initialized: 83 84 sregex rex = sregex::compile( "(\\w+) (\\w+)!" ); 85 86To create a regular expression object from a string, you must call a factory method such as 87_regex_compile_. This is another area in which xpressive differs from 88other object-oriented regular expression libraries. Other libraries encourage you to think of 89a regular expression as a kind of string on steroids. In xpressive, regular expressions are not 90strings; they are little programs in a domain-specific language. Strings are only one ['representation] 91of that language. Another representation is an expression template. For example, the above line of code 92is equivalent to the following: 93 94 sregex rex = (s1= +_w) >> ' ' >> (s2= +_w) >> '!'; 95 96This describes the same regular expression, except it uses the domain-specific embedded language 97defined by static xpressive. 98 99As you can see, static regexes have a syntax that is noticeably different than standard Perl 100syntax. That is because we are constrained by C++'s syntax. The biggest difference is the use 101of `>>` to mean "followed by". For instance, in Perl you can just put sub-expressions next 102to each other: 103 104 abc 105 106But in C++, there must be an operator separating sub-expressions: 107 108 a >> b >> c 109 110In Perl, parentheses `()` have special meaning. They group, but as a side-effect they also create 111back-references like [^$1] and [^$2]. In C++, there is no way to overload parentheses to give them 112side-effects. To get the same effect, we use the special `s1`, `s2`, etc. tokens. Assign to 113one to create a back-reference (known as a sub-match in xpressive). 114 115You'll also notice that the one-or-more repetition operator `+` has moved from postfix 116to prefix position. That's because C++ doesn't have a postfix `+` operator. So: 117 118 "\\w+" 119 120is the same as: 121 122 +_w 123 124We'll cover all the other differences [link boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes later]. 125 126[endsect] 127