• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[/
2  Copyright 2006-2007 John Maddock.
3  Distributed under the Boost Software License, Version 1.0.
4  (See accompanying file LICENSE_1_0.txt or copy at
5  http://www.boost.org/LICENSE_1_0.txt).
6]
7
8[section:faq FAQ]
9
10[*Q.] I can't get regex++ to work with escape characters, what's going on?
11
12[*A.] If you embed regular expressions in C++ code, then remember that escape
13characters are processed twice: once by the C++ compiler, and once by the
14Boost.Regex expression compiler, so to pass the regular expression \d+
15to Boost.Regex, you need to embed "\\d+" in your code. Likewise to match a
16literal backslash you will need to embed "\\\\" in your code.
17
18[*Q.] No matter what I do regex_match always returns false, what's going on?
19
20[*A.] The algorithm regex_match only succeeds if the expression matches *all*
21of the text, if you want to *find* a sub-string within the text that matches
22the expression then use regex_search instead.
23
24[*Q.] Why does using parenthesis in a POSIX regular expression change the
25result of a match?
26
27[*A.] For POSIX (extended and basic) regular expressions, but not for perl regexes,
28parentheses don't only mark; they determine what the best match is as well.
29When the expression is compiled as a POSIX basic or extended regex then Boost.Regex
30follows the POSIX standard leftmost longest rule for determining what matched.
31So if there is more than one possible match after considering the whole expression,
32it looks next at the first sub-expression and then the second sub-expression
33and so on. So...
34
35"'''(0*)([0-9]*)'''" against "00123" would produce
36$1 = "00"
37$2 = "123"
38
39where as
40
41"0*([0-9])*" against "00123" would produce
42$1 = "00123"
43
44If you think about it, had $1 only matched the "123", this would be "less good"
45than the match "00123" which is both further to the left and longer. If you
46want $1 to match only the "123" part, then you need to use something like:
47
48"0*([1-9][0-9]*)"
49
50as the expression.
51
52[*Q.] Why don't character ranges work properly (POSIX mode only)?
53
54[*A.] The POSIX standard specifies that character range expressions are
55locale sensitive - so for example the expression [A-Z] will match any
56collating element that collates between 'A' and 'Z'. That means that for
57most locales other than "C" or "POSIX", [A-Z] would match the single
58character 't' for example, which is not what most people expect - or
59at least not what most people have come to expect from regular
60expression engines. For this reason, the default behaviour of Boost.Regex
61(perl mode) is to turn locale sensitive collation off by not setting the
62`regex_constants::collate` compile time flag. However if you set a non-default
63compile time flag - for example `regex_constants::extended` or
64`regex_constants::basic`, then locale dependent collation will be enabled,
65this also applies to the POSIX API functions which use either
66`regex_constants::extended` or `regex_constants::basic` internally.
67[Note - when `regex_constants::nocollate` in effect, the library behaves
68"as if" the LC_COLLATE locale category were always "C", regardless of what
69its actually set to - end note].
70
71[*Q.] Why are there no throw specifications on any of the functions?
72What exceptions can the library throw?
73
74[*A.] Not all compilers support (or honor) throw specifications, others
75support them but with reduced efficiency. Throw specifications may be added
76at a later date as compilers begin to handle this better. The library
77should throw only three types of exception: [boost::regex_error] can be
78thrown by [basic_regex] when compiling a regular expression, `std::runtime_error`
79can be thrown when a call to `basic_regex::imbue` tries to open a message
80catalogue that doesn't exist, or when a call to [regex_search] or [regex_match]
81results in an "everlasting" search, or when a call to `RegEx::GrepFiles` or
82`RegEx::FindFiles` tries to open a file that cannot be opened, finally
83`std::bad_alloc` can be thrown by just about any of the functions in this library.
84
85[*Q.] Why can't I use the "convenience" versions of regex_match /
86regex_search / regex_grep / regex_format / regex_merge?
87
88[*A.] These versions may or may not be available depending upon the
89capabilities of your compiler, the rules determining the format of
90these functions are quite complex - and only the versions visible to
91a standard compliant compiler are given in the help. To find out
92what your compiler supports, run <boost/regex.hpp> through your
93C++ pre-processor, and search the output file for the function
94that you are interested in.  Note however, that very few current
95compilers still have problems with these overloaded functions.
96
97[endsect]
98
99