Lines Matching +full:utf +full:- +full:8
19 --------
20 There are readable examples in the `ctest` and `examples` sub-directories.
23 [Rust and Cargo installed](https://www.rust-lang.org/downloads.html)
27 $ git clone git://github.com/rust-lang/regex
28 $ cd regex/regex-capi/examples
35 -----------
45 https://github.com/rust-lang/regex/blob/master/PERFORMANCE.md
49 -------------
50 All regular expressions must be valid UTF-8.
53 approximation, haystacks should be UTF-8. In fact, UTF-8 (and, one
55 library. It is impossible to match UTF-16, UTF-32 or any other encoding
56 without first transcoding it to UTF-8.
58 With that said, haystacks do not need to be valid UTF-8, and if they aren't
59 valid UTF-8, no performance penalty is paid. Whether invalid UTF-8 is
62 single UTF-8 encoding of a Unicode codepoint (sans LF). In particular,
63 it will not match invalid UTF-8 such as `\xFF`, nor will it match surrogate
64 codepoints or "alternate" (i.e., non-minimal) encodings of codepoints.
69 corresponding regex is guaranteed to match valid UTF-8. Invalid UTF-8 will
72 which parts of the regular expression must match UTF-8 or not.
77 expression with `(?-u)`.
79 Finally, if one wants to match specific invalid UTF-8 bytes, then you can
80 use escape sequences. e.g., `(?-u)\\xFF` will match `\xFF`. It's not
82 expressions must be valid UTF-8.
86 ------
97 -------