UNICODE.md - OpenGrok cross reference for /external/rust/crates/regex/UNICODE.md

Lines Matching +full:emoji +full:- +full:regex
3 This document describes the regex crate's conformance to Unicode's
14    are ASCII-only definitions.
21 undertaking. This is at least partially a result of the fact that this regex
46 fixed-width variants of the same idea.
48 Note that when Unicode mode is disabled, any non-ASCII Unicode codepoint is
50 mode is disabled. That is, the regex `\xFF` matches the Unicode codepoint
51 U+00FF (encoded as `\xC3\xBF` in UTF-8) while the regex `(?-u)\xFF` matches
61 points specified by Unicode. The regex crate does not provide exhaustive
90 The following is a list of all properties supported by the regex crate (starred
111 * `Emoji`
154 The regex crate only provides ASCII definitions of the
163 `[[:word:]]` and `(?-u)\w` are equivalent.
170 The regex crate provides full support for nested character classes, along with
171 union, intersection (`&&`), difference (`--`) and symmetric difference (`~~`)
174 For example, to match all non-ASCII letters, you could use either
175 `[\p{Letter}--\p{Ascii}]` (difference) or `[\p{Letter}&&[^\p{Ascii}]]`
183 The regex crate provides basic Unicode aware word boundary assertions. A word
185 boundary negation corresponds to a zero-width match, where its adjacent
186 characters correspond to word and non-word, or non-word and word characters.
207 `(?-u)\b` is an ASCII-only word boundary. This can occasionally be beneficial
209 boundaries is currently sub-optimal on non-ASCII text.
216 The regex crate provides full support for case insensitive matching in
221 example, straight-forward to implement.
231 The regex crate only provides support for recognizing the `\n` (`END OF LINE`)
244 The regex crate provides full support for Unicode code point matching. Namely,
247 Given Rust's strong ties to UTF-8, the following guarantees are also provided:
249 * All matches are reported on valid UTF-8 code unit boundaries. That is, any
250   match range returned by the public regex API is guaranteed to successfully
253   No support for UTF-16 is provided, so this is never necessary.
257 Unicode features are disabled as well. For example, `(?-u)\pL` is not a valid
258 regex but `\pL(?-u)\xFF` (matches any Unicode `Letter` followed by the literal