Lines Matching +full:vm +full:- +full:pointer
17 implementation of the Pike VM (similar to Thompson's construction, but supports
19 --- This library contains such an implementation in src/pikevm.rs.
21 Making it fast is harder. One of the key problems with the Pike VM is that it
23 positions between them. The Pike VM also spends a lot of time following the
25 speed up the Pike VM: extract one or more literal prefixes from the regular
27 prefixes in the search text. The Pike VM can then be avoided for most the
29 prefixes is in the regex-syntax crate (in this repository). The code to search
31 we fall back to an Aho-Corasick DFA using the aho-corasick crate. For one
32 literal, we use a variant of the Boyer-Moore algorithm. Both Aho-Corasick and
33 Boyer-Moore use `memchr` when appropriate. The Boyer-Moore variant in this
39 to executing the Pike VM: backtracking, whose implementation can be found in
49 It is distinct from the Pike VM in that the DFA is explicitly represented in
63 The following sub-sections describe the rest of the library and how each of the
68 Regular expressions are parsed using the regex-syntax crate, which is
69 maintained in this repository. The regex-syntax crate defines an abstract
75 The regex-syntax crate also provides sophisticated support for extracting
84 non-deterministic finite automaton. In particular, the opcodes explicitly rely
97 The first column is the instruction pointer and the second column is the
115 `goto` pointer embedded into it. This resulted in a small performance boost for
116 the Pike VM, because it was one fewer epsilon transition that it had to follow.
125 performing UTF-8 decoding and executing instructions using Unicode codepoints.
126 In the latter case, the program handles UTF-8 decoding implicitly, so that the
131 efficient manner and (2) the Pike VM benefits greatly from inlining Unicode
135 N.B. UTF-8 decoding is built into the compiled program by making use of the
136 utf8-ranges crate. The compiler in this library factors out common suffixes to
150 boundary assertions and (2) the caller never asks for sub-capture locations.
156 1. The Pike VM (supports captures).
158 3. Literal substring or multi-substring search.
172 For the most part, the execution logic is straight-forward and follows the
175 a forwards and backwards search, and then falls back to either the Pike VM or
192 matching engine in the crate was the Pike VM. The `regex!` macro was, itself,
193 also a Pike VM. The only advantages it offered over the dynamic Pike VM that
225 located in src/testdata. The scripts/regex-match-tests.py takes the test suite
238 * `tests/test_default.rs` - tests `Regex::new`
239 * `tests/test_default_bytes.rs` - tests `bytes::Regex::new`
240 * `tests/test_nfa.rs` - tests `Regex::new`, forced to use the NFA
242 * `tests/test_nfa_bytes.rs` - tests `Regex::new`, forced to use the NFA
244 * `tests/test_nfa_utf8bytes.rs` - tests `Regex::new`, forced to use the NFA
245 algorithm on every regex and use *UTF-8* byte based programs.
246 * `tests/test_backtrack.rs` - tests `Regex::new`, forced to use
248 * `tests/test_backtrack_bytes.rs` - tests `Regex::new`, forced to use
250 * `tests/test_backtrack_utf8bytes.rs` - tests `Regex::new`, forced to use
251 backtracking on every regex and use *UTF-8* byte based programs.
252 * `tests/test_crates_regex.rs` - tests to make sure that all of the
264 times slightly, try using `cargo test --test default`, which will only use the
276 The benchmarking in this crate is made up of many micro-benchmarks. Currently,
292 * `bench_rust.rs` - benchmarks `Regex::new`
294 * `bench_pcre.rs` - benchmarks PCRE
295 * `bench_onig.rs` - benchmarks Oniguruma
313 $ cargo benchcmp old new --improvements
315 The `cargo-benchcmp` utility is available here:
316 https://github.com/BurntSushi/cargo-benchcmp
319 `./bench/bench --help`.
335 $ rustdoc --crate-name docs src/lib.rs -o target/doc -L target/debug/deps --no-defaults --passes co…
340 See https://github.com/rust-lang/rust/issues/15347 for more info