• Home
  • Raw
  • Download

Lines Matching +full:set +full:- +full:output

6        pcre2test - a program for testing Perl-compatible regular expressions.
10 pcre2test [options] [input file [output file]]
15 of the regular expressions themselves, see the pcre2pattern documenta-
16 tion. For details of the PCRE2 library function calls and their op-
21 setting defaults and controlling some special actions. The output shows
23 command lines, the patterns, and the subject lines specify PCRE2 func-
24 tion options, control how the subject is processed, and what output is
27 There are many obscure modifiers, some of which are specifically de-
34 PCRE2's 8-BIT, 16-BIT AND 32-BIT LIBRARIES
36 Different versions of the PCRE2 library can be built to support charac-
37 ter strings that are encoded in 8-bit, 16-bit, or 32-bit code units.
38 One, two, or all three of these libraries may be simultaneously in-
40 However, its own input and output are always in 8-bit format. When
41 testing the 16-bit or 32-bit libraries, patterns and subject strings
42 are converted to 16-bit or 32-bit format before being passed to the li-
43 brary functions. Results are converted back to 8-bit code units for
44 output.
46 In the rest of this document, the names of library functions and struc-
47 tures are given in generic form, for example, pcre2_compile(). The ac-
48 tual names used in the libraries have a suffix _8, _16, or _32, as ap-
61 contain binary zeros, even though in Unix-like environments, fgets()
66 patterns, there is a facility for specifying some or all of the 8-bit
67 input characters as hexadecimal pairs, which makes it possible to in-
70 Input for the 16-bit and 32-bit libraries
72 When testing the 16-bit or 32-bit libraries, there is a need to be able
75 used. In addition, when the utf modifier (see "Setting compilation op-
76 tions" below) is set, the pattern and any following subject lines are
77 interpreted as UTF-8 strings and translated to UTF-16 or UTF-32 as ap-
80 For non-UTF testing of wide characters, the utf8_input modifier can be
82 16-bit or 32-bit mode. It causes the pattern and following subject
83 lines to be treated as UTF-8 according to the original definition (RFC
84 2279), which allows for character values up to 0x7fffffff. Each charac-
85 ter is placed in one 16-bit or 32-bit code unit (in the 16-bit case,
88 UTF-8 (in its original definition) is not capable of encoding values
89 greater than 0x7fffffff, but such values can be handled by the 32-bit
90 library. When testing this library in non-UTF mode with utf8_input set,
92 in UTF-8) 0x80000000 is added to the character's value. This is the
99 -8 If the 8-bit library has been built, this option causes it to
100 be used (this is the default). If the 8-bit library has not
103 -16 If the 16-bit library has been built, this option causes it
104 to be used. If only the 16-bit library has been built, this
105 is the default. If the 16-bit library has not been built,
108 -32 If the 32-bit library has been built, this option causes it
109 to be used. If only the 32-bit library has been built, this
110 is the default. If the 32-bit library has not been built,
113 -ac Behave as if each pattern has the auto_callout modifier, that
114 is, insert automatic callouts into every pattern that is com-
117 -AC As for -ac, but in addition behave as if each subject line
118 has the callout_extra modifier, that is, show additional in-
121 -b Behave as if each pattern has the fullbincode modifier; the
122 full internal binary form of the pattern is output after com-
125 -C Output the version number of the PCRE2 library, and all
127 included, and then exit with zero exit code. All other op-
128 tions are ignored. If both -C and -LM are present, whichever
131 -C option Output information about a specific build-time option, then
133 as RunTest. The following options output the value and set
136 ebcdic-nl the code for LF (= NL) in an EBCDIC environment:
141 exit code is set to the link size
149 The following options output 1 for true or 0 for false, and
150 set the exit code to the same value:
152 backslash-C \C is supported (not locked out)
154 jit just-in-time support is available
155 pcre2-16 the 16-bit library was built
156 pcre2-32 the 32-bit library was built
157 pcre2-8 the 8-bit library was built
160 If an unknown option is given, an error message is output;
163 -d Behave as if each pattern has the debug modifier; the inter-
164 nal form and information about the compiled pattern is output
165 after compilation; -d is equivalent to -b -i.
167 -dfa Behave as if each subject line has the dfa modifier; matching
171 -error number[,number,...]
173 in the comma-separated list, display the resulting messages
174 on the standard output, then exit with zero exit code. The
178 -help Output a brief summary these options and then exit.
180 -i Behave as if each pattern has the info modifier; information
183 -jit Behave as if each pattern line has the jit modifier; after
184 successful compilation, each pattern is passed to the just-
185 in-time compiler, if available.
187 -jitfast Behave as if each pattern line has the jitfast modifier; af-
189 just-in-time compiler, if available, and each subject line is
192 -jitverify
195 just-in-time compiler, if available, and the use of JIT for
198 -LM List modifiers: write a list of available pattern and subject
199 modifiers to the standard output, then exit with zero exit
200 code. All other options are ignored. If both -C and any -Lx
203 -LP List properties: write a list of recognized Unicode proper-
204 ties to the standard output, then exit with zero exit code.
205 All other options are ignored. If both -C and any -Lx options
208 -LS List scripts: write a list of recognized Unicode script names
209 to the standard output, then exit with zero exit code. All
210 other options are ignored. If both -C and any -Lx options are
213 -pattern modifier-list
216 -q Do not output the version number of pcre2test at the start of
219 -S size On Unix-like systems, set the size of the run-time stack to
222 -subject modifier-list
225 -t Run each compile and match many times with a timer, and out-
229 that are used for timing by following -t with a number (as a
230 separate item on the command line). For example, "-t 1000"
233 -tm This is like -t except that it times only the matching phase,
236 -T -TM These behave like -t and -tm, but in addition, at the end of
237 a run, the total times for all compiles and matches are out-
240 -version Output the PCRE2 version number and then exit.
246 and writes to the second. If the first name is "-", input is taken from
254 function. This provides line-editing and history facilities. The output
255 from the -help option states whether or not readline() will be used.
258 set of input lines. Each set starts with a regular expression pattern,
259 followed by any number of subject lines to be matched against that pat-
263 checking that the behaviour of PCRE2 and Perl is the same. For a speci-
273 to do multi-line matches, you have to use the \n escape sequence (or \r
282 lines for a test, at which point a new pattern or command line is ex-
296 PCRE2_NEVER_UCP options set, which locks out the use of the PCRE2_UTF
300 when PCRE2_UTF is not set, but which require Unicode property support
308 unset, and the automatic options are not displayed in pattern informa-
309 tion, to avoid cluttering up test output.
313 This command is used to load a set of precompiled patterns from a file,
319 This command is used to load a set of binary character tables that can
321 the pcre2_dftables program with the -b option.
323 #newline_default [<newline-list>]
328 be overridden when a pattern is compiled. The standard test files con-
330 tests expect a single linefeed to be recognized as a newline by de-
331 fault. Without special action the tests would fail when PCRE2 is com-
335 acceptable as the default. The types must be one of CR, LF, CRLF, ANY-
340 If the default newline is in the list, this command has no effect. Oth-
342 specifies the first newline convention in the list (LF in the above ex-
347 When the POSIX API is being tested there is no way to override the de-
348 fault newline convention, though it is possible to set the newline con-
350 posix_nosub modifier is used when #newline_default would set a default
351 for the non-POSIX API.
353 #pattern <modifier-list>
355 This command sets a default modifier list that applies to all subse-
360 This line is used in test files that can also be processed by perl-
361 test.sh to confirm that Perl gives the same results as PCRE2. Subse-
362 quent tests are checked for the use of pcre2test features that are in-
367 that set or unset "mark" are recognized and acted on. The #perltest,
384 This command is used to save a set of compiled patterns to a file, as
385 described in the section entitled "Saving and restoring compiled pat-
388 #subject <modifier-list>
390 This command sets a default modifier list that applies to all subse-
391 quent subject lines. Modifiers on a subject line can change these set-
401 one or the other. Each modifier has a long name, for example "an-
403 value, for example, "offset=12". Values cannot contain comma charac-
407 A few of the more common modifiers can also be specified as single let-
417 This is a pattern line whose modifier list starts with two one-letter
418 modifiers (/i and /g). The lower-case abbreviated modifiers are the
425 symbols, excluding pattern meta-characters):
427 / ! " ' ` - = _ : ; , % & @ ~
431 characters are included within it. It is possible to include the delim-
438 but since the delimiters are all non-alphanumeric, the inclusion of the
441 the backslash will itself be interpreted as a literal. If the terminat-
453 causing pcre2test to read the next line as a continuation of the regu-
464 modifier was set for the pattern. The following provide a means of en-
465 coding non-printing characters in a visible way:
476 a byte unless > 255 in UTF-8 or 16-bit or 32-bit mode
482 the pattern. It is recognized always. There may be any number of hexa-
483 decimal digits inside the braces; invalid values provoke error mes-
486 Note that \xhh specifies one byte rather than one character in UTF-8
487 mode; this makes it possible to construct invalid UTF-8 sequences for
488 testing purposes. On the other hand, \x{hh} is interpreted as a UTF-8
489 character in UTF-8 mode, generating more than one byte if the value is
490 greater than 127. When testing the 8-bit library not in UTF-8 mode,
494 In UTF-16 mode, all 4-digit \x{hhhh} values are accepted. This makes it
495 possible to construct invalid UTF-16 sequences for testing purposes.
497 In UTF-32 mode, all 4- to 8-digit \x{...} values are accepted. This
498 makes it possible to construct invalid UTF-32 sequences for testing
526 A backslash followed by any other non-alphanumeric character just es-
533 If the subject_literal modifier is set for a pattern, all subject lines
534 that follow are treated as literals, with no special treatment of back-
536 set as defaults by a #subject command.
544 were set by a previous #pattern command.
548 The following modifiers set options for pcre2_compile(). Most of them
549 set bits in the options argument of that function, but those whose
550 names start with PCRE2_EXTRA are additional options that are set in the
551 compile context. For the main options, there are some single-letter ab-
552 breviations that are the same as Perl options. There is special han-
554 into PCRE2_EXTENDED_MORE as in Perl. A third appearance adds PCRE2_EX-
555 TENDED as well, though this makes no difference to the way pcre2_com-
559 allow_empty_class set PCRE2_ALLOW_EMPTY_CLASS
560 allow_lookaround_bsk set PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
561 allow_surrogate_escapes set PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
562 alt_bsux set PCRE2_ALT_BSUX
563 alt_circumflex set PCRE2_ALT_CIRCUMFLEX
564 alt_verbnames set PCRE2_ALT_VERBNAMES
565 anchored set PCRE2_ANCHORED
566 auto_callout set PCRE2_AUTO_CALLOUT
567 bad_escape_is_literal set PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
568 /i caseless set PCRE2_CASELESS
569 dollar_endonly set PCRE2_DOLLAR_ENDONLY
570 /s dotall set PCRE2_DOTALL
571 dupnames set PCRE2_DUPNAMES
572 endanchored set PCRE2_ENDANCHORED
573 escaped_cr_is_lf set PCRE2_EXTRA_ESCAPED_CR_IS_LF
574 /x extended set PCRE2_EXTENDED
575 /xx extended_more set PCRE2_EXTENDED_MORE
576 extra_alt_bsux set PCRE2_EXTRA_ALT_BSUX
577 firstline set PCRE2_FIRSTLINE
578 literal set PCRE2_LITERAL
579 match_line set PCRE2_EXTRA_MATCH_LINE
580 match_invalid_utf set PCRE2_MATCH_INVALID_UTF
581 match_unset_backref set PCRE2_MATCH_UNSET_BACKREF
582 match_word set PCRE2_EXTRA_MATCH_WORD
583 /m multiline set PCRE2_MULTILINE
584 never_backslash_c set PCRE2_NEVER_BACKSLASH_C
585 never_ucp set PCRE2_NEVER_UCP
586 never_utf set PCRE2_NEVER_UTF
587 /n no_auto_capture set PCRE2_NO_AUTO_CAPTURE
588 no_auto_possess set PCRE2_NO_AUTO_POSSESS
589 no_dotstar_anchor set PCRE2_NO_DOTSTAR_ANCHOR
590 no_start_optimize set PCRE2_NO_START_OPTIMIZE
591 no_utf_check set PCRE2_NO_UTF_CHECK
592 ucp set PCRE2_UCP
593 ungreedy set PCRE2_UNGREEDY
594 use_offset_limit set PCRE2_USE_OFFSET_LIMIT
595 utf set PCRE2_UTF
598 non-printing characters in output strings to be printed using the
599 \x{hh...} notation. Otherwise, those less than 0x100 are output in hex
600 without the curly brackets. Setting utf in 16-bit or 32-bit mode also
601 causes pattern and subject strings to be translated to UTF-16 or
602 UTF-32, respectively, before being passed to library functions.
606 The following modifiers affect the compilation process or request in-
607 formation about the pattern. There are single-letter abbreviations for
614 convert_glob_escape=c set glob escape character
615 convert_glob_separator=c set glob separator character
616 convert_length set convert buffer length
626 max_pattern_length=<n> set the maximum pattern length
628 newline=<type> set newline type
630 parens_nest_limit=<n> set maximum parentheses depth
638 use_length do not zero-terminate the pattern
639 utf8_input treat input as UTF-8
646 set to "anycrlf", \R matches CR, LF, or CRLF only. If it is set to
648 specified when PCRE2 is built; if it is not, the default is set to Uni-
661 output after compilation. This information does not contain length and
662 offset values, which ensures that the same output is generated for dif-
664 bincode, the same regression tests can be used in different environ-
669 code unit widths and link sizes, and is also useful for one-off tests.
693 sets of options are the same, just a single "options" line is output;
700 no_start_optimize is set because the minimum length is not calculated
708 in the pattern. A list of them is output at the end of any other infor-
715 null_context modifier is set, however, NULL is passed. This is for
722 for substrings enclosed in single or double quotes, are to be inter-
724 way of creating patterns that contain binary zeros and other non-print-
731 contains nine characters, only two of which are specified in hexadeci-
736 Either single or double quotes may be used. There is no way of includ-
742 By default, patterns are passed to the compiling functions as zero-ter-
743 minated strings but can be passed by length instead of being zero-ter-
745 happens automatically (whether or not use_length is set) when hex is
746 set, because patterns specified in hexadecimal may contain binary ze-
753 Specifying wide characters in 16-bit and 32-bit modes
755 In 16-bit and 32-bit modes, all input is automatically treated as UTF-8
756 and translated to UTF-16 or UTF-32 when the utf modifier is set. For
757 testing the 16-bit and 32-bit libraries in non-UTF mode, the utf8_input
759 are interpreted as UTF-8 as a means of specifying wide characters. More
764 Some tests use long patterns that are very repetitive. Instead of cre-
772 are expanded before the pattern is passed to pcre2_compile(). For exam-
781 two values in the quantifier. For example, \[AB]{6000,6000} is not rec-
784 If the info modifier is set on an expanded pattern, the result of the
785 expansion is included in the information that is output.
789 Just-in-time (JIT) compiling is a heavyweight optimization that can
793 this to optimized machine code. It needs to know whether the match-time
799 JIT compilation is requested by the jit pattern modifier, which may op-
804 1 compile JIT code for non-partial matching
820 PCRE2_PARTIAL_HARD option set. Note that such a call may return a com-
823 for partial matching (for example, jit=2) but do not set the partial
825 none was compiled for non-partial matching.
827 If JIT compilation is successful, the compiled JIT code will automati-
828 cally be used when an appropriate type of match is run, except when in-
829 compatible run-time options are specified. For more details, see the
834 "fast path" interface, pcre2_jit_match(), which skips some of the san-
841 jitverify is specified without jit, jit=7 is assumed. If JIT compila-
842 tion is successful when jitverify is set, the text "(JIT)" is added to
843 the first output line after a match or non match when JIT-compiled code
852 The given locale is set, pcre2_maketables() is called to build a set of
853 character tables for the locale, and this is then passed to pcre2_com-
857 command if a default is needed. Setting a locale and alternate charac-
863 the compiled pattern to be output. This does not include the size of
864 the pcre2_code block; it is just the actual compiled data. If the pat-
866 compiled code is also output. Here is an example:
876 parentheses in a pattern. Breaching the limit causes a compilation er-
877 ror. The default for the library is set when PCRE2 is built, but
893 wrapper supports only the 8-bit library. Note that it does not imply
894 POSIX matching semantics; for more detail see the pcre2posix documenta-
895 tion. The following pattern modifiers set options for the regcomp()
913 been set, a large buffer is used.
915 The aftertext and allaftertext subject modifiers work as described be-
919 The pattern is passed to regcomp() as a zero-terminated string by de-
920 fault, but if the use_length or hex modifiers are set, the REG_PEND ex-
925 The stackguard modifier is used to test the use of pcre2_set_com-
927 availability to be checked during compilation (see the pcre2api docu-
929 greater than zero, pcre2_set_compile_recursion_guard() is called to set
932 than the value given by the modifier, non-zero is returned, causing the
938 0, 1, 2, or 3. It causes a specific set of built-in character tables to
940 behaviour with different character tables. The digit specifies the ta-
946 2 a set of tables defining ISO 8859 characters
947 3 a set of tables loaded by the #loadtables command
949 In tables 2, some characters whose codes are greater than 128 are iden-
951 a #loadtables command has loaded them from a binary file. Setting al-
958 pattern's modifier list, in which case they are applied to every sub-
969 jitstack=<n> set size of JIT stack
985 as defaults, set them in a #subject command.
989 If the subject_literal modifier is present on a pattern, all the sub-
990 ject lines that it matches are taken as literal strings, with no inter-
991 pretation of backslashes. It is not possible to set subject modifiers
992 on such lines, but any that are set as defaults by a #subject command
1001 described in the section entitled "Saving and restoring compiled pat-
1002 terns" below. If pushcopy is used instead of push, a copy of the com-
1005 pcre2_code_copy() function. The push and pushcopy modifiers are in-
1015 tested by setting the convert modifier. Its argument is a colon-sepa-
1016 rated list of options, which set the equivalent option for the
1026 The "unset" value is useful for turning off a default that has been set
1027 by a #pattern command. When one of these options is set, the input pat-
1028 tern is passed to pcre2_pattern_convert(). If the conversion is suc-
1029 cessful, the result is reflected in the output and then passed to
1030 pcre2_compile(). The normal utf and no_utf_check options, if set, cause
1035 its output. However, if the convert_length modifier is set to a value
1040 used to specify the escape and separator characters for glob process-
1041 ing, overriding the defaults, which are operating-system dependent.
1051 The following modifiers set options for pcre2_match() or
1054 anchored set PCRE2_ANCHORED
1055 endanchored set PCRE2_ENDANCHORED
1056 dfa_restart set PCRE2_DFA_RESTART
1057 dfa_shortest set PCRE2_DFA_SHORTEST
1058 no_jit set PCRE2_NO_JIT
1059 no_utf_check set PCRE2_NO_UTF_CHECK
1060 notbol set PCRE2_NOTBOL
1061 notempty set PCRE2_NOTEMPTY
1062 notempty_atstart set PCRE2_NOTEMPTY_ATSTART
1063 noteol set PCRE2_NOTEOL
1064 partial_hard (or ph) set PCRE2_PARTIAL_HARD
1065 partial_soft (or ps) set PCRE2_PARTIAL_SOFT
1070 If the posix or posix_nosub modifier was present on the pattern, caus-
1071 ing the POSIX wrapper API to be used, the only option-setting modifiers
1072 that have any effect are notbol, notempty, and noteol, causing REG_NOT-
1076 There is one additional modifier that can be used with the POSIX wrap-
1077 per. It is ignored (with a warning) if used for non-POSIX matching.
1084 passed as the end of the subject string. For more detail of REG_STAR-
1087 not support actual binary zeros in its input), you must use posix_star-
1092 The following modifiers affect the matching process or request addi-
1102 allusedtext show all consulted text (non-JIT only)
1105 callout_data=<n> set a value to pass via callouts
1112 depth_limit=<n> set a depth limit
1119 heap_limit=<n> set a limit on heap memory (Kbytes)
1120 jitstack=<n> set size of JIT stack
1122 match_limit=<n> set a match limit
1127 offset=<n> set starting offset
1128 offset_limit=<n> set offset limit
1129 ovector=<n> set size of output vector
1144 zero_terminate pass the subject as zero-terminated
1148 and ovector subject modifiers work as described below. All other modi-
1155 addition output the remainder of the subject string. This is useful for
1157 The allaftertext modifier requests the same action for captured sub-
1158 strings as well as the main matched substring. In each case the remain-
1159 der is output on the following line with a plus character following the
1166 message). Setting this modifier affects the output if there is a look-
1169 follow the start and end of the actual match are indicated in the out-
1181 the preceding and following strings "pqr" and "xyz" having been con-
1188 part of the match. In this situation, the output for the matched string
1190 point, with circumflex characters under the earlier characters. For ex-
1198 Unlike allusedtext, the startchar modifier can be used with JIT. How-
1203 The allcaptures modifier requests that the values of all potential cap-
1204 tured parentheses be output after a match. By default, only those up to
1205 the highest one actually used in the match are output (corresponding to
1207 the match are output as "<unset>". This modifier is not relevant for
1213 The allvector modifier requests that the entire ovector be shown, what-
1216 for a successful complete non-DFA match. This modifier, which acts af-
1220 and if this is found in both elements of a capturing pair, "<un-
1221 changed>" is output. After a successful match, this applies to all
1224 elements are the only ones that should be set. After a DFA match, the
1230 A callout function is supplied when pcre2test calls the library match-
1246 difference to the matching process if the pattern begins with a lookbe-
1250 PCRE2_NOTEMPTY_ATSTART and PCRE2_ANCHORED flags set, in order to search
1251 for another, non-empty, match at the same point in the subject. If this
1252 match fails, the start offset is advanced, and the normal match is re-
1254 modifier or the split() function. Normally, the start offset is ad-
1256 as a newline, and the current character is CR followed by LF, an ad-
1261 The copy and get modifiers can be used to test the pcre2_sub-
1263 given more than once, and each can specify a capture group name or num-
1268 If the #subject command is used to set default copy and/or get lists,
1269 these can be unset by specifying a negative number to cancel all num-
1276 by the convenience functions are output with C, G, or L after the
1284 If the replace modifier is set, the pcre2_substitute() function is
1286 pcre2_match() in the case of PCRE2_SUBSTITUTE_MATCHED). Note that re-
1288 end of a modifier. This is not thought to be an issue in a test pro-
1291 Specifying a completely empty replacement string disables this modi-
1292 fier. However, it is possible to specify an empty replacement by pro-
1293 viding a buffer length, as described below, for an otherwise empty re-
1298 see if it is a valid UTF-8 string. If so, it is correctly converted to
1300 UTF-8 string, the individual code units are copied directly. This pro-
1301 vides a means of passing an invalid UTF-8 string for testing purposes.
1303 The following modifiers set options (in additional to the normal match
1317 After a successful substitution, the modified string is output, pre-
1328 than 256 characters) for substitution tests, as fixed-size buffers are
1331 to pcre2_substitute() as the size of the output buffer, with the re-
1339 Failed: error -47: no more memory
1341 The default action of pcre2_substitute() is to return PCRE2_ER-
1342 ROR_NOMEMORY when the output buffer is too small. However, if the
1343 PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option is set (by using the substi-
1353 Failed: error -47: no more memory: 10 code units are needed
1361 If the substitute_callout modifier is set, a substitution callout func-
1362 tion is set up. The null_context modifier must not be set, because the
1365 the input and output strings are output. For example:
1374 parenthesized number is the number of pairs that are set in the ovector
1375 (that is, one more than the number of capturing groups that were set).
1379 By default, the substitution callout function returns zero, which ac-
1381 Two further modifiers can be used to test other return values. If sub-
1382 stitute_skip is set to a value greater than zero the callout function
1384 returns -1. These cause the replacement to be rejected, and -1 causes
1385 no further matching to take place. If either of them are set, substi-
1397 If both are set for the same number, stop takes precedence. Only a sin-
1404 that is used by the just-in-time optimization code. It is ignored if
1408 very complicated patterns. If jitstack is set non-zero on a subject
1409 line it overrides any value that was set on the pattern.
1413 The heap_limit, match_limit, and depth_limit modifiers set the appro-
1427 between systems. If JIT is being used, only the match limit is rele-
1430 When using this modifier, the pattern should not contain any limit set-
1434 reduce the value of an in-pattern limit; they cannot increase it.
1436 For non-DFA matching, the minimum depth_limit number is a measure of
1442 For non-DFA matching, the match_limit number is a measure of the amount
1448 calls, both recursive and non-recursive, to the internal matching func-
1461 returned for a match, non-match, or partial match, pcre2test shows it.
1463 it is added to the non-match message.
1467 The memory modifier causes pcre2test to log the sizes of all heap mem-
1470 used only when a match requires more internal workspace that the de-
1471 fault allocation on the stack, so in many cases there will be no out-
1473 modifier to work, the null_context modifier must not be set on both the
1474 pattern and the subject, though it can be set on one or the other.
1486 not characters. When this modifier is used, the use_offset_limit modi-
1487 fier must have been set for the pattern; if not, an error is generated.
1489 Setting the size of the output vector
1491 The ovector modifier applies only to the subject line in which it ap-
1492 pears, though of course it can also be used to set a default in a #sub-
1498 POSIX API, a value of zero is used to cause pcre2_match_data_cre-
1501 match block with a zero-length ovector; there is always at least one
1504 Passing the subject as zero-terminated
1506 By default, the subject string is passed to a native API matching func-
1508 a zero-terminated string, the zero_terminate modifier is provided. It
1513 passing the replacement string as zero-terminated.
1519 null_context modifier is set, however, NULL is passed. This is for
1522 with the find_limits, find_limits_noheap, or substitute_callout modi-
1525 Similarly, for testing purposes, if the null_subject or null_replace-
1526 ment modifier is set, the subject or replacement string pointers are
1533 pcre2_match() to match each subject line. PCRE2 also supports an alter-
1534 native matching function, pcre2_dfa_match(), which operates in a dif-
1538 If the dfa modifier is set, the alternative matching function is used.
1539 This function finds all possible matches at a given point in the sub-
1540 ject. If, however, the dfa_shortest modifier is set, processing stops
1545 DEFAULT OUTPUT FROM pcre2test
1547 This section describes the output when the normal matching function,
1550 When a match succeeds, pcre2test outputs the list of captured sub-
1552 pattern. Otherwise, it outputs "No match" when the return is PCRE2_ER-
1562 also output. Here is an example of an interactive pcre2test run.
1565 PCRE2 version 10.22 2016-07-29
1574 Unset capturing substrings that are not followed by one that is set are
1590 If the strings contain any non-printing characters, they are output as
1591 \xhh escapes if the value is less than 256 and UTF mode is not set.
1592 Otherwise they are output as \x{hh...} escapes. See below for the defi-
1593 nition of non-printing characters. If the aftertext modifier is set,
1594 the output for substring 0 is followed by the the rest of the subject
1602 If global matching is requested, the results of successive matching at-
1603 tempts are output in sequence, like this:
1614 "No match" is output only if the first match attempt fails. Here is an
1620 Error -24 (bad offset value)
1628 OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
1631 output consists of a list of all the matches that start at the first
1641 longest matching string is always given first (and numbered zero). Af-
1642 ter a PCRE2_ERROR_PARTIAL return, the output is "Partial match:", fol-
1667 When the alternative matching function has given the PCRE2_ERROR_PAR-
1684 If the pattern contains any callout requests, pcre2test's callout func-
1687 differences in behaviour. The output for callouts with numerical argu-
1696 --->pqrabcdef
1699 This output indicates that callout number 0 occurred for a match at-
1702 was \d. Just one circumflex is output if the start and current posi-
1703 tions are the same, or if the current position precedes the start posi-
1709 plus, is output. For example:
1711 re> /\d?[A-E]\*/auto_callout
1713 --->E*
1715 +3 ^ [A-E]
1720 If a pattern contains (*MARK) items, an additional line is output when-
1721 ever a change of latest mark is passed to the callout function. For ex-
1726 --->abc
1736 the rest of the match, so nothing more is output. If, as a result of
1738 output.
1742 The output for a callout with a string argument is similar, except that
1744 the callout string and its offset in the pattern string are output be-
1751 --->abcdefg
1754 --->abcdefg
1765 If the callout_capture modifier is set, the current captured groups are
1766 output when a callout occurs. This is useful only for non-DFA matching,
1770 The normal callout output, showing the callout number or pattern offset
1772 set.
1775 JIT, setting the callout_extra modifier causes additional output from
1778 attempt" is output. If there has been a backtrack since the last call-
1780 output, followed by "No other matching paths" if the backtrack ended
1786 --->aac
1792 --->aac
1798 --->aac
1806 --->aac
1812 --->aac
1822 the "a+" item is turned into "a++", which reduces the number of back-
1832 numbers. If there is only one number, 1 is returned instead of 0 (caus-
1836 modifier is similar, except that PCRE2_ERROR_CALLOUT is returned, caus-
1838 are set for the same callout number, callout_error takes precedence.
1842 The callout_data modifier can be given an unsigned or a negative num-
1843 ber. This is set as the "user data" that is passed to the matching
1848 Inserting callouts can be helpful when using pcre2test to check compli-
1853 NON-PRINTING CHARACTERS
1856 bytes other than 32-126 are always treated as non-printing characters
1861 set for the pattern (using the locale modifier). In this case, the is-
1862 print() function is used to distinguish printing and non-printing char-
1873 compiled patterns can be saved they must be serialized, that is, con-
1874 verted to a stream of bytes. A single byte stream may contain any num-
1875 ber of compiled patterns, but they must all use the same character ta-
1879 The functions whose names begin with pcre2_serialize_ are used for se-
1880 rializing and de-serializing. They are described in the pcre2serialize
1888 In pcre2test, when a pattern with push modifier is successfully com-
1892 compiled pattern to be stacked, leaving the original available for im-
1909 reads the data in the file, and then arranges for it to be de-serial-
1911 The pattern on the top of the stack can be retrieved by the #pop com-
1916 particular, hex, posix, posix_nosub, push, and pushcopy are not al-
1917 lowed, nor are any option-setting modifiers. The JIT modifiers are,
1918 however permitted. Here is an example that saves and reloads two pat-
1955 Copyright (c) 1997-2022 University of Cambridge.