1<html> 2<head> 3<title>pcre2convert specification</title> 4</head> 5<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB"> 6<h1>pcre2convert man page</h1> 7<p> 8Return to the <a href="index.html">PCRE2 index page</a>. 9</p> 10<p> 11This page is part of the PCRE2 HTML documentation. It was generated 12automatically from the original man page. If there is any nonsense in it, 13please consult the man page, in case the conversion went wrong. 14<br> 15<ul> 16<li><a name="TOC1" href="#SEC1">EXPERIMENTAL PATTERN CONVERSION FUNCTIONS</a> 17<li><a name="TOC2" href="#SEC2">THE CONVERT CONTEXT</a> 18<li><a name="TOC3" href="#SEC3">THE CONVERSION FUNCTION</a> 19<li><a name="TOC4" href="#SEC4">CONVERTING GLOBS</a> 20<li><a name="TOC5" href="#SEC5">CONVERTING POSIX PATTERNS</a> 21<li><a name="TOC6" href="#SEC6">AUTHOR</a> 22<li><a name="TOC7" href="#SEC7">REVISION</a> 23</ul> 24<br><a name="SEC1" href="#TOC1">EXPERIMENTAL PATTERN CONVERSION FUNCTIONS</a><br> 25<P> 26This document describes a set of functions that can be used to convert 27"foreign" patterns into PCRE2 regular expressions. This facility is currently 28experimental, and may be changed in future releases. Two kinds of pattern, 29globs and POSIX patterns, are supported. 30</P> 31<br><a name="SEC2" href="#TOC1">THE CONVERT CONTEXT</a><br> 32<P> 33<b>pcre2_convert_context *pcre2_convert_context_create(</b> 34<b> pcre2_general_context *<i>gcontext</i>);</b> 35<br> 36<br> 37<b>pcre2_convert_context *pcre2_convert_context_copy(</b> 38<b> pcre2_convert_context *<i>cvcontext</i>);</b> 39<br> 40<br> 41<b>void pcre2_convert_context_free(pcre2_convert_context *<i>cvcontext</i>);</b> 42<br> 43<br> 44<b>int pcre2_set_glob_escape(pcre2_convert_context *<i>cvcontext</i>,</b> 45<b> uint32_t <i>escape_char</i>);</b> 46<br> 47<br> 48<b>int pcre2_set_glob_separator(pcre2_convert_context *<i>cvcontext</i>,</b> 49<b> uint32_t <i>separator_char</i>);</b> 50<br> 51<br> 52A convert context is used to hold parameters that affect the way that pattern 53conversion works. Like all PCRE2 contexts, you need to use a context only if 54you want to override the defaults. There are the usual create, copy, and free 55functions. If custom memory management functions are set in a general context 56that is passed to <b>pcre2_convert_context_create()</b>, they are used for all 57memory management within the conversion functions. 58</P> 59<P> 60There are only two parameters in the convert context at present. Both apply 61only to glob conversions. The escape character defaults to grave accent under 62Windows, otherwise backslash. It can be set to zero, meaning no escape 63character, or to any punctuation character with a code point less than 256. 64The separator character defaults to backslash under Windows, otherwise forward 65slash. It can be set to forward slash, backslash, or dot. 66</P> 67<P> 68The two setting functions return zero on success, or PCRE2_ERROR_BADDATA if 69their second argument is invalid. 70</P> 71<br><a name="SEC3" href="#TOC1">THE CONVERSION FUNCTION</a><br> 72<P> 73<b>int pcre2_pattern_convert(PCRE2_SPTR <i>pattern</i>, PCRE2_SIZE <i>length</i>,</b> 74<b> uint32_t <i>options</i>, PCRE2_UCHAR **<i>buffer</i>,</b> 75<b> PCRE2_SIZE *<i>blength</i>, pcre2_convert_context *<i>cvcontext</i>);</b> 76<br> 77<br> 78<b>void pcre2_converted_pattern_free(PCRE2_UCHAR *<i>converted_pattern</i>);</b> 79<br> 80<br> 81The first two arguments of <b>pcre2_pattern_convert()</b> define the foreign 82pattern that is to be converted. The length may be given as 83PCRE2_ZERO_TERMINATED. The <b>options</b> argument defines how the pattern is to 84be processed. If the input is UTF, the PCRE2_CONVERT_UTF option should be set. 85PCRE2_CONVERT_NO_UTF_CHECK may also be set if you are sure the input is valid. 86One or more of the glob options, or one of the following POSIX options must be 87set to define the type of conversion that is required: 88<pre> 89 PCRE2_CONVERT_GLOB 90 PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR 91 PCRE2_CONVERT_GLOB_NO_STARSTAR 92 PCRE2_CONVERT_POSIX_BASIC 93 PCRE2_CONVERT_POSIX_EXTENDED 94</pre> 95Details of the conversions are given below. The <b>buffer</b> and <b>blength</b> 96arguments define how the output is handled: 97</P> 98<P> 99If <b>buffer</b> is NULL, the function just returns the length of the converted 100pattern via <b>blength</b>. This is one less than the length of buffer needed, 101because a terminating zero is always added to the output. 102</P> 103<P> 104If <b>buffer</b> points to a NULL pointer, an output buffer is obtained using 105the allocator in the context or <b>malloc()</b> if no context is supplied. A 106pointer to this buffer is placed in the variable to which <b>buffer</b> points. 107When no longer needed the output buffer must be freed by calling 108<b>pcre2_converted_pattern_free()</b>. If this function is called with a NULL 109argument, it returns immediately without doing anything. 110</P> 111<P> 112If <b>buffer</b> points to a non-NULL pointer, <b>blength</b> must be set to the 113actual length of the buffer provided (in code units). 114</P> 115<P> 116In all cases, after successful conversion, the variable pointed to by 117<b>blength</b> is updated to the length actually used (in code units), excluding 118the terminating zero that is always added. 119</P> 120<P> 121If an error occurs, the length (via <b>blength</b>) is set to the offset 122within the input pattern where the error was detected. Only gross syntax errors 123are caught; there are plenty of errors that will get passed on for 124<b>pcre2_compile()</b> to discover. 125</P> 126<P> 127The return from <b>pcre2_pattern_convert()</b> is zero on success or a non-zero 128PCRE2 error code. Note that PCRE2 error codes may be positive or negative: 129<b>pcre2_compile()</b> uses mostly positive codes and <b>pcre2_match()</b> 130negative ones; <b>pcre2_convert()</b> uses existing codes of both kinds. A 131textual error message can be obtained by calling 132<b>pcre2_get_error_message()</b>. 133</P> 134<br><a name="SEC4" href="#TOC1">CONVERTING GLOBS</a><br> 135<P> 136Globs are used to match file names, and consequently have the concept of a 137"path separator", which defaults to backslash under Windows and forward slash 138otherwise. If PCRE2_CONVERT_GLOB is set, the wildcards * and ? are not 139permitted to match separator characters, but the double-star (**) feature 140(which does match separators) is supported. 141</P> 142<P> 143PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR matches globs with wildcards allowed to 144match separator characters. PCRE2_GLOB_NO_STARSTAR matches globs with the 145double-star feature disabled. These options may be given together. 146</P> 147<br><a name="SEC5" href="#TOC1">CONVERTING POSIX PATTERNS</a><br> 148<P> 149POSIX defines two kinds of regular expression pattern: basic and extended. 150These can be processed by setting PCRE2_CONVERT_POSIX_BASIC or 151PCRE2_CONVERT_POSIX_EXTENDED, respectively. 152</P> 153<P> 154In POSIX patterns, backslash is not special in a character class. Unmatched 155closing parentheses are treated as literals. 156</P> 157<P> 158In basic patterns, ? + | {} and () must be escaped to be recognized 159as metacharacters outside a character class. If the first character in the 160pattern is * it is treated as a literal. ^ is a metacharacter only at the start 161of a branch. 162</P> 163<P> 164In extended patterns, a backslash not in a character class always 165makes the next character literal, whatever it is. There are no backreferences. 166</P> 167<P> 168Note: POSIX mandates that the longest possible match at the first matching 169position must be found. This is not what <b>pcre2_match()</b> does; it yields 170the first match that is found. An application can use <b>pcre2_dfa_match()</b> 171to find the longest match, but that does not support backreferences (but then 172neither do POSIX extended patterns). 173</P> 174<br><a name="SEC6" href="#TOC1">AUTHOR</a><br> 175<P> 176Philip Hazel 177<br> 178University Computing Service 179<br> 180Cambridge, England. 181<br> 182</P> 183<br><a name="SEC7" href="#TOC1">REVISION</a><br> 184<P> 185Last updated: 28 June 2018 186<br> 187Copyright © 1997-2018 University of Cambridge. 188<br> 189<p> 190Return to the <a href="index.html">PCRE2 index page</a>. 191</p> 192