• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1News about PCRE2 releases
2-------------------------
3
4
5Version 10.42 11-December-2022
6------------------------------
7
8This is an unexpectedly early release to fix a problem that was introduced in
910.41. ChangeLog number 19 (GitHub #139) added the default definition of
10PCRE2_CALL_CONVENTION to pcre2posix.c instead of pcre2posix.h, which meant that
11programs including pcre2posix.h but not pcre2.h couldn't compile. A new test
12that checks this case has been added.
13
14A couple of other minor issues are also fixed, and a patch for an intermittent
15JIT fault is also included. See ChangeLog and the Git log.
16
17
18Version 10.41 06-December-2022
19------------------------------
20
21This is another mainly bug-fixing and code-tidying release. There is one
22significant upgrade to pcre2grep: it now behaves like GNU grep when matching
23more than one pattern and a later pattern matches at an earlier point in the
24subject when the matched substrings are being identified by colour or by
25offsets.
26
27
28Version 10.40 15-April-2022
29---------------------------
30
31This is mostly a bug-fixing and code-tidying release. However, there are some
32extensions to Unicode property handling:
33
34* Added support for Bidi_Class and a number of binary Unicode properties,
35including Bidi_Control.
36
37* A number of changes to script matching for \p and \P:
38
39  (a) Script extensions for a character are now coded as a bitmap instead of
40      a list of script numbers, which should be faster and does not need a
41      loop.
42
43  (b) Added the syntax \p{script:xxx} and \p{script_extensions:xxx} (synonyms
44      sc and scx).
45
46  (c) Changed \p{scriptname} from being the same as \p{sc:scriptname} to being
47      the same as \p{scx:scriptname} because this change happened in Perl at
48      release 5.26.
49
50  (d) The standard Unicode 4-letter abbreviations for script names are now
51      recognized.
52
53  (e) In accordance with Unicode and Perl's "loose matching" rules, spaces,
54      hyphens, and underscores are ignored in property names, which are then
55      matched independent of case.
56
57As always, see ChangeLog for a list of all changes (also the Git log).
58
59
60Version 10.39 29-October-2021
61-----------------------------
62
63This release is happening soon after 10.38 because the bug fix is important.
64
651. Fix incorrect detection of alternatives in first character search in JIT.
66
672. Update to Unicode 14.0.0.
68
693. Some code cleanups (see ChangeLog).
70
71
72Version 10.38 01-October-2021
73-----------------------------
74
75As well as some bug fixes and tidies (as always, see ChangeLog for details),
76the documentation is updated to list the new URLs, following the move of the
77source repository to GitHub and the mailing list to Google Groups.
78
79* The CMake build system can now build both static and shared libraries in one
80go.
81
82* Following Perl's lead, \K is now locked out in lookaround assertions by
83default, but an option is provided to re-enable the previous behaviour.
84
85
86Version 10.37 26-May-2021
87-------------------------
88
89A few more bug fixes and tidies. The only change of real note is the removal of
90the actual POSIX names regcomp etc. from the POSIX wrapper library because
91these have caused issues for some applications (see 10.33 #2 below).
92
93
94Version 10.36 04-December-2020
95------------------------------
96
97Again, mainly bug fixes and tidies. The only enhancements are the addition of
98GNU grep's -m (aka --max-count) option to pcre2grep, and also unifying the
99handling of substitution strings for both -O and callouts in pcre2grep, with
100the addition of $x{...} and $o{...} to allow for characters whose code points
101are greater than 255 in Unicode mode.
102
103NOTE: there is an outstanding issue with JIT support for MacOS on arm64
104hardware. For details, please see Bugzilla issue #2618.
105
106
107Version 10.35 15-April-2020
108---------------------------
109
110Bugfixes, tidies, and a few new enhancements.
111
1121. Capturing groups that contain recursive backreferences to themselves are no
113longer automatically atomic, because the restriction is no longer necessary
114as a result of the 10.30 restructuring.
115
1162. Several new options for pcre2_substitute().
117
1183. When Unicode is supported and PCRE2_UCP is set without PCRE2_UTF, Unicode
119character properties are used for upper/lower case computations on characters
120whose code points are greater than 127.
121
1224. The character tables (for low-valued characters) can now more easily be
123saved and restored in binary.
124
1255. Updated to Unicode 13.0.0.
126
127
128Version 10.34 21-November-2019
129------------------------------
130
131Another release with a few enhancements as well as bugfixes and tidies. The
132main new features are:
133
1341. There is now some support for matching in invalid UTF strings.
135
1362. Non-atomic positive lookarounds are implemented in the pcre2_match()
137interpreter, but not in JIT.
138
1393. Added two new functions: pcre2_get_match_data_size() and
140pcre2_maketables_free().
141
1424. Upgraded to Unicode 12.1.0.
143
144
145Version 10.33 16-April-2019
146---------------------------
147
148Yet more bugfixes, tidies, and a few enhancements, summarized here (see
149ChangeLog for the full list):
150
1511. Callouts from pcre2_substitute() are now available.
152
1532. The POSIX functions are now all called pcre2_regcomp() etc., with wrapper
154functions that use the standard POSIX names. However, in pcre2posix.h the POSIX
155names are defined as macros. This should help avoid linking with the wrong
156library in some environments, while still exporting the POSIX names for
157pre-existing programs that use them.
158
1593. Some new options:
160
161   (a) PCRE2_EXTRA_ESCAPED_CR_IS_LF makes \r behave as \n.
162
163   (b) PCRE2_EXTRA_ALT_BSUX enables support for ECMAScript 6's \u{hh...}
164       construct.
165
166   (c) PCRE2_COPY_MATCHED_SUBJECT causes a copy of a matched subject to be
167       made, instead of just remembering a pointer.
168
1694. Some new Perl features:
170
171   (a) Perl 5.28's experimental alphabetic names for atomic groups and
172       lookaround assertions, for example, (*pla:...) and (*atomic:...).
173
174   (b) The new Perl "script run" features (*script_run:...) and
175       (*atomic_script_run:...) aka (*sr:...) and (*asr:...).
176
177   (c) When PCRE2_UTF is set, allow non-ASCII letters and decimal digits in
178       capture group names.
179
1805. --disable-percent-zt disables the use of %zu and %td in formatting strings
181in pcre2test. They were already automatically disabled for VC and older C
182compilers.
183
1846. Some changes related to callouts in pcre2grep:
185
186   (a) Support for running an external program under VMS has been added, in
187       addition to Windows and fork() support.
188
189   (b) --disable-pcre2grep-callout-fork restricts the callout support in
190       to the inbuilt echo facility.
191
192
193Version 10.32 10-September-2018
194-------------------------------
195
196This is another mainly bugfix and tidying release with a few minor
197enhancements. These are the main ones:
198
1991. pcre2grep now supports the inclusion of binary zeros in patterns that are
200read from files via the -f option.
201
2022. ./configure now supports --enable-jit=auto, which automatically enables JIT
203if the hardware supports it.
204
2053. In pcre2_dfa_match(), internal recursive calls no longer use the stack for
206local workspace and local ovectors. Instead, an initial block of stack is
207reserved, but if this is insufficient, heap memory is used. The heap limit
208parameter now applies to pcre2_dfa_match().
209
2104. Updated to Unicode version 11.0.0.
211
2125. (*ACCEPT:ARG), (*FAIL:ARG), and (*COMMIT:ARG) are now supported.
213
2146. Added support for \N{U+dddd}, but only in Unicode mode.
215
2167. Added support for (?^) to unset all imnsx options.
217
218
219Version 10.31 12-February-2018
220------------------------------
221
222This is mainly a bugfix and tidying release (see ChangeLog for full details).
223However, there are some minor enhancements.
224
2251. New pcre2_config() options: PCRE2_CONFIG_NEVER_BACKSLASH_C and
226PCRE2_CONFIG_COMPILED_WIDTHS.
227
2282. New pcre2_pattern_info() option PCRE2_INFO_EXTRAOPTIONS to retrieve the
229extra compile time options.
230
2313. There are now public names for all the pcre2_compile() error numbers.
232
2334. Added PCRE2_CALLOUT_STARTMATCH and PCRE2_CALLOUT_BACKTRACK bits to a new
234field callout_flags in callout blocks.
235
236
237Version 10.30 14-August-2017
238----------------------------
239
240The full list of changes that includes bugfixes and tidies is, as always, in
241ChangeLog. These are the most important new features:
242
2431. The main interpreter, pcre2_match(), has been refactored into a new version
244that does not use recursive function calls (and therefore the system stack) for
245remembering backtracking positions. This makes --disable-stack-for-recursion a
246NOOP. The new implementation allows backtracking into recursive group calls in
247patterns, making it more compatible with Perl, and also fixes some other
248previously hard-to-do issues. For patterns that have a lot of backtracking, the
249heap is now used, and there is an explicit limit on the amount, settable by
250pcre2_set_heap_limit() or (*LIMIT_HEAP=xxx). The "recursion limit" is retained,
251but is renamed as "depth limit" (though the old names remain for
252compatibility).
253
254There is also a change in the way callouts from pcre2_match() are handled. The
255offset_vector field in the callout block is no longer a pointer to the
256actual ovector that was passed to the matching function in the match data
257block. Instead it points to an internal ovector of a size large enough to hold
258all possible captured substrings in the pattern.
259
2602. The new option PCRE2_ENDANCHORED insists that a pattern match must end at
261the end of the subject.
262
2633. The new option PCRE2_EXTENDED_MORE implements Perl's /xx feature, and
264pcre2test is upgraded to support it. Setting within the pattern by (?xx) is
265also supported.
266
2674. (?n) can be used to set PCRE2_NO_AUTO_CAPTURE, because Perl now has this.
268
2695. Additional compile options in the compile context are now available, and the
270first two are: PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES and
271PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL.
272
2736. The newline type PCRE2_NEWLINE_NUL is now available.
274
2757. The match limit value now also applies to pcre2_dfa_match() as there are
276patterns that can use up a lot of resources without necessarily recursing very
277deeply.
278
2798. The option REG_PEND (a GNU extension) is now available for the POSIX
280wrapper. Also there is a new option PCRE2_LITERAL which is used to support
281REG_NOSPEC.
282
2839. PCRE2_EXTRA_MATCH_LINE and PCRE2_EXTRA_MATCH_WORD are implemented for the
284benefit of pcre2grep, and pcre2grep's -F, -w, and -x options are re-implemented
285using PCRE2_LITERAL, PCRE2_EXTRA_MATCH_WORD, and PCRE2_EXTRA_MATCH_LINE. This
286is tidier and also fixes some bugs.
287
28810. The Unicode tables are upgraded from Unicode 8.0.0 to Unicode 10.0.0.
289
29011. There are some experimental functions for converting foreign patterns
291(globs and POSIX patterns) into PCRE2 patterns.
292
293
294Version 10.23 14-February-2017
295------------------------------
296
2971. ChangeLog has the details of a lot of bug fixes and tidies.
298
2992. There has been a major re-factoring of the pcre2_compile.c file. Most syntax
300checking is now done in the pre-pass that identifies capturing groups. This has
301reduced the amount of duplication and made the code tidier. While doing this,
302some minor bugs and Perl incompatibilities were fixed (see ChangeLog for
303details.)
304
3053. Back references are now permitted in lookbehind assertions when there are
306no duplicated group numbers (that is, (?| has not been used), and, if the
307reference is by name, there is only one group of that name. The referenced
308group must, of course be of fixed length.
309
3104. \g{+<number>} (e.g. \g{+2} ) is now supported. It is a "forward back
311reference" and can be useful in repetitions (compare \g{-<number>} ). Perl does
312not recognize this syntax.
313
3145. pcre2grep now automatically expands its buffer up to a maximum set by
315--max-buffer-size.
316
3176. The -t option (grand total) has been added to pcre2grep.
318
3197. A new function called pcre2_code_copy_with_tables() exists to copy a
320compiled pattern along with a private copy of the character tables that is
321uses.
322
3238. A user supplied a number of patches to upgrade pcre2grep under Windows and
324tidy the code.
325
3269. Several updates have been made to pcre2test and test scripts (see
327ChangeLog).
328
329
330Version 10.22 29-July-2016
331--------------------------
332
3331. ChangeLog has the details of a number of bug fixes.
334
3352. The POSIX wrapper function regcomp() did not used to support back references
336and subroutine calls if called with the REG_NOSUB option. It now does.
337
3383. A new function, pcre2_code_copy(), is added, to make a copy of a compiled
339pattern.
340
3414. Support for string callouts is added to pcre2grep.
342
3435. Added the PCRE2_NO_JIT option to pcre2_match().
344
3456. The pcre2_get_error_message() function now returns with a negative error
346code if the error number it is given is unknown.
347
3487. Several updates have been made to pcre2test and test scripts (see
349ChangeLog).
350
351
352Version 10.21 12-January-2016
353-----------------------------
354
3551. Many bugs have been fixed. A large number of them were provoked only by very
356strange pattern input, and were discovered by fuzzers. Some others were
357discovered by code auditing. See ChangeLog for details.
358
3592. The Unicode tables have been updated to Unicode version 8.0.0.
360
3613. For Perl compatibility in EBCDIC environments, ranges such as a-z in a
362class, where both values are literal letters in the same case, omit the
363non-letter EBCDIC code points within the range.
364
3654. There have been a number of enhancements to the pcre2_substitute() function,
366giving more flexibility to replacement facilities. It is now also possible to
367cause the function to return the needed buffer size if the one given is too
368small.
369
3705. The PCRE2_ALT_VERBNAMES option causes the "name" parts of special verbs such
371as (*THEN:name) to be processed for backslashes and to take note of
372PCRE2_EXTENDED.
373
3746. PCRE2_INFO_HASBACKSLASHC makes it possible for a client to find out if a
375pattern uses \C, and --never-backslash-C makes it possible to compile a version
376PCRE2 in which the use of \C is always forbidden.
377
3787. A limit to the length of pattern that can be handled can now be set by
379calling pcre2_set_max_pattern_length().
380
3818. When matching an unanchored pattern, a match can be required to begin within
382a given number of code units after the start of the subject by calling
383pcre2_set_offset_limit().
384
3859. The pcre2test program has been extended to test new facilities, and it can
386now run the tests when LF on its own is not a valid newline sequence.
387
38810. The RunTest script has also been updated to enable more tests to be run.
389
39011. There have been some minor performance enhancements.
391
392
393Version 10.20 30-June-2015
394--------------------------
395
3961. Callouts with string arguments and the pcre2_callout_enumerate() function
397have been implemented.
398
3992. The PCRE2_NEVER_BACKSLASH_C option, which locks out the use of \C, is added.
400
4013. The PCRE2_ALT_CIRCUMFLEX option lets ^ match after a newline at the end of a
402subject in multiline mode.
403
4044. The way named subpatterns are handled has been refactored. The previous
405approach had several bugs.
406
4075. The handling of \c in EBCDIC environments has been changed to conform to the
408perlebcdic document. This is an incompatible change.
409
4106. Bugs have been mended, many of them discovered by fuzzers.
411
412
413Version 10.10 06-March-2015
414---------------------------
415
4161. Serialization and de-serialization functions have been added to the API,
417making it possible to save and restore sets of compiled patterns, though
418restoration must be done in the same environment that was used for compilation.
419
4202. The (*NO_JIT) feature has been added; this makes it possible for a pattern
421creator to specify that JIT is not to be used.
422
4233. A number of bugs have been fixed. In particular, bugs that caused building
424on Windows using CMake to fail have been mended.
425
426
427Version 10.00 05-January-2015
428-----------------------------
429
430Version 10.00 is the first release of PCRE2, a revised API for the PCRE
431library. Changes prior to 10.00 are logged in the ChangeLog file for the old
432API, up to item 20 for release 8.36. New programs are recommended to use the
433new library. Programs that use the original (PCRE1) API will need changing
434before linking with the new library.
435
436****
437