Hubbry Logo
search
logo

Perl Compatible Regular Expressions

logo
Community Hub0 Subscribers
Write something...
Be the first to start a discussion here.
Be the first to start a discussion here.
See all
Perl Compatible Regular Expressions

Perl Compatible Regular Expressions (PCRE) is a library written in C, which implements a regular expression engine, inspired by the capabilities of the Perl programming language. Philip Hazel started writing PCRE in summer 1997. PCRE's syntax is much more powerful and flexible than either of the POSIX regular expression flavors (BRE, ERE) and than that of many other regular-expression libraries.

While PCRE originally aimed at feature-equivalence with Perl, the two implementations are not fully equivalent. During the PCRE 7.x and Perl 5.9.x phase, the two projects coordinated development, with features being ported between them in both directions.

In 2015, a fork of PCRE was released with a revised programming interface (API). The original software, now called PCRE1 (the 1.xx–8.xx series), has had bugs mended, but no further development. As of 2020, it is considered obsolete, and the current 8.45 release is likely to be the last. The new PCRE2 code (the 10.xx series) has had a number of extensions and coding improvements and is where development takes place.

A number of prominent open-source programs, such as the Apache and Nginx HTTP servers, and the PHP and R scripting languages, incorporate the PCRE library; proprietary software can do likewise, as the library is BSD-licensed. As of Perl 5.10, PCRE is also available as a replacement for Perl's default regular-expression engine through the re::engine::PCRE module.

The library can be built on Unix, Windows, and several other environments. PCRE2 is distributed with a POSIX C wrapper, several test programs, and the utility program pcregrep/pcre2grep that is built in tandem with the library.

The just-in-time compiler can be enabled when the PCRE2 library is built. Large performance benefits are possible when (for example) the calling program utilizes the feature with compatible patterns that are executed repeatedly. The just-in-time compiler support was written by Zoltan Herczeg and is not addressed in the POSIX wrapper.

The use of the system stack for backtracking can be problematic in PCRE1, which is why this feature of the implementation was changed in PCRE2. The heap is now used for this purpose, and the total amount can be limited. The problem of stack overflow, which came up regularly with PCRE1, is no longer an issue with PCRE2 from release 10.30 (2017).

Like Perl, PCRE2 has consistent escaping rules: any non-alpha-numeric character may be escaped to mean its literal value by prefixing a \ (backslash) before the character. Any alpha-numeric character preceded by a backslash typically gives it a special meaning. In the case where the sequence has not been defined to be special, an error occurs. This is different to Perl, which gives an error only if it is in warning mode (PCRE2 does not have a warning mode). In basic POSIX regular expressions, sometimes backslashes escaped non-alpha-numerics (e.g. \.), and sometimes they introduced a special feature (e.g. \(\)).

See all
User Avatar
No comments yet.