SAP REGEX PCRE - Syntax



Get Example source ABAP code based on a different SAP table
  


VERSION 5 IN

ABAP_REGEX - PCRE Syntax
Regular expressions with PCRE syntax can be specified after the addition PCRE of the statements FIND and REPLACE and the argument pcre of built-in functions for strings. Objects for PCRE regular expressions can be created with the factory method CREATE_PCRE of the system class CL_ABAP_REGEX to be used in statements FIND and REPLACE or with the system class CL_ABAP_MATCHER.
Currently, there is no detailed description of the PCRE syntax for regular expressions in the ABAP keyword documentation.
For a short syntax overview, see Special Characters in PCRE Regular Expressions
For the complete documentation, refer to: Perl documentation <(>perlre<)>
A regular expression in PCRE syntax can be compiled in a normal or extended mode. In the extended mode, most unescaped whitespace characters (blanks and line breaks) of the pattern are ignored outside character classes and comments can be placed behind #. In ABAP statements and built-in functions, the extended mode is switched on by default and can be switched off with (?-x) in the regular expression. When using CL_ABAP_REGEX, it can be switched by the parameter EXTENDED of method CREATE_PCRE.
For regular expressions in PCRE syntax, it can be defined whether valid UTF-16 character strings are expected or not. In ABAP statements and built-in functions, a PCRE regular expression can be introduced with (*UTF) in order to check for valid UTF-16 strings. When using CL_ABAP_REGEX, the parameter UNICODE_HANDLING of method CREATE_PCRE can be used. When the strict mode for working with UTF-16 strings is switched on, a surrogate pair is handled as a single character (see example for counting).



Latest notes:

The PCRE syntax is more powerful than the obsolete POSIX syntax. Furthermore, PCRE regular expressions generally perform better than the POSIX regular expressions supported by ABAP. Therefore, it is recommended that POSIX regular expressions are migrated to PCRE.
The PCRE syntax supports callouts that call ABAP methods during matching an regular expression with CL_ABAP_MATCHER.
ABAP SQL and ABAP CDS also support the PCRE syntax with the built-in functions REPLACE_REGEXPR, LIKE_REGEXPR , and OCCURRENCES_REGEXPR. These functions access the PCRE1 library implemented in the SAP HANA database.
NON_V5_HINTS
The test and demonstration program DEMO_REGEX allows PCRE syntax to be tested by selecting PCRE.
ABAP_HINT_END

ABAP_EXAMPLE_VX5
Searching for a PCRE regular expression in a character string. See also the class CL_DEMO_FIND_REGEX.
ABEXA 01667
ABAP_EXAMPLE_END

ABAP_EXAMPLE_VX5
PCRE regular expressions support non-greedy behavior by placing a question mark (?) behind quantifiers as for example the wildcard character asterisk (*). In the following example, the asterisk in the first regular expression is greedy and finds everything between the first <(><)> and the last < /i>. In the second regular expression, the asterisk is marked as non-greedy and only the substring between the first <(><)> and the following < /i> is found. Non-greedy behavior is not supported in the obsolete POSIX syntax and other workarounds as for example <(>[^<]*<)> have to be used. For more information, see New features in PCRE compared to POSIX .
ABEXA 01454
ABAP_EXAMPLE_END

ABAP_EXAMPLE_VX5
This example shows the effect of the extended mode, that is switched on by default and might lead to unexpected results. All replacements yield the same result a-b except the one in r5, where the line break character is not found but instead all empty spaces are replaced by the replacement character -. The reason is that the pattern | n| consists of nothing but the line break character and this is ignored in the extended mode. Therefore, the pattern is in fact an empty string and yields the same result as specifying an empty string directly. By switching off the extended mode with (?-x), the linebreak character is not ignored and yields the same result as the special PCRE character n, which is expressed in different ways here.
ABEXA 01666
ABAP_EXAMPLE_END

ABAP_EXAMPLES_ABEXA
PCRE Regular Expression with Callouts
Parsing with PCRE Regular Expression
ABAP_EXAMPLE_END

Copyright Note
This software uses the PCRE2 library under the <(>PCRE2 LICENCE<)> .