1.3.10.31 RegEx Match
The RegEx Match processor matches the data in an attribute against a regular expression, and outputs the matching data in a new attribute. It also adds an attribute with an array of all the matched groups within the regular expression.
Use RegEx Match as a simple way to extract data that matches a regular expression. It is particularly useful where you want to create an array of groups.
Note that a group in a regular expression is contained between parentheses. A single regular expression may have many groups.
RegEx Match adds two attributes - one containing the value that matched against the whole regular expression, and another containing an array of the matching groups within the regular expression. If there was no match, the new attributes will both be null.
Regular Expressions
Regular expressions are a standard technique for expressing patterns and manipulating Strings that is very powerful once mastered.
Tutorials and reference material about regular expressions are available on the Internet, and in books, including: Mastering Regular Expressions by Jeffrey E. F. Friedl published by O'Reilly UK; ISBN: 0-596-00289-0.
There are also software packages available to help you master regular expressions, such as RegExBuddy, and online libraries of useful regular expressions, such as RegExLib.
The following table describes the configuration options:
Configuration | Description |
---|---|
Inputs |
Specify a single String attribute. |
Options |
Specify the following options:
|
Outputs |
Describes any data attribute or flag attribute outputs. |
Data Attributes |
The following data attributes are output:
|
Flags |
The following flags are output:
|
The following table describes the statistics produced by the profiler:
Statistic | Description |
---|---|
Matched |
The number of records which matched the regular expression. |
Unmatched |
The number of records which did not match the regular expression. |
Output Filters
The following output filters are available from the RegEx Match processor:
-
Records that matched the regular expression
-
Records that did not match the regular expression
Example
In this example, the values in an ADDRESS3 attribute are matched against the following UK Postcode regular expression:
([A-Z]{1,2}[0-9]{1,2}|[A-Z]{3}|[A-Z]{1,2}[0-9][A-Z]) +([0-9][A-Z]{2})
Matched values | Unmatched values |
---|---|
170 |
1831 |
Drilldown on Matched values:
Where values match, an array is created with the values matching each distinct group; that is, Outcode and Incode:
ADDRESS3 | RegExMatchFull | RegExMatchGroups |
---|---|---|
SP7 9QJ |
SP7 9QJ |
{SP7}{9QJ} |
BA16 0BB |
BA16 0BB |
{BA16}{0BB} |
LA9 7BT |
LA9 7BT |
{LA9}{7BT} |
E16 2AG |
E16 2AG |
{E16}[2AG} |
SN1 5BB |
SN1 5BB |
{SN1}{5BB} |