2.4 Regular Expressions to Redact Patterns of Data
Regular expressions redact specific data within a column data value, based on a pattern search. You can use regular expressions to redact a column of strings of different lengths.
For example, you can redact the user name of email addresses, so that only the domain shows (for example, replacing hpreston
in the email address hpreston@example.com
with [redacted]
so that it appears as [redacted]@example.com
). To perform the redaction, set the DBMS_REDACT.ADD_POLICY
or DBMS_REDACT.ALTER_POLICY
procedure function_type
parameter to either DBMS_REDACT.REGEXP
or DBMS_REDACT.REGEXP_WIDTH
, and then use the following parameters to build the regular expression:
-
A string search pattern (that is, the values to search for), such as:
regexp_pattern => '(.+)@(.+\.[A-Za-z]{2,4})'
This setting looks for a pattern of the following form:
one_or_more_characters@one_or_more_characters.2-4_characters_in_range_A-Z_or_a-z
-
A replacement string that replaces the value matched by the
regexp_pattern
setting. The replacement string can include back references to sub-expressions of the main regular expression pattern. The following example replaces the data before the@
symbol (from theregexp_pattern
setting) with the text[redacted]
. The\2
setting refers to the second match group, which is(.+\.[A-Za-z]{2,4})
from theregexp_pattern
setting.regexp_replace_string => '[redacted]@\2'
-
The starting position for the string search string, such as the first character of the data, such as:
regexp_position => DBMS_REDACT.RE_BEGINNING
This value is the default if it is not specified.
-
The kind of search and replace operation to perform, such as the first occurrence, every fifth occurrence, or all of the occurrences, such as:
regexp_occurrence => DBMS_REDACT.RE_ALL
This value is the default if it is not specified.
-
The default matching behavior for the search and replace operation, such as whether the search is case-sensitive (
i
sets it to be not case-sensitive):regexp_match_parameter => 'i'
In addition to the default parameters, you can use a set of predefined formats that enable you to use commonly used regular expressions for telephone numbers, email addresses, and credit card numbers.