3.9 Creating a Regular Expression-Based Redaction Policy
A regular expression-based redaction policy enables you to redact data based on a search-and-replace model.
3.9.1 About Creating Regular Expression-Based Redaction Policies
Regular expression-based redaction enables you to search for patterns of data to redact.
For example, you can use regular expressions to redact email addresses, which can have varying character lengths. It is designed for use with character data only. You can use formats for the search and replace operation, or you can create custom pattern formats.
You cannot use regular expressions to redact a subset of the values in a column. The REGEXP_PATTERN
(regular expression pattern) must match all of the values in order for the REGEXP_REPLACE_STRING
setting to take effect, and the REGEXP_REPLACE_STRING
must change the value.
For rows where the REGEXP_PATTERN
fails to match, Data Redaction
performs DBMS_REDACT.FULL
redaction. This mitigates the risk of showing
the actual data for those rows which it failed to match because of a mistake in the
REGEXP_PATTERN
causing the regular expression to fail to match all
of the values in the column.
In addition, if no change to the value occurs as a result of the REGEXP_REPLACE_STRING
setting during regular expression replacement operation, Data Redaction performs DBMS_REDACT.FULL
redaction.
3.9.2 Syntax for Creating a Regular Expression-Based Redaction Policy
The regexp_*
parameters of the DBMS_REDACT.ADD_POLICY
procedure can create a regular expression-based redaction policy.
The DBMS_REDACT.ADD_POLICY
fields for creating a regular expression-based data redaction policy are as follows:
DBMS_REDACT.ADD_POLICY ( object_schema IN VARCHAR2 := NULL, object_name IN VARCHAR2, column_name IN VARCHAR2 := NULL, policy_name IN VARCHAR2, function_type IN BINARY_INTEGER := NULL, expression IN VARCHAR2, enable IN BOOLEAN := TRUE, regexp_pattern IN VARCHAR2 := NULL, regexp_replace_string IN VARCHAR2 := NULL, regexp_position IN BINARY_INTEGER := 1, regexp_occurrence IN BINARY_INTEGER := 0, regexp_match_parameter IN VARCHAR2 := NULL);
In this specification:
-
object_schema
,object_name
,column_name
,policy_name
,expression
,enable
: See General Syntax of the DBMS_REDACT.ADD_POLICY Procedure. -
function_type
: Specifies the type of redaction. For regular expression based redaction, use eitherDBMS_REDACT.REGEXP
orDBMS_REDACT.REGEXP_WIDTH
.If you use the
DBMS_REDACT.REGEXP
redaction type, then no truncation occurs. This applies even if the redacted value is wider than the column width, and the Oracle Call Interface width attribute (OCI_ATTR_CHAR_SIZE
) of the column is not preserved. (It becomes 4000, just as it does when theREGEXP_REPLACE
SQL operator is used on a column.)Using the
DBMS_REDACT.REGEXP_WIDTH
redaction type truncates any redacted value that exceeds the width of the column, and ensures that the OCI width attribute of the column (OCI_ATTR_CHAR_SIZE
) remains unchanged.Note the following:
-
Use the
DBMS_REDACT.REGEXP_WIDTH
function type if your applications depend on the value of theOCI_ATTR_CHAR_SIZE
attribute. For example, applications that are built using the Oracle OLE DB Provider interface are sensitive to the value of theOCI_ATTR_CHAR_SIZE
attribute. If you useDBMS_REDACT.REGEXP
as the redaction type, then theOCI_ATTR_CHAR_SIZE
always becomes 4000. This setting makes it unsuitable as the redaction type of policies on tables that are used by Oracle OLE DB based applications. See Oracle Call Interface Developer's Guide for more information about Oracle Call Interface parameter attributes. -
When you set the
function_type
parameter toDBMS_REDACT.REGEXP
orDBMS_REDACT.REGEXP_WIDTH
, omit thefunction_parameters
parameter from theDBMS_REDACT.ADD_POLICY
procedure. -
Specify the regular expression parameters in much the same way that you specify the
pattern
,replace
,position
,occurrence
, andmatch_parameter
arguments to theREGEXP_REPLACE
SQL function. See Oracle Database SQL Language Reference for information about theREGEXP_REPLACE
SQL function.
-
-
regexp_pattern
: Describes the search pattern for data that must be matched. If it finds a match, then Oracle Database replaces the data as specified by theregexp_replace_string
setting. See the following sections for more information: -
regexp_replace_string
: Specifies how you want to replace the data to be redacted. See the following sections for more information: -
regexp_position
: Specifies the starting position for the string search. The value that you enter must be a positive integer indicating the character of thecolumn_name
data where Oracle Database should begin the search. The default is1
or theDBMS_REDACT.RE_BEGINNING
format, meaning that Oracle Database begins the search at the first character of thecolumn_name
data. -
regexp_occurrence
: Specifies how to perform the search and replace operation. The value that you enter must be a non-negative integer indicating the occurrence of the replace operation:-
If you specify
0
or theDBMS_REDACT.RE_ALL
format, then Oracle Database replaces all the occurrences of the match. -
If you specify the
DBMS_REDACT.RE_FIRST
format, then Oracle Database replaces the first occurrence of the match. -
If you specify a positive integer
n
, then Oracle Database replaces then
th occurrence of the match.
If the occurrence is greater than 1, then the database searches for the second occurrence beginning with the first character following the first occurrence of pattern, and so forth.
-
-
regexp_match_parameter
: Specifies a text literal that lets you change the default matching behavior of the function. The behavior of this parameter is the same for this function as for theREGEXP_REPLACE
SQL function. See Oracle Database SQL Language Reference for detailed information.To filter the search so that it is not case sensitive, specify the
RE_CASE_INSENSITIVE
format.
3.9.3 Regular Expression-Based Redaction Policies Using Formats
The DBMS_REDACT.ADD_POLICY
procedure supports both
regexp_pattern
and regexp_replace_string
parameters.
3.9.3.1 Regular Expression Formats
The regular expression formats represent commonly used expressions, such as the replacement of digits within a credit card number.
Table 3-10 describes the formats that you can use with the regexp_pattern
parameter in the DBMS_REDACT.ADD_POLICY
procedure.
Table 3-10 Formats for the regexp_pattern Parameter
Format | Description |
---|---|
|
Searches for any digit. Replaces the identified pattern with the characters specified by the regexp_replace_string => DBMS_REDACT.RE_REDACT_WITH_SINGLE_X This setting replaces any matched digit with the The following setting replaces any matched digit with the regexp_replace_string => DBMS_REDACT.RE_REDACT_WITH_SINGLE_1 |
|
Searches for the middle digits of any credit card (other than American Express) that has 6 leading digits and 4 trailing digits. Replaces the identified pattern with the characters specified by the The appropriate |
|
Matches credit card numbers other than American Express credit card numbers. The appropriate |
|
Matches American Express credit card numbers. The appropriate |
|
Searches for any U.S. telephone number. Replaces the identified pattern with the characters specified by the The appropriate |
|
Searches for any email address. Replaces the identified pattern with the characters specified by the The appropriate
|
|
Searches for an IP address. Replaces the identified pattern with the characters specified by the The appropriate |
The table below describes formats that you can use with the
regexp_replace_string
parameter in the
DBMS_REDACT.ADD_POLICY
procedure.
Table 3-11 Formats for the regexp_replace_string Parameter
Format | Description |
---|---|
|
Replaces the data with a single |
|
Replaces the data with a single |
|
Redacts the middle digits in credit card numbers, as specified by setting the |
|
Redacts the first 12 digits of a credit card number other than an American Express card number. For example, |
|
Redacts the first 10 digits of an American Express number. For example, |
|
Redacts the last 7 digits of U.S. telephone numbers, as specified by setting the |
|
Redacts the email name as specified by setting the |
|
Redacts the email domain name as specified by setting the |
|
Redacts the last three digits of the IP address as specified by setting the |
Related Topics
3.9.3.2 Example: Regular Expression Redaction Policy Using Formats
The DBMS_REDACT.ADD_POLICY
procedure can create a regular expression redaction policy that uses formats.
Example 3-7 shows how to use regular expression formats to redact credit card numbers.
Example 3-7 Regular Expression Data Redaction Character Values
BEGIN DBMS_REDACT.ADD_POLICY( object_schema => 'mavis', object_name => 'cust_info', column_name => 'cc_num', policy_name => 'redact_cust_cc_nums', function_type => DBMS_REDACT.REGEXP, expression => '1=1', regexp_pattern => DBMS_REDACT.RE_PATTERN_CC_L6_T4, regexp_replace_string => DBMS_REDACT.RE_REDACT_CC_MIDDLE_DIGITS, regexp_position => DBMS_REDACT.RE_BEGINNING, regexp_occurrence => DBMS_REDACT.RE_FIRST, regexp_match_parameter => DBMS_REDACT.RE_CASE_INSENSITIVE, policy_description => 'Regular expressions to redact credit card numbers', column_description => 'cc_num contains customer credit card numbers'); END; /
Query and redacted result:
SELECT cc_num FROM mavis.cust_info; CC_NUM ------- 401288XXXXXX1881 411111XXXXXX1111 555555XXXXXX1111 511111XXXXXX1118
3.9.4 Custom Regular Expression Redaction Policies
You can customize regular expressions in Data Redaction policies.
3.9.4.1 Settings for Custom Regular Expressions
Oracle Data Redaction provides special settings to configure policies that use regular expressions.
To create custom regular expression redaction policies, you use the following parameters in the DBMS_REDACT.ADD_POLICY
procedure:
-
regexp_pattern
: This pattern is usually a text literal and can be of any of the data typesCHAR
,VARCHAR2
,NCHAR
, orNVARCHAR2
. The pattern can contain up to 512 bytes. For further information about writing the regular expression for theregexp_pattern
parameter, see the description of thepattern
argument of theREGEXP_REPLACE
SQL function in Oracle Database SQL Language Reference, because the support that Data Redaction provides for regular expression matching is similar to that of theREGEXP_REPLACE
SQL function. -
regexp_replace_string
: This data can be of any of the data typesCHAR
,VARCHAR2
,NCHAR
, orNVARCHAR2
. Theregexp_replace_string
can contain up to 500 back references to subexpressions in the form\
n
, wheren
is a number from 1 to 9. If you want to include a backslash (\) in theregexp_replace_string
setting, then you must precede it with the escape character, which is also a backslash. For example, to literally replace the matched pattern with\2
(rather than replace it with the second matched subexpression of the matched pattern), you enter\\2
in theregexp_replace_string
setting. For more information, see Oracle Database SQL Language Reference.
See Also:
General Syntax of the DBMS_REDACT.ADD_POLICY Procedure for information about other DBMS_REDACT.ADD_POLICY
parameters
3.9.4.2 Example: Custom Regular Expression Redaction Policy
The DBMS_REDACT.ADD_POLICY
procedure regexp* parameters can create a custom regular expression redaction policy.
Example 3-8 shows how to use regular expressions to redact the emp_id
column data. In this example, taken together, the regexp_pattern
and regexp_replace_string
parameters do the following: first, find the pattern of 9 digits. For reference, break them into three groups that contain the first 3, the next 2, and then the last 4 digits. Then, replace the first five digits with XXXXX
concatenated with the third group (the last 4 digits) as found in the original pattern.
Example 3-8 Partially Redacted Data Redaction Using Regular Expressions
BEGIN DBMS_REDACT.ADD_POLICY( object_schema => 'mavis', object_name => 'cust_info', column_name => 'emp_id', policy_name => 'redact_cust_ids', function_type => DBMS_REDACT.REGEXP, expression => '1=1', regexp_pattern => '(\d\d\d)(\d\d)(\d\d\d\d)', regexp_replace_string => 'XXXXX\3', regexp_position => 1, regexp_occurrence => 0, regexp_match_parameter => 'i', policy_description => 'Redacts employee IDs using regular expression', column_description => 'emp_id contains employee ID numbers'); END; /
Query and redacted result:
SELECT emp_id FROM mavis.cust_info; EMP_ID ------------ XXXXX1234 XXXXX5678