2.4 Regular Expressions to Redact Patterns of Data

Regular expressions redact specific data within a column data value, based on a pattern search. You can use regular expressions to redact a column of strings of different lengths.

For example, you can redact the user name of email addresses, so that only the domain shows (for example, replacing hpreston in the email address hpreston@example.com with [redacted] so that it appears as [redacted]@example.com). To perform the redaction, set the DBMS_REDACT.ADD_POLICY or DBMS_REDACT.ALTER_POLICY procedure function_type parameter to either DBMS_REDACT.REGEXP or DBMS_REDACT.REGEXP_WIDTH, and then use the following parameters to build the regular expression:

  • A string search pattern (that is, the values to search for), such as:

    regexp_pattern => '(.+)@(.+\.[A-Za-z]{2,4})' 
    

    This setting looks for a pattern of the following form:

    one_or_more_characters@one_or_more_characters.2-4_characters_in_range_A-Z_or_a-z
    
  • A replacement string that replaces the value matched by the regexp_pattern setting. The replacement string can include back references to sub-expressions of the main regular expression pattern. The following example replaces the data before the @ symbol (from the regexp_pattern setting) with the text [redacted]. The \2 setting refers to the second match group, which is (.+\.[A-Za-z]{2,4}) from the regexp_pattern setting.

    regexp_replace_string => '[redacted]@\2'
    
  • The starting position for the string search string, such as the first character of the data, such as:

    regexp_position => DBMS_REDACT.RE_BEGINNING
    

    This value is the default if it is not specified.

  • The kind of search and replace operation to perform, such as the first occurrence, every fifth occurrence, or all of the occurrences, such as:

    regexp_occurrence => DBMS_REDACT.RE_ALL
    

    This value is the default if it is not specified.

  • The default matching behavior for the search and replace operation, such as whether the search is case-sensitive (i sets it to be not case-sensitive):

    regexp_match_parameter => 'i'

In addition to the default parameters, you can use a set of predefined formats that enable you to use commonly used regular expressions for telephone numbers, email addresses, and credit card numbers.