Invalid Character Check
The Invalid Character Check processor provides a quick and easy way to find values that contain odd characters.
Use the Invalid Character Check to check for unusual characters. This is particularly useful when analyzing free text fields, which may have 'data cheats' in them, where data entry users have worked round mandatory fields by entering dummy characters such as #. The Invalid Character Check is also useful for finding typos.
If the invalid characters do not signify anything, they can simply be removed by adding a Denoise processor.
The following table describes the configuration options:
Configuration | Description |
---|---|
Inputs |
Specify a single attribute or an array to analyze invalid characters. |
Options |
Specify the following options:
|
Outputs |
Describes any data attribute or flag attribute outputs. |
Data Attributes |
None. |
Flags |
For each attribute input, a new attribute is created in the following format:
A single summary flag is also output:
|
The following table describes the statistics produced by the profiler:
Statistic | Description |
---|---|
Valid records |
The records that were categorized as valid by the Invalid Character Check. |
Invalid records |
The records that were categorized as invalid by the Invalid Character Check. |
Output Filters
The following output filters are available from a Invalid Character Check:
-
Valid records
-
Invalid records
Example
In this example, a NAME attribute is checked for invalid characters such as ()#%^*$£"!'A number of records are found containing the # character and one record with 'character.
Valid | Invalid |
---|---|
1988 |
14 |
You can drill down on invalid values:
This list describes the elements in the Summary page:
Name
-
# MCAULEY
-
# RAE
-
# WILLIAM
-
# SWAN
-
# HAWKES
-
# BARKER
-
# PALMER
-
# SNOWDON
-
# DOONAN
-
# MCCLEMENTS
-
# SHIELDS
-
# SEADEN
-
{O'CONNAL}