1.3.3.9.24 Match Transformation: Strip Words
The Strip Words match transformation allows you to remove certain words from String values before clustering or comparing them. This works in exactly the same way as the main Strip Words processor.
The Strip Words transformation is very useful when clustering or comparing text values that contain a lot of different forms of certain words that are not needed to identify the value. For example, when matching company names, suffixes such as "LIMITED", "LTD", "GRP", "GROUP", "PLC" etc. may be stripped in order to match the meaningful parts of the identifier values.
Example
In this example, the Strip Words transformation is used in a comparison on a company name identifier.
Example configuration
Reference Data used includes the following words in the left-most column:
CORP, CORPORATION, LIMITED, LTD, PLC, GROUP, GRP
Delimiter Reference Data: *Delimiters
Delimiter characters: none
Ignore case?: Yes
Example transformations
The following table shows example transformations using the above configuration of the Strip Words transformation:
Table 1-97 Example Transformations for Strip Words
Value | Transformed Value |
---|---|
ORACLE CORP |
ORACLE |
ORACLE CORPORATION |
ORACLE |
INTERCHANGE GROUP LIMITED |
INTERCHANGE |
INTERCHANGE GROUP |
INTERCHANGE |
INTERCHANGE GRP LTD |
INTERCHANGE |