1.3.10.41 Strip Words
The Strip Words transformation processor removes any occurrences of words that match a Reference Data list from attribute values.
Strip Words can be used to remove extraneous words from attributes, often with a view to creating values for matching. For example, when matching companies using a Company Name field, it may be useful to remove less significant words that occur in various forms, or which may occur in some values and not others, such as LTD, LIMITED, UK, PLC and so on.
The following table describes the configuration options:
Configuration | Description |
---|---|
Inputs |
Specify any String or String Array type attributes from which you want to strip words. Number and Date attributes are not valid inputs. Note that if you input an Array attribute, the transformation will apply to all array elements, and an Array attribute will be output. |
Options |
Specify the following options:
|
Outputs |
Describes any data attribute or flag attribute outputs. |
Data Attributes |
The following data attributes are output:
|
Flags |
None. |
The Strip Words transformer presents no summary statistics on its processing.
In the Data view, each input attribute is shown with its new derived attribute with numbers stripped to the right.
Output Filters
None.
Example
In this example, Strip Words is used to remove less significant words such as 'Limited', 'Ltd.', 'Services' and 'Associates' from a field containing Company Names:
BUSINESS | Business.StrippedWords |
---|---|
Kamke & Ellis Ltd. |
Kamke & Ellis |
Sanford Electrical Co |
Sanford Electrical |
C T V Services |
C T V |
W F Electrical Contractors Limited |
W F Electrical Contractors |
Eco-Systems Group |
Eco-Systems |
Milbourne Associates |
Milbourne |