1.3.3.9.14 Match Transformation: Make Array from String
The Make Array from String transformation allows a single text value to be broken up into a variable number of distinct values. This is useful when creating clusters for matching, as clusters will be created for each distinct value created. This ensures that any values with a common word in them, regardless of the order of that word within the value, will be in the same cluster for matching purposes. For example, where a Name identifier has values 'John Simpson' and 'Simpson, J', clustering by making an array using comma and space delimiters will ensure the two records are in the same cluster ('Simpson').
The Make Array from String transformation is functionally the same as the main Make Array from String processor, but is used specifically when clustering to split values into several words to use as cluster keys.
Note that Make Array from String cannot be used within comparisons.
Use the Make Array from String transformation as the last transformation when clustering in order to ensure that records will be brought together into the same cluster if they have any word in common.
The following table describes the configuration options:
Configuration | Description |
---|---|
Options |
Specify the following options:
|
Example
In this example, the Make Array from String transformation is included in the configuration of a cluster on an Address1 identifier.
Example configuration
The following transformations are added to the Address1 identifier to form a cluster:
-
Upper Case
-
Strip Numbers
-
Strip Words (to remove very common words such as The, House, Road, Street, Avenue, Lane, etc.)
-
Normalize Whitespace
-
Make Array from String
Example transformations
The following table shows examples of transformations using the above configuration:
Table 1-86 Example Transformations for Make Array from String
Value | Value after first 4 transformations | Value after Make Array from String transformation |
---|---|---|
The Maltings, 14 Appletree Lane |
MALTINGS, APPLETREE |
1 - MALTINGS 2 - APPLETREE |
14 Appletree Lane |
APPLETREE |
1 - APPLETREE |
The Maltings |
MALTINGS |
1 - MALTINGS |
32 Rushton Road, Coventry |
RUSHTON, COVENTRY |
1 - RUSHTON 2 - COVENTRY |
32 Rushton Rd |
RUSHTON |
1 - RUSHTON |
15 Stroud Green Road |
STROUD GREEN |
1 - STROUD 2 - GREEN |
14 Green End Avenue |
GREEN END |
1 - GREEN 2 - END |
All records that share a common value after transformation will be in the same cluster. For example, the first two records above will be in the 'APPLETREE' cluster, and the first and third records will be in the 'MALTINGS' cluster.