1.3.3.9.15 Match Transformation: Metaphone
The Metaphone transformation creates common metaphone keys from values that sound the same, but may be different, for example due to misspellings.
The Metaphone transformation is extremely useful both when clustering and performing comparisons, especially when matching data that may contain mis-spellings, such as names.
When clustering, it provides a useful way of dividing records into cluster groups by creating groups of values that all have the same-sounding identifier - for example the same metaphone key ("KLT") is produced from all of the following surnames: "Gold", "Gould", and "Gauld".
When using comparisons, it is often useful to include a positive metaphone match to strengthen a match rule. For example, an edit distance of 2 or 3 characters on a name field may be quite a weak match, but if both values still sound the same, this may strengthen the match - for example, "John Clarke" might well be the same person as "Jon Clarke", but is far less likely to be the same person as "John Darke".
This provides a way of finding misspellings that are often due to the person entering the data not hearing the name correctly.
The following table describes the configuration options:
Configuration | Description |
---|---|
Options |
Specify the following options:
|
Example
In this example, the Metaphone transformation is used to strengthen name matches. An exact String match comparison (see Comparison: Exact String Match) is performed on the transformed value, effectively forming a comparison that determines whether or not two values sound the same.
Example transformations
The following table shows examples of transformations using the above configuration:
Table 1-87 Example Transformations for Metaphone
Value | Transformed Value |
---|---|
Ellen Wilson |
ALNLSN |
Eileen Wilson |
ALNLSN |
Pauline Bedham |
PLNPTM |
Pauline Beedham |
PLNPTM |
Lewis |
LS |
Louis |
LS |
Lees |
LS |
Pearce |
PRS |
Pierce |
PRS |