4.5 Reference Data
The following reference data sets are provided with the EDQ-PDS project.
EDQ-PDS consists of some pre-installed reference data sets as shown below.
Name | Description |
---|---|
|
Blank reference data for providing the required reference for the Abbreviate processor used in the Key Generation processor. |
|
Tokens that are stripped from a product description attribute before key values are generated, for example single initials such as 'S' which may be too common to form keys but may be significant in matching. Note that these are tokens that you specifically want to remove from key values but not from matching. See, |
|
Characters to use to delimit tokens in key generation - default is spaces. |
|
Token replacements to perform on product description tokens within key generation. These are applied in addition to the standardization and abbreviate performed in the Standardization process. |
|
Individual characters to strip or standardize in the product description in preparation for key generation and matching. |
|
Tokens to strip from product descriptions for both key generation and matching, for example very common non-identifying words such as 'and' and 'the'. |
|
Tokens to standardize in product description prior to matching. |
|
Individual characters to strip or standardize from product name in match preparation. |
|
Tokens to standardize in product name prior to matching. |
|
Contains vowels to strip from shortened product description. |
|
Diacritic characters to remove from product description and name. |
|
Standardization for accented characters. |
|
Valid formats for custom dates. |
|
Colors for standardization and extraction. |
|
Common retail companies for standardization and extraction. |
|
Dictionary of English words - used for Profiling. |
|
Common materials for standardization and extraction. |
|
Number bands used for price profiling. |
|
Sizes for standardization and extraction of sizes. |
|
Regular expressions for standardization and extraction of quantified units of measure. |
|
Vowels to strip when profiling. |