Example custom comparison
Custom comparisons may be added into the match library - they are added to widgets.xml
in the same way as processors (widgets). The only limitation is that a comparison must have exactly two inputs and one output. Outputs must be either strings (for Boolean comparisons) or numbers (for comparisons that use Result Bands). Boolean comparisons return "T" for True or "F" for False.
Each custom comparison must be associated with an identifier type - either an existing type (String, Number or Date), or a custom type - see Example custom identifier type.
Associating comparison gadgets with identifier types
Comparison gadgets must be associated for use with specific Identifier types. If you want to associate new comparisons with existing system Identifiers, their names are:
dnm:string for Strings
dnm:number for Numbers
dnm:date for Dates
The following example xml represents a comparison association added to matchlibrary.xml
:
<identifierComparison> <ident>dnm:string</ident> <gadget>dnm:exactstringmatch</gadget> </identifierComparison>
This associates the identifier "dnm:string
" with the comparison "dnm:exactstringmatch
".
Setting default result bands for comparisons
The following xml represents a comparison default result band added to matchlibrary.xml
for the 'String Edit Distance' comparison:
<comparisonReturn> <widgetId>dnm:stringeditdistance</widgetId> <resultBand name="exact" label="Exact Match">0</resultBand> <resultBand name="onetypo" label="One Typo">1</resultBand> <resultBand name="twotypos" label="Two Typos">2</resultBand> <resultBand name="threetypos" label="Three Typos">3</resultBand> </comparisonReturn>
Complete Example
The following example files may be packaged in a JAR file and used to add a custom 'Character Transposition Match' comparison to the match library. The Character Transposition Match comparison matches strings where character transpositions have occurred. For example, when comparing the values 'Michael' and 'Micheal', a single transposition will be counted, so the two values will match if the Maximum allows transpositions option is set to 1 or higher:
Example 2-1 matchlibrary.xml
<?xml version="1.0" encoding="UTF-8"?> <!-- Custom Match Library Extension Copyright 2008 Oracle Ltd. All rights reserved. --> <matchLibrary> <identifierComparison> <ident>dnm:string</ident> <gadget>dn:characterTranspositionMatch</gadget> </identifierComparison> </matchLibrary>
Example 2-2 widgets.xml
<?xml version="1.0" encoding="UTF-8"?> <widgets> <comment>Oracle Match example script widgets</comment> <copyright>Copyright 2008 Oracle Ltd. All rights reserved.</copyright> <widget id="dn:characterTranspositionMatch" class="com.datanomic.director.match.library.util.JavaScriptGadget"> <guidata> <label>%characterTranspositionMatch.gadget</label> <group>compare</group> <icon>script</icon> </guidata> <!-- inputs --> <inputs> <input id="1" type="string" maxattributes="1"> <guidata><label>label1</label></guidata> </input> <input id="2" type="string" maxattributes="1"> <guidata><label>label1</label></guidata> </input> </inputs> <!-- outputs --> <outputs cardinality="1:1"> <output id="1" type="string" name="result"> <guidata><label>resultlabel</label></guidata> </output> </outputs> <properties> <property name="matchNoDataPairs" type="boolean" required="true"> <guidata> <label>%characterTranspositionMatch.property.matchNoDataPairs.label</label> </guidata> <default>false</default> </property> <property name="ignoreCase" type="boolean" required="true"> <guidata> <label>%characterTranspositionMatch.property.ignoreCase.label</label> </guidata> <default>true</default> </property> <property name="startsWith" type="boolean" required="true"> <guidata> <label>%characterTranspositionMatch.property.startsWith.label</label> </guidata> <default>false</default> </property> <property name="maxAllowedTranspositions" type="number" required="true"> <guidata> <label>%characterTranspositionMatch.property.maxAllowedTranspositions.label</label> </guidata> <default>1</default> </property> </properties> <parameters> <parameter name="script"> <![CDATA[ function S(s) { return (s == null) ? "" : s; } function doit() { // no data pairs if (S(input1) == "" | S(input2) == "") { if (matchNoDataPairs) output1 = "T"; else output1 = "F"; return; } if (!startsWith) { if (input1.length != input2.length) { output1 = "F"; return; } } var transpositions = 0; var longword = input1.length > input2.length ? input1 : input2; var shortword = input1.length > input2.length ? input2 : input1; if (ignoreCase) { // convert to uppercase longword = longword.toUpperCase(); shortword = shortword.toUpperCase(); } for (var i = 0; i < shortword.length; i++) { if (shortword[i] != longword[i]) { // are we at the end of the string? if (i == shortword.length - 1) { output1 = "F"; return; } // not a transposition match? if (shortword[i] != longword[i + 1]) { output1 = "F"; return; } // compare the next character if (shortword[i + 1] != longword[i]) { output1 = "F"; return; } transpositions++; // too many transpositions? if (transpositions > maxAllowedTranspositions) { output1 = "F"; return; } // skip over the characters i++; } } output1 = "T"; } ]]> </parameter> <parameter name="function">doit</parameter> </parameters> </widget> </widgets>
Example 2-3 matchlibrary.properties
[This file was not required in this case as the comparison does not support result bands, and does not require new identifiers.]
Example 2-4 widgets.properties
characterTranspositionMatch.gadget = Character Transposition Match characterTranspositionMatch.property.matchNoDataPairs.label = Match No Data pairs? characterTranspositionMatch.property.ignoreCase.label = Ignore case? characterTranspositionMatch.property.startsWith.label = Starts with? characterTranspositionMatch.property.maxAllowedTranspositions.label = Maximum allowed transpositions
Example 2-5 version.properties
name=Character Transposition Match version=v8.1.3.(175) title=Character Transposition Match type=GADGET