xf:normalize-unicode

NOTE: I'm removing this function because we are no longer supporting this function. See Radar bug: 26685 and http://radar.beasys.com/netui/showbug.jsp?bugid=34117

Normalize the value of $string-var (a string) according to the normalization form specified by $optional-string-var.

Normalization is the process of removing different representations of equivalent sequences from text—to convert the data into a form which can be binary-compared for equivalence.

The following argument table shows the valid strings for $optional-string-var and the normalization that occurs.

If this function is invoked with the normalized-unicode($string-var) signature, then the Unicode Normalization Form C (NFC) is used for normalization.

If $optional-string-var is equal to a zero-length string ("") then no normalization is performed and $string-var is returned.

If $optional-string-var is equal to W3C then the TransformException exception is raised with the SYS_NOTIMPLEMENTED fault code.

TODO: Update after Radar bug: http://radar.beasys.com/netui/showbug.jsp?bugid=29 100 is fixed.

If $optional-string-var is not equal to one of the following strings: NFC, NFD, NFKC, NFKD, or W3C then the TransformException exception is raised with the RT_NORMAL_FORM fault code. In the mapper the following error message is displayed:

Error occurred while executing XQuery: in operator normalize-unicode12 of class com.xqrl.runtime.strings.StringNormalizeUnicode: unsupported normal form: NNN 

If the value of $string-var or $optional-string-var is the empty sequence, the empty sequence is returned. The empty sequence is a sequence containing zero items (), which is similar to null in SQL.

Signatures

xf:normalized-unicode(xs:string? $string-var) —> xs:string?

xf:normalized-unicode(xs:string? $string-var, xs:string $optional-string-var) —> xs:string?

Arguments

Data Type
Argument
Description

xs:string?

$string-var

Represents the string to normalize.

xs:string?

$optional-string-var

Defines the normalized form used.

Note: This string can be of any case, for example: nfc, Nfc, or nFK are all legal values.



NFC

The Unicode Normalization Form C is used.

NFD

The Unicode Normalization Form D is used.

NFKC

The Unicode Normalization Form KC is used.

NFKD

The Unicode Normalization Form KD is used.

W3C

Not implemented, the TransformException with the SYS_NOTIMPLEMENTED fault code.

TODO BETA REFRESH: Add if implemented: The fully normalized form is returned. To learn more see Character Model for the World Wide Web 1.0.

Returns

Returns the string that has been normalized.

Examples

Unicode Normalization Form KD

Invoking normalize-unicode("rÖle", "NFKD") returns the string ro?le as shown in the following example query:

<result>{xf:normalize-unicode("rÖle", "NFKD")}</result> 

The preceding query generates the following result:

<result>ro?le</result> 

No Normalization

Invoking normalize-unicode("rÖle", "") returns the string rÖle (no normalization occurs) as shown in the following example query:

<result>{xf:normalize-unicode("rÖle", "NFKD")}</result> 

The preceding query generates the following result:

<result>rÖle</result> 

Error—Not Valid Normalization Form

Invoking normalize-unicode("rÖle", "NNN") throws the TransformException exception is raised with the RT_NORMAL_FORM fault code.

Invoking normalize-unicode("rÖle", "NNN") produces the Error occured while executing XQuery: in operator normalize-unicode12 of class com.xqrl.runtime.strings.StringNormalizeUnicode: unsupported normal form: NNN.

For example, the following example query:

<result>{xf:normalize-unicode("rÖle", "NNN")}</result> 

Produces the Error occured while executing XQuery: in operator normalize-unicode12 of class com.xqrl.runtime.strings.StringNormalizeUnicode: unsupported normal form: NNN.

XQuery Compliance

W3C fully normalized form is not supported (the W3C option).

Related Topics

W3C normalized-unicode function description.