CREATE_LANG_DATA
Use the DBMS_VECTOR_CHAIN.CREATE_LANG_DATA
chunker helper procedure to load your own language data file into the database.
Purpose
To create custom language data for your chosen language (specified using the language
chunking parameter).
A language data file contains language-specific abbreviation tokens. You can supply this data to the chunker to help in accurately determining sentence boundaries of chunks, by using knowledge of the input language's end-of-sentence (EOS) punctuations, abbreviations, and contextual rules.
Usage Notes
-
All supported languages are distributed with the default language-specific abbreviation dictionaries. You can create a language data based on the abbreviation tokens loaded in the
schema.table.column
, using a user-specified language data name (PREFERENCE_NAME
). -
After loading your language data, you can use language-specific chunking by specifying the
language
chunking parameter withVECTOR_CHUNKS
orUTL_TO_CHUNKS
. -
You can query these data dictionary views to access existing language data:
-
ALL_VECTOR_LANG
displays all available languages data. -
USER_VECTOR_LANG
displays languages data from the schema of the current user. -
ALL_VECTOR_ABBREV_TOKENS
displays abbreviation tokens from all available language data. -
USER_VECTOR_ABBREV_TOKENS
displays abbreviation tokens from the language data owned by the current user.
-
Syntax
DBMS_VECTOR_CHAIN.CREATE_LANG_DATA (
PARAMS IN JSON default NULL
);
PARAMS
{
table_name,
column_name,
language,
preference_name
}
Table 12-19 Parameter Details
Parameter | Description | Required | Default Value |
---|---|---|---|
|
Name of the table (along with the optional table owner) in which you want to load the language data |
Yes |
No value |
|
Column name in the language data table in which you want to load the language data |
Yes |
No value |
|
Any supported language name, as listed in Supported Languages and Data File Locations |
Yes |
No value |
|
User-specified preference name for this language data |
Yes |
No value |
Example
declare
params CLOB := '{"table_name" : "eos_data_1",
"column_name" : "token",
"language" : "indonesian",
"preference_name" : "my_lang_1"}';
begin
DBMS_VECTOR_CHAIN.CREATE_LANG_DATA(
JSON (params));
end;
/
End-to-end example:
To run an end-to-end example scenario using this procedure, see Create and Use Custom Language Data.
Related Topics
Parent topic: DBMS_VECTOR_CHAIN