UTL_TO_TEXT
Use the DBMS_VECTOR_CHAIN.UTL_TO_TEXT
chainable utility function to convert an input document (for example, PDF, DOC, JSON, XML, or HTML) to plain text.
Purpose
To perform a file-to-text transformation by using the Oracle Text component (CONTEXT
) of Oracle Database.
Syntax
DBMS_VECTOR_CHAIN.UTL_TO_TEXT (
DATA IN CLOB | BLOB,
PARAMS IN JSON default NULL
) return CLOB;
DATA
This function accepts the input data type as CLOB
or BLOB
. It can read documents from a remote location or from files stored locally in the database tables.
It returns a plain text version of the document as CLOB
.
Oracle Text supports around 150 file types. For a complete list of all the supported document formats, see Oracle Text Reference.
PARAMS
Specify the following input parameter in JSON format:
{
"plaintext" : "true or false",
"charset" : "UTF8"
}
Table 12-31 Parameter Details
Parameter | Description |
---|---|
|
Plain text output. The default value for this parameter is If you do not want to return the document as plain text, then set this parameter to |
|
Character set encoding. Currently, only |
Example
select DBMS_VECTOR_CHAIN.UTL_TO_TEXT (
t.blobdata,
json('{
"plaintext": "true",
"charset" : "UTF8"
}')
) from tab t;
End-to-end example:
To run an end-to-end example scenario using this function, see Convert File to Text to Chunks to Embeddings Within Oracle Database.
Parent topic: DBMS_VECTOR_CHAIN