Overview to Document Classification and Key Value Extraction

In Oracle Cloud Infrastructure (OCI), Document Understanding provides pretrained AI models that can extract text, tables, and other key data from document files. You perform document classification or key value extraction on a document, then use that extracted data as part of your analysis in Oracle Analytics.

Document Understanding also lets you create custom models for key value extraction and document classification.

In Oracle Analytics, you use data flows to apply the Document Understanding AI models to your data.

Oracle Analytics supports several pretrained and custom AI models available from Document Understanding:
  • Pretrained Models Supported in Oracle Analytics
    • Document Classification
    • Key Value Extraction (for receipts, invoices, driver IDs, and passports)
  • Custom Models Supported in Oracle Analytics
    • Custom Document Classification
    • Custom Key Value Extraction

You must set up and build custom models in OCI Console before you can use them in Oracle Analytics. First, you use OCI Data Labeling to create a good dataset that you can use to train the model and then you build your custom model. See OCI Document Understanding - Custom Models.

Example Output From a Document Classification Model

In this example, a data flow applies a pretrained document classification model to documents in JPG format to predict whether they are receipts, and outputs the analysis results to a dataset. The dataset includes a RECEIPT value for "Document Type", and a "Confidence" prediction level for each document.


Description of oci_du_files13.png follows
Description of the illustration oci_du_files13.png

Before you start: