Extract Content from an Invoice PDF Document with a Document Understanding Action
You can use AI to extract information from invoices, receipts, passports, and drivers licenses with a document understanding action. This section provides a high-level use case on how to exact information from an invoice in PDF format and convert it to JSON format in Oracle Integration.
Ensure that you have completed all prerequisites for using the document understanding action. See Extract Document Information with a Document Understanding Action.
- Create an application integration.
- Drag a REST Adapter into the integration canvas.
- Configure the REST Adapter.
- On the Basic Info page, enter a name.
- On the Resource Configuration page, specify the necessary
information. For example:
- Enter a relative resource URL: /process.
- Select the POST action.
- Select Configure a request payload for this endpoint.
- On the Configure Request page, select Binary as the request payload format.
- Complete the wizard.
- Click Actions
and drag the Document Understanding action into the integration canvas.
- Enter the following information in the wizard.
- Enter a name and optional description.
- From the Compartment Name list, select the compartment in which your instance is running.
- From the Document Type list, select Invoice.
- Open the mapper. The invoice PDF document must be sent in base-64 format.
- Expand the target Document element.
- Right-click Data, and select Create target node.
- In the target section, click Functions
.
- In the Expression Builder, click Design View
.
- In the Functions section, expand Advanced, and drag encodeReferenceToBase64 into the Expression Builder. This step is required for extracting data from all supported document types (invoice, receipt, drivers license, and passport).
- Map Stream Reference from the
Sources section to the
encodeReferenceToBase64 () function in the Expression
Builder.
- Return to the integration canvas. This action validates your expression.
- Assign a business identifier.
- Save and activate the integration.
- From the Actions
menu, select Run.
- In the Body tab of the Request section, click File, then Browse.
- Upload the PDF file of your invoice.
- Click Run.
- When processing completes, expand the invoke connection in the activity stream.
- Scroll through the payload and note that the fields from the
invoice PDF document have been converted to JSON format.
A log action was not added to this integration. However, adding one to the end of the integration also enables you to see the fields that are returned as output by the Document Understanding API. An invoice returns the following specific fields. The fields returned are different and visible in the log action for each document type that you select in the wizard (receipt, drivers license, or passport). See Analyzing with the API.