Query External Data with Autonomous AI Database
Describes packages and tools to query and validate data with Autonomous AI Database.
External data is not managed by the database; however, you can use DBMS_CLOUD procedures to query your external data. Although queries on external data will not be as fast as queries on database tables, you can use this approach to quickly start running queries on your external source files and external data. Depending on the type of external table, you can validate external data using the DBMS_CLOUD validation procedures. The data validation procedures let you validate the source files for an external table so that you can identify problems and either correct the data in the external table or exclude invalid data before you use the data.
Note: If you are not using ADMIN user, ensure the user has the necessary privileges for the operations the user needs to perform. See Manage User Privileges on Autonomous AI Database - Connecting with a Client Tool for more information.
-
To query data in files in the Cloud, you need to first store your object storage credentials in your Autonomous AI Database, and then create an external table using the PL/SQL procedure
DBMS_CLOUD.CREATE_EXTERNAL_TABLE. -
Query JSON Data in Object Storage using External Tables
With Autonomous AI Database, you can access JSON documents stored in Oracle Cloud Infrastructure Object Storage using external tables.
-
Query External Data with ORC, Parquet, or Avro Source Files Autonomous AI Database makes it easy to access ORC, Parquet, or Avro data stored in object store using external tables. ORC, Parquet, and Avro sources have metadata embedded in them and the
DBMS_CLOUD.CREATE_EXTERNAL_TABLEprocedure can utilize this metadata to simplify the creation of external tables. -
Query External Tables with Implicit Partitioning
On Autonomous AI Database you can create implicit partitioned external tables from Hive style partitioned data or from simple folder partitioned data stored on your Cloud Object Store.
-
Query External Tables with Partitioning Specified in Source Files If you want to query multiple data files in the Object Store as a single external table and the files can be represented as multiple logical partitions, it is highly recommended to use an external partitioned table. Using an external partitioned table preserves the logical partitioning of your data files for query access.
-
Query External Partitioned Data (with Partitioning Clause)
If you want to query multiple data files in the Object Store as a single external table and the files can be represented as multiple logical partitions, then it is highly recommended to use an external partitioned table. Using an external partitioned table preserves the logical partitioning of your data files for query access. Use the procedure
DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLEto create an external partitioned table. -
Query Data that Uses Table Hyperlink Source
You can create an external table over an Autonomous AI Database Table Hyperlink.
-
Query Hybrid Partitioned Data If you want to query internal data and multiple data files in the Object Store as single logical table you can use a hybrid partitioned table to represent the data as single object. Use the procedure
DBMS_CLOUD.CREATE_HYBRID_PART_TABLEto create a hybrid partitioned table. -
Query External Data Pump Dump Files
You can also query Oracle Data Pump dump files in the Cloud by creating an external table using
DBMS_CLOUD.CREATE_EXTERNAL_TABLE. -
Query Big Data Service Hadoop (HDFS) Data from Autonomous AI Database You can create database links to Oracle Big Data Service from Autonomous AI Database.
-
Query External Data with Data Catalog
Oracle Cloud Infrastructure Data Catalog is the metadata management service for Oracle Cloud that helps you discover data and support data governance. It provides an inventory of assets, a business glossary, and a common metastore for data lakes.
-
Query External Data with AWS Glue Data Catalog
Autonomous AI Database supports a system for synchronizing with an Amazon AWS Glue Data Catalog instance.
-
Autonomous AI Database supports querying Apache Iceberg tables.
-
To validate any external table, you can use the procedure
DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE. -
Validate External Partitioned Data
To validate an external partitioned table, you can use the procedure
DBMS_CLOUD.VALIDATE_EXTERNAL_PART_TABLE. This procedure includes a parameter that lets you specify a specific partition to validate. -
Validate Hybrid Partitioned Data To validate a hybrid partitioned table, you can use the procedure
DBMS_CLOUD.VALIDATE_HYBRID_PART_TABLE. This procedure includes a parameter that lets you specify a specific partition to validate. -
To validate an external table, use the procedures
DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE,DBMS_CLOUD.VALIDATE_EXTERNAL_PART_TABLE, andDBMS_CLOUD.VALIDATE_HYBRID_PART_TABLE.