2 Installation
Oracle Big Data SQL requires installation of components on the Hadoop system where the data resides and also on the Oracle Database server which queries the data.
The Oracle Big Data SQL architecture consists of an installation on an Oracle Database system (single node or RAC) that works in conjunction with a parallel installation on a Hadoop (or NoSQL) cluster. The two systems may be networked via either Ethernet or InfiniBand. Hadoop and Hive clients on the compute nodes of the Oracle Database system enable communication between the database and the Oracle Big Data SQL process (known as Oracle Big Data SQL “cell”) that runs on each of the DataNodes of the Hadoop cluster. Through this mechanism, Oracle Database can query data on the Hadoop cluster. In addition, an Oracle Big Data SQL Query Server can be deployed on an edge node on a cluster and can also connect to the Oracle Big Data SQL cells. For details see Work With Query Server.
Since data in the Hadoop HDFS file system and Object Storage is stored in an undetermined format, SQL queries require some constructs to parse and interpret data for it to be processed in rows and columns. Oracle Big Data SQL leverages available Hadoop constructs to accomplish this for HDFS, notably InputFormat and SerDe Java classes, optionally through Hive metadata definitions. For object storage, Big Data SQL uses highly optimized C-drivers to access data in text, Parquet, ORC and Avro file formats. The Oracle Big Data SQL processing cells on the DataNodes are managed by YARN and are integrated into the Hadoop infrastructure. Three key features provided by the cells are Smart Scan, Storage Indexes, and Aggregation Offload. See About Smart Scan for Big Data Sources, About Storage Indexes, and About Aggregation Offload.
See the following resources for installation information:
-
Introduction in Oracle Big Data SQL Installation Guide.
This guide describes installation and configuration procedures for supported Hadoop system/Oracle Database server combinations.
-
Oracle Big Data SQL Master Compatibility Matrix
This is Document 2119369.1 in My Oracle Support. Check the matrix for up-to-date information on Big Data SQL compatibility with the following:-
Oracle Engineered Systems.
-
Other systems.
-
Linux OS distributions and versions.
-
Hadoop distributions.
-
Oracle Database releases, including required patches.
-