6.3 Specify the Hive Databases to Synchronize With Query Server
Before you can synchronize Query Server with the desired Hive databases in the metastore, you have to specify the list of Hive databases.
- During installation, specify the
sync_hive_db_list
parameter in thebds-config.json
configuration file. - After installation, you can update the
sync_hive_db_list
configuration parameter in Cloudera Manager or Apache Ambari.
After installing Query Server, it automatically creates schemas and external tables based on the Hive metastore databases list that you specified. Every subsequent Query Server restart will perform a delta synchronization.
6.3.1 Specify the Hive Databases in the bds-config.json Configuration File
You can provide the initial list of Hive databases to synchronize with Query Server as part of the installation process using the bds-config.json
configuration file.
In the configuration file, include the sync_hive_db_list
configuration parameter followed by a list of the Hive databases. The following example specifies two Hive databases for the sync_hive_db_list
configuration parameter: htdb0
and htdb1
. Only these two databases will be synchronized with Query Server, even if the Hive metastore contains other databases.
"edgedb": {
"node": "<edgenode_host_name>
",
"enabled": "true",
"sync_hive_db_list": "htdb0,htdb1"
. . .
}
To synchronize all Hive databases in the metastore with Query Server, use the "*" wildcard character as follows:
"edgedb": {
"node": "EdgeNode_Host_Name",
"enabled": "true"
"sync_hive_db_list": "*"
. . .
}
If the bds-config.json
configuration file does not contain the sync_hive_db_list
configuration parameter, then no synchronization will take place between the Hive databases and Query Server. In that case, you must specify the Hive databases using the sync_hive_db_list
configuration parameter in Cloudera Manager or Apache Ambari.
Note:
Query Server is not intended to store internal data in Oracle tables. Whenever the Query Server is re-started, it is "reset" to its initial and clean state. This eliminates typical database maintenance such as storage management, database configuration, and so on. The goal of Query Server is to provide a SQL front-end for data in Hadoop, Object Store, Kafka, and NoSQL databases and not a general-purpose RDBMS.6.3.2 Updating the Hive Databases With the sync_hive_db_list Configuration Parameter
You can update the list of the Hive databases to synchronize with Query Server by using Cloudera Manager.
sync_hive_db_list
configuration parameter in Cloudera Manager as follows: