Configuring a Federated Distributed Database
To deploy a federated distributed database configuration using existing databases, you define the database layout just as you would
for the user-defined data distribution method, using GDSCTL
commands.
The following is a high-level description of the process for creating and deploying a federated distributed database configuration.
-
Run the
GDSCTL CREATE SHARDCATALOG
command with theFOR_FEDERATED_DATABASE
option to create the configuration -
Add shard directors to the configuration.
-
Add a shardspace to the configuration. A shardspace is defined as an existing database and its replica.
-
Add a shard by adding the existing database to the shardspace, then run
DEPLOY
. -
Run
GDSCTL SYNC SCHEMA
to compare the schemas in the federated distributed database configuration and retrieve the common shared schemas. UseSYNC SCHEMA
to inspect and apply the DDLs. -
Use
SQL ALTER TABLE
on the shard catalog to convert tables containing the same data across the federated shards to duplicated tables. -
Prepare the shards in the federated distributed database configuration for multi-shard queries.
Create a Federated Distributed Database Configuration
The GDSCTL
command CREATE SHARDCATALOG
is
used to create the federated distributed database configuration, with the FOR_FEDERATED_DATABASE
option used instead of
selecting a data distribution method in the SHARDING
parameter.
The usage for the GDSCTL
command CREATE
SHARDCATALOG
in creating a federated distributed database is similar to how it is used to create the shard catalog with the user-defined data
distribution method, except that instead of specifying a method in the
SHARDING
parameter, you use the
FOR_FEDERATED_DATABASE
option. That is, the
FOR_FEDERATED_DATABASE
option is mutually exclusive with the
SHARDING
option.
CREATE SHARDCATALOG -DATABASE connect_identifier
[-USER username[/password]]
[-REGION region_name_list]
[-CONFIGNAME config_name]
[-AUTOVNCR ON/OFF]
[-FORCE]
[-SDB distributed_db_name]
[-SHARDSPACE shardspace_name_list]
-FOR_FEDERATED_DATABASE
The CREATE SHARDCATALOG
syntax statement above shows which
parameters are supported. The parameters not shown are not supported when used with the
FOR_FEDERATED_DATABASE
data distribution method, for example,
–AGENT_PASSWORD
, REPFACTOR
, and the Oracle Data
Guard protection mode PROTECTMODE
.
Note:
Only Oracle Data Guard replication is supported for a federated distributed database. Oracle Globally Distributed Database doesn't handle the creation and management of the Data Guard configuration, but
you can use Data Guard parameters with the ADD SHARD
command so
that you can add the primary and standby databases to see the status in
GDSCTL
.
See Also:
The GDSCTL create shardcatalog topic in Oracle Database Global Data Services Concepts and Administration Guide for usage notes and command options.Retrieve, Inspect, and Apply the DDLs
Run the GDSCTL SYNC SCHEMA
command in phases to create the
schema objects common to the existing databases in the shard catalog.
The GDSCTL SYNC SCHEMA
syntax shown here illustrates the
three phases of the operation.
sync[hronize] schema
[-schema [schemalist | all] [-retrieve_only] [-restart [-force]]
| -apply [-skip_first]
| -show [[-ddl ddlnum] [-count n] | [-failed_only]]]
SYNC SCHEMA
should be run in phases, as described here.
-
Retrieve Phase
Run
SYNC SCHEMA
with the-retrieve_only
option to inspect and verify the DDLs before they are run on the shard catalog.sync schema -schema schemalist –retrieve_only
When
SYNC SCHEMA
is run without-retrieve_only
, the DDL is retrieved and applied at the same time. -
Inspection Phase
You can examine the DDL statements and their processing status with the
-show
option. The-ddl ddlnum
option shows the specified DDL, and the-count n
option specifies the maximum number of entries to show.sync schema –show -ddl ddlnum -count n
Or you can use the
-failed_only
option to examine only the errored out statements.sync schema –show -failed_only
-
Apply Phase
In the final phase, you run the DDLs on the shard catalog to create the schemas and their objects.
sync schema –apply
If you get an error in the apply phase, there are a couple of ways to work around it:
-
If you can fix the cause of the error, fix and then retry
SYNC SCHEMA -apply
, which retries the failed DDL. -
If the DDL cannot be fixed or it is not required, you can run
SYNC SCHEMA –apply -skip_first
, which resumes the apply phase from the point of the DDL failure.
For security reasons, Oracle Globally Distributed Database doesn't offer a way to edit the DDLs.
-
-
Import Incremental Changes
If there are changes in the schema at a later point, the previous phases can be run again to import incremental changes. For example, when new objects are added, or a new column is added to a table, which will generate an
ALTER TABLE ADD
statement.
See Also:
The sync schema (synchronize schema) topic in
Oracle Database Global
Data Services Concepts and Administration
Guide for more SYNC
SCHEMA
usage notes and option details.
SYNC SCHEMA Operations for information about the tasks performed by SYNC SCHEMA
Convert Tables to Duplicated Tables
Use ALTER TABLE table_name externally
duplicated
to mark tables as duplicated in a federated distributed database configuration.
Any table created by SYNC SCHEMA
is considered by the
multi-shard query layer as an externally sharded table. If the table contains the
same data on all of the shards, you can alter the table to externally duplicated,
so that the multi-shard query retrieves the data from one shard only, even if it is a
query on a table with no filter predicates on ORA_SHARDSPACE_NAME
.
ALTER TABLE table_name [externally duplicated | externally sharded]
Prepare the Shards For Multi-Shard Queries
Create all shard users and use the
ORA_SHARDSPACE_NAME
pseudo-column to perform queries on specific
shards.
All Shard Users
Before running multi-shard queries from the shard catalog, you must create
all shard users and grant them access to the sharded and duplicated tables.
These users and their privileges should be created in the shard catalog under
shard DDL enabled
.
Create Shardspace-Specific Queries
A shardspace in a federated distributed database is a set consisting of a primary shard and zero or more standby shards. To filter
query results for a particular shard[space], a pseudo-column called
ORA_SHARDSPACE_NAME
is added to every externally
sharded
table. The value of this pseudo column in the tables is the name of
the shardspace.
Depending on the value of MULTISHARD_QUERY_DATA_CONSISTENCY
, the rows
can be fetched from the primary or from any of the standbys in the shardspace. To run a
multi-shard query on a given shard, you can filter the query with the predicate
ORA_SHARDSPACE_NAME = shardspace_name_shard_belongs_to
.
A query like SELECT CUST_NAME, CUST_ID FROM CUSTOMER
, where
the table CUSTOMER is marked as externally sharded
, runs on all of the
shards.
A query like SELECT CUST_NAME, CUST_ID FROM CUSTOMER WHERE
ora_shardspace_name = ‘EUROPE’
runs on the shards belonging to the
shardspace_name
Europe. Depending on the
MULTISHARD_QUERY_DATA_CONSISTENCY
parameter value, the query is run
on either the primary shard of the shardspace Europe or on its standbys.
You can join sharded tables from different shardspaces. For example, to find the customers from shardspace Europe with orders in shardspace NA, write a query similar to the following.
SELECT order_id, customer_name FROM customers c , orders o WHERE c.cust_id = o.cust_id and
c.ora_shardspace_name = ‘Europe’ and o.ora_shardspace_name = ‘NA’
Querying an externally duplicated
table, with or without
the ORA_SHARDSPACE_NAME
predicate, should go to only one of the
shardspaces. The MULTISHARD_QUERY_DATA_CONSISTENCY
parameter value
determines whether to query a primary shard in the shardspace or its replicas.