Federated Distributed Database Reference
SYNC SCHEMA Operations
DDL Synchronization
DDL synchronization is an operation that SYNC SCHEMA
runs
just after the deployment of the shards in a federated distributed database configuration.
The goal of this operation is to import the object definitions from all of the shards, compare the definitions across the shards, and generate DDLs for the objects that exist on all of the shards (common objects). Once the DDLs are run and the objects are created, you can reference these objects in multi-shard queries.
Import Users
A user or schema is a candidate for import by SYNC SCHEMA
if it exists on all of the shards and owns importable schema objects.
You can narrow the list of users to be imported by passing a list of users
in the -SCHEMA
parameter. For example,
gdsctl> sync schema -schema scott
gdsctl> sync schema -schema scott,myschema
For case-sensitive schemas use quoted identifiers.
gdsctl> sync schema -schema "O'Brien",scott
To include all non-Oracle schemas, use the value ALL
in the
SCHEMA
parameter.
gdsctl> sync schema -schema all
Before importing the users, SYNC SCHEMA
verifies that any
discovered users exist on all shards, and no user already exists on the shard catalog
with the same name. The users are then created on the shard catalog as local users and
they are locked. Because these are local users, they only share the same name with
shards and are essentially the same as any other user that may have the same name across
different databases. Note that these users are not able to login and issue queries
because they are not all shard users. To issue multi-shard queries, an all shard
user must be created.
Note:
Only users local to a PDB are imported. Common CDB users are not imported.Grant User Roles and Priviledges
For the imported users, SYNC SCHEMA
compares users'
privileges.
SYNC SCHEMA
grants only the privileges that are granted on
all of the shards (common grants). A user A who has a DBA role on
shard1, but does not have DBA role on shard2, is not granted the DBA role in
the shard catalog.
Import Object Definitions
The objects compared and imported by SYNC SCHEMA
to the
shard catalog are the objects that will be referenced in multi-shard queries or used by
multi-shard query processing.
These objects are:
- Tables
- Views and Materialized Views (exported as tables)
- Check Constraints
- Object Types
- Synonyms
Running SYNC SCHEMA
does not import objects related to
storage, or objects that have no impact on multi-shard query processing, such as
tablespaces, indexes, indextypes, directories, or zone maps.
Schema Object Comparison
The objects, from one shard to another, can have different definitions.
SYNC SCHEMA
compares the different definitions and creates a common
definition to enable multi-shard queries against imported objects.
SYNC SCHEMA
detects the objects' differences at two levels:
number of objects, and object definitions.
First, SYNC SCHEMA
considers the number of objects. It is
likely that, during an application upgrade, some objects are added to the schemas. Only
objects that are on all of the shards will be imported into the shard catalog.
Second, the object definitions from one shard to another can have different
attributes. For the objects that SYNC SCHEMA
imports, the following
differences are noted:
Differences in Tables
When comparing objects in a federated distributed database configuration, some differences in tables have an impact on multi-shard queries and some do not.
Column Differences
Only column differences have an impact on multi-shard queries. SYNC
SCHEMA
addresses only this difference.
-
The number of columns can be different.
-
The data type of a given column can be different.
-
The default value of a given column can be different.
-
The expression of a virtual column can be different
When a table has a different numbers of columns, SYNC
SCHEMA
will opt for the creation of a table that contains the union of all
of the columns. Taking the union of all of the columns, compared to just taking the
intersection, will spare you from re-writing multi-shard queries in case of an
incremental deploy, when the added shard has fewer columns than indicated in the shard
catalog.
When a column has different data types, SYNC SCHEMA
defines
it as the highest (largest) datatype.
When a column has different data types, and one of the columns is a user-defined object type, then that column is not imported into the shard catalog.
When a column has different default values, SYNC SCHEMA
sets NULL
as the default value.
Nested table columns are not imported into the shard catalog.
Example: a Customer table is defined on shard1 and shard2 as shown here.
On shard1:
Customer( Cust_id number, Name varchar(30),
Address varchar(50),Zip_code number)
On shard2:
Customer( Cust_id varchar(20), Name varchar(30),
Address varchar(50),Zip_code number,
Country_code number)
Note that the column Cust_id
is a number on shard1 and
a varchar(20) on shard2. Also, note that Country_code
exists on
shard2 but does not exist on shard1.
The Customer table created by SYNC SCHEMA
in the shard
catalog has all of the columns, including Country_code
, and the
Cust_id
type is varchar(20).
Customer( Cust_id varchar(20), Name varchar(30),
Address varchar(50),Zip_code number,
Country_code number)
SYNC SCHEMA
keeps track of these differences between
schemas in the shard catalog. A query issued on the catalog database that accesses
these heterogeneous columns is rewritten to address the differences before it is
sent to the shards. On the shard, if there is a data type mismatch, the data is CAST
into the "superior" data type as created on the catalog. If the column is missing on
the shard, the default value is returned as set on the catalog.
Partition Scheme Differences
Note that this difference has no impact on multi-shard queries, and is ignored.
- Partitioning column can be different.
- Partition type can be different.
- Number of partitions can be different.
Storage Attribute Differences
Note that this difference has no impact on multi-shard queries, and is ignored.
- Tablespaces, on which the table is created, are different.
- The encryption can be different.
- The
INMEMORY
attribute can be different.
Differences in Views
Views on shards are created and handled as tables in the shard catalog. The same restrictions that apply to tables also apply to views.
Troubleshooting a Federated Distributed Database
Solve common federated distributed database issues with these troubleshooting tips.
ORA-03851: Operation not supported in federated database
ORA-03701: Invalid parameter combination: federated database and ...
Some of the operations and command options that apply to a traditional distributed database are not applicable to a federated distributed database. This is because:
-
There is no concept of a chunk in a federated distributed database. Any chunk-related operation is invalid, for example
SPLIT CHUNK
andMOVE CHUNK
. -
The Data Guard broker configuration is not set up or managed by the system in federated distributed database, because the existing shards may already have been set up with their own high availability configurations. Operations such as
SET DATAGUARD_PROPERTY
orMODIFY SHARDSPACE
are not supported. -
The
CREATE SHARD
command is not supported.
ORA-03885: Some primary shards are undeployed or unavailable
The SYNC SCHEMA
operation requires that all primary
shards be available. Check the output of the CONFIG
SHARD
command, and check the status of all primary
shards. Fix any issues and retry the operations when the shards
become available.
ORA-03871: Some DDL statements are not applied to the catalog
The SYNC SCHEMA
operation cannot import
object definitions from the shards when some statements from the
previous issuance are still not applied on the shard catalog. Run
SYNC SCHEMA
with the
-apply
option to run these statements.
Handling Errors During Multi-Shard Queries
If a multi-shard query fails with this error due to a
mismatch of the object definition on the shard and the catalog, make
sure that the shard catalog has the latest schema changes imported.
Any time there are schema changes in the federated distributed database, you must run SYNC SCHEMA
to import any changes
in the schemas on the shards.
Note that subsequent runs of SYNC SCHEMA
will not drop and recreate the object, but will generate
ALTER
statements to incorporate the
definition changes. This ensures that if there are queries already
running during the SYNC SCHEMA
operation, they
won't fail with invalid object errors.
Handling Errors During DDL Processing Phase
If DDL fails on the shard catalog, the status of each DDL can
be examined with the SYNC SCHEMA
-show
option.
gdsctl> sync schema -show
Note: The SYNC SCHEMA -show
command is different
from the command SHOW DDL
. SHOW
DDL
lists DDL statements run by an all-shard user
that are first run on the catalog and then propagated to the shards,
whereas SYNC SCHEMA -show
DDL statements are
generated from the objects imported from shards.
By default, SYNC SCHEMA -show
lists a fixed number of
the latest DDLs. The -count
and
-ddl
options can be used to inspect
specific range of DDLs. For example,
gdsctl> sync schema -show -count 20
gdsctl> sync schema -show -count 20 -ddl 5
To check the complete DDL text and error message, if any, use the
-ddl
option.
gdsctl> sync schema -show -ddl 5
To list only the failed DDL statements, use the
-failed_only
option.
gdsctl> sync schema –failed_only
Based on the error message of the failed DDL, fix the cause of the error and perform the apply phase.
gdsctl> sync schema -apply
The SYNC SCHEMA
command also has a
-restart
option to perform the complete
operation from the beginning as if it were run for the first time.
This option will DROP
all existing schemas imported
during all previous runs of SYNC SCHEMA
and any
related metadata. Be aware that this will cause any running queries
on these objects to fail.
gdsctl> sync schema -restart