Creating Schema Objects
The following topics show you how to create the schema objects in your Oracle Globally Distributed Database.
Refer back to Schema Objects for conceptual information about these objects.
Create an All-Shards User
Local users that only exist in the shard catalog database do not have the privileges to create schema objects in the Oracle Globally Distributed Database. The first step of creating the distributed database schema is to create an all-shards user.
Create an all-shards user by connecting to the shard catalog database as a
privileged user, enabling SHARD DDL
, and running the CREATE
USER
command. When the all-shards user connects to the shard catalog
database, the SHARD DDL
mode is enabled by default.
Note:
Local users can create non-schema distributed database objects, such as tablespaces, directories, and contexts, if they enable
SHARD DDL
mode; however, they cannot create schema objects,
such as tables, views, indexes, functions, procedures, and so on.
Sharded objects cannot have any dependency on local objects. For example, you cannot create an all-shard view on a local table.
You cannot grant SYS
privileges to sharded users using sharded DDL.
You must log in to each shard and grant the privilege to the account manually on
that shard.
Creating a Sharded Table Family
Create a sharded table family with the SQL CREATE TABLE
statement. You can specify parent-child relationships between tables using reference
partitioning or equi-partitioning.
Use Reference Partitioning to Specify Parent-Child Relationships Between Tables
The recommended way to create a sharded table family is to specify parent-child relationships between tables using reference partitioning.
Partitioning by reference simplifies the syntax since the partitioning scheme is only specified for the root table. Also, partition management operations that are performed on the root table are automatically propagated to its descendents. For example, when adding a partition to the root table, a new partition is created on all its descendents.
The appropriate CREATE TABLE
statements for
Customers–Orders–LineItems schema using a system-managed data distribution methodology
are shown below. The first statement creates the root table of the table family,
Customers.
CREATE SHARDED TABLE Customers
( CustNo NUMBER NOT NULL
, Name VARCHAR2(50)
, Address VARCHAR2(250)
, CONSTRAINT RootPK PRIMARY KEY(CustNo)
)
PARTITION BY CONSISTENT HASH (CustNo)
PARTITIONS AUTO
TABLESPACE SET ts1
;
The following two statements create the Orders and LineItems tables, which are a child and grandchild of the Customers table.
CREATE SHARDED TABLE Orders
( OrderNo NUMBER NOT NULL
, CustNo NUMBER NOT NULL
, OrderDate DATE
, CONSTRAINT OrderPK PRIMARY KEY (CustNo, OrderNo)
, CONSTRAINT CustFK FOREIGN KEY (CustNo) REFERENCES Customers(CustNo)
)
PARTITION BY REFERENCE (CustFK)
;
CREATE SHARDED TABLE LineItems
( CustNo NUMBER NOT NULL
, LineNo NUMBER(2) NOT NULL
, OrderNo NUMBER(5) NOT NULL
, StockNo NUMBER(4)
, Quantity NUMBER(2)
, CONSTRAINT LinePK PRIMARY KEY (CustNo, OrderNo, LineNo)
, CONSTRAINT LineFK FOREIGN KEY (CustNo, OrderNo) REFERENCES Orders(CustNo, OrderNo)
)
PARTITION BY REFERENCE (LineFK)
;
In the example statements above, corresponding partitions of all tables in the family are stored in the same tablespace set, TS1. However, it is possible to specify separate tablespace sets for each table.
Note that in the example statements above, the partitioning column
CustNo
used as the sharding key is present in all three tables.
This is despite the fact that reference partitioning, in general, allows a child table
to be equi-partitioned with the parent table without having to duplicate the key columns
in the child table. The reason for this is that reference partitioning requires a
primary key in a parent table because the primary key must be specified in the foreign
key constraint of a child table used to link the child to its parent. However, a primary
key on a sharded table must be the same as, or contain, the sharding key. This makes it
possible to enforce global uniqueness of a primary key without coordination with other
shards, a critical requirement for linear scalability.
To summarize, the use of reference-partitioned tables in a distributed database requires adhering to the following rules:
-
A primary key on a sharded table must either be the same as the sharding key, or contain the sharding key. This is required to enforce global uniqueness of a primary key without coordination with other shards.
-
Reference partitioning requires a primary key in a parent table, because the primary key must be specified in the foreign key constraint of a child table to link the child to its parent. It is also possible to have a foreign key constraint when the parent table has just
UNIQUE
constraint, but noPRIMARY KEY
. The sharding key must also beNOT NULL
.For example, to link the LineItems (child) table to the Orders (parent) table, you need a primary key in the Orders table. The second rule implies that the primary key in the Orders table contains the
CustNo
value. (This is an existing partitioning rule not specific to Oracle Globally Distributed Database.)
Use Equi-Partitioning to Specify Parent-Child Relationships Between Tables
In some cases it is impossible or undesirable to create primary and foreign
key constraints that are required for reference partitioning. For such cases, specifying
parent-child relationships in a table family requires that all tables are explicitly
equi-partitioned. Each child table is created with the PARENT
clause in
CREATE SHARDED TABLE
that contains the name of its parent. An
example of the syntax is shown below.
CREATE SHARDED TABLE Customers
( CustNo NUMBER NOT NULL
, Name VARCHAR2(50)
, Address VARCHAR2(250)
, region VARCHAR2(20)
, class VARCHAR2(3)
, signup DATE
)
PARTITION BY CONSISTENT HASH (CustNo)
PARTITIONS AUTO
TABLESPACE SET ts1
;
CREATE SHARDED TABLE Orders
( OrderNo NUMBER
, CustNo NUMBER NOT NULL
, OrderDate DATE
)
PARENT Customers
PARTITION BY CONSISTENT HASH (CustNo)
PARTITIONS AUTO
TABLESPACE SET ts1
;
CREATE SHARDED TABLE LineItems
( LineNo NUMBER
, OrderNo NUMBER
, CustNo NUMBER NOT NULL
, StockNo NUMBER
, Quantity NUMBER
)
PARENT Customers
PARTITION BY CONSISTENT HASH (CustNo)
PARTITIONS AUTO
TABLESPACE SET ts1
;
Because the partitioning scheme is fully specified in all of the
CREATE SHARDED TABLE
statements, any table can be independently
subpartitioned. This is not permitted with reference partitioning where subpartitions
can only be specified for the root table and the subpartitioning scheme is the same for
all tables in a table family.
Note that this method only supports two-level table families, that is, all children must have the same parent and grandchildren cannot exist. This is not a limitation as long as the partitioning column from the parent table exists in all of the child tables.
See Also:
Oracle Database VLDB and Partitioning Guide for information about reference partitioning
Designing Schemas With Multiple Table Families
An Oracle Globally Distributed Database schema can have multiple table families, where all of the data from different table families reside in the same chunks, which contain partitions from different table families sharing the same hash key range.
Note:
Multiple table families are supported in system-managed distributed databases only. Composite and user-defined distributed databases only support one table family.To create a new table family, create a root sharded table and specify tablespace sets that are not used by existing tablespace families. Each table family is identified by its root table. Tables in the different table families should not be related to each other.
Each table family should have its own sharding key definition, while the same restriction on having the same sharding key columns in child tables still holds true within each table family. This means that all tables from different table families are sharded the same way with consistent hash into the same number of chunks, with each chunk containing data from all the table families.
Design your table families such that queries between different table-families are minimal and only carried out on the sharding coordinator, as many such joins will have an effect on performance
The following example shows you how to create multiple table families using
the PARENT
clause with a system-managed sharding methodology
(PARTITION BY CONSISTENT HASH
).
CREATE SHARDED TABLE Customers <=== Table Family #1
( CustId NUMBER NOT NULL
, Name VARCHAR2(50)
, Address VARCHAR2(250)
, region VARCHAR2(20)
, class VARCHAR2(3)
, signup DATE
)
PARTITION BY CONSISTENT HASH (CustId)
PARTITIONS AUTO
TABLESPACE SET ts1
;
CREATE SHARDED TABLE Orders
( OrderNo NUMBER
, CustId NUMBER
, OrderDate DATE
)
PARENT Customers
PARTITION BY CONSISTENT HASH (CustId)
PARTITIONS AUTO
TABLESPACE SET ts1
;
CREATE SHARDED TABLE LineItems
( LineNo NUMBER
, OrderNo NUMBER
, CustId NUMBER
, StockNo NUMBER
, Quantity NUMBER
)
)
PARENT Customers
PARTITION BY CONSISTENT HASH (CustId)
PARTITIONS AUTO
TABLESPACE SET ts1
;
CREATE SHARDED TABLE Products <=== Table Family #2
( ProdId NUMBER NOT NULL,
CONSTRAINT pk_products PRIMARY KEY (ProdId)
)
PARTITION BY CONSISTENT HASH (ProdId)
PARTITIONS AUTO
TABLESPACE SET ts_2
;
Note:
ORA-3850 is thrown if you attempt to use a tablespace set for a table family, but that tablespace set is already in use by an existing table family.
Joins across table families may not be efficient, and if you have many such joins, or if they are performance-critical, you should use duplicated tables instead of multiple table families.
Associating Global Services With Multiple Table Families
Each table family should be associated with a different global service. Applications from different table families each have their own connection pool and service, and use their own sharding key for routing to the correct shard.
When you create the first root table (that is, the first table family) all
of the existing global services are automatically associated with it. You can use the
GDSCTL
MODIFY SERVICE
command to change the services associated with a table
family after more table families are created, as shown in this example.
GDSCTL> MODIFY SERVICE –GDSPOOL shdpool –TABLE_FAMILY sales.customer -SERVICE sales
Creating Sharded Tables
A sharded table is a table that is partitioned into smaller and more manageable pieces among multiple databases, called shards.
The following topics guide your decisions and provide instructions for creating sharded tables:
Tablespace Set Sizing
When you create a tablespace set on the shard catalog, you must make sure you have enough space for the tablespaces created on the shard catalog and on each of the shards.
This is especially important in a metered usage environment.
For example, with a shard catalog and three shards in the configuration, you issue the following statements.
ALTER SESSION ENABLE SHARD DDL;
CREATE TABLESPACE SET TSP_SET_1 IN SHARDSPACE SHSPC_1 USING TEMPLATE
(DATAFILE SIZE 100M AUTOEXTEND ON NEXT 1M MAXSIZE UNLIMITED);
For example, assuming a default of 120 chunks per shard, the command creates the following objects in these configurations:
-
System sharding:
120 tablespaces (for 120 chunks on each shard)
+ 1 tablespace in the shard catalog
= 120 + 1 tablespace in each shard, with initial tables space 100M
-
Raft replication:
360 tablespaces (for 360 chunks on each shard, because there are 3 chunks created for each replication unit)
+ 1 tablespace in the shard catalog
= 360 + 1 tablespace in each shard, with initial tables space 100M
If the required amount of storage is not planned for, this can lead to a failed DDL, and that will require significant effort to recover from.
To prevent this issue, you must set the database initialization parameter
DB_FILES
greater than or equal to the total number of chunks and/or
tablespace sets required in the shard. Find a formula for calculating
DB_FILES
in Create the Shard Databases.
Also, note that all tablespaces in a tablespace set are bigfile tablespaces. A bigfile tablespace is a tablespace with a single, but potentially very large (up to 4G blocks) data file. See Bigfile Tablespaces in Oracle Database Administrator’s Guide for details.
Sharded Tables for System-Managed Sharding
In a system-managed distributed database, data is automatically distributed across the shards using partitioning by consistent hash.
Before creating a sharded table, create a tablespace set with
CREATE TABLESPACE SET
to store the table partitions.
CREATE TABLESPACE SET ts1;
If you need to customize the tablespace attributes, add the
USING TEMPLATE
clause to CREATE TABLESPACE SET
as shown in this example.
CREATE TABLESPACE SET ts1
USING TEMPLATE
( DATAFILE SIZE 10M
EXTENT MANAGEMENT LOCAL UNIFORM SIZE 256K
SEGMENT SPACE MANAGEMENT AUTO
ONLINE
)
;
You create a sharded table with CREATE SHARDED TABLE
,
horizontally partitioning the table across the shards based on the sharding key
cust_id
.
CREATE SHARDED TABLE customers
( cust_id NUMBER NOT NULL
, name VARCHAR2(50)
, address VARCHAR2(250)
, region VARCHAR2(20)
, class VARCHAR2(3)
, signup DATE
CONSTRAINT cust_pk PRIMARY KEY(cust_id)
)
PARTITION BY CONSISTENT HASH (cust_id)
PARTITIONS AUTO
TABLESPACE SET ts1
;
A system-managed sharded table is partitioned by consistent hash, by
specifying PARTITION BY CONSISTENT HASH (primary_key_column)
.
The PARTITIONS AUTO
clause specifies that the number of
partitions is automatically set to the number of tablespaces in the tablespace set
ts1
, and each partition is stored in a separate tablespace.
Sharded Tables for User-Defined Sharding
In a user-defined distributed database, you explicitly map data to individual shards. A sharded table in a user-defined distributed database can be partitioned by range or list.
You do not create tablespace sets for user-defined sharded tables; however, you must create each tablespace individually and explicitly associate it with a shardspace deployed in the distributed database configuration, as shown here.
CREATE TABLESPACE tbs1 IN SHARDSPACE west;
CREATE TABLESPACE tbs2 IN SHARDSPACE central;
CREATE TABLESPACE tbs3 IN SHARDSPACE east;
When you create the sharded table, you define the partitions with the ranges or lists of data to be stored in each tablespace, as shown in the following example.
CREATE SHARDED TABLE accounts
( id NUMBER
, account_number NUMBER
, customer_id NUMBER
, branch_id NUMBER
, state VARCHAR(2) NOT NULL
, status VARCHAR2(1)
)
PARTITION BY LIST (state)
( PARTITION p_west VALUES ('OR', 'WA') TABLESPACE ts1
, PARTITION p_central VALUES ('SD', 'WI') TABLESPACE ts2
, PARTITION p_east VALUES ('NY', 'VM', 'NJ') TABLESPACE ts3
)
;
Sharded Tables for Composite Sharding
The distributed database using the composite sharding method allows you to partition subsets of data that correspond to a range or list of key values in a table partitioned by consistent hash.
With composite sharding, as with the other sharding methods, tablespaces are used to specify the mapping of partitions to shards. To partition subsets of data in a sharded table, a separate tablespace set must be created for each shardspace deployed in the distributed database configuration as shown in the following example.
CREATE TABLESPACE SET tbs1 IN SHARDSPACE shspace1;
CREATE TABLESPACE SET tbs2 IN SHARDSPACE shspace2;
The statement in the following example partitions a sharded table into two partition sets: gold and silver, based on class of service. Each partition set is stored in a separate tablespace. Then data in each partition set is further partitioned by consistent hash on customer ID.
CREATE SHARDED TABLE customers
( cust_id NUMBER NOT NULL
, name VARCHAR2(50)
, address VARCHAR2(250)
, location_id VARCHAR2(20)
, class VARCHAR2(3)
, signup_date DATE
, CONSTRAINT cust_pk PRIMARY KEY(cust_id, class)
)
PARTITIONSET BY LIST (class)
PARTITION BY CONSISTENT HASH (cust_id)
PARTITIONS AUTO
(PARTITIONSET gold VALUES (‘gld’) TABLESPACE SET tbs1,
PARTITIONSET silver VALUES (‘slv’) TABLESPACE SET tbs2)
;
Sharded Tables for Directory-Based Sharding
Create directory-based sharded
tables using PARTITION BY DIRECTORY
in the CREATE SHARDED
TABLE
statement.
For example:
CREATE SHARDED TABLE customers
( id NUMBER NOT NULL
, name VARCHAR2(30)
, address VARCHAR2(30)
, status VARCHAR2(1)
,
CONSTRAINT cust_pk PRIMARY KEY(id)
)
PARTITION BY DIRECTORY (id)
( PARTITION p1 TABLESPACE tbs1,
PARTITION p2 TABLESPACE tbs2,
PARTITION p3 TABLESPACE tbs3…);
Note:
-
Unlike in user-defined sharding, key values are not specified for the partitions in the
CREATE TABLE
statement. -
The directory table is automatically created during root table creation. The definition of the directory table is:
<shard_user_schema>.<root_table>$SDIR
-
If a child table is created with
PARENT
clause in a different schema from the root table, an additional privilege is required for the child table's schema owner. (This is only for directory-based sharding and is not required for regular user-defined sharding.)This is because there is a foreign key constraint on the child table to the directory table's sharding key columns, to ensure that no rows can be inserted into the child table without the sharding key value being present in the directory mapping. As a consequence, the child table's schema needs a reference privilege on the directory table's sharding key columns.
See "Granting References" in Creating Tables Sharded by Directory for the workaround.
Creating Duplicated Tables
The number of database requests handled by a single shard can be maximized by duplicating read-only or read-mostly tables across all shards.
Duplicated tables are a good choice for relatively small tables that are not updated frequently, and that are often accessed together with sharded tables. See Duplicated Tables for more detailed concepts and a diagram.
Types of Duplicated Tables
There are three types of duplicated tables you can create in a distributed database:
-
Duplicated tables that refresh at a "refresh interval" set for the table. (see Setting the Duplicated Table Global Refresh Rate and Customizing Duplicated Table Refresh Rates)
-
Duplicated tables that refresh on demand. These tables don't refresh until you explicitly attempt to refresh them. (see Refreshing Duplicated Tables On Demand)
-
Duplicated tables that refresh on commit. These are called synchronous duplicated tables. (See example below)
With the first two types of duplicated table, you can connect to any shard and update a duplicated table directly on the shard. Then the update is asynchronously propagated to all other shards.
A synchronous duplicated table is a duplicated table that is synchronized on the shards ‘on-commit’ on the shard catalog. The rows in a duplicated table on the shards are automatically synchronized with the rows in the duplicated table on the shard catalog when the active transaction performing DMLs on the duplicated tables is committed.
See Updating Duplicated Tables and Synchronizing Their Contents for more details.
Creating a Duplicated Table
A duplicated table, Products, can be created using the following statement:
CREATE DUPLICATED TABLE Products
( StockNo NUMBER PRIMARY KEY
, Description VARCHAR2(20)
, Price NUMBER(6,2))
;
Creating a Synchronous Duplicated Table
To create this same table as a synchronous duplicated table, use the
SYNCHRONOUS
keyword in the statement, as shown here:
CREATE DUPLICATED TABLE Products
( StockNo NUMBER PRIMARY KEY
, Description VARCHAR2(20)
, Price NUMBER(6,2))
SYNCHRONOUS;
See CREATE TABLE in Oracle Database SQL Language Reference for more information
about the DUPLICATED
clause.
Updating Duplicated Tables and Synchronizing Their Contents
Oracle Globally Distributed Database synchronizes the contents of duplicated tables using Materialized View Replication.
A duplicated table on each shard is represented by a materialized view.
The primary table for the materialized views is located in the shard catalog. The
CREATE DUPLICATED TABLE
statement automatically creates the
primary table, materialized views, and other objects required for materialized view
replication.
Synchronous Duplicated Tables
Synchronous duplicated tables refresh automatically on commit from the shard catalog.
When the active transaction is committed on the duplicated tables created with
SYNCHRONOUS
in the shard catalog, a multi-shard DML is
initiated for any synchronous duplicated tables that were updated with DMLs. To
minimize the impact on performance of this commit, these synchronization DMLs are
performed in parallel.
Note:
All shards must be up and running for a synchronous duplicated table DML to get refreshed on the shards “on-commit” on the shard catalog.Non-Synchronous Duplicated Tables
For duplicated tables that are not created with
SYNCHRONOUS
, you can connect to any shard and update a
duplicated table directly on the shard. What happens after that depends on whether
you have set up automated refresh.
The materialized views on all of the shards can be refreshed with one of the two options:
- Automatic refresh at a configurable frequency per table
- On-demand refresh by running a stored procedure
For automatic refresh, to get better refresh performance, you can also use a stored procedure interface to create materialized view refresh groups.
On a refresh, the update is first propagated over a database link from the shard to the primary table on the shard catalog. Then the update is asynchronously propagated to all other shards as a result of a materialized view refresh.
Setting the Duplicated Table Global Refresh Rate
You can set a global refresh rate for all duplicated tables.
By default duplicated tables are refreshed every 60 seconds.
The example below shows increasing the refresh interval to 100
seconds by setting the database parameter
shrd_dupl_table_refresh_rate
.
SQL> show parameter refresh
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
shrd_dupl_table_refresh_rate integer 60
SQL> alter system set shrd_dupl_table_refresh_rate=100 scope=both;
System altered.
SQL> show parameter refresh
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
shrd_dupl_table_refresh_rate integer 100
Customizing Duplicated Table Refresh Rates
You can set a finer grained refresh rate for individual duplicated tables.
Table-level refresh rates can be initially set with CREATE
TABLE
, and can be updated using ALTER TABLE
.
The REFRESH
clause syntax allows you to specify a refresh interval
in seconds, minutes, hours, or you can set the table to only refresh on demand.
[REFRESH INTERVAL refresh_rate [SECOND|MINUTE|HOUR] | REFRESH ON DEMAND]
For example, to create a duplicated table with customized refresh rate of two minutes:
CREATE DUPLICATED TABLE Products
( StockNo NUMBER PRIMARY KEY
, Description VARCHAR2(20)
, Price NUMBER(6,2))
REFRESH INTERVAL 2 MINUTE;
(To set on demand refresh, see Refreshing Duplicated Tables On Demand.)
To alter the duplicated table with a customized refresh rate of one hour:
ALTER TABLE table_name MODIFY REFRESH INTERVAL 1 HOUR;
If DEFAULT
is specified, the value set in database
parameter shrd_dupl_table_refresh_rate
is used.
ALTER TABLE table_name MODIFY REFRESH INTERVAL DEFAULT;
Refreshing Duplicated Tables On Demand
You can set duplicated tables to be refreshed on demand rather than at a refresh interval.
When configured with the REFRESH ON DEMAND
clause,
duplicated tables are not automatically refreshed. You need to manually refresh
these tables.
Setting On-Demand Refresh
To create a duplicated table that you can refresh on demand:
CREATE DUPLICATED TABLE Products
( StockNo NUMBER PRIMARY KEY
, Description VARCHAR2(20)
, Price NUMBER(6,2))
REFRESH ON DEMAND;
To update a duplicated table refresh method to on-demand refresh:
ALTER TABLE table_name MODIFY REFRESH ON DEMAND;
Refreshing the Duplicated Table On Demand
To refresh the tables created with the ON DEMAND
clause, a utility
procedure is provided which can be run on the shard catalog.
exec sys.refreshDuplicatedTable(table_name);
Here table_name can optionally be qualified with schema_name, so it would be schema_name.table_name.
Alternatively, you can refresh duplicated table materialized views directly on shards
using the DBMS_MVIEW.REFRESH
procedure.
Duplicated Table Support and Limitations
Keep the following considerations in mind when designing your schema with duplicated tables.
The following are supported for duplicated tables:
-
ALTER TABLE ADD/DROP CONSTRAINT
(one constraint at a time) -
ALTER TABLE ADD/DROP PRIMARY KEY
-
The creation and alteration of duplicated tables with the
inmemory
andparallel
options is supported.
The following are not supported for duplicated tables.
- System and reference partitioned tables
LONG
data typeREF
data types- abstract (MDSYS datatypes are supported)
- Maximum number of columns without primary key is 999
- nologging options
Note:
A race condition is possible when a transaction run on a shard tries to update a row which was deleted on the shard catalog. In this case, an error is returned and the transaction on the shard is rolled back.
The following use cases are not supported when updating duplicated tables on a shard.
- Updating a LOB or a data type not supported by database links
- Updating or deleting of a row inserted by the same transaction