Creating Schema Objects

The following topics show you how to create the schema objects in your Oracle Globally Distributed Database.

Refer back to Schema Objects for conceptual information about these objects.

Create an All-Shards User

Local users that only exist in the shard catalog database do not have the privileges to create schema objects in the Oracle Globally Distributed Database. The first step of creating the distributed database schema is to create an all-shards user.

Create an all-shards user by connecting to the shard catalog database as a privileged user, enabling SHARD DDL, and running the CREATE USER command. When the all-shards user connects to the shard catalog database, the SHARD DDL mode is enabled by default.

Note:

Local users can create non-schema distributed database objects, such as tablespaces, directories, and contexts, if they enable SHARD DDL mode; however, they cannot create schema objects, such as tables, views, indexes, functions, procedures, and so on.

Sharded objects cannot have any dependency on local objects. For example, you cannot create an all-shard view on a local table.

You cannot grant SYS privileges to sharded users using sharded DDL. You must log in to each shard and grant the privilege to the account manually on that shard.

Creating a Sharded Table Family

Create a sharded table family with the SQL CREATE TABLE statement. You can specify parent-child relationships between tables using reference partitioning or equi-partitioning.

Use Reference Partitioning to Specify Parent-Child Relationships Between Tables

The recommended way to create a sharded table family is to specify parent-child relationships between tables using reference partitioning.

Partitioning by reference simplifies the syntax since the partitioning scheme is only specified for the root table. Also, partition management operations that are performed on the root table are automatically propagated to its descendents. For example, when adding a partition to the root table, a new partition is created on all its descendents.

The appropriate CREATE TABLE statements for Customers–Orders–LineItems schema using a system-managed data distribution methodology are shown below. The first statement creates the root table of the table family, Customers.

CREATE SHARDED TABLE Customers 
( CustNo      NUMBER NOT NULL
, Name        VARCHAR2(50)
, Address     VARCHAR2(250) 
, CONSTRAINT RootPK PRIMARY KEY(CustNo)
)
PARTITION BY CONSISTENT HASH (CustNo)
PARTITIONS AUTO
TABLESPACE SET ts1
;

The following two statements create the Orders and LineItems tables, which are a child and grandchild of the Customers table.

CREATE SHARDED TABLE Orders 
( OrderNo   NUMBER NOT NULL
, CustNo    NUMBER NOT NULL
, OrderDate DATE
, CONSTRAINT OrderPK PRIMARY KEY (CustNo, OrderNo)
, CONSTRAINT CustFK  FOREIGN KEY (CustNo) REFERENCES Customers(CustNo) 
)
PARTITION BY REFERENCE (CustFK)
;
CREATE SHARDED TABLE LineItems 
( CustNo    NUMBER NOT NULL
, LineNo    NUMBER(2) NOT NULL
, OrderNo   NUMBER(5) NOT NULL
, StockNo   NUMBER(4)
, Quantity  NUMBER(2)
, CONSTRAINT LinePK  PRIMARY KEY (CustNo, OrderNo, LineNo)
, CONSTRAINT LineFK  FOREIGN KEY (CustNo, OrderNo) REFERENCES Orders(CustNo, OrderNo)
)
PARTITION BY REFERENCE (LineFK)
;

In the example statements above, corresponding partitions of all tables in the family are stored in the same tablespace set, TS1. However, it is possible to specify separate tablespace sets for each table.

Note that in the example statements above, the partitioning column CustNo used as the sharding key is present in all three tables. This is despite the fact that reference partitioning, in general, allows a child table to be equi-partitioned with the parent table without having to duplicate the key columns in the child table. The reason for this is that reference partitioning requires a primary key in a parent table because the primary key must be specified in the foreign key constraint of a child table used to link the child to its parent. However, a primary key on a sharded table must be the same as, or contain, the sharding key. This makes it possible to enforce global uniqueness of a primary key without coordination with other shards, a critical requirement for linear scalability.

To summarize, the use of reference-partitioned tables in a distributed database requires adhering to the following rules:

  • A primary key on a sharded table must either be the same as the sharding key, or contain the sharding key. This is required to enforce global uniqueness of a primary key without coordination with other shards.

  • Reference partitioning requires a primary key in a parent table, because the primary key must be specified in the foreign key constraint of a child table to link the child to its parent. It is also possible to have a foreign key constraint when the parent table has just UNIQUE constraint, but no PRIMARY KEY. The sharding key must also be NOT NULL.

    For example, to link the LineItems (child) table to the Orders (parent) table, you need a primary key in the Orders table. The second rule implies that the primary key in the Orders table contains the CustNo value. (This is an existing partitioning rule not specific to Oracle Globally Distributed Database.)

Use Equi-Partitioning to Specify Parent-Child Relationships Between Tables

In some cases it is impossible or undesirable to create primary and foreign key constraints that are required for reference partitioning. For such cases, specifying parent-child relationships in a table family requires that all tables are explicitly equi-partitioned. Each child table is created with the PARENT clause in CREATE SHARDED TABLE that contains the name of its parent. An example of the syntax is shown below.

 CREATE SHARDED TABLE Customers 
( CustNo      NUMBER NOT NULL
, Name        VARCHAR2(50)
, Address     VARCHAR2(250) 
, region      VARCHAR2(20)
, class       VARCHAR2(3)
, signup      DATE
)
PARTITION BY CONSISTENT HASH (CustNo)
PARTITIONS AUTO
TABLESPACE SET ts1
;

CREATE SHARDED TABLE Orders 
( OrderNo   NUMBER 
, CustNo    NUMBER NOT NULL
, OrderDate DATE
)
PARENT Customers
PARTITION BY CONSISTENT HASH (CustNo)
PARTITIONS AUTO
TABLESPACE SET ts1
;

CREATE SHARDED TABLE LineItems 
( LineNo    NUMBER
, OrderNo   NUMBER
, CustNo    NUMBER NOT NULL
, StockNo   NUMBER
, Quantity  NUMBER
)
PARENT Customers
PARTITION BY CONSISTENT HASH (CustNo)
PARTITIONS AUTO
TABLESPACE SET ts1
;

Because the partitioning scheme is fully specified in all of the CREATE SHARDED TABLE statements, any table can be independently subpartitioned. This is not permitted with reference partitioning where subpartitions can only be specified for the root table and the subpartitioning scheme is the same for all tables in a table family.

Note that this method only supports two-level table families, that is, all children must have the same parent and grandchildren cannot exist. This is not a limitation as long as the partitioning column from the parent table exists in all of the child tables.

See Also:

Oracle Database VLDB and Partitioning Guide for information about reference partitioning

Designing Schemas With Multiple Table Families

An Oracle Globally Distributed Database schema can have multiple table families, where all of the data from different table families reside in the same chunks, which contain partitions from different table families sharing the same hash key range.

Note:

Multiple table families are supported in system-managed distributed databases only. Composite and user-defined distributed databases only support one table family.

To create a new table family, create a root sharded table and specify tablespace sets that are not used by existing tablespace families. Each table family is identified by its root table. Tables in the different table families should not be related to each other.

Each table family should have its own sharding key definition, while the same restriction on having the same sharding key columns in child tables still holds true within each table family. This means that all tables from different table families are sharded the same way with consistent hash into the same number of chunks, with each chunk containing data from all the table families.

Design your table families such that queries between different table-families are minimal and only carried out on the sharding coordinator, as many such joins will have an effect on performance

The following example shows you how to create multiple table families using the PARENT clause with a system-managed sharding methodology (PARTITION BY CONSISTENT HASH).

CREATE SHARDED TABLE Customers <=== Table Family #1
( CustId NUMBER NOT NULL
, Name VARCHAR2(50)
, Address VARCHAR2(250)
, region VARCHAR2(20)
, class VARCHAR2(3)
, signup DATE
)
PARTITION BY CONSISTENT HASH (CustId)
PARTITIONS AUTO
TABLESPACE SET ts1
;

CREATE SHARDED TABLE Orders
( OrderNo NUMBER
, CustId NUMBER
, OrderDate DATE
)
PARENT Customers
PARTITION BY CONSISTENT HASH (CustId)
PARTITIONS AUTO
TABLESPACE SET ts1
;

CREATE SHARDED TABLE LineItems
( LineNo NUMBER
, OrderNo NUMBER
, CustId NUMBER
, StockNo NUMBER
, Quantity NUMBER
)
)
PARENT Customers
PARTITION BY CONSISTENT HASH (CustId)
PARTITIONS AUTO
TABLESPACE SET ts1
;

CREATE SHARDED TABLE Products <=== Table Family #2
( ProdId NUMBER NOT NULL,
  CONSTRAINT pk_products PRIMARY KEY (ProdId)
)
PARTITION BY CONSISTENT HASH (ProdId)
PARTITIONS AUTO
TABLESPACE SET ts_2
;

Note:

ORA-3850 is thrown if you attempt to use a tablespace set for a table family, but that tablespace set is already in use by an existing table family.

Joins across table families may not be efficient, and if you have many such joins, or if they are performance-critical, you should use duplicated tables instead of multiple table families.

Associating Global Services With Multiple Table Families

Each table family should be associated with a different global service. Applications from different table families each have their own connection pool and service, and use their own sharding key for routing to the correct shard.

When you create the first root table (that is, the first table family) all of the existing global services are automatically associated with it. You can use the GDSCTL MODIFY SERVICE command to change the services associated with a table family after more table families are created, as shown in this example.

GDSCTL> MODIFY SERVICE –GDSPOOL shdpool –TABLE_FAMILY sales.customer -SERVICE sales

Creating Sharded Tables

A sharded table is a table that is partitioned into smaller and more manageable pieces among multiple databases, called shards.

The following topics guide your decisions and provide instructions for creating sharded tables:

Tablespace Set Sizing

When you create a tablespace set on the shard catalog, you must make sure you have enough space for the tablespaces created on the shard catalog and on each of the shards.

This is especially important in a metered usage environment.

For example, with a shard catalog and three shards in the configuration, you issue the following statements.

ALTER SESSION ENABLE SHARD DDL;
CREATE TABLESPACE SET TSP_SET_1 IN SHARDSPACE SHSPC_1 USING TEMPLATE
 (DATAFILE SIZE 100M  AUTOEXTEND ON NEXT 1M MAXSIZE UNLIMITED);

For example, assuming a default of 120 chunks per shard, the command creates the following objects in these configurations:

  • System sharding:

    120 tablespaces (for 120 chunks on each shard)

    + 1 tablespace in the shard catalog

    = 120 + 1 tablespace in each shard, with initial tables space 100M

  • Raft replication:

    360 tablespaces (for 360 chunks on each shard, because there are 3 chunks created for each replication unit)

    + 1 tablespace in the shard catalog

    = 360 + 1 tablespace in each shard, with initial tables space 100M

If the required amount of storage is not planned for, this can lead to a failed DDL, and that will require significant effort to recover from.

To prevent this issue, you must set the database initialization parameter DB_FILES greater than or equal to the total number of chunks and/or tablespace sets required in the shard. Find a formula for calculating DB_FILES in Create the Shard Databases.

Also, note that all tablespaces in a tablespace set are bigfile tablespaces. A bigfile tablespace is a tablespace with a single, but potentially very large (up to 4G blocks) data file. See Bigfile Tablespaces in Oracle Database Administrator’s Guide for details.

Sharded Tables for System-Managed Sharding

In a system-managed distributed database, data is automatically distributed across the shards using partitioning by consistent hash.

Before creating a sharded table, create a tablespace set with CREATE TABLESPACE SET to store the table partitions.

CREATE TABLESPACE SET ts1;

If you need to customize the tablespace attributes, add the USING TEMPLATE clause to CREATE TABLESPACE SET as shown in this example.

CREATE TABLESPACE SET ts1
USING TEMPLATE
( DATAFILE SIZE 10M
  EXTENT MANAGEMENT LOCAL UNIFORM SIZE 256K
  SEGMENT SPACE MANAGEMENT AUTO
  ONLINE
)
;

You create a sharded table with CREATE SHARDED TABLE, horizontally partitioning the table across the shards based on the sharding key cust_id.

CREATE SHARDED TABLE customers 
( cust_id     NUMBER NOT NULL
, name        VARCHAR2(50)
, address     VARCHAR2(250)
, region      VARCHAR2(20)
, class       VARCHAR2(3)
, signup      DATE
CONSTRAINT cust_pk PRIMARY KEY(cust_id)
)
PARTITION BY CONSISTENT HASH (cust_id)
PARTITIONS AUTO
TABLESPACE SET ts1
;

A system-managed sharded table is partitioned by consistent hash, by specifying PARTITION BY CONSISTENT HASH (primary_key_column).

The PARTITIONS AUTO clause specifies that the number of partitions is automatically set to the number of tablespaces in the tablespace set ts1, and each partition is stored in a separate tablespace.

Sharded Tables for User-Defined Sharding

In a user-defined distributed database, you explicitly map data to individual shards. A sharded table in a user-defined distributed database can be partitioned by range or list.

You do not create tablespace sets for user-defined sharded tables; however, you must create each tablespace individually and explicitly associate it with a shardspace deployed in the distributed database configuration, as shown here.

CREATE TABLESPACE tbs1 IN SHARDSPACE west;
CREATE TABLESPACE tbs2 IN SHARDSPACE central;
CREATE TABLESPACE tbs3 IN SHARDSPACE east;

When you create the sharded table, you define the partitions with the ranges or lists of data to be stored in each tablespace, as shown in the following example.

CREATE SHARDED TABLE accounts
( id             NUMBER
, account_number NUMBER
, customer_id    NUMBER
, branch_id      NUMBER
, state          VARCHAR(2) NOT NULL
, status         VARCHAR2(1)
)
PARTITION BY LIST (state)
( PARTITION p_west VALUES ('OR', 'WA') TABLESPACE ts1
, PARTITION p_central VALUES ('SD', 'WI') TABLESPACE ts2
, PARTITION p_east VALUES ('NY', 'VM', 'NJ') TABLESPACE ts3
)
;

Sharded Tables for Composite Sharding

The distributed database using the composite sharding method allows you to partition subsets of data that correspond to a range or list of key values in a table partitioned by consistent hash.

With composite sharding, as with the other sharding methods, tablespaces are used to specify the mapping of partitions to shards. To partition subsets of data in a sharded table, a separate tablespace set must be created for each shardspace deployed in the distributed database configuration as shown in the following example.

CREATE TABLESPACE SET tbs1 IN SHARDSPACE shspace1;
CREATE TABLESPACE SET tbs2 IN SHARDSPACE shspace2;

The statement in the following example partitions a sharded table into two partition sets: gold and silver, based on class of service. Each partition set is stored in a separate tablespace. Then data in each partition set is further partitioned by consistent hash on customer ID.

CREATE SHARDED TABLE customers
( cust_id NUMBER NOT NULL
, name VARCHAR2(50)
, address VARCHAR2(250) 
, location_id VARCHAR2(20) 
, class VARCHAR2(3) 
, signup_date DATE 
, CONSTRAINT cust_pk PRIMARY KEY(cust_id, class) 
)
PARTITIONSET BY LIST (class) 
  PARTITION BY CONSISTENT HASH (cust_id)
  PARTITIONS AUTO
(PARTITIONSET gold VALUES (‘gld’) TABLESPACE SET tbs1,
 PARTITIONSET silver VALUES (‘slv’) TABLESPACE SET tbs2)
;

Sharded Tables for Directory-Based Sharding

Create directory-based sharded tables using PARTITION BY DIRECTORY in the CREATE SHARDED TABLE statement.

For example:

CREATE SHARDED TABLE customers
( id             NUMBER NOT NULL
, name           VARCHAR2(30)
, address        VARCHAR2(30)
, status         VARCHAR2(1)
,
CONSTRAINT cust_pk PRIMARY KEY(id)
)
PARTITION BY DIRECTORY (id)
( PARTITION p1 TABLESPACE tbs1,
  PARTITION p2 TABLESPACE tbs2,
  PARTITION p3 TABLESPACE tbs3…);

Note:

  • Unlike in user-defined sharding, key values are not specified for the partitions in the CREATE TABLE statement.

  • The directory table is automatically created during root table creation. The definition of the directory table is:

    <shard_user_schema>.<root_table>$SDIR

  • If a child table is created with PARENT clause in a different schema from the root table, an additional privilege is required for the child table's schema owner. (This is only for directory-based sharding and is not required for regular user-defined sharding.)

    This is because there is a foreign key constraint on the child table to the directory table's sharding key columns, to ensure that no rows can be inserted into the child table without the sharding key value being present in the directory mapping. As a consequence, the child table's schema needs a reference privilege on the directory table's sharding key columns.

    See "Granting References" in Creating Tables Sharded by Directory for the workaround.

Creating Duplicated Tables

The number of database requests handled by a single shard can be maximized by duplicating read-only or read-mostly tables across all shards.

Duplicated tables are a good choice for relatively small tables that are not updated frequently, and that are often accessed together with sharded tables. See Duplicated Tables for more detailed concepts and a diagram.

Types of Duplicated Tables

There are three types of duplicated tables you can create in a distributed database:

With the first two types of duplicated table, you can connect to any shard and update a duplicated table directly on the shard. Then the update is asynchronously propagated to all other shards.

A synchronous duplicated table is a duplicated table that is synchronized on the shards ‘on-commit’ on the shard catalog. The rows in a duplicated table on the shards are automatically synchronized with the rows in the duplicated table on the shard catalog when the active transaction performing DMLs on the duplicated tables is committed.

See Updating Duplicated Tables and Synchronizing Their Contents for more details.

Creating a Duplicated Table

A duplicated table, Products, can be created using the following statement:

CREATE DUPLICATED TABLE Products 
( StockNo     NUMBER PRIMARY KEY
, Description VARCHAR2(20)
, Price       NUMBER(6,2))
;

Creating a Synchronous Duplicated Table

To create this same table as a synchronous duplicated table, use the SYNCHRONOUS keyword in the statement, as shown here:

CREATE DUPLICATED TABLE Products 
( StockNo     NUMBER PRIMARY KEY
, Description VARCHAR2(20)
, Price       NUMBER(6,2))
SYNCHRONOUS;

See CREATE TABLE in Oracle Database SQL Language Reference for more information about the DUPLICATED clause.

Updating Duplicated Tables and Synchronizing Their Contents

Oracle Globally Distributed Database synchronizes the contents of duplicated tables using Materialized View Replication.

A duplicated table on each shard is represented by a materialized view. The primary table for the materialized views is located in the shard catalog. The CREATE DUPLICATED TABLE statement automatically creates the primary table, materialized views, and other objects required for materialized view replication.

Synchronous Duplicated Tables

Synchronous duplicated tables refresh automatically on commit from the shard catalog.

When the active transaction is committed on the duplicated tables created with SYNCHRONOUS in the shard catalog, a multi-shard DML is initiated for any synchronous duplicated tables that were updated with DMLs. To minimize the impact on performance of this commit, these synchronization DMLs are performed in parallel.

Note:

All shards must be up and running for a synchronous duplicated table DML to get refreshed on the shards “on-commit” on the shard catalog.

Non-Synchronous Duplicated Tables

For duplicated tables that are not created with SYNCHRONOUS, you can connect to any shard and update a duplicated table directly on the shard. What happens after that depends on whether you have set up automated refresh.

The materialized views on all of the shards can be refreshed with one of the two options:

  • Automatic refresh at a configurable frequency per table
  • On-demand refresh by running a stored procedure

For automatic refresh, to get better refresh performance, you can also use a stored procedure interface to create materialized view refresh groups.

On a refresh, the update is first propagated over a database link from the shard to the primary table on the shard catalog. Then the update is asynchronously propagated to all other shards as a result of a materialized view refresh.

Setting the Duplicated Table Global Refresh Rate

You can set a global refresh rate for all duplicated tables.

By default duplicated tables are refreshed every 60 seconds. The example below shows increasing the refresh interval to 100 seconds by setting the database parameter shrd_dupl_table_refresh_rate.

SQL> show parameter refresh
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
shrd_dupl_table_refresh_rate         integer     60
 
SQL> alter system set shrd_dupl_table_refresh_rate=100 scope=both;
 
System altered.
 
SQL> show parameter refresh
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
shrd_dupl_table_refresh_rate         integer     100

Customizing Duplicated Table Refresh Rates

You can set a finer grained refresh rate for individual duplicated tables.

Table-level refresh rates can be initially set with CREATE TABLE, and can be updated using ALTER TABLE.

The REFRESH clause syntax allows you to specify a refresh interval in seconds, minutes, hours, or you can set the table to only refresh on demand.

[REFRESH INTERVAL refresh_rate [SECOND|MINUTE|HOUR] | REFRESH ON DEMAND]

For example, to create a duplicated table with customized refresh rate of two minutes:

CREATE DUPLICATED TABLE Products
( StockNo NUMBER PRIMARY KEY
, Description VARCHAR2(20)
, Price NUMBER(6,2))
REFRESH INTERVAL 2 MINUTE;

(To set on demand refresh, see Refreshing Duplicated Tables On Demand.)

To alter the duplicated table with a customized refresh rate of one hour:

ALTER TABLE table_name MODIFY REFRESH INTERVAL 1 HOUR;

If DEFAULT is specified, the value set in database parameter shrd_dupl_table_refresh_rate is used.

ALTER TABLE table_name MODIFY REFRESH INTERVAL DEFAULT;

Refreshing Duplicated Tables On Demand

You can set duplicated tables to be refreshed on demand rather than at a refresh interval.

When configured with the REFRESH ON DEMAND clause, duplicated tables are not automatically refreshed. You need to manually refresh these tables.

Setting On-Demand Refresh

To create a duplicated table that you can refresh on demand:

CREATE DUPLICATED TABLE Products
( StockNo NUMBER PRIMARY KEY
, Description VARCHAR2(20)
, Price NUMBER(6,2))
REFRESH ON DEMAND;

To update a duplicated table refresh method to on-demand refresh:

ALTER TABLE table_name MODIFY REFRESH ON DEMAND;

Refreshing the Duplicated Table On Demand

To refresh the tables created with the ON DEMAND clause, a utility procedure is provided which can be run on the shard catalog.

exec sys.refreshDuplicatedTable(table_name);

Here table_name can optionally be qualified with schema_name, so it would be schema_name.table_name.

Alternatively, you can refresh duplicated table materialized views directly on shards using the DBMS_MVIEW.REFRESH procedure.

Duplicated Table Support and Limitations

Keep the following considerations in mind when designing your schema with duplicated tables.

The following are supported for duplicated tables:

  • ALTER TABLE ADD/DROP CONSTRAINT (one constraint at a time)

  • ALTER TABLE ADD/DROP PRIMARY KEY

  • The creation and alteration of duplicated tables with the inmemory and parallel options is supported.

The following are not supported for duplicated tables.

  • System and reference partitioned tables
  • LONGdata type
  • REF data types
  • abstract (MDSYS datatypes are supported)
  • Maximum number of columns without primary key is 999
  • nologging options

Note:

A race condition is possible when a transaction run on a shard tries to update a row which was deleted on the shard catalog. In this case, an error is returned and the transaction on the shard is rolled back.

The following use cases are not supported when updating duplicated tables on a shard.

  • Updating a LOB or a data type not supported by database links
  • Updating or deleting of a row inserted by the same transaction