Schema Objects
To obtain the benefits of Oracle Globally Distributed Database, the schema of a distributed database should be designed in a way that maximizes the number of database requests processed on a single shard.
Partitions, Tablespaces, and Chunks
Distribution of partitions across shards is achieved by creating partitions in tablespaces that reside on different shards.
Each partition of a sharded table is stored in a separate tablespace, making the tablespace the unit of data distribution in a distributed database.
As described in Sharded Table Family, to minimize the number of multi-shard joins, corresponding partitions of all tables in a table family are always stored in the same shard. This is guaranteed when tables in a table family are created in the same set of distributed tablespaces as shown in the syntax examples where tablespace set ts1
is used for all tables.
However, it is possible to create different tables from a table family in different
tablespace sets, for example the Customers table in tablespace set ts1
and Orders in tablespace set ts2
. In this case, it must be guaranteed
that the tablespace that stores partition 1 of Customers always resides in the same
shard as the tablespace that stores partition 1 of Orders.
To support this functionality, a set of corresponding partitions from all of the tables in a table family, called a chunk, is formed. A chunk contains a single partition from each table of a table family. This guarantees that related data from different sharded tables can be moved together. In other words, a chunk is the unit of data migration between shards. With system-managed and composite data distribution methods, the number of chunks within each shard is specified when the distributed database is created. With user-defined data distribution, the total number of chunks is equal to the number of partitions.
A chunk that contains corresponding partitions from the tables of Cutomers-Orders-LineItems schema is shown in the following figure.
Each shard contains multiple chunks as shown in the following figure.
In addition to sharded tables, a shard can also contain one or more duplicated tables. Duplicated tables cannot be stored in tablespaces that are used for sharded tables.
Tablespace Sets
Oracle Globally Distributed Database creates and manages tablespaces as a unit called a TABLESPACE
SET
.
A distributed database configured with the system-managed and composite data distribution methods use
TABLESPACE SET
, while user-defined data distribution uses regular
tablespaces.
A tablespace is a logical unit of data distribution in a distributed database. The distribution of partitions across shards is achieved by automatically creating partitions in tablespaces that reside on different shards.
To minimize the number of multi-shard joins, the corresponding partitions of related tables are always stored in the same shard. Each partition of a sharded table is stored in a separate tablespace.
The PARTITIONS AUTO
clause specifies that the number of partitions
should be automatically determined. This type of hashing provides more flexibility and
efficiency in migrating data between shards, which is important for elastic
scalability.
The number of tablespaces created per tablespace set is determined based on the number of chunks that were defined for the shardspace during deployment.
Note:
Only Oracle Managed Files are supported by tablespace sets.
Individual tablespaces cannot be dropped or altered independently of the entire tablespace set.
TABLESPACE SET
cannot be used with the user-defined
data distribution method.
Sharded Tables
A database table is split up across the shards, so that each shard contains the table with the same columns, but a different subset of rows. A table split up in this manner is called a sharded table.
The following figure shows how a set of large tables (referred to as a table family), can be horizontally partitioned across the three shards, so that each shard contains a subset of the data, indicated with red, yellow, and blue rows.
Figure 2-4 Horizontal Partitioning of a Table Across Shards
Partitions are distributed across shards at the tablespace level, based on a sharding key. Examples of keys include customer ID, account number, and country ID.
Each partition of a sharded table resides in a separate tablespace, and each tablespace is associated with a specific shard. Depending on the data distribution method, the association can be established automatically or defined by the administrator.
Even though the partitions of a sharded table reside in multiple shards, to the application, the table looks and behaves exactly the same as a partitioned table in a single database. SQL statements issued by an application never have to refer to shards or depend on the number of shards and their configuration.
The familiar SQL syntax for table partitioning specifies how rows should be
partitioned across shards. For example, the following SQL statement creates a sharded
table, horizontally partitioning the table across shards based on the sharding key
cust_id
.
CREATE SHARDED TABLE customers
( cust_id NUMBER NOT NULL
, name VARCHAR2(50)
, address VARCHAR2(250)
, region VARCHAR2(20)
, class VARCHAR2(3)
, signup DATE
,CONSTRAINT cust_pk PRIMARY KEY(cust_id)
)
PARTITION BY CONSISTENT HASH (cust_id)
PARTITIONS AUTO
TABLESPACE SET ts1
;
The sharded table is partitioned by consistent hash, a special type of hash partitioning commonly used in scalable distributed systems. This technique automatically spreads tablespaces across shards to provide an even distribution of data and workload.
Note:
Global indexes on sharded tables are not supported, but local indexes are supported.Sharded Table Family
A sharded table family is a set of tables that are sharded in the same way. Often there is a parent-child relationship between database tables with a referential constraint in a child table (foreign key) referring to the primary key of the parent table.
Multiple tables linked by such relationships typically form a tree-like structure where every child has a single parent. A set of such tables is referred to as a table family. A table in a table family that has no parent is called the root table. There can be only one root table in a table family.
How a Table Family Is Sharded
Sharding a table family is illustrated here with the Customers–Orders–LineItems schema.
Before sharding, the tables in the schema may look as shown in the examples below. The three tables have a parent-child relationship, with Customers as the root table.
Customers Table (Root) Before Sharding
CustNo Name Address Location Class --------- ---------- -------------- --------- ------ 123 Brown 100 Main St us3 Gold 456 Jones 300 Pine Ave us1 Silver 999 Smith 453 Cherry St us2 Bronze
Orders Table Before Sharding
OrderNo CustNo OrderDate --------- -------- ----------- 4001 123 14-FEB-2013 4002 456 09-MAR-2013 4003 456 05-APR-2013 4004 123 27-MAY-2013 4005 999 01-SEP-2013
LineItems Table Before Sharding
LineNo OrderNo CustNo StockNo Quantity ------ ------- ------ ------- -------- 40011 4001 123 05683022 1 40012 4001 123 45423609 4 40013 4001 123 68584904 1 40021 4002 456 05683022 1 40022 4002 456 45423509 3 40022 4003 456 80345330 16 40041 4004 123 45423509 1 40042 4004 123 68584904 2 40051 4005 999 80345330 12
The tables can be sharded by the customer number, CustNo
,
in the Customers table, which is the root. The shard containing data pertaining to
customer 123 is shown in the following example tables.
Customers Table Shard With Customer 123 Data
CustNo Name Address Location Class --------- ---------- -------------- ---------- ------ 123 Brown 100 Main St us3 Gold
Orders Table Shard With Customer 123 Data
OrderNo CustNo OrderDate --------- -------- ----------- 4001 123 14-FEB-2013 4004 123 27-MAY-2013
LineItems Table Shard With Customer 123 Data
LineNo OrderNo CustNo StockNo Quantity ------ ------- ------ ------- -------- 40011 4001 123 05683022 1 40012 4001 123 45423609 4 40013 4001 123 68584904 1 40041 4004 123 45423509 1 40042 4004 123 68584904 2
Duplicated Tables
In Oracle Globally Distributed Database a table with the same contents in each shard is called a duplicated table.
A distributed database includes both sharded tables that are horizontally partitioned across shards, and duplicated tables that are replicated to all shards.
Duplicated tables are a good choice for relatively small tables that are not updated frequently, and that are often accessed together with sharded tables. For this reason, duplicated tables usually contain reference information, for example, a Stock Items table that is common to each shard.
For many applications, the number of database requests handled by a single shard can be maximized by duplicating read-only or read-mostly tables across all shards. The combination of sharded tables and duplicated tables enables all transactions associated with a sharding key to be processed by a single shard. This technique enables linear scalability and fault isolation.
As an example of the need for a duplicated table, consider the table family that is
described in Sharded Table Family.
This database schema might also include a Products
table which contains
data that is shared by all the customers in the shards that were created for this table
family, and it cannot be sharded by the customer number. To prevent multi-shard queries
during order processing, the entire table can be duplicated on all shards in the distributed database.
The difference between sharded tables (Customers, Orders, and Line Items) and a duplicated table (Products) is shown in the following figure.
Figure 2-5 Sharded Tables and a Duplicated Table

Description of "Figure 2-5 Sharded Tables and a Duplicated Table"
In the figure above, the Customers, Orders, and Line Items tables are all sharded by a Customer ID number into three shards, illustrated by the colors of the rows in each table at the top of the figure, and the corresponding color of the sharded table in each shard. The duplicated table , Products, shown in gray, is replicated to all of the shards in its entirety, shown by the arrow from the table pointing to each of the three shards.
See Creating Duplicated Tables for more information, limitations, and examples.
Non-Table Objects Created on All Shards
In addition to duplicated tables, other schema objects, such as users, roles, views, indexes, synonyms, functions, procedures, and packages, and non-schema database objects, such as tablespaces, tablespace sets, directories, and contexts, can be created on all shards in an Oracle Globally Distributed Database.
Unlike tables, which require an extra keyword in the CREATE
statement—SHARDED
or DUPLICATED
—other objects are created on all shards using existing syntax. The only requirement is that the SHARD DDL
session property must be enabled.
Note that automatic creation on all shards of the following objects is not supported in this release. These objects can be created by connecting to individual shards.
-
Cluster
-
Control file
-
Database link
-
Disk group
-
Edition
-
Flashback archive
-
Materialized zone map
-
Outline
-
Pfile
-
Profile
-
Restore point
-
Rollback segment
-
Summary
Materialized views and view logs are supported starting in Oracle Database 18c, with the following restrictions:
-
Materialized views created on sharded tables remain empty on the catalog database, while the corresponding materialized views on shards contain data from each of the individual shards.
-
Only the
REFRESH COMPLETE ON DEMAND USING TRUSTED CONSTRAINTS
option is supported for materialized views on sharded tables.