Schema Design Considerations
Design of the Oracle Globally Distributed Database schema has a big impact on performance and scalability. An improperly designed schema can lead to unbalanced distribution of data and workload across shards and large percentage of multi-shard operations.
The data model should be a hierarchical tree structure with a single root table. Oracle Globally Distributed Database supports any number of levels within the hierarchy.
To obtain the benefits of a distributed database, the schema of a distributed database should be designed in a way that maximizes the number of database requests processed on a single shard.
A distributed database schema consists of a sharded table family and duplicated tables with the following characteristics.
Sharded table family
- A set of tables which are equi-partitioned by the sharding key.
- Related data is always stored and moved together.
- Joins and integrity constraint checks are done within a shard.
- The data distribution method and sharding key are based on the application's requirements.
- The sharding key must be included in the primary key.
Duplicated tables
- Non-sharded tables which are replicated to all shards.
- Usually contain common reference data.
- Can be read and updated on each shard.
Planning a Distributed Database Schema Design
Once the distributed database is populated with data, it is impossible to change many attributes of the schema, such as whether a table is sharded or duplicated, sharding key, and so on. Therefore, the following points should be carefully considered before deploying a distributed database.
-
Which tables should be sharded?
-
Which tables should be duplicated?
-
Which sharded table should be the root table?
-
What method should be used to link other tables to the root table?
-
Which data distribution method should be used?
-
Which sharding key should be used?
-
Which super sharding key should be used (if the data distribution method is composite)?