Configure the Distributed Database Topology

After the databases for the shard catalog and all of the shards are configured, along with corresponding TNS listeners, you can add the distributed database metadata to the shard catalog database using GDSCTL. The distributed database metadata describes the topology used for the distributed database.

The distributed database topology consists of the data distribution method, replication (high availability) technology, the default number of chunks to be present in the distributed database, the location and number of shard directors, the numbers of shardgroups, shardspaces, regions, and shards in the distributed database, and the global services that will be used to connect to the distributed database.

Keep the Global Data Services Control Utility (GDSCTL) Command Reference in the Oracle Database Global Data Services Concepts and Administration Guide on hand for information about usage and options for the GDSCTL commands used in the configuration procedures.

Follow the procedures listed below, in order, to complete your distributed database topology configuration.

Run the commands from a shard director host, because the GDSCTL command line interface is installed there as part of the shard director (global service manager) installation.

Create the Shard Catalog

Use the GDSCTL CREATE SHARDCATALOG command to create metadata describing the distributed database topology in the shard catalog database.

Note that once you run CREATE SHARDCATALOG, and the rest of the distributed database metadata has been created, there are several metadata properties that cannot be modified without recreating the entire distributed database from scratch. These include the distributed database method (system-managed, user-defined, composite), replication technology (Oracle Data Guard or Raft replication), default number of chunks in the shardspace, and others. Make sure that you consult the GDSCTL reference documentation for the complete list of possible command options and their defaults.

Shard Catalog Connect String

When you run the CREATE SHARDCATALOG command, GDSCTL connects to the shard catalog database with the user name and connect string specified.

If your shard catalog database has an associated standby database for high availability or disaster recovery purposes, the connection string, catalog_connect_string in the examples that follow, should specify all primary and standby databases. If you don't include the standby databases in the connect string, then the shard director processes will not be able to connect to the standby if the primary shard catalog is unavailable.

Note that catalog_connect_string should specify the PDB for the shard catalog database, not the CDB$ROOT.

The following is a simple tnsnames.ora entry.

CATALOG_CONNECT_STRING=
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = tcp)(HOST = primary_catalog)(PORT = 1521))
      (ADDRESS = (PROTOCOL = tcp)(HOST = standby_catalog)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = catpdb.example.com)
    )
  )

Creating the Shard Catalog

Run CREATE SHARDCATALOG with the settings appropriate for your planned distributed database topology.

System-Managed Distribution Method

In the following example, the distributed database metadata is created for a system-managed configuration with two regions named region1 and region2. Because system-managed is the default distribution method, it does not need to be specified with the -sharding parameter.

GDSCTL> create shardcatalog -database catalog_connect_string
 -user mysdbadmin/mysdbadmin_password -repl DG -region region1,region2

Note also that if -shardspace is not specified, a default shardspace named shardspaceora is created. If -region is not specified, the default region named regionora is created. If the single default region is created along with the default shardspace, then a default shardgroup named shardspaceora_regionora is also created in the shardspace.

For replication (-repl) with system-managed distribution, you can choose either Oracle Data Guard (DG) or Raft replication (native).

Composite Distribution Method

The following example shows you how to create shard catalog metadata for a composite distributed database with Data Guard replication in MaxAvailability protection mode, 60 chunks per shardspace, and two shardspaces.

GDSCTL> create shardcatalog -database catalog_connect_string
 -user mysdbadmin/mysdbadmin_password -sharding composite -chunks 60 
 -protectmode maxavailability -shardspace shardspace1,shardspace2

User-Defined Distribution Method

The next example shows you how to create shard catalog metadata for a user-defined distributed database with Data Guard replication.

GDSCTL> create shardcatalog -database catalog_connect_string
 -user mysdbadmin/mysdbadmin_password -sharding user
 -protectmode maxperformance 

Consult the GDSCTL documentation or run GDSCTL HELP CREATE SHARDCATALOG for more details about the command usage.

Replication Settings

Oracle Data Guard can be used with any data distribution method, and is configured in the CREATE SHARDCATALOG command with -repl DG.

Raft replication requires a bit more planning, but it is also enabled in CREATE SHARDCATALOG command with -repl native. See Raft Replication Configuration and Management for additional configurable attributes.

Future Connections to the Shard Catalog

GDSCTL stores the credentials for the shard catalog administrator in a wallet on the local host. However, for subsequent GDSCTL sessions on other hosts, it may be necessary to explicitly connect to the shard catalog in order to perform administrative tasks by running the GDSCTL CONNECT command, as shown here.

GDSCTL> connect mysdbadmin/mysdbadmin_password@catalog_connect_string

Add and Start Shard Directors

Add to the configuration the shard directors, which will monitor the distributed database system and run background tasks in response to GDSCTL commands and other events, and start them.

The following commands must be run on the host where the shard director processes are to run. This can be the shard catalog host or a dedicated host for the shard director processes.

  1. Add and start a shard director (GSM), as shown in the following example.
    GDSCTL> connect mysdbadmin/mysdbadmin_password@catalog_connect_string
    GDSCTL> add gsm -gsm sharddirector1 -catalog catalog_connect_string -pwd gsmcatuser_password
    GDSCTL> start gsm -gsm sharddirector1
    

    The value for the -gsm parameter is the name that you will be using to reference this shard director in later GDSCTL commands. The values for the -catalog and -pwd parameters should be the same used when you created the shard catalog database.

    Use the -listener, -localons, and -remoteons parameters as described in the GDSCTL reference to override the default port numbers of 1522, 6123, and 6234, respectively. Always confirm that the port numbers to be used, whether default or user-specified, are available on the host and do not conflict with other running software or Oracle listeners.

  2. Repeat the ADD GSM and START GSM commands for any additional shard directors on each shard director host.

    Replace the shard director name (that is, sharddirector1 in the example) with an appropriate value for each shard director.

    If more than one shard director is used, then multiple regions must have been created for them in the CREATE SHARDCATALOG command, or you can add them later by running ADD REGION.

    Specify a region for each shard director with the -region parameter on each ADD GSM command, as shown here.

    GDSCTL> add gsm -gsm sharddirector2 -catalog catalog_connect_string -pwd gsmcatuser_password -region dc2

For later GDSCTL sessions, you might need to explicitly specify the shard director to be administered. If an error message is shown referencing the default GSMORA shard director, run GDSCTL SET GSM before continuing, as shown here.

GDSCTL> set gsm -gsm sharddirector1

Add Shardspaces If Needed

If you are using composite or user-defined data distribution, and you need to add more shardspaces to complete your desired distributed database topology, use the ADD SHARDSPACE command to add additional shardspaces.

  • Run ADD SHARDSPACE as shown here.
    GDSCTL> add shardspace -shardspace shardspace2 

    By default, the ADD SHARDSPACE command inherits the -chunks and -protectmode values that you used in the CREATE SHARDCATALOG command. You can specify, on a per-shardspace basis,the number of chunks and the Data Guard protection mode by using the -chunks and -protectmode parameters with ADD SHARDSPACE.

Add Shardgroups If Needed

If your distributed database topology uses the system-managed or composite data distribution method, you can add any necessary additional shardgroups for your application.

Each shardspace must contain at least one primary shardgroup and may contain any number or type of standby shardgroups. Shardgroups are not used in the user-defined distribution method.

  • Run ADD SHARDGROUP to add shardgroups to the configuration.
    GDSCTL> add shardgroup -shardgroup shardgroup_primary -shardspace shardspace1
     -deploy_as primary -region region1
    GDSCTL> add shardgroup -shardgroup shardgroup_standby -shardspace shardspace1
     -deploy_as active_standby -region region2
    

    Note that when you run ADD SHARDGROUP you can specify one of three types of shardgroups: primary, standby (mounted, not open), and active_standby (open, available for queries) using the -deploy_as parameter (the default is standby).

    Any shards subsequently added to the shardgroup must be opened in the mode corresponding to the -deploy_as setting for the shardgroup. For example, read-write for primary shardgroups, mounted for standby shardgroups, or read-only with apply for active standby shardgroups.

    After shards are deployed, their current mode is monitored by the shard directors and communicated to the shard catalog such that it is possible and expected that shards of different open modes may be in the same shardgroup, depending upon subsequent switchover or failover operations.

Verify the Distributed Database Topology

Before adding information about your shard databases to the catalog, verify that your distributed database topology is correct before proceeding by using the various GDSCTL CONFIG commands.

Once shards are added and deployed, it is no longer possible to change much of the shard catalog metadata, so validating your configuration is an important task at this point.

  • Run GDSCTL CONFIG to view overall configuration information.
    GDSCTL> config
    
    Regions
    ------------------------
    region1                       
    region2                       
    
    GSMs
    ------------------------
    sharddirector1                          
    sharddirector2                          
    
    Sharded Database
    ------------------------
    orasdb                     
    
    Databases
    ------------------------ 
    
    Shard Groups
    ------------------------
    shardgroup_primary                         
    shardgroup_standby                         
    
    Shard spaces
    ------------------------
    shardspaceora                         
    
    Services
    ------------------------
    
    GDSCTL pending requests
    ------------------------
    Command                   Object                  Status
    -------                   ------                  ------
    
    Global properties
    ------------------------
    Name: oradbcloud
    Master GSM: sharddirector1
    DDL sequence #: 0

    You can use the various GDSCTL CONFIG commands to display more information about shardspaces, shardgroups, and other shard catalog objects. For a complete list of GDSCTL CONFIG command variants, see the GDSCTL reference documentation or run GDSCTL HELP.

Add the Shard CDBs

Add the CDBs containing the shard PDBs to the distributed database configuration with the ADD CDB command.

  1. Run the ADD CDB command as shown here.
    GDSCTL> add cdb -connect cdb_connect_string -pwd gsmrootuser_password

    This command causes GDSCTL to connect to GSMROOTUSER/gsmrootuser_password@cdb_connect_string as SYSDG to validate settings and to retrieve the DB_UNIQUE_NAME of the CDB, which will become the CDB name in the shard catalog.

  2. Repeat the ADD CDB command for all of the CDBs that contain a shard PDB in the configuration.
  3. When all of the CDBs are added, run GDSCTL CONFIG CDB to display a list of CDBs in the catalog.
    GDSCTL> config cdb

Add the Shard PDBs

Use the ADD SHARD command to add the shard PDB information to the shard catalog, then verify it with the CONFIG SHARD command.

  1. Run ADD SHARD with the usage appropriate to your data distribution method, as shown in the following examples.

    For system-managed or composite distribution, run ADD SHARD with the parameters shown here.

    
    GDSCTL> add shard -connect shard_connect_string -pwd gsmuser_password 
    -shardgroup shardgroup_name -cdb cdb_name
    

    For user-defined distribution, the command usage is slightly different.

    GDSCTL> add shard -connect shard_connect_string -pwd gsmuser_password 
    -shardspace shardspace_name -deploy_as db_mode -cdb cdb_name
    

    The -cdb parameter specifies the name of the CDB in which the shard PDB exists, -shardgroup or -shardspace specifies the location of the shard in your distributed database topology, and -deploy_as specifies the open mode (primary, standby, active_standby) of the shard.

    Note:

    It is highly recommended that you set server=dedicated in the connect string.

    When you run ADD SHARD, GDSCTL connects to GSMUSER/gsmuser_password@shard_connect_string as SYSDG to validate the settings on the shard, re-runs dbms_gsm_fix.validateShard to check for errors, and constructs the shard name using the convention db_unique_name_of_CDB_PDB_name (for example cdb1_pdb1).

    Finally, the metadata that describes the shard is added to the shard catalog.

  2. Run GDSCTL CONFIG SHARD to view the shard metadata on the shard catalog.
    GDSCTL> config shard
    Name      Shard Group          Status    State    Region    Availability
    --------- -------------------  ------    -----    ------    ------------
    cdb1_pdb1 shardgroup_primary   U         none     region1   -
    cdb2_pdb1 shardgroup_standby   U         none     region2   -
    cdb3_pdb2 shardgroup_primary   U         none     region1   -
    cdb4_pdb2 shardgroup_standby   U         none     region2   -
    

    Note that the value for Status is U for “undeployed”, and State and Availability are none and - until the DEPLOY command is successfully run.

Add Host Metadata

Add all of the host names and IP addresses of your shard hosts to the shard catalog.

As part of the deployment process, the shard director contacts the shards and directs them to register with the shard director’s TNS listener process. This listener process only accepts incoming registration requests from trusted sources and will reject registration requests from unknown hosts.

If your shard hosts have multiple host names or network interfaces assigned to them, it is possible that the incoming registration request to the shard director may come from a host that was not automatically added during ADD SHARD. In this case, the registration request is rejected and the shard will not deploy correctly. The visible symptom of this problem will be that CONFIG SHARD shows PENDING for the shard’s Availability after DEPLOY has completed.

To avoid this issue, use the GDSCTL ADD INVITEDNODE command to manually add all host names and IP addresses of your shard hosts to the shard catalog metadata.

  1. View a list of trusted hosts.

    By default, the ADD SHARD command adds the default host name of the shard host to the shard catalog metadata, so that any registration requests from that host to the shard director will be accepted. You can view the list of trusted hosts by running the GDSCTL CONFIG VNCR command.

    GDSCTL> config vncr
  2. Ping from all of the hosts in the configuration to verify successful host name resolution.

    Any hosts listed in the CONFIG VNCR output must be reachable by name from all of the other hosts in the topology. Use the ping command from the shard, shard catalog, and shard director hosts to verify that hostname resolution succeeds for all of the host names listed.

    To resolve any issues, use operating system commands or settings to ensure that all of the host names can be resolved.

  3. Run the REMOVE INVITEDNODE command to manually remove any host names that are not necessary and cannot be resolved from all of the hosts.
  4. Run the ADD INVITEDNODE command to manually add all host names and IP addresses of your shard hosts to the shard catalog metadata.
    GDSCTL> add invitednode 127.0.0.1

Check Free DB_FILES

Verify that there are enough free data files in each shard to make sure there is enough capacity to create the number of chunks and tablespace sets you need for the distributed database.

To check free DB_FILES and parameter setting:

SQL> select count(*) from v$datafile;

  COUNT(*)
----------
XxxXX

SQL> show parameter db_files

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
db_files			     integer	 200

Formulas to calculate the DB_FILES setting can be found in Create the Shard Databases.