Using SODA ID as the Sharding Key
You can designate the SODA ID as the sharding key when creating the distributed database schema.
The following examples show you how to create a sharded table for the JSON collection, create the SODA mapping, and access the sharded table from an application with Java and Python code samples.
Creating a Sharded Table for the JSON Collection
To create a sharded table that uses the SODA ID as the sharding key:
- Connect to the shard catalog as the Sharding/SODA user.
- Enable
SHARD DDL
. - Create a tablespace set.
- Run
CREATE SHARDED TABLE
, as shown in the example below.
The following example creates a sharded table (Customers) for a JSON collection of customer profile documents (CUSTPROFILE).
A column for the SODA ID (ID) identifies the JSON entries, and is also used as the primary key and sharding key. When creating a JSON entry in the table with SODA, the application populates the ID column with a unique value.
The other columns are the default column names given when SODA creates a table to hold an underlying collection. You can see this for yourself when creating a SODA collection and then examining the created table.
Creating a Sharded Table: System-Managed
/* Enable shard DDL */
ALTER SESSION ENABLE SHARD DDL;
/* Create a tablespace set */
CREATE TABLESPACE SET TSP_SET_1 USING TEMPLATE
(datafile size 100m autoextend on next 10M maxsize unlimited
extent management local segment space management auto);
/* Create the sharded table */
CREATE SHARDED TABLE CUSTOMERS
(
"ID" VARCHAR2(255) NOT NULL,
"CREATED_ON" timestamp default sys_extract_utc(SYSTIMESTAMP) NOT NULL,
"LAST_MODIFIED" timestamp default sys_extract_utc(SYSTIMESTAMP) NOT NULL,
"VERSION" varchar2(255) NOT NULL,
"CUSTPROFILE" JSON,
PRIMARY KEY (ID),
)
TABLESPACE SET TSP_SET_1
PARTITION BY CONSISTENT HASH (ID) PARTITIONS AUTO;
Creating a Sharded Table: User-Defined
If the SODA ID has a meaningful value, then the database can be sharded with the user-defined method, and you can create a sharded table using the example below.
Before creating the sharded table in a user-defined distributed database, ensure that the necessary tablespaces and shardspaces have been created. See User-Defined Data Distribution and Configure the Distributed Database Topology for details about creating distributed database objects.
/* Enable shard DDL */
ALTER SESSION ENABLE SHARD DDL;
/* Create the sharded table */
CREATE SHARDED TABLE CUSTOMERS
(
"ID" VARCHAR2(255) NOT NULL,
"CREATED_ON" timestamp default sys_extract_utc(SYSTIMESTAMP) NOT NULL,
"LAST_MODIFIED" timestamp default sys_extract_utc(SYSTIMESTAMP) NOT NULL,
"VERSION" varchar2(255) NOT NULL,
"CUSTPROFILE" JSON,
PRIMARY KEY (ID),
)
PARTITION BY RANGE (ID)
(PARTITION p1 VALUES LESS THAN ('5000') TABLESPACE ts1,
PARTITION p2 VALUES LESS THAN ('10000') TABLESPACE ts2)
Creating a Mapped SODA Collection on the Sharded Table
Create a mapped SODA collection to let SODA know which columns to use when working with the sharded table.
In this task, you first run a procedure to create the mapped collection, which creates the metadata necessary for SODA to recognize the previously created table as a SODA collection.
Afterwards you run an additional procedure,
sys.exec_shard_plsql()
, which ensures that the map collection
is created on all shards and all future shards.
Creating a SODA Mapped Collection Across All Shards
As the Sharding/SODA user and with SHARD DDL
enabled,
run the following commands on the shard catalog. The shard catalog propagates the
procedure to all of the shards to be processed automatically.
GRANT SODA_APP TO PROCEDURE APP_SCHEMA.COLLECTION_PROC_CUSTOMERS;
create or replace procedure COLLECTION_PROC_CUSTOMERS AS
METADATA varchar2(8000);
COL SODA_COLLECTION_T;
begin METADATA := '{"tableName":"CUSTOMERS",
"keyColumn":{"name":"ID","assignmentMethod" : "CLIENT"},
"contentColumn":{"name":"CUSTPROFILE","sqlType":"JSON"},
"versionColumn":{"name":"VERSION","method":"UUID"},
"lastModifiedColumn":{"name":"LAST_MODIFIED"},
"creationTimeColumn":{"name":"CREATED_ON"},
"readOnly":false}';
-- Create a collection using "map" mode, based on
-- the table you've created above and specified in
-- the custom metadata under "tableName" field.
COL := dbms_soda.create_collection('CUSTOMERS',METADATA,DBMS_SODA.CREATE_MODE_MAP);
end ;
/
exec sys.exec_shard_plsql('app_schema.collection_proc_customers()',4+1);
Note that the keyColumn is mapped as ID, which holds the unique ID of each document. It is designated as CLIENT here because the application will supply a unique key for each document on insert.
At this point, a new collection has been created.
You can run PL/SQL to list the collections. On the shard catalog, run the following commands, and verify that the output lists the CUSTOMERS collection as shown here.
SET SERVEROUTPUT ON
DECLARE
l_coll_list SODA_COLLNAME_LIST_T;
BEGIN
l_coll_list := DBMS_SODA.list_collection_names;
IF l_coll_list.COUNT > 0 THEN
FOR i IN 1 .. l_coll_list.COUNT LOOP
DBMS_OUTPUT.put_line(i || ' : ' || l_coll_list(i));
END LOOP;
END IF;
END;
/
1 : CUSTOMERS
PL/SQL procedure successfully completed.
Code Samples
The following code samples in Java and Python show you how to connect to a shard using the sharding key and insert a new document.
Note that when using SODA in a distributed database environment, new documents should be created by connecting to specific shards, and not using the shard catalog.
Java Code Sample
These Java code samples are created for the "Using SODA ID as the Sharding Key" use case.
The Java sample below shows you how to connect to a shard and insert a JSON document into the collection.
import java.sql.Connection;
import java.util.Properties;
import java.util.List;
// SODA specific imports
import oracle.soda.rdbms.OracleRDBMSClient;
import oracle.soda.OracleDatabase;
import oracle.soda.OracleCursor;
import oracle.soda.OracleCollection;
import oracle.soda.OracleDocument;
import oracle.soda.OracleException;
// Sharding and UCP imports
import oracle.jdbc.OracleShardingKey;
import oracle.jdbc.OracleType;
import oracle.jdbc.pool.OracleDataSource;
import oracle.ucp.jdbc.PoolDataSourceFactory;
import oracle.ucp.jdbc.PoolDataSource;
/*
* The sample demonstrates connecting to a distributed database using
* Oracle JDBC driver and UCP as a client side connection pool.
*/
public class QuickInsertShard {
public static void main(String args[]) throws Exception {
// TNS_ADMIN - Should be the path where the tnsnames.ora file resides
// dbshard_rw - It is the TNS alias present in tnsnames.ora.
// Note that the connection is to the Shard Director (GSM) and the service name is the shard RW service
final String DB_URL="jdbc:oracle:thin:@dbshard_rw?TNS_ADMIN=/home/opc/dbhome/";
// Update the Database Username and Password to the Shard User
final String DB_USER = "app_schema";
String DB_PASSWORD = "<user_password>" ;
// Get the PoolDataSource for UCP
PoolDataSource pds = PoolDataSourceFactory.getPoolDataSource();
// Set the connection factory first before all other properties
pds.setConnectionFactoryClassName(OracleDataSource.class.getName());
pds.setURL(DB_URL);
pds.setUser(DB_USER);
pds.setPassword(DB_PASSWORD);
pds.setConnectionPoolName("JDBC_UCP_POOL");
// Default is 0. Set the initial number of connections to be created
// when UCP is started.
pds.setInitialPoolSize(10);
// Default is 0. Set the minimum number of connections
// that is maintained by UCP at runtime.
pds.setMinPoolSize(10);
// Instead of Max Pool Size, we can set the number of max connections per shard
pds.setMaxConnectionsPerShard(20);
// We cannot get the connection until we have the Shard key which is part of the SQL
//We first set the sharding key or document id explicitly
String shardingKeyVal="10";
// Now we build the connection using this shard key
OracleShardingKey sdkey = pds.createShardingKeyBuilder().subkey(shardingKeyVal, OracleType.VARCHAR2).build();
System.out.println("Initiating UCP and Creating Connection...");
Connection conn = pds.createConnectionBuilder().shardingKey(sdkey).build();
// Enable the SODA Shared Metadata cache
Properties props = new Properties();
props.put("oracle.soda.sharedMetadataCache", "true");
OracleRDBMSClient cl = new OracleRDBMSClient(props);
// Get a DB Connection for use in SODA
OracleDatabase db = cl.getDatabase(conn);
// Print all the Collections in this DB
List<String> names = db.admin().getCollectionNames();
for (String name : names)
System.out.println ("Collection name: " + name);
// Open up the CUSTOMERS Collection
OracleCollection col = db.openCollection("CUSTOMERS");
//For a collection configured with client-assigned document keys,
//you must provide the key for the input document. Build a document with JSON.
OracleDocument cKeyDoc = db.createDocumentFromString(shardingKeyVal, "{\"name\": \"Matilda\", \"State\": \"CA\", \"ZIP\":\"94065\"}");
// Insert the document above
//If the key already identifies a document in the collection
//then this will replace the existing doc.
OracleDocument savedDoc = col.saveAndGet(cKeyDoc);
// Get the document back assuming we only know the key
// We are still connected to the same shard
OracleDocument doc = col.find().key(shardingKeyVal).getOne();
String content = doc.getContentAsString();
System.out.println("Retrieved content is: " + content);
// We are done, so close the connection to the shard
conn.close();
// At this point we could open up a new shard connection using a different sharding key
}} // End of QuickInsertShard
This Java sample shows how you would perform a multi-shard query.
import java.sql.Connection;
import java.util.Properties;
import java.util.List;
// SODA specific imports
import oracle.soda.rdbms.OracleRDBMSClient;
import oracle.soda.OracleDatabase;
import oracle.soda.OracleCursor;
import oracle.soda.OracleCollection;
import oracle.soda.OracleDocument;
import oracle.soda.OracleException;
// Sharding and UCP imports
import oracle.jdbc.OracleShardingKey;
import oracle.jdbc.OracleType;
import oracle.jdbc.pool.OracleDataSource;
import oracle.ucp.jdbc.PoolDataSourceFactory;
import oracle.ucp.jdbc.PoolDataSource;
/*
* The sample demonstrates connecting to a distributed database using
* Oracle JDBC driver and UCP as a client side connection pool.
*/
public class QuickQueryCat {
public static void main(String args[]) throws Exception {
// TNS_ADMIN - Should be the path where the tnsnames.ora file resides
// dbshard_rw - It is the TNS alias present in tnsnames.ora.
// This connection is to the shard director using the catalog service name.
final String DB_URL="jdbc:oracle:thin:@dbcat?TNS_ADMIN=/home/opc/dbhome/";
// Update the Database Username and Password to the Shard User
final String DB_USER = "app_schema";
String DB_PASSWORD = "<user_password>" ;
// Get the PoolDataSource for UCP
PoolDataSource pds = PoolDataSourceFactory.getPoolDataSource();
// Set the connection factory first before all other properties
pds.setConnectionFactoryClassName(OracleDataSource.class.getName());
pds.setURL(DB_URL);
pds.setUser(DB_USER);
pds.setPassword(DB_PASSWORD);
pds.setConnectionPoolName("JDBC_UCP_POOL");
// Now we get a direct connection to the shard catalog
System.out.println("Initiating UCP and Creating Connection...");
Connection conn = pds.getConnection();
// Enable the SODA Shared Metadata cache
Properties props = new Properties();
props.put("oracle.soda.sharedMetadataCache", "true");
OracleRDBMSClient cl = new OracleRDBMSClient(props);
// Get a DB Connection
OracleDatabase db = cl.getDatabase(conn);
// Print all the Collections in this DB
List<String> names = db.admin().getCollectionNames();
for (String name : names)
System.out.println ("Collection name: " + name);
// Open up the CUSTOMERS Collection
OracleCollection col = db.openCollection("CUSTOMERS");
// Do a search across ALL Shards. In this case all users named Matilda
// Setup the specification and open a cursor
OracleDocument filterSpec = db.createDocumentFromString("{ \"name\" : \"Matilda\"}");
OracleCursor c = col.find().filter(filterSpec).getCursor();
// Print the results of the query
while (c.hasNext()) {
OracleDocument resultDoc = c.next();
// Print the document key and document content
System.out.println ("Document key: " + resultDoc.getKey() + "\n" +
" document content: " + resultDoc.getContentAsString());
}
// Close the cursor
c.close();
// Here, we could initiate another multi-shard query if desired
// We are done, so close the connection
conn.close();
}} // End of QuickQueryCat
Python Code Sample
This Python sample shows how you can actually work with JSON objects using SODA in a distributed database environment.
To use this sample code in your environment, follow the instructions to install the cx_Oracle module for Python: https://cx-oracle.readthedocs.io/en/latest/user_guide/installation.html
This example shows how to connect to a shard using the sharding key and insert a new document.
Note that when using SODA in a distributed database environment, new documents should be created by connecting to specific shards and not using the shard catalog.
# import the cx_Oracle module for Python
import cx_Oracle
# Create a connection pool that will be used for connecting to all shards
# The components of the dsn are hostname (shard director),
# port (usually 1522), global service (created with GDSCTL)
# The pool is then created and SODA metadata caching is enabled.
dsn=cx_Oracle.makedsn("shard_director_host",1522,service_name="service_name")
pool=cx_Oracle.SessionPool("app_schema","password",dsn, soda_metadata_cache=True)
# Connect to a specific shard by using the sharding key, which in this example is
# set explicitly with "sodaid", but this might be passed in or part of a loop
# You must know beforehand if you are creating or working with a document for a specific Customer
#
sodaid="2468"
connection=pool.acquire(shardingkey=[sodaid])
# Set autocommit and open the CUSTOMERS collection
connection.autocommit = True
soda = connection.getSodaDatabase()
collection = soda.openCollection("CUSTOMERS")
# Insert a document
# Because you are specifying the shard key, you must pass that in with the document (key=custid)
# The value can be a UUID for example but it need not have any relation to the JSON Content.
content = {'name': 'Matilda', 'State': 'CA', 'ZIP':'94065'}
idcontent=soda.createDocument(content, key=sodaid)
doc = collection.insertOneAndGet(idcontent)
# Fetch the document back by key
doc = collection.find().key(sodaid).getOne()
content = doc.getContent()
print('Retrieved SODA document dictionary is:')
print(content)
# After you have finished, release this connection back into the pool
pool.release(connection)
# If you want to add or work with more customers, start with another connection
# For example: connection=pool.acquire(shardingkey=["123"]) and so on.
#When you are completely finished working with customers you can shut down the pool
pool.close()
This code sample shows you how to run a multi-shard query to return all customer names whose names begin with an "M".
import cx_Oracle
# Create an unpooled connection to the shard catalog
# In general, pooled connections should be used for all connections. This is shown here only as an example.
# The connect string connects to the shard director, but uses the catalog service, e.g. GD$catalog.oradbcloud
connection = cx_Oracle.connect("app_schema","password","db_connect_string")
# Open the CUSTOMERS collection
connection.autocommit = True
soda = connection.getSodaDatabase()
collection = soda.openCollection("CUSTOMERS")
# Now query the collection
# It is important to note that this is a query across ALL shards
# In other words, you will get ALL users whose names start with M
documents = collection.find().filter({'name': {'$like': 'M%'}}).getDocuments()
for d in documents:
content = d.getContent()
print(content["name"])
# Close the connection
connection.close()