14 Developing Applications with Graph Analytics
In order to run graph algorithms, the graph application connects to the graph server (PGX) in the middle tier, which in turn connects to the Oracle Database.
Note:
You can only load property graph views into the graph server (PGX). Loading a SQL property graph into the graph server (PGX) is not supported.- About Vertex and Edge IDs
 The graph server (PGX) enforces by default the existence of a unique identifier for each vertex and edge in a graph.
- Graph Management in the Graph Server (PGX)
 You can load a graph into the graph server (PGX) and perform different actions such as publish, store, or delete a graph.
- Keeping the Graph in Oracle Database Synchronized with the Graph Server
 You can use theFlashbackSynchronizerAPI to automatically apply changes made to graph in the database to the correspondingPgxGraphobject in memory, thus keeping both synchronized.
- Optimizing Graphs for Read Versus Updates in the Graph Server (PGX)
 The graph server (PGX) can store an optimized graph for other reads or updates. This is only relevant when the updates are made directly to a graph instance in the graph server.
- Executing Built-in Algorithms
 The graph server (PGX) contains a set of built-in algorithms that are available as Java APIs.
- Using Custom PGX Graph Algorithms
 A custom PGX graph algorithm allows you to write a graph algorithm in Java syntax and have it automatically compiled to an efficient parallel implementation.
- Creating Subgraphs
 You can create subgraphs based on a graph that has been loaded into memory. You can use filter expressions or create bipartite subgraphs based on a vertex (node) collection that specifies the left set of the bipartite graph.
- Using Automatic Delta Refresh to Handle Database Changes
 You can automatically refresh (auto-refresh) graphs periodically to keep the in-memory graph synchronized with changes to the property graph stored in the property graph tables in Oracle Database (VT$ and GE$ tables).
- User-Defined Functions (UDFs) in PGX
 User-defined functions (UDFs) allow users of PGX to add custom logic to their PGQL queries or custom graph algorithms, to complement built-in functions with custom requirements.
- Using the Graph Server (PGX) as a Library
 When you utilize PGX as a library in your application, the graph server (PGX) instance runs in the same JVM as the Java application and all requests are translated into direct function calls instead of remote procedure invocations.
Parent topic: Using the Graph Server (PGX)
14.1 About Vertex and Edge IDs
The graph server (PGX) enforces by default the existence of a unique identifier for each vertex and edge in a graph.
PgxGraph.getVertex(ID id) and
        PgxGraph.getEdge(ID id), or by PGQL queries using the built-in
        id() method.
                     vertex_id_strategy and edge_id_strategy:
                        - keys_as_ids: This is the default strategy to generate vertex IDs.
- partitioned_ids: This is the recommended strategy for partitioned graphs.
- unstable_generated_ids: This results in system generated vertex or edge IDs.
- no_ids: This strategy disables vertex or edge IDs and therefore prevents you from calling APIs using vertex or edge IDs.
Using keys to generate IDs
The default strategy to generate the vertex IDs is to use the keys provided during
        loading of the graph (keys_as_ids). In that case, each vertex should have a
        vertex key that is unique across all providers.
                     
For edges, by default no keys are required in the edge data, and edge IDs will
        be automatically generated by PGX (unstable_generated_ids). This automatic
        ID generation can be applied for vertex IDs also. Note that the generation of vertex or edge
        IDs is not guaranteed to be deterministic. If required, it is also possible to load edge
        keys as IDs.
                     
The partitioned_ids strategy requires keys to be unique only
          within a vertex or edge provider (data source). The keys do not have to be globally
        unique. Globally unique IDs are derived from a combination of the provider name and the key
        inside the provider, as
          <provider_name>(<unique_key_within_provider>).
        For example, Account(1).
                     
The partititioned_ids strategy can be set through the configuration fields vertex_id_strategy and edge_id_strategy. For example,
                     
{
  "name": "bank_graph_analytics",
  "optimized_for": "updates",
  "vertex_id_strategy" : "partitioned_ids",
  "edge_id_strategy" : "partitioned_ids",
  "vertex_providers": [
    {
      "name": "Accounts",
      "format": "rdbms",
      "database_table_name": "BANK_ACCOUNTS",
      "key_column": "ID",
      "key_type": "integer",
      "props": [
        {
          "name": "ID",
          "type": "integer"
        },
        {
          "name": "NAME",
          "type": "string"
        }
      ],
      "loading": {
        "create_key_mapping" : true
      }
    }
  ],
  "edge_providers": [
    {
      "name": "Transfers",
      "format": "rdbms",
      "database_table_name": "BANK_TXNS",
      "key_column": "ID",
      "source_column": "FROM_ACCT_ID",
      "destination_column": "TO_ACCT_ID",
      "source_vertex_provider": "Accounts",
      "destination_vertex_provider": "Accounts",
      "props": [
         {
          "name": "ID",
          "type": "integer"
        },
        {
          "name": "AMOUNT",
          "type": "double"
        }
      ],
      "loading": {
        "create_key_mapping" : true
      }
    }
  ]
}Note:
All available key types are supported in combination with partitioned IDs.After the graph is loaded, PGX maintains information about which property of a
        provider corresponds to the key of the provider. In the preceding example, the vertex
        property ID happens to correspond to the vertex key and also the edge
        property ID happens to correspond to the edge key. Each provider can have
        at most one such "key property" and the property can have any name.
                     
vertex key property ID cannot be updatedUsing an auto-incrementer to generate partitioned IDs
It is recommended to always set create_key_mapping to
          true to benefit from performance optimizations. But if there are no
        single-column keys for edges, create_key_mapping can be set to
          false. Similarly, create_key_mapping can be set to
          false for vertex providers also. IDs will be generated via an
        auto-incrementer, for example Accounts(1), Accounts(2),
          Accounts(3).
                     
See PGQL Queries with Partitioned IDs for more information on executing PGQL queries with partitioned IDs.
Parent topic: Developing Applications with Graph Analytics
14.2 Graph Management in the Graph Server (PGX)
You can load a graph into the graph server (PGX) and perform different actions such as publish, store, or delete a graph.
- Reading Graphs from Oracle Database into the Graph Server (PGX)
 Once logged into the graph server (PGX), you can read graphs from the database into the graph server.
- Storing a Graph Snapshot on Disk
 After reading a graph into memory, you can make any changes to the graph (such as running the PageRank algorithm and storing the values as vertex properties), and then store this snapshot of the graph on disk.
- Publishing a Graph
 You can publish a graph that can be referenced by other sessions.
- Deleting a Graph
 In order to reduce the memory usage of the graph server (PGX), the session must drop the unused graph objects created through thegetGraph()method, by invoking thedestroy()method.
Parent topic: Developing Applications with Graph Analytics
14.2.1 Reading Graphs from Oracle Database into the Graph Server (PGX)
Once logged into the graph server (PGX), you can read graphs from the database into the graph server.
Your database user must exist and have read access on the graph data in the database.
There are several ways to load a property graph view into the graph server (PGX) from Oracle Database:
- Using the readGraphByNameAPI - see Loading a PG View Using the readGraphByName API for more details.
- Using the PGQL CREATE PROPERTY GRAPHstatement - see Creating a Property Graph Using PGQL for more details.
- Using the PgViewSubgraphReader#fromPgViewAPI to create and load a subgraph - see Loading a Subgraph from Property Graph Views for more details.
- Using a PGX graph configuration file in JSON format - see Loading a Graph Using a JSON Configuration File for more details.
- Using the GraphConfigBuilder class to create Oracle RDBMS graph configurations programmatically through Java methods - see Loading a Graph by Defining a Graph Configuration Object for more details.
- Reading Entity Providers at the Same SCN
 If you have a graph which consists of multiple vertex or edge tables or both, then you can read all the vertices and edges at the same System Change Number (SCN).
- Progress Reporting and Estimation for Graph Loading
 Loading a large graph into the graph server(PGX) can be a long running operation. However, if you load the graph using an asynchronous action, then you can monitor the progress of the graph loading operation.
- API for Loading Graphs into Memory
 Learn about the APIs used for loading a graph using a JSON configuration file or graph configuration object.
- Graph Configuration Options
 Learn about the graph configuration options.
- Data Loading Security Best Practices
 Loading a graph from the database requires authentication and it is therefore important to adhere to certain security guidelines when configuring access to this kind of data source.
- Data Format Support Matrix
 Learn about the different data formats supported in the graph server (PGX).
- Immutability of Loaded Graphs
 Once the graph is loaded into the graph server (PGX), the graph and its properties are automatically marked as immutable.
Parent topic: Graph Management in the Graph Server (PGX)
14.2.1.1 Reading Entity Providers at the Same SCN
If you have a graph which consists of multiple vertex or edge tables or both, then you can read all the vertices and edges at the same System Change Number (SCN).
This helps to overcome issues such as reading edge providers at a later SCN than the SCN at which the vertices were read, as some edges may reference missing vertices.
Note that reading a graph from the database is still possible even if Flashback is not enabled on Oracle Database. In case of multiple databases, SCN can be used to maintain consistency for entity providers belonging to the same database only.
You can use the as_of flag in the graph configuration to specify at what
            SCN an entity provider must be read. The valid values for the as_of
            flag are as follows:
                        
Table 14-1 Valid values for "as_of" Key in Graph Configuration
| Value | Description | 
|---|---|
| A positive longvalue | This is a parseable SCN value. | 
| "<current-scn>" | The current SCN is determined at the beginning of the graph loading. | 
| "<no-scn>" | This is to disable SCN at the time of graph loading. | 
| null | This defaults to "<current-scn>"behavior. | 
If "as_of" is omitted for a vertex or an edge provider in
            the graph configuration file, then this follows the same behavior as "as_of":
                null.
                        
Example 14-1 Graph Configuration Using
                    "as_of" for Vertex and Edge Providers in the Same
                Database
                           
The following example configuration has three vertex providers and one edge provider pointing to the same database.
{
  "name": "employee_graph",
  "vertex_providers": [
    {
      "name": "Department",
      "as_of": "<current-scn>",
      "format": "rdbms",
      "database_table_name": "DEPARTMENTS",
      "key_column": "DEPARTMENT_ID",
      "props": [
        {
          "name": "DEPARTMENT_NAME",
          "type": "string"
        }
      ]
    },
    {
      "name": "Location",
      "as_of": "28924323",
      "format": "rdbms",
      "database_table_name": "LOCATIONS",
      "key_column": "LOCATION_ID",
      "props": [
        {
          "name": "CITY",
          "type": "string"
        }
      ]
    },
    {
      "name": "Region",
      "as_of": "<no-scn>",
      "format": "rdbms",
      "database_table_name": "REGIONS",
      "key_column": "REGION_ID",
      "props": [
        {
          "name": "REGION_NAME",
          "type": "string"
        }
      ]
    }
  ],
  "edge_providers": [
    {
      "name": "LocatedAt",
      "format": "rdbms",
      "database_table_name": "DEPARTMENTS",
      "key_column": "DEPARTMENT_ID",
      "source_column": "DEPARTMENT_ID",
      "destination_column": "LOCATION_ID",
      "source_vertex_provider": "Department",
      "destination_vertex_provider": "Location"
    }
  ]
}When reading the employee_graph using the preceding
                configuration file, the graph is read at the same SCN for the
                    Department and LocatedAt entity providers.
                This is explained in the following table: 
                           
Table 14-2 Example Scenario Using "as_of"
| Entity Provider | "as_of" | SCN Value | 
|---|---|---|
| Department | "<current-scn>" | SCN determined automatically | 
| Location | "28924323" | "28924323"used as SCN | 
| Region | "<no-scn>" | No SCN used | 
| LocatedAt | "as_of"flag is omitted | SCN determined automatically | 
The current SCN value of the database can be determined using one of the following options:
- Querying V$DATABASEview:SELECT CURRENT_SCN FROM V$DATABASE;
- Using DBMS_FLASHBACKpackage:SELECT DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER FROM DUAL;
If you do not have the required privileges to perform either of the preceding operations, then you can use:
SELECT TIMESTAMP_TO_SCN(SYSDATE) FROM DUAL;However, note that this option is less precise than the earlier two options.
You can then read the graph into the graph server using the JSON configuration file as shown:
opg4j> var g = session.readGraphWithProperties("employee_graph.json")PgxGraph g = session.readGraphWithProperties("employee_graph.json");g = session.read_graph_with_properties("employee_graph.json")14.2.1.2 Progress Reporting and Estimation for Graph Loading
Loading a large graph into the graph server(PGX) can be a long running operation. However, if you load the graph using an asynchronous action, then you can monitor the progress of the graph loading operation.
The following table shows the asynchronous graph loading APIs supported for the following formats:
Table 14-3 Asynchronous Graph Loading APIs
| Data Format | API | 
|---|---|
| PG VIEWS | session.readGraphByNameAsync() | 
| PG SCHEMA | session.readGraphWithPropertiesAsync() | 
| CSV | session.readGraphFileAsync() | 
These supported APIs return a PgxFuture object.
                        
You can then use the PgxFuture.getProgress() method to collect the
            following statistics:
                        
- Report on the progress of the graph loading operation
- Estimate of the remaining vertices and edges that need to be loaded into memory
For example, the following code shows the steps to load a PG view graph
            asynchronously and subsequently obtain the FutureProgress object to
            report and estimate the loading progress. However, note that the graph loading estimate
            (for example, the number of loaded entities and providers or the number of total
            entities and providers) can be obtained only until the graph loading operation is in
            progress. Also, the system internally computes the graph loading progress for every
            10000 entries of entities that are loaded into the graph server (PGX).
                        
opg4j>  var graphLoadingFuture = session.readGraphByNameAsync("BANK_GRAPH_VIEW", GraphSource.PG_VIEW)
readGraphFuture ==> oracle.pgx.api.PgxFuture@6106dfb6[Not completed]
opg4j> while (!graphLoadingFuture.isDone()) {
...>   var progress = graphLoadingFuture.getProgress();
...>   var graphLoadingProgress = progress.asGraphLoadingProgress();
...>   if (graphLoadingProgress.isPresent()) {
...>     var numLoadedVertices = graphLoadingProgress.get().getNumLoadedVertices();
...>   }
...>   Thread.sleep(1000);
...> }
opg4j> var graph = graphLoadingFuture.get();
graph ==> PgxGraph[name=BANK_GRAPH_VIEW_3,N=999,E=4993,created=1664289985985]
PgxFuture<PgxGraph> graphLoadingFuture = session.readGraphByNameAsync("BANK_GRAPH_VIEW", GraphSource.PG_VIEW);
while (!graphLoadingFuture.isDone()) {
  FutureProgress progress = graphLoadingFuture.getProgress();
  Optional < GraphLoadingProgress > graphLoadingProgress = progress.asGraphLoadingProgress();
  if (graphLoadingProgress.isPresent()) {
    long numLoadedVertices = graphLoadingProgress.get().getNumLoadedVertices();
  }
  Thread.sleep(1000);
}
PgxGraph graph = graphLoadingFuture.get();It is recommended that you do not use the FutureProgress object in a
                chain of asynchronous operations.
                           
14.2.1.3 API for Loading Graphs into Memory
Learn about the APIs used for loading a graph using a JSON configuration file or graph configuration object.
The following methods in PgxSession can be used to load graphs into
            the graph server (PGX).
                        
PgxGraph readGraphWithProperties(String path)
PgxGraph readGraphWithProperties(String path, String newGraphName)
PgxGraph readGraphWithProperties(GraphConfig config)
PgxGraph readGraphWithProperties(GraphConfig config, String newGraphName)
PgxGraph readGraphWithProperties(GraphConfig config, boolean forceUpdateIfNotFresh)
PgxGraph readGraphWithProperties(GraphConfig config, boolean forceUpdateIfNotFresh, String newGraphName)
PgxGraph readGraphWithProperties(GraphConfig config, long maxAge, TimeUnit maxAgeTimeUnit)
PgxGraph readGraphWithProperties(GraphConfig config, long maxAge, TimeUnit maxAgeTimeUnit, boolean blockIfFull, String newGraphName)read_graph_with_properties(self, config, max_age=9223372036854775807, max_age_time_unit='days',
                                   block_if_full=False, update_if_not_fresh=True, graph_name=None)The first argument (path to a graph configuration file
                or a parsed config object) is the meta-data of the graph to be
                read. The meta-data includes the following information:
                           
- Location of the graph data such as file location and name, DB location and connection information, and so on
- Format of the graph data such as plain text formats, XML-based formats, binary formats, and so on
- Types and Names of the properties to be loaded
The forceUpdateIfNotFresh and maxAge
                arguments can be used to fine-control the age of the snapshot to be read. The graph
                server (PGX) will return an existing graph snapshot if the given graph specification
                was already loaded into memory by a different session. So, the
                    maxAge argument becomes important if reading from a database in
                which the data might change frequently. If no forceUpdateIfNotFresh
                or maxAge is specified, PGX will favor cached data over reading new
                snapshots into memory.
                           
14.2.1.4 Graph Configuration Options
Learn about the graph configuration options.
The following table lists the JSON fields that are common to all graph configurations:
Table 14-4 Graph Config JSON Fields
| Field | Type | Description | Default | 
|---|---|---|---|
| name | string | Name of the graph. | Required | 
| array_compaction_threshold | number | [only relevant if the graph is optimized for updates] Threshold used to determined when to compact the delta-logs into a new array. If lower than the engine min_array_compaction_threshold value,min_array_compaction_thresholdwill be used instead | 0.2 | 
| attributes | object | Additional attributes needed to read and write the graph data. | null | 
| data_source_id | string | Data source id to use to connect to an RDBMS instance. | null | 
| edge_id_strategy | enum[no_ids, keys_as_ids, unstable_generated_ids] | Indicates what ID strategy should be used for the edges of this graph. If not specified (or set to null), the strategy will be determined during loading or using a default value. | null | 
| edge_id_type | enum[long] | Type of the edge ID. Setting it to longrequires the IDs in the edge providers to be unique across the graphs; those IDs will be used as global IDs. Setting it tonull(or omitting it) will allow repeated IDs across different edge providers and PGX will automatically generate globally-unique IDs for the edges. | null | 
| edge_providers | array of object | List of edge providers in this graph. | [] | 
| error_handling | object | Error handling configuration. | null | 
| external_stores | array of object | Specification of the external stores where external string properties reside. | [] | 
| jdbc_url | string | JDBC URL pointing to an RDBMS instance | null | 
| keystore_alias | string | Alias to the keystore to use when connecting to database. | null | 
| loading | object | Loading-specific configuration to use. | null | 
| local_date_format | array of string | array of local_dateformats to use when loading and storinglocal_dateproperties. See DateTimeFormatter for more details of the format string | [] | 
| max_prefetched_rows | integer | Maximum number of rows prefetched during each round trip resultset-database. | 10000 | 
| num_connections | integer | Number of connections to read and write data from or to the RDBMS table. | <no-of-cpus> | 
| optimized_for | enum[read, updates] | Indicates if the graph should use data-structures optimized for read-intensive scenarios or for fast updates. | read | 
| password | string | Password to use when connecting to database. | null | 
| point2d | string | Longitude and latitude as floating point values separated by a space. | 0.0 0.0 | 
| prepared_queries | array of object | An additional list of prepared queries with arguments, working in the same way as 'queries'. Data matching at least one those queries will also be loaded. | [] | 
| queries | array of string | A list of queries used to determine which data to load from the database. Data matching at least one of the queries will be loaded. Not setting any query will load the entire graph. | [] | 
| redaction_rules | array of object | Array of redaction rules. | [] | 
| rules_mapping | array of object | Mapping for redaction rules to users and roles. | [] | 
| schema | string | Schema to use when reading or writing RDBMS objects | null | 
| source_name | string | Name of the database graph, if the graph is loaded from a database. | null | 
| source_type | enum[pg_view] | Source type for database graphs. | null | 
| time_format | array of string | The timeformat to use when loading and storing time properties. See DateTimeFormatter for a documentation of the format string. | [] | 
| time_with_timezone_format | array of string | The time with timezoneformat to use when loading and storing time with timezone properties. Please see DateTimeFormatter for more information of the format string. | [] | 
| timestamp_format | array of string | The timestampformat to use when loading and storing timestamp properties. See DateTimeFormatter for more information of the format string. | [] | 
| timestamp_with_timezone_format | array of string | The timestamp with timezoneformat to use when loading and storingtimestamp with timezoneproperties. See DateTimeFormatter for more information of the format string. | [] | 
| username | string | Username to use when connecting to an RDBMS instance. | null | 
| vector_component_delimiter | character | Delimiter for the different components of vector properties. | ; | 
| vertex_id_strategy | enum[no_ids, keys_as_ids, unstable_generated_ids] | Indicates what ID strategy should be used for the vertices of this graph. If not specified (or set to null), the strategy will be automatically detected. | null | 
| vertex_id_type | enum[int, integer, long, string] | Type of the vertex ID. For homogeneous graphs, if not specified (or set to null), it will default to a specific value (depending on the origin of the data). | null | 
| vertex_providers | array of object | List of vertex providers in this graph. | [] | 
Note:
Database connection fields specified in the graph configuration will be used as default in case underlying data provider configuration does not specify them.Provider Configuration JSON file Options
You can specify the meta-information about each provider's data using provider configurations. Provider configurations include the following information about the provider data:
- Location of the data: a file, multiple files or database providers
- Information about the properties: name and type of the property
Table 14-5 Provider Configuration JSON file Options
| Field | Type | Description | Default | 
|---|---|---|---|
| format | enum[pgb, csv, rdbms] | Provider format. | Required | 
| name | string | Entity provider name. | Required | 
| attributes | object | Additional attributes needed to read and write the graph data. | null | 
| destination_vertex_provider | string | Name of the destination vertex provider to be used for this edge provider. | null | 
| error_handling | object | Error handling configuration. | null | 
| has_keys | boolean | Indicates if the provided entities data have keys. | true | 
| key_type | enum[int, integer, long, string] | Type of the keys. | long | 
| keystore_alias | string | Alias to the keystore to use when connecting to database. | null | 
| label | string | label for the entities loaded from this provider. | null | 
| loading | object | Loading-specific configuration. | null | 
| local_date_format | array of string | Array of local_date formats to use when loading and storing local_date properties. See DateTimeFormatter for a documentation of the format string. | [] | 
| password | string | Password to use when connecting to database. | null | 
| point2d | string | Longitude and latitude as floating point values separated by a space. | 0.0 0.0 | 
| props | array of object | Specification of the properties associated with this entity provider. | [] | 
| source_vertex_provider | string | Name of the source vertex provider to be used for this edge provider. | null | 
| time_format | array of string | The time format to use when loading and storing time properties. See DateTimeFormatter for a documentation of the format string. | [] | 
| time_with_timezone_format | array of string | The time with timezone format to use when loading and storing time with timezone properties. See DateTimeFormatter for a documentation of the format string. | [] | 
| timestamp_format | array of string | The timestamp format to use when loading and storing timestamp properties. See DateTimeFormatter for a documentation of the format string. | [] | 
| timestamp_with_timezone_format | array of string | The timestamp with timezone format to use when loading and storing timestamp with timezone properties. See DateTimeFormatter for a documentation of the format string. | [] | 
| vector_component_delimiter | character | Delimiter for the different components of vector properties. | ; | 
Provider Labels
The label field in the provider configuration can be used to set a label for the entities loaded from the provider. If no label is specified, all entities from the provider are labeled with the name of the provider. It is only possible to set the same label for two different providers if they have exactly the same properties (same names and same types).
                           
Property Configuration
The props entry in the Provider configuration is an object with the following JSON fields:
                           
Table 14-6 Property Configuration
| Field | Type | Description | Default | 
|---|---|---|---|
| name | string | Name of the property. | Required | 
| type | enum[boolean, integer, vertex, edge, float, long, double, string, date, local_date, time, timestamp, time_with_timezone, timestamp_with_timezone, point2d] | Type of the property . Note:date is deprecated, use one of local_date / time / timestamp / time_with_timezone / timestamp_with_timezone instead). | Required | 
| aggregate | enum[identity, group_key, min, max, avg, sum, concat, count] | [currently unsupported] which aggregation function to use, aggregation always happens by vertex key. | null | 
| column | value | Name or index (starting from 0) of the column holding the property data. If it is not specified, the loader will try to use the property name as column name (for CSV format only). | null | 
| default | value | Default value to be assigned to this property if datasource does not provide it. In case of date type: string is expected to be formatted with yyyy-MM-dd HH:mm:ss. If no default is present (null), non-existent properties will contain default Java types (primitives) or empty string (string) or01.01.1970 00:00(date). | null | 
| dimension | integer | Dimension of property. | 0 | 
| drop_after_loading | boolean | [currently unsupported] indicating helper properties only used for aggregation, which are dropped after loading | false | 
| field | value | Name of the JSON field holding the property data. Nesting is denoted by dot - separation. Field names containing dots are possible, in this case the dots need to be escaped using backslashes to resolve ambiguities. Only the exactly specified object are loaded, if they are non existent, the default value is used. | null | 
| format | array of string | Array of formats of property. | [] | 
| group_key | string | [currently unsupported] can only be used if the property / key is part of the grouping expression. | null | 
| max_distinct_strings_per_pool | integer | [only relevant if string_pooling_strategy is indexed] Amount of distinct strings per property after which to stop pooling. If the limit is reached an exception is thrown. If set to null, the default value from the global PGX configuration will be used. | null | 
| stores | array of object | A list of storage identifiers that indicate where this property resides. | [] | 
| string_pooling_strategy | enum[indexed, on_heap, none] | Indicates which string pooling strategy to use. If set to null, the default value from the global PGX configuration will be used. | null | 
Loading Configuration
The loading entry is a JSON object with the following fields:
                           
Table 14-7 Loading Configuration
| Field | Type | Description | Default | 
|---|---|---|---|
| create_key_mapping | boolean | If true, a mapping between entity keys and internal IDs is prepared during loading. | true | 
| filter | string | [currently unsupported] the filter expression | null | 
| grouping_by | array of string | [currently unsupported] array of edge properties used for aggregator. For Vertices, only the ID can be used (default) | [] | 
| load_labels | boolean | Whether or not to load the entity label if it is available. | false | 
| strict_mode | boolean | If true, exceptions are thrown and logged withERRORlevel whenever loader encounters problems with input file, such as invalid format, repeated keys, missing fields, mismatches and other potential errors. Iffalse, loader may use less memory during loading phase, but behave unexpectedly with erratic input files. | true | 
Error Handling Configuration
The error_handling entry is a JSON object with the following fields:
                           
Table 14-8 Error Handling Configuration
| Field | Type | Description | Default | 
|---|---|---|---|
| on_missed_prop_key | enum[silent, log_warn, log_warn_once, error] | Error handling for a missing property key. | log_warn_once | 
| on_missing_vertex | enum[ignore_edge, ignore_edge_log, ignore_edge_log_once, create_vertex, create_vertex_log, create_vertex_log_once, error] | Error handling for a missing source or destination vertex of an edge in a vertex data source. | error | 
| on_parsing_issue | enum[silent, log_warn, log_warn_once, error] | Error handling for incorrect data parsing. If set to silent, log_warnorlog_warn_once, will attempt to continue loading. Some parsing issues may not be recoverable and provoke the end of loading. | error | 
| on_prop_conversion | enum[silent, log_warn, log_warn_once, error] | Error handling when encountering a different property type other than the one specified, but coercion is possible. | log_warn_once | 
| on_type_mismatch | enum[silent, log_warn, log_warn_once, error] | Error handling when encountering a different property type other than the one specified, but coercion is not possible. | error | 
| on_vector_length_mismatch | enum[silent, log_warn, log_warn_once, error] | Error handling for a vector property that does not have the correct dimension. | error | 
Note:
The only supported setting for theon_missing_vertex error handling configuration is ignore_edge.
                           14.2.1.5 Data Loading Security Best Practices
Loading a graph from the database requires authentication and it is therefore important to adhere to certain security guidelines when configuring access to this kind of data source.
The following guidelines are recommended:
- The user or role used to access the data should be a read-only account that only has access to the required graph data.
- The graph data should be marked as read-only, for example, with non-updateable views in the case of the database.
14.2.1.6 Data Format Support Matrix
Learn about the different data formats supported in the graph server (PGX).
Note:
The table refers to limitations of the PGX implementation of the format and not necessarily to limitations of the format itself.Table 14-9 Data Format Support Matrix
| Format | Vertex IDs | Edge IDs | Vertex Labels | Edge Labels | Vector properties | 
|---|---|---|---|---|---|
| PGB | int, long, string | long | multiple | single | supported (vectors can be of type integer,long,floatordouble) | 
| CSV | int, long, string | long | multiple | single | supported (vectors can be of type integer,long,floatordouble) | 
| ADJ_LIST | int, long, string | Not supported | Not supported | Not supported | supported (vectors can be of type integer,long,floatordouble) | 
| EDGE_LIST | int, long, string | Not supported | multiple | single | supported (vectors can be of type integer,long,floatordouble) | 
| GRAPHML | int, long, string | Not supported | Not supported | Not supported | Not supported | 
14.2.1.7 Immutability of Loaded Graphs
Once the graph is loaded into the graph server (PGX), the graph and its properties are automatically marked as immutable.
The immutability of loaded graphs is due to the following design choices:
- Typical graph analyses happen on a snapshot of a graph instance, and therefore they do not require mutations of the graph instance.
- Immutability allows PGX to use an internal graph representation optimized for fast analysis.
- In remote mode, the graph instance might be shared among multiple clients.
However, the graph server (PGX) also provides methods to customize and mutate graph instances for the purpose of analysis. See Graph Mutation and Subgraphs for more information.
14.2.2 Storing a Graph Snapshot on Disk
After reading a graph into memory, you can make any changes to the graph (such as running the PageRank algorithm and storing the values as vertex properties), and then store this snapshot of the graph on disk.
If you want to save the state of the graph in memory, then a snapshot of a graph can be saved as a file in binary format (PGB file).
In general, if you must shut down the graph server, then it is recommended that you store all the graph queries and analytics APIs that have been run on the graph. Once the graph server (PGX) is restarted, you can reload the graph and rerun the APIs.
However, if you must save the state of the graph, then the following example explains how to store the graph snapshot on disk.
As a prerequisite for storing the graph snapshot, you need to explicitly authorize access to the corresponding directories by defining a directory object pointing to the directory (on the graph server) that contains the files to read or write.
CREATE OR REPLACE DIRECTORY pgx_file_location AS '<path_to_dir>';
GRANT READ, WRITE ON directory pgx_file_location to GRAPH_DEVELOPER;Also, note the following:
- The directory in the CREATE DIRECTORYstatement must exist on the graph server (PGX).
- The directory must be readable (and/or writable) at the OS level by the graph server (PGX).
The preceding code grants the privileges on the directory to the
                GRAPH_DEVELOPER role. However, you can also grant permissions to an
            individual user:
                     
GRANT READ, WRITE ON DIRECTORY pgx_file_location TO <graph_user>;You can then run the following code to load a property graph view into the graph server (PGX) and save the graph snapshot as a file. Note that multiple PGB files will be generated, one for each vertex and edge provider in the graph.
opg4j> var g = session.readGraphByName("BANK_GRAPH", GraphSource.PG_VIEW)
g ==> PgxGraph[name=BANK_GRAPH_NEW,N=999,E=4993,created=1676021791568]
opg4j> analyst.pagerank(graph)
$8 ==> VertexProperty[name=pagerank,type=double,graph=BANK_GRAPH]
// Now save the state of this graph
opg4j>  var storedPgbConfig = graph.store(ProviderFormat.PGB, "<path_to_dir>")In a three-tier deployment, the file is written on the server-side file system. You must also ensure that the file location to write is specified in the graph server (PGX). (As explained in Three-Tier Deployments of Oracle Graph with Autonomous Database, in a three-tier deployment, access to the PGX server file system requires a list of allowed locations to be specified.)
Parent topic: Graph Management in the Graph Server (PGX)
14.2.3 Publishing a Graph
You can publish a graph that can be referenced by other sessions.
Publishing a Single Graph Snapshot
The PgxGraph#publish() method can be used to publish the
                current selected snapshot of the graph. The publish operation will move the graph
                name from the session-private namespace to the public namespace. If a graph with the
                same name has been already published, then the publish() method
                will fail with an exception. Graphs published with snapshots and single published
                snapshots share the same namespace.
                        
Table 12-6 describes the grants required to publish a graph.
Note that calling the publish() method without arguments
                publishes the snapshot with its persistent properties only. However, if you want to
                publish specific transient properties, then you must list them within the
                    publish() call as shown:
                        
opg4j> var prop1 = graph.createVertexProperty(PropertyType.INTEGER, "prop1")
opg4j> prop1.fill(0)
opg4j> var cost = graph.createEdgeProperty(PropertyType.DOUBLE, "cost")
opg4j> cost.fill(0d)
opg4j> graph.publish(List.of(prop1), List.of(cost))VertexProperty<Integer, Integer> prop1 = graph.createVertexProperty(PropertyType.INTEGER, "prop1");
prop1.fill(0);
EdgeProperty<Double> cost = graph.createEdgeProperty(PropertyType.DOUBLE, "cost");
cost.fill(0d);
List<VertexProperty<Integer, Integer> vertexProps = Arrays.asList(prop);
List<EdgeProperty<Double>> edgeProps = Arrays.asList(cost);
graph.publish(vertexProps, edgeProps);
prop = graph.create_vertex_property("integer", "prop1")
prop1.fill(0)
cost = graph.create_edge_property("double", "cost")
cost.fill(0d)
vertex_props = [prop]
edge_props = [cost]
graph.publish(vertex_props, edge_props)Publishing a Graph with Snapshots
If you want to make all snapshots of the graph
                visible to other sessions, then use the publishWithSnapshots()
                method. When a graph is published with snapshots, the GraphMetaData
                information of each snapshot is also made available to the other sessions, with the
                exception of the graph configuration, which is null.
                        
When calling the publishWithSnapshots() method, all
                the persistent properties of all the snapshots are published and made visible to the
                other sessions. Transient properties are session-private and therefore they must be
                published explicitly. Once published, all properties become read-only.
                        
Similar
                to publishing a single graph snapshot, the publishWithSnapshots()
                method will move the graph name from the session-private namespace to the public
                namespace. If a graph with the same name has been already published, then the
                    publishWithSnapshots() method will fail with an exception.
                        
If you want to publish specific transient
                properties, you should list them within the publishWithSnapshots()
                call, as in the following example:
                        
opg4j> var prop1 = graph.createVertexProperty(PropertyType.INTEGER, "prop1")
opg4j> prop1.fill(0)
opg4j> var cost = graph.createEdgeProperty(PropertyType.DOUBLE, "cost")
opg4j> cost.fill(0d)
opg4j> graph.publishWithSnapshots(List.of(prop1), List.of(cost))VertexProperty<Integer, Integer> prop1 = graph.createVertexProperty(PropertyType.INTEGER, "prop1");
prop1.fill(0);
EdgeProperty<Double> cost = graph.createEdgeProperty(PropertyType.DOUBLE, "cost");
cost.fill(0d);
List<VertexProperty<Integer, Integer> vertexProps = Arrays.asList(prop);
List<EdgeProperty<Double>> edgeProps = Arrays.asList(cost);
graph.publishWithSnapshots(vertexProps,edgeProps);VertexProperty<Integer, Integer> prop1 = graph.createVertexProperty(PropertyType.INTEGER, "prop1")
prop1.fill(0)
EdgeProperty<Double> cost = graph.createEdgeProperty(PropertyType.DOUBLE, "cost")
cost.fill(0d)
List<VertexProperty<Integer, Integer> vertexProps = Arrays.asList(prop)
List<EdgeProperty<Double>> edgeProps = Arrays.asList(cost)
graph.publishWithSnapshots(vertexProps,edgeProps)Referencing a Published Graph from Another Session
You can reference a published graph by its name in another session,
                using the PgxSession#getGraph() method.
                        
The following example references a published graph
                    myGraph in a new session (session2):
                        
opg4j> var session2 = instance.createSession("session2")
opg4j> var graph2 = session2.getGraph(Namespace.PUBLIC, "myGraph")PgxSession session2 = instance.createSession("session2");
PgxGraph graph2 = session2.getGraph(Namespace.PUBLIC, "myGraph");session2 = pypgx.get_session("session2");
PgxGraph graph2 = session2.get_graph("myGraph")
                           session2 can access only the published snapshot. If the graph has
                been published without snapshots, calling the
                    getAvailableSnapshots() method will return an empty queue.
                        
In case if the graph snapshots have been published, then the call to
                    getGraph() returns the most recent available snapshot.
                    session2 can see all the available snapshots through the
                    getAvailableSnapshots() method. You can then set a specific
                snapshot using the PgxSession#setSnapshot() method.
                        
Note:
If a referenced graph is not required anymore, then it is important that you release the graph. See Deleting a Graph for more information.Publishing a Property
After publishing (a single snapshot or all of them), you can still publish transient properties individually. Published properties are associated to a specific snapshot on which they are created, and hence visible only on that snapshot.
opg4j> graph.getVertexProperty("prop1").publish()
opg4j> graph.getEdgeProperty("cost").publish()
graph.getVertexProperty("prop1").publish();
graph.getEdgeProperty("cost").publish();graph.get_vertex_property("prop1").publish()
graph.get_edge_property("cost").publish()Getting a Published Property in Another Session
Sessions referencing a published graph (with or without snapshots) can reference a published property through the
                PgxGraph#getVertexProperty and
            PgxGraph#getEdgeProperty.
                     opg4j> var session2 = instance.createSession("session2")
opg4j> var graph2 = session2.getGraph(Namespace.PUBLIC, "myGraph")
opg4j> var vertexProperty = graph2.getVertexProperty("prop1")
opg4j> var edgeProperty = graph2.getEdgeProperty("cost")PgxSession session2 = instance.createSession("session2");
PgxGraph graph2 = session2.getGraph(Namespace.PUBLIC, "myGraph");
VertexProperty<Integer, Integer> vertexProperty = graph2.getVertexProperty("prop1");
EdgeProperty<Double> edgeProperty = graph2.getEdgeProperty("cost");session2 = pypgx.get_session(session_name ="session2")
graph2 = session2.get_graph("myGraph")
vertex_property = graph2.get_vertex_property("prop1")
edge_property = graph2.get_edge_property("cost")Pinning a Published Graph
You can pin a published graph so that it remains published even if no session uses it.
opg4j> graph.pin()graph.pin();>>> graph.pin()Unpinning a Published Graph
You can unpin a published graph that was earlier pinned. By doing this, you can remove the graph and all its snapshots, if no other session is using a snapshot of the graph.
opg4j> var graph = session.getGraph("bank_graph_analytics")
graph ==> PgxGraph[name=bank_graph_analytics,N=999,E=4993,created=1660217577201]
opg4j> graph.unpin()PgxGraph graph = session.getGraph("bank_graph_analytics");
graph.unpin();>>> graph = session.get_graph("bank_graph_analytics")
>>> graph.unpin()Related Topics
Parent topic: Graph Management in the Graph Server (PGX)
14.2.4 Deleting a Graph
In order to reduce the memory usage of the graph server (PGX), the session must drop the
  unused graph objects created through the getGraph() method, by invoking the
   destroy() method.
                     
Calling the destroy() method not only destroys the specified
   graph, but all of its associated properties, including transient properties as well. In addition,
   all of the collections related to the graph instance (for example, a VertexSet)
   are also destroyed automatically. If a session holds multiple PgxGraph objects
   referencing the same graph, invoking destroy() on any of them will invalidate
   all the PgxGraph objects referencing that graph, making any operation on those
   objects fail. 
                     
For example:
opg4j> var graph1 = session.getGraph("myGraphName")
opg4j> var graph2 = session.getGraph("myGraphName")
opg4j> graph2.destroy()                               // Delete graph2
opg4j> var properties = graph1.getVertexProperties() //throws an exception as graph1 reference is not valid anymore
opg4j> properties = graph2.getVertexProperties()     //throws an exception as graph2 reference is not valid anymorePgxGraph graph1 = session.getGraph("myGraphName");
// graph2 references the same graph of graph1
PgxGraph graph2 = session.getGraph("myGraphName");
// Delete graph2
graph2.destroy();
// Both the following calls throw an exception, as both references are not valid anymore
Set<VertexProperty<?, ?>> properties = graph1.getVertexProperties();
properties = graph2.getVertexProperties();graph1 = session.get_graph("myGraphName")
# graph2 references the same graph of graph1
 graph2 = session.get_graph("myGraphName")
# Delete graph2
graph2.destroy()
# Both the following calls throw an exception, as both references are not valid anymore
properties = graph1.get_vertex_properties()
properties = graph2.get_vertex_properties()The same behavior occurs when multiple PgxGraph objects
    reference the same snapshot. Since a snapshot is effectively a graph, destroying a
     PgxGraph object referencing a certain snapshot invalidates all
     PgxGraph objects referencing the same snapshot, but does not invalidate those
    referencing other snapshots:
                        
// Get a snapshot of "myGraphName"
PgxGraph graph1 = session.getGraph("myGraphName");
// graph2 and graph3 reference the same snapshot as graph1
PgxGraph graph2 = session.getGraph("myGraphName");
PgxGraph graph3 = session.getGraph("myGraphName");
// Assume another snapshot is created ...
// Make graph3 references the latest snapshot available
session.setSnapshot(graph3, PgxSession.LATEST_SNAPSHOT);
graph2.destroy();
// Both the following calls throw an exception, as both references are not valid anymore
Set<VertexProperty<?, ?>> properties = graph1.getVertexProperties();
properties = graph2.getVertexProperties();
// graph3 is still valid, so the call succeeds
properties = graph3.getVertexProperties();Note:
Even if a graph is destroyed by a session, the graph data may still remain in the server memory, if the graph is currently shared by other sessions. In such a case, the graph may still be visible among the available graphs through thePgxSession.getGraphs() method.
                        As a safe alternative to the manual removal of each graph, the PGX API supports
    some implicit resource management features which allow developers to safely omit the
     destroy() call. See Resource Management Considerations for more information.
                        
Parent topic: Graph Management in the Graph Server (PGX)
14.3 Keeping the Graph in Oracle Database Synchronized with the Graph Server
You can use the FlashbackSynchronizer API to automatically apply changes made to graph in the database to the corresponding PgxGraph object in memory, thus keeping both synchronized.
                  
This API uses Oracle's Flashback
                                                  Technology to fetch the changes in the
                                                  database since the last fetch and then push those
                                                  changes into the graph server using the
                                                  ChangeSet API. After the changes
                                                  are applied, the usual snapshot semantics of the
                                                  graph server apply: each delta fetch application
                                                  creates a new in-memory snapshot. Any queries or
                                                  algorithms that are executing concurrently to
                                                  snapshot creation are unaffected by the changes
                                                  until the corresponding session refreshes its
                                                  PgxGraph object to the latest
                                                  state by calling the
                                                  session.setSnapshot(graph,
                                                  PgxSession.LATEST_SNAPSHOT)
                                                  procedure.
                  
Also, if the changes from the previous fetch
                              operation no longer exist, then the synchronizer will throw an
                              exception. This occurs if the previous fetch duration is longer than
                              the UNDO_RETENTION parameter setting in the database.
                              To avoid this exception, ensure to fetch the changes at intervals less
                              than the UNDO_RETENTION parameter value. The default
                              setting for the UNDO_RETENTION parameter is
                                        900 seconds. See Oracle Database Reference for more
                              information.
                  
Prerequisites for Synchronizing
The Oracle database must have Flashback enabled and the database user that you use to perform synchronization must have:
- Read access to all tables which need to be kept synchronized.
- Permission to use flashback APIs. For example:GRANT EXECUTE ON DBMS_FLASHBACK TO <user>
The database must also be configured to retain changes for the amount of time needed by your use case.
Types of graphs that can be synchronized
Not all PgxGraph objects in PGX can be synchronized. The following limitations apply:
                     
- 
                           Only the original creator of the graph can synchronize it. That is, the current user must have the MANAGE permission of the graph. 
- Only graphs loaded from database tables (PG View graphs) can be synchronized. Graphs created from other formats or graphs created via the graph builder API or PG View graphs created from database views cannot be synchronized.
- Only the latest snapshot of a graph can be synchronized.
Types of changes that can be synchronized
The synchronizer supports keeping the in-memory graph snapshot in sync with the following database-side modifications:
- insertion of new vertices and edges
- removal of existing vertices and edges
- update of property values of any vertex or edge
The synchronizer does not support schema-level changes to the input graph, such as:
- alteration of the list of input vertex or edge tables
- alteration of any columns of any input tables (vertex or edge tables)
Furthermore, the synchronizer does not support updates to vertex and edge keys.
For a detailed example, see the following topic:
- Synchronizing a PG View Graph
 You can synchronize a graph loaded into the graph server (PGX) from a property graph view (PG View) with the changes made to the graph in the database.
- Synchronizing a Published Graph
 You can synchronize a published graph by configuring the Flashback Synchronizer with aPartitionedGraphConfigobject containing the graph schema along with the database connection details.
Parent topic: Developing Applications with Graph Analytics
14.3.1 Synchronizing a PG View Graph
You can synchronize a graph loaded into the graph server (PGX) from a property graph view (PG View) with the changes made to the graph in the database.
14.3.2 Synchronizing a Published Graph
You can synchronize a published graph by configuring the Flashback
        Synchronizer with a PartitionedGraphConfig object containing the graph
        schema along with the database connection details.
                     
PartitionedGraphConfig object can be created either
            through the PartitionedGraphConfigBuilder API or by reading the graph
            configuration from a JSON file.
                        Though synchronization of graphs created via graph configuration objects is supported in general, the following few limitations apply:
- Only partitioned graph configurations with all providers being database tables are supported.
- Each edge or vertex provider or both must specify the owner of the
                    table by setting the username field. For example, if user SCOTTowns the table, then set the user name accordingly for the providers.
- Snapshot source must be set to CHANGE_SET.
- It is highly recommended to optimize the graph for update operations in order to avoid memory exhaustion when creating many snapshots.
The following example shows the sample configuration for creating
                the PartitionedGraphConfig object:
                        
{
  ...
  "optimized_for": "updates",
  "vertex_providers": [
      ...
      "username":"<username>",
      ...
  ],
  "edge_providers": [
      ...
      "username":"<username>",
      ...
  ],
  "loading": {
    "snapshots_source": "change_set"
  }
}GraphConfig cfg = GraphConfigBuilder.forPartitioned()
                      …
                      .setUsername("<username>")
                      .setSnapshotsSource(SnapshotsSource.CHANGE_SET)
                      .setOptimizedFor(GraphOptimizedFor.UPDATES)
                      ...
                      .build();opg4j> var graph = session.readGraphWithProperties("<path_to_json_config_file>")
graph ==> PgxGraph[name=bank_graph_analytics_fb,N=999,E=4993,created=1664310157103]
opg4j> graph.publishWithSnapshots()PgxGraph graph = session.readGraphWithProperties("<path_to_json_config_file>");
graph.publishWithSnapshots();>>> graph = session.read_graph_with_properties("<path_to_json_config_file>")
>>> graph.publish_with_snapshots()You can now perform the following steps to synchronize the published graph using a graph configuration object which is built from a JSON file.
14.4 Optimizing Graphs for Read Versus Updates in the Graph Server (PGX)
The graph server (PGX) can store an optimized graph for other reads or updates. This is only relevant when the updates are made directly to a graph instance in the graph server.
Graph Optimized for Reads
Graphs optimized for reads will provide the best performance for graph analytics and PGQL queries. In this case there could be potentially higher latencies to update the graph (adding or removing vertex and edges or updating the property values of previously existing vertex or edges through GraphChangeSet API). There could also be higher memory consumption. When using graphs optimized for reads, each updated graph or graph snapshot consumes memory proportional to the size of the graph in terms of vertices and edges.
                     
The optimized_for configuration property can be set to
                    reads when loading the graph into the graph server (PGX) to
                create a graph instance that is optimized for reads.
                     
Graph Optimized for Updates
Graphs optimized for updates use a representation enabling low-latency update of graphs. With this representation, the graph server can reach millisecond-scale latencies when updating graphs with millions of vertices and edges (this is indicative and will vary depending on the hardware configuration).
To achieve faster update operations, graph server avoids as much as possible doing a full duplication of the previous graph (snapshot) to create a new graph (snapshot). This also improves the memory consumption (in typical scenarios). New snapshots (or new graphs) will only consume additional memory proportional to the memory required for the changes applied.
In this representation, there could be lower performance of graph queries and analytics.
The optimized_for configuration property can be set to
                    updates when loading the graph into the graph server (PGX) to
                create a graph instance that is optimized for reads.
                     
Parent topic: Developing Applications with Graph Analytics
14.5 Executing Built-in Algorithms
The graph server (PGX) contains a set of built-in algorithms that are available as Java APIs.
The following table provides an overview of the available algorithms, grouped by category. Note that these algorithms can be invoked through the Analyst Class.
Note:
See the supported Built-In Algorithms on GitHub for more details.Table 14-10 Overview of Built-In Algorithms
| Category | Algorithms | 
|---|---|
| Classic graph algorithms | Prim's Algorithm | 
| Community detection | Conductance Minimization (Soman and Narang Algorithm), Infomap, Label Propagation, Louvain | 
| Connected components | Strongly Connected Components, Weakly Connected Components (WCC) | 
| Link predition | WTF (Whom To Follow) Algorithm | 
| Matrix factorization | Matrix Factorization | 
| Other | Graph Traversal Algorithms | 
| Path finding | All Vertices and Edges on Filtered Path, Bellman-Ford Algorithms, Bidirectional Dijkstra Algorithms, Compute Distance Index, Compute High-Degree Vertices, Dijkstra Algorithms, Enumerate Simple Paths, Fast Path Finding, Fattest Path, Filtered Fast Path Finding, Hop Distance Algorithms | 
| Ranking and walking | Closeness Centrality Algorithms, Degree Centrality Algorithms, Eigenvector Centrality, Hyperlink-Induced Topic Search (HITS), PageRank Algorithms, Random Walk with Restart, Stochastic Approach for Link-Structure Analysis (SALSA) Algorithms, Vertex Betweenness Centrality Algorithms | 
| Structure evaluation | Adamic-Adar index, Bipartite Check, Conductance, Cycle Detection Algorithms, Degree Distribution Algorithms, Eccentricity Algorithms, K-Core, Local Clustering Coefficient (LCC), Modularity, Partition Conductance, Reachability Algorithms, Topological Ordering Algorithms, Triangle Counting Algorithms | 
This following topics describe the use of the graph server (PGX) using Triangle Counting and PageRank analytics as examples.
- About Built-In Algorithms in the Graph Server (PGX)
- Running the Triangle Counting Algorithm
- Running the PageRank Algorithm
Parent topic: Developing Applications with Graph Analytics
14.5.1 About Built-In Algorithms in the Graph Server (PGX)
The graph server (PGX) contains a set of built-in algorithms that are available as
         Java APIs. The details of the APIs are documented in the Javadoc that is included in the
         product documentation library. Specifically, see the BuiltinAlgorithms
         interface Method Summary for a list of the supported in-memory analyst methods.
                     
For example, this is the PageRank procedure signature:
/**
   * Classic pagerank algorithm. Time complexity: O(E * K) with E = number of edges, K is a given constant (max
   * iterations)
   *
   * @param graph
   *          graph
   * @param e
   *          maximum error for terminating the iteration
   * @param d
   *          damping factor
   * @param max
   *          maximum number of iterations
   * @return Vertex Property holding the result as a double
   */
  public <ID extends Comparable<ID>> VertexProperty<ID, Double> pagerank(PgxGraph graph, double e, double d, int max);
Parent topic: Executing Built-in Algorithms
14.5.2 Running the Triangle Counting Algorithm
For triangle counting, the sortByDegree boolean parameter of countTriangles() allows you to control whether the graph should first be sorted by degree (true) or not (false). If true, more memory will be used, but the algorithm will run faster; however, if your graph is very large, you might want to turn this optimization off to avoid running out of memory.
                     
opg4j> analyst.countTriangles(graph, true)
==> 1import oracle.pgx.api.*; Analyst analyst = session.createAnalyst(); long triangles = analyst.countTriangles(graph, true);
The algorithm finds one triangle in the sample graph.
Tip:
When using the graph shell, you can increase the amount of log output during execution by changing the logging level. See information about the:loglevel command with :h
                    :loglevel.
                        Parent topic: Executing Built-in Algorithms
14.5.3 Running the PageRank Algorithm
PageRank computes a rank value between 0 and 1 for each vertex (node) in the graph and stores the values in a double property. The algorithm therefore creates a vertex property of type double for the output.
                     
In the graph server (PGX), there are two types of vertex and edge properties:
- 
                           Persistent Properties: Properties that are loaded with the graph from a data source are fixed, in-memory copies of the data on disk, and are therefore persistent. Persistent properties are read-only, immutable and shared between sessions. 
- 
                           Transient Properties: Values can only be written to transient properties, which are private to a session. You can create transient properties by calling createVertexPropertyandcreateEdgePropertyonPgxGraphobjects, or by copying existing properties usingclone()on Property objects.Transient properties hold the results of computation by algorithms. For example, the PageRank algorithm computes a rank value between 0 and 1 for each vertex in the graph and stores these values in a transient property named pg_rank. Transient properties are destroyed when the Analyst object is destroyed.
This example obtains the top three vertices with the highest PageRank values. It uses a transient vertex property of type double to hold the computed PageRank values. The PageRank algorithm uses the following default values for the input parameters: error (tolerance = 0.001), damping factor = 0.85, and maximum number of iterations = 100.
                     
opg4j> rank = analyst.pagerank(graph, 0.001, 0.85, 100);
==> ...
opg4j> rank.getTopKValues(3)
==> 128=0.1402019732468347
==> 333=0.12002296283541904
==> 99=0.09708583862990475
import java.util.Map.Entry;
import oracle.pgx.api.*;
 
Analyst analyst = session.createAnalyst();
VertexProperty<Integer, Double> rank = analyst.pagerank(graph, 0.001, 0.85, 100);
for (Entry<Integer, Double> entry : rank.getTopKValues(3)) {
 System.out.println(entry.getKey() + "=" + entry.getValue());
}Parent topic: Executing Built-in Algorithms
14.6 Using Custom PGX Graph Algorithms
A custom PGX graph algorithm allows you to write a graph algorithm in Java syntax and have it automatically compiled to an efficient parallel implementation.
- Writing a Custom PGX Algorithm
- Compiling and Running a Custom PGX Algorithm
- Example Custom PGX Algorithm: PageRank
Parent topic: Developing Applications with Graph Analytics
14.6.1 Writing a Custom PGX Algorithm
A PGX algorithm is a regular .java file with a single class definition that is
            annotated with @GraphAlgorithm. For example:
                     
import oracle.pgx.algorithm.annotations.GraphAlgorithm;
@GraphAlgorithm
public class MyAlgorithm {
    ...
}
A PGX algorithm class must contain exactly one public method which will be used as entry point. The class may contain any number of private methods.
For example:
import oracle.pgx.algorithm.PgxGraph;
import oracle.pgx.algorithm.VertexProperty;
import oracle.pgx.algorithm.annotations.GraphAlgorithm;
import oracle.pgx.algorithm.annotations.Out;
@GraphAlgorithm
public class MyAlgorithm {
    public int myAlgorithm(PgxGraph g, @Out VertexProperty<Integer> distance) {
        System.out.println("My first PGX Algorithm program!");
        return 42;
    }
}
As with normal Java methods, a PGX algorithm method only supports primitive data
            types as return values (an integer in this example). More interesting is the
                @Out annotation, which marks the vertex property
                distance as output parameter. The caller passes output parameters
            by reference. This way, the caller has a reference to the modified property after the
            algorithm terminates.
                     
Parent topic: Using Custom PGX Graph Algorithms
14.6.1.1 Collections
To create a collection you call the .create() function. For example, a VertexProperty<Integer> is created as follows:
VertexProperty<Integer> distance = VertexProperty.create();To get the value of a property at a certain vertex v:
                        
distance.get(v);Similarly, to set the property of a certain vertex v to a value e:
                        
distance.set(v, e);You can even create properties of collections:
VertexProperty<VertexSequence> path = VertexProperty.create();However, PGX Algorithm assignments are always by value (as opposed to by reference). To make this explicit, you must call .clone() when assigning a collection:
                        
VertexSequence sequence = path.get(v).clone();Another consequence of values being passed by value is that you can check for equality using the == operator instead of the Java method .equals(). For example:
                        
PgxVertex v1 = G.getRandomVertex();
PgxVertex v2 = G.getRandomVertex();
System.out.println(v1 == v2);
Parent topic: Writing a Custom PGX Algorithm
14.6.1.2 Iteration
The most common operations in PGX algorithms are iterations (such as looping over all vertices, and looping over a vertex's neighbors) and graph traversal (such as breath-first/depth-first). All collections expose a forEach and forSequential method by which you can iterate over the collection in parallel and in sequence, respectively. 
                        
For example:
- To iterate over a graph's vertices in parallel:G.getVertices().forEach(v -> { ... });
- To iterate over a graph's vertices in sequence:G.getVertices().forSequential(v -> { ... });
- To traverse a graph's vertices from rin breadth-first order:import oracle.pgx.algorithm.Traversal; Traversal.inBFS(G, r).forward(n -> { ... });Inside the forward(orbackward) lambda you can access the current level of the BFS (or DFS) traversal bycalling currentLevel().
Parent topic: Writing a Custom PGX Algorithm
14.6.1.3 Reductions
Within these parallel blocks it is common to atomically update, or reduce to, a variable defined outside the lambda. These atomic reductions are available as methods on Scalar<T>: reduceAdd, reduceMul, reduceAnd, and so on. For example, to count the number of vertices in a graph:
                        
public int countVertices() {
    Scalar<Integer> count = Scalar.create(0);
    G.getVertices().forEach(n -> {
        count.reduceAdd(1);
    });
    return count.get();
}
Sometimes you want to update multiple values atomically. For example, you might want to find the smallest property value as well as the vertex whose property value attains this smallest value. Due to the parallel execution, two separate reduction statements might get you in an inconsistent state.
To solve this problem the Reductions class provides argMin and argMax functions. The first argument to argMin is the current value and the second argument is the potential new minimum. Additionally, you can chain andUpdate calls on the ArgMinMax object to indicate other variables and the values that they should be updated to (atomically). For example:
                        
VertexProperty<Integer> rank = VertexProperty.create();
int minRank = Integer.MAX_VALUE;
PgxVertex minVertex = PgxVertex.NONE;
G.getVertices().forEach(n ->
    argMin(minRank, rank.get(n)).andUpdate(minVertex, n)
);
Parent topic: Writing a Custom PGX Algorithm
14.6.2 Compiling and Running a Custom PGX Algorithm
Note:
Compiling a custom PGX Algorithm using the PGX Algorithm API is not supported on Oracle JDK 17.Parent topic: Using Custom PGX Graph Algorithms
14.6.3 Example Custom PGX Algorithm: PageRank
The following is an implementation of pagerank as a PGX algorithm:
                     
import oracle.pgx.algorithm.PgxGraph;
import oracle.pgx.algorithm.Scalar;
import oracle.pgx.algorithm.VertexProperty;
import oracle.pgx.algorithm.annotations.GraphAlgorithm;
import oracle.pgx.algorithm.annotations.Out;
@GraphAlgorithm
public class Pagerank {
  public void pagerank(PgxGraph G, double tol, double damp, int max_iter, boolean norm, @Out VertexProperty<Double> rank) {
    Scalar<Double> diff = Scalar.create();
    int cnt = 0;
    double N = G.getNumVertices();
    rank.setAll(1 / N);
    do {
      diff.set(0.0);
      Scalar<Double> dangling_factor = Scalar.create(0d);
      if (norm) {
        dangling_factor.set(damp / N * G.getVertices().filter(v -> v.getOutDegree() == 0).sum(rank::get));
      }
      G.getVertices().forEach(t -> {
        double in_sum = t.getInNeighbors().sum(w -> rank.get(w) / w.getOutDegree());
        double val = (1 - damp) / N + damp * in_sum + dangling_factor.get();
        diff.reduceAdd(Math.abs(val - rank.get(t)));
        rank.setDeferred(t, val);
      });
      cnt++;
    } while (diff.get() > tol && cnt < max_iter);
  }
}
Parent topic: Using Custom PGX Graph Algorithms
14.7 Creating Subgraphs
You can create subgraphs based on a graph that has been loaded into memory. You can use filter expressions or create bipartite subgraphs based on a vertex (node) collection that specifies the left set of the bipartite graph.
Note:
Starting from Graph Server and Client Release 22.3, creating subgraphs using filter expressions is deprecated. It is recommended that you load a subgraph from property graph views. See Loading a Subgraph from Property Graph Views for more information.For information about reading a graph into memory, see Reading Graphs from Oracle Database into the Graph Server (PGX) for the various methods to load a graph into the graph server (PGX).
- About Filter Expressions
- Using a Simple Filter to Create a Subgraph
- Using a Complex Filter to Create a Subgraph
- Using a Vertex Set to Create a Bipartite Subgraph
Parent topic: Developing Applications with Graph Analytics
14.7.1 About Filter Expressions
Filter expressions are expressions that are evaluated for each vertex or edge. The expression can define predicates that a vertex or an edge must fulfil in order to be contained in the result, in this case a subgraph.
Consider an example graph that consists of four vertices (nodes) and four edges. For an edge to match the filter expression src.prop == 10, the source vertex prop property must equal 10. Two edges match that filter expression, as shown in the following figure.
                     
The following figure shows the graph that results when the filter is applied.
Figure 14-2 Graph Created by the Simple Filter

Description of "Figure 14-2 Graph Created by the Simple Filter"
The vertex filter src.prop == 10 filters out the edges associated with
            vertex 333 and the vertex itself.
                     
Parent topic: Creating Subgraphs
14.7.2 Using a Simple Filter to Create a Subgraph
The following examples create the subgraph described in About Filter Expressions.
var subgraph = graph.filter(new VertexFilter("vertex.prop == 10"))import oracle.pgx.api.*;
import oracle.pgx.api.filter.*;
PgxGraph graph = session.readGraphWithProperties(...);
PgxGraph subgraph = graph.filter(new VertexFilter("vertex.prop == 10"));
Parent topic: Creating Subgraphs
14.7.3 Using a Complex Filter to Create a Subgraph
This example uses a slightly more complex filter. It uses the outDegree function, which calculates the number of outgoing edges for an identifier (source src or destination dst). The following filter expression matches all edges with a  cost property value greater than 50 and a destination vertex (node) with an outDegree greater than 1.
                     
dst.outDegree() > 1 && edge.cost > 50
One edge in the sample graph matches this filter expression, as shown in the following figure.
Figure 14-3 Edges Matching the outDegree Filter

Description of "Figure 14-3 Edges Matching the outDegree Filter"
The following figure shows the graph that results when the filter is applied. The filter excludes the edges associated with the vertices 99 and 1908, and so excludes those vertices also.
Figure 14-4 Graph Created by the outDegree Filter

Description of "Figure 14-4 Graph Created by the outDegree Filter"
Parent topic: Creating Subgraphs
14.7.4 Using a Vertex Set to Create a Bipartite Subgraph
You can create a bipartite subgraph by specifying a set of vertices (nodes), which are used as the left side. A bipartite subgraph has edges only between the left set of vertices and the right set of vertices. There are no edges within those sets, such as between two nodes on the left side. In the graph server (PGX), vertices that are isolated because all incoming and outgoing edges were deleted are not part of the bipartite subgraph.
The following figure shows a bipartite subgraph. No properties are shown.
The following examples create a bipartite subgraph from a simple graph consisting of
      four vertices and four edges. The vertex ID values for the four vertices are
        99, 128, 1908 and 333
      respectively. See Figure 14-1 in About Filter Expressions for more information on the vertex and edge property values including the edge
      direction between the vertices.
                     
You must first create a vertex collection and fill it
                                                with the vertices for the left side. In the example
                                                shown, vertices with vertex ID values
                                                  333 and 99 are
                                                added to the left side of the vertex collection.
                     
Using the Shell to Create a Bipartite Subgraph
opg4j> s = graph.createVertexSet() ==> ... opg4j> s.addAll([graph.getVertex(333), graph.getVertex(99)]) ==> ... opg4j> s.size() ==> 2 opg4j> bGraph = graph.bipartiteSubGraphFromLeftSet(s) ==> PGX Bipartite Graph named sample-sub-graph-4
Using Java to Create a Bipartite Subgraph
import oracle.pgx.api.*; VertexSet<Integer> s = graph.createVertexSet(); s.addAll(graph.getVertex(333), graph.getVertex(99)); BipartiteGraph bGraph = graph.bipartiteSubGraphFromLeftSet(s);
When you create a subgraph, the graph server (PGX) automatically creates a Boolean vertex (node) property that indicates whether the vertex is on the left side. You can specify a unique name for the property.
The resulting bipartite subgraph looks like this:
Vertex with ID 1908 is excluded from the bipartite subgraph. The
                                                  only edge that connected that vertex extended from
                                                  128 to 1908. The
                                                  edge was removed, because it violated the
                                                  bipartite properties of the subgraph. Vertex
                                                  1908 had no other edges, and so
                                                  was removed as well. Moreover, the edge from the
                                                  vertex with the ID 128 to the
                                                  vertex with ID 99 is not present
                                                  in the bipartite subgraph, because edges are only
                                                  allowed to go from left to right (and not from
                                                  right to left).
                        
Parent topic: Creating Subgraphs
14.8 Using Automatic Delta Refresh to Handle Database Changes
You can automatically refresh (auto-refresh) graphs periodically to keep the in-memory graph synchronized with changes to the property graph stored in the property graph tables in Oracle Database (VT$ and GE$ tables).
Note that the auto-refresh feature is not supported when loading a graph into PGX in memory directly from relational tables.
- Configuring the Graph Server (PGX) for Auto-Refresh
- Configuring Basic Auto-Refresh
- Reading the Graph Using the Graph Server (PGX) or a Java Application
- Checking Out a Specific Snapshot of the Graph
- Advanced Auto-Refresh Configuration
- Special Considerations When Using Auto-Refresh
Parent topic: Developing Applications with Graph Analytics
14.8.1 Configuring the Graph Server (PGX) for Auto-Refresh
Because auto-refresh can create many snapshots and therefore may lead to a high memory usage, by default the option to enable auto-refresh for graphs is available only to administrators.
To allow all users to auto-refresh graphs, you must include the following line into
      the graph server (PGX) configuration file (located in
        /etc/oracle/graph/pgx.conf):
                     
{
  "allow_user_auto_refresh": true
}
Parent topic: Using Automatic Delta Refresh to Handle Database Changes
14.8.2 Configuring Basic Auto-Refresh
Auto-refresh is configured in the loading section of the graph configuration. The example in this topic sets up auto-refresh to check for updates every minute, and to create a new snapshot when the data source has changed.
The following block (JSON format) enables the auto-refresh feature in the configuration file of the sample graph:
{
  "format": "pg",
  "jdbc_url": "jdbc:oracle:thin:@mydatabaseserver:1521/dbName",
  "username": "scott",
  "password": "<password>",
  "name": "my_graph",
  "vertex_props": [{
    "name": "prop",
    "type": "integer"
  }],
  "edge_props": [{
    "name": "cost",
    "type": "double"
  }],
  "separator": " ",
  "loading": {
    "auto_refresh": true,
    "update_interval_sec": 60
  },
}
Notice the additional loading section containing the auto-refresh settings. You can also use the Java APIs to construct the same graph configuration programmatically:
                     
GraphConfig config = GraphConfigBuilder.forPropertyGraphRdbms()
  .setJdbcUrl("jdbc:oracle:thin:@mydatabaseserver:1521/dbName")
  .setUsername("scott")
  .setPassword("<password>")
  .setName("my_graph")
  .addVertexProperty("prop", PropertyType.INTEGER)
  .addEdgeProperty("cost", PropertyType.DOUBLE)
  .setAutoRefresh(true)
  .setUpdateIntervalSec(60)
  .build();
Parent topic: Using Automatic Delta Refresh to Handle Database Changes
14.8.3 Reading the Graph Using the Graph Server (PGX) or a Java Application
After creating the graph configuration, you can load the graph into the graph server (PGX) using the regular APIs.
opg4j> G = session.readGraphWithProperties("graphs/my-config.pg.json")After the graph is loaded, a background task is started automatically, and it periodically checks the data source for updates.
Parent topic: Using Automatic Delta Refresh to Handle Database Changes
14.8.4 Checking Out a Specific Snapshot of the Graph
The database is queried every minute for updates. If the graph has changed in the database after the time interval passed, the graph is reloaded and a new snapshot is created in-memory automatically.
You can "check out" (move a pointer to a different version of) the available in-memory snapshots of the graph using the getAvailableSnapshots() method of PgxSession. Example output is as follows:
                     
opg4j> session.getAvailableSnapshots(G)
==> GraphMetaData [getNumVertices()=4, getNumEdges()=4, memoryMb=0, dataSourceVersion=1453315103000, creationRequestTimestamp=1453315122669 (2016-01-20 10:38:42.669), creationTimestamp=1453315122685 (2016-01-20 10:38:42.685), vertexIdType=integer, edgeIdType=long]
==> GraphMetaData [getNumVertices()=5, getNumEdges()=5, memoryMb=3, dataSourceVersion=1452083654000, creationRequestTimestamp=1453314938744 (2016-01-20 10:35:38.744), creationTimestamp=1453314938833 (2016-01-20 10:35:38.833), vertexIdType=integer, edgeIdType=long]
The preceding example output contains two entries, one for the originally loaded graph with 4 vertices and 4 edges, and one for the graph created by auto-refresh with 5 vertices and 5 edges.
To check out out a specific snapshot of the graph, use the setSnapshot() methods of PgxSession and give it the creationTimestamp of the snapshot you want to load.
For example, if G is pointing to the newer graph with 5 vertices and 5 edges, but you want to analyze the older version of the graph, you need to set the snapshot to 1453315122685. In the graph shell:
opg4j> G.getNumVertices()
==> 5
opg4j> G.getNumEdges()
==> 5
opg4j> session.setSnapshot( G, 1453315122685 )
==> null
opg4j> G.getNumVertices()
==> 4
opg4j> G.getNumEdges()
==> 4
You can also load a specific snapshot of a graph directly using the readGraphAsOf() method of PgxSession. This is a shortcut for loading a graph with readGraphWithProperty() followed by a setSnapshot(). For example:
                     
opg4j> G = session.readGraphAsOf( config, 1453315122685 )If you do not know or care about what snapshots are currently available in-memory, you can also specify a time span of how “old” a snapshot is acceptable by specifying a maximum allowed age. For example, to specify a maximum snapshot age of 60 minutes, you can use the following:
opg4j> G = session.readGraphWithProperties( config, 60l, TimeUnit.MINUTES )If there are one or more snapshots in memory younger (newer) than the specified maximum age, the youngest (newest) of those snapshots will be returned. If all the available snapshots are older than the specified maximum age, or if there is no snapshot available at all, then a new snapshot will be created automatically.
Parent topic: Using Automatic Delta Refresh to Handle Database Changes
14.8.5 Advanced Auto-Refresh Configuration
You can specify advanced options for auto-refresh configuration.
Internally, the graph server (PGX) fetches the changes since the last check from the database and creates a new snapshot by applying the delta (changes) to the previous snapshot. There are two timers: one for fetching and caching the deltas from the database, the other for actually applying the deltas and creating a new snapshot.
Additionally, you can specify a threshold for the number of cached deltas. If the number of cached changes grows above this threshold, a new snapshot is created automatically. The number of cached changes is a simple sum of the number of vertex changes plus the number of edge changes.
The deltas are fetched periodically and cached on the graph server (PGX) for two reasons:
- 
                           To speed up the actual snapshot creation process 
- 
                           To account for the case that the database can "forget" changes after a while 
You can specify both a threshold and an update timer, which means that both conditions will be checked before new snapshot is created. At least one of these parameters (threshold or update timer) must be specified to prevent the delta cache from becoming too large. The interval at which the source is queried for changes must not be omitted.
The following parameters show a configuration where the data source is queried for new deltas every 5 minutes. New snapshots are created every 20 minutes or if the cached deltas reach a size of 1000 changes.
{
  "format": "pg",
  "jdbc_url": "jdbc:oracle:thin:@mydatabaseserver:1521/dbName",
  "username": "scott",
  "password": "<your_password>",
  "name": "my_graph",
  "loading": {
    "auto_refresh": true,
    "fetch_interval_sec": 300,
    "update_interval_sec": 1200,
    "update_threshold": 1000,
    "create_edge_id_index": true,
    "create_edge_id_mapping": true
  }
}
Parent topic: Using Automatic Delta Refresh to Handle Database Changes
14.8.6 Special Considerations When Using Auto-Refresh
This section explains a few special considerations when you enable auto-refresh for graphs in the graph server (PGX):
- If you call graph.destroy(), auto-refresh does not immediately stop.It only stops once the graph is actually freed from the server memory. This happens when all the following conditions are true:- No other session is referencing that graph.
- PGX consumes more than release_memory_thresholdmemory.release_memory_thresholdis apgx.confoption that defaults to 85% of available system memory.
- The PGX "garbage collector" has been run.
                                    memory_cleanup_intervalis apgx.confoption which defaults to once every 10 minutes.
 
- If you configure the graph to be loaded with auto-refresh, you cannot omit the jdbc url,usernameandkeystoreparameters from the graph configuration file since auto-refreshed graphs are not "user bound". You cannot obtain the connection settings from the user who initiated it.
Parent topic: Using Automatic Delta Refresh to Handle Database Changes
14.9 User-Defined Functions (UDFs) in PGX
User-defined functions (UDFs) allow users of PGX to add custom logic to their PGQL queries or custom graph algorithms, to complement built-in functions with custom requirements.
Caution:
UDFs enable running arbitrary code in the PGX server, possibly accessing sensitive data. Additionally, any PGX session can invoke any of the UDFs that are enabled on the PGX server. The application administrator who enables UDFs is responsible for checking the following:
- All the UDF code can be trusted.
- The UDFs are stored in a secure location that cannot be tampered with.
Furthermore, PGX assumes UDFs to be state-less and side-effect free.
PGX supports two types of UDFs:
- Java UDFs
- JavaScript UDFs
How to Use Java UDFs
The following simple example shows how to register a Java UDF at the PGX server and invoke it.
- Create a class with a public static method. For example:package my.udfs; public class MyUdfs { public static String concat(String a, String b) { return a + b; } }
- Compile the class and compress into a JAR file. For example:mkdir ./target javac -d ./target *.java cd target jar cvf MyUdfs.jar *
- Copy the JAR file into /opt/oracle/graph/pgx/server/lib.
- Create a UDF JSON configuration file. For example, assume that
            /path/to/my/udfs/dir/my_udfs.jsoncontains the following:{ "user_defined_functions": [ { "namespace": "my", "language": "java", "implementation_reference": "my.udfs.MyUdfs", "function_name": "concat", "return_type": "string", "arguments": [ { "name": "a", "type": "string" }, { "name": "b", "type": "string" } ] } ] }
- Point to the directory containing the UDF configuration file in /etc/oracle/graph/pgx.conf. For example:"udf_config_directory": "/path/to/my/udfs/dir/"
- Restart the PGX server. For example:sudo systemctl restart pgx
- Try to invoke the UDF from within a PGQL query. For example:graph.queryPgql("SELECT my.concat(my.concat(n.firstName, ' '), n.lastName) FROM MATCH (n:Person)")
- Try to invoke the UDF from within a PGX algorithm. For example:
                           Note: For each UDF you want to use, you need to create an abstract method with the same schema that gets annotated with the@Udfannotation.import oracle.pgx.algorithm.annotations.Udf; .... @GraphAlgorithm public class MyAlogrithm { public void bomAlgorithm(PgxGraph g, VertexProperty<String> firstName, VertexProperty<String> lastName, @Out VertexProperty<String> fullName) { ... fullName.set(v, concat(firstName.get(v), lastName.get(v))); ... } @Udf(namespace = "my") abstract String concat(String a, String b); }
JavaScript UDFs
The requirements for a JavaScript UDF is as follows:
- The JavaScript source must contain all dependencies.
- The source must contain at least one valid export.
- The languageparameter must be set tojavascriptin the UDF configuration file.
For example, consider a JavaScript source file format.js as shown:
                     
//format.js
const fun = function(name, country) {
  if (country == null) return name;
  else return name + " (" + country + ")";
}
module.exports = {stringFormat: fun};In order to load the UDF from format.js, the UDF configuration file will
        appear as follows:
                     
{
  "namespace": "my",
  "function_name": "format",
  "language": "javascript",
  "source_location": "format.js",
  "source_function_name": "stringFormat",
  "return_type": "string",
  "arguments": [
    {
      "name": "name",
      "type": "string"
    },
    {
      "name": "country",
      "type": "string"
    }
  ]
}Note:
In this case, since the name of the UDF and the implementing method differ, you need to set the name of the UDF in thesource_function_name field. Also, you can
        provide the path of the source code file  in the source_location
        field.
                     UDF Configuration File Information
 A UDF configuration file is a JSON file containing an array of
          user_defined_functions. (An example of such a file is in the step to
        "Create a UDF JSON configuration file" in the preceding How to Use Java UDFs subsection.)
                     
Each user-defined function supports the fields shown in the following table.
Table 14-11 Fields for Each UDF
| Field | Data Type | Description | Required? | 
|---|---|---|---|
| function_name | string | Name of the function used as identifier in PGX | Required | 
| language | enum[java, javascript] | Source language for he function ( javaorjavascript) | Required | 
| return_type | enum[boolean, integer, long, float, double, string] | Return type of the function | Required | 
| arguments | array of object | Array of arguments. For each argument: type, argument name, required? | [] | 
| implementation_reference | string | Reference to the function name on the classpath | null | 
| namespace | string | Namespace of the function in PGX | null | 
| source_code | string | Source code of the function provided inline | null | 
| source_function_name | string | Name of the function in the source language | null | 
| source_location | string | Local file path to the function's source code | null | 
All configured UDFs must be unique with regard to the combination of the following fields:
- namespace
- function_name
- arguments
Parent topic: Developing Applications with Graph Analytics
14.10 Using the Graph Server (PGX) as a Library
When you utilize PGX as a library in your application, the graph server (PGX) instance runs in the same JVM as the Java application and all requests are translated into direct function calls instead of remote procedure invocations.
In this case, you must install the graph server (PGX) using RPM in the
            same machine as the client  applications. The shell executables provided by the graph
            server installation helps you to launch the Java or the Python shell in an  embedded
            server mode. See Installing Oracle Graph Server for more information.
                  
You can now start the Java shell without any parameters as shown:
cd /opt/oracle/graph
./bin/opg4j/etc/oracle/graph/pgx.confYou can change the location of the configuration file by passing the
                --pgx_conf command-line option followed by the path to the
            configuration file:
                  
# start local PGX instance with custom config
./bin/opg4j --pgx_conf <path_to_pgx.conf>You can also start the Python shell without any parameters as shown:
cd /opt/oracle/graph/
./bin/opg4pyWhen using Java, you can obtain a reference to the local PGX instance as shown:
import oracle.pg.rdbms.*; import oracle.pgx.api.*; ... ServerInstance instance = GraphServer.getEmbeddedInstance();
In a Python application, you can obtain a reference to the local PGX instance as shown:
import os os.environ["PGX_CLASSPATH"] = "/opt/oracle/graph/lib/*" import opg4py.graph_server as graph_server ... instance = graph_server.get_embedded_instance()
Starting the PGX Engine
PGX provides a convenience mechanism to start the PGX Engine when using
                the graph server (PGX) as a library. That is, the graph server (PGX) is
                automatically initialized and starts up automatically when
                    ServerInstance.createSession() is called the first time. This
                is provided that the engine is not already running at that time.
                     
For this implicit initialization, PGX will configure itself with the PGX configuration file at the default locations. If the PGX configuration file is not found, PGX will configure itself using default parameter values as shown in Configuration Parameters for the Graph Server (PGX) Engine.
Stopping the PGX Engine
When using the graph server (PGX) as a library, the
                    shutdownEngine() method will be called automatically via a JVM
                shutdown hook on exit. Specifically, the shutdown hook is invoked once all the non-daemon threads of the
                application exit. 
                     
It is recommended that you do not terminate your PGX application
                forcibly with kill -9, as it will not clear the
                    temp directory. See tmp_dir in Configuration Parameters for the Graph Server (PGX) Engine. 
                     
Parent topic: Developing Applications with Graph Analytics


