4.3 About Vertex and Edge IDs
Generating vertex and edge IDs when loading from database tables into PGX
PGX enforces by default the existence of a unique identifier for each vertex and edge in a graph, so that they can be retrieved by using PgxGraph.getVertex(ID id)
and PgxGraph.getEdge(ID id)
or by PGQL queries using the built-in id()
method.
vertex_id_strategy
and
edge_id_strategy
:
keys_as_ids
: This is the default strategy to generate vertex IDs.partitioned_ids
: This is the recommended strategy for partitioned graphs.unstable_generated_ids
: This results in system generated vertex or edge IDs.no_ids
: This strategy disables vertex or edge IDs and therefore prevents you from calling APIs using vertex or edge IDs.
Using keys to generate IDs
The default strategy to generate the vertex IDs is to use the keys provided during
loading of the graph (keys_as_ids
). In that case, each vertex should have a
vertex key that is unique across all providers.
For edges, by default no keys are required in the edge data, and edge IDs will
be automatically generated by PGX (unstable_generated_ids
). This automatic
ID generation can be applied for vertex IDs also. Note that the generation of vertex or edge
IDs is not guaranteed to be deterministic. If required, it is also possible to load edge
keys as IDs.
The partitioned_ids
strategy requires keys to be unique only
within a vertex or edge provider (data source). The keys do not have to be globally
unique. Globally unique IDs are derived from a combination of the provider name and the key
inside the provider, as
<provider_name>
(<unique_key_within_provider>
).
For example, Account(1)
.
The partititioned_ids
strategy can be set through the configuration fields vertex_id_strategy
and edge_id_strategy
. For example,
{
"name": "bank_graph_analytics",
"optimized_for": "updates",
"vertex_id_strategy" : "partitioned_ids",
"edge_id_strategy" : "partitioned_ids",
"vertex_providers": [
{
"name": "Accounts",
"format": "rdbms",
"database_table_name": "BANK_ACCOUNTS",
"key_column": "ID",
"key_type": "integer",
"props": [
{
"name": "ID",
"type": "integer"
},
{
"name": "NAME",
"type": "string"
}
],
"loading": {
"create_key_mapping" : true
}
}
],
"edge_providers": [
{
"name": "Transfers",
"format": "rdbms",
"database_table_name": "BANK_TXNS",
"key_column": "ID",
"source_column": "FROM_ACCT_ID",
"destination_column": "TO_ACCT_ID",
"source_vertex_provider": "Accounts",
"destination_vertex_provider": "Accounts",
"props": [
{
"name": "ID",
"type": "integer"
},
{
"name": "AMOUNT",
"type": "double"
}
],
"loading": {
"create_key_mapping" : true
}
}
]
}
Note:
All available key types are supported in combination with partitioned IDs.After the graph is loaded, PGX maintains information about which property of a
provider corresponds to the key of the provider. In the preceding example, the vertex
property ID
happens to correspond to the vertex key and also the edge
property ID
happens to correspond to the edge key. Each provider can have
at most one such "key property" and the property can have any name.
vertex key property ID cannot be updated
Using an auto-incrementer to generate partitioned IDs
It is recommended to always set create_key_mapping
to
true
to benefit from performance optimizations. But if there are no
single-column keys for edges, create_key_mapping
can be set to
false
. Similarly, create_key_mapping
can be set to
false
for vertex providers also. IDs will be generated via an
auto-incrementer, for example Accounts(1)
, Accounts(2)
,
Accounts(3)
.
See PGQL Queries with Partitioned IDs for more information on executing PGQL queries with partitioned IDs.
Parent topic: Using the In-Memory Graph Server (PGX)