28.1 Loading Graph Data from Files

You can load graph data from multiple file formats (such as .csv, .xml, and .pgb).

Loading a Graph Using session.readGraphFiles() API

When loading graph from CSV files, ensure that the files have a header which provide the names of the columns to be loaded as properties. The header must follow a specific format. See Comma-Separated Values (CSV) for an example of a CSV file with header details.

The graph server (PGX) uses this header to determine the name and types of the properties to load, as well as the column to be used as vertex (or edge) ID, the columns that indicate the source and destination vertex ID for edges, and the column to be loaded as vertex or edge label.

You can then use the PgxSession.readGraphFiles() API to load the graph. This method takes the following three arguments:

  • Path to the vertex file
  • Path to the edge file
  • Name of the graph
opg4j> var loadedGraph = session.readGraphFiles("<path/vertices.csv>", "<path/edges.csv>", "<graph_name>")
import oracle.pgx.api.PgxSession;
import oracle.pgx.api.PgxGraph;

PgxSession session = Pgx.createSession("NewSession");
PgxGraph loadedGraph = session.readGraphFiles("<path/vertices.csv>", "<path/edges.csv>", "<graph_name>");
session = pypgx.get_session(session_name="<session_name>")
loaded_graph = session.read_graph_files("<path/vertices.csv>", "<path/edges.csv>", "<graph_name>")

Loading a Graph Using a GraphConfigBuilder Object

You can create a graph configuration object for your graph data files and then load the graph into the graph server (PGX) as shown:

opg4j> var vertexProviderConfig = new FileEntityProviderConfigBuilder(ProviderFormat.CSV).
...>   addUri("sample.vertices.csv").
...>   setName("vertices").
...>   setKeyType(IdType.INTEGER).
...>   setKeyColumn(1).
...>   build()
vertexProviderConfig ==> {"key_type":"integer","props":[],"key_column":1,"name":"vertices","format":"csv","uris":["sample.vertices.csv"],
"loading":{"create_key_mapping":true},"error_handling":{},"attributes":{}}

opg4j> var edgeProviderConfig = new FileEntityProviderConfigBuilder(ProviderFormat.CSV).
...>   addUri("sample.edges.csv").
...>   setName("edges").
...>   setSourceVertexProvider("vertices").
...>   setDestinationVertexProvider("vertices").
...>   setSourceColumn(1).
...>   setDestinationColumn(2).
...>   build()
edgeProviderConfig ==> {"destination_vertex_provider":"vertices","props":[],"source_column":1,"format":"csv","loading":{"create_key_mapping":false},
"error_handling":{},"attributes":{},"destination_column":2,"key_type":"long","source_vertex_provider":"vertices","name":"edges","uris":["sample.edges.csv"]}

opg4j> var config = GraphConfigBuilder.forPartitioned().
...>   addVertexProvider(vertexProviderConfig).
...>   addEdgeProvider(edgeProviderConfig).
...>   setName("simple graph").
...>   setVertexIdType(IdType.INTEGER).
...>   build()
config ==> {"edge_providers":[{"destination_vertex_provider":"vertices","props":[],"storing":{},"source_column":1,"time_with_timezone_format":
["h[h]:m[m][:s[s]] a[ XXX]","[yyyy-MM-dd'T']H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"format":"csv",
"loading":{"create_key_mapping":false},"has_keys":true,"error_handling":{},"attributes":{},"timestamp_format":["yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSS[SSS]]][XXX]",
"yyyy-MM-dd H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"destination_column":2,"key_type":"long",
"time_format":["h[h]:m[m][:s[s][.SSS]] a[ XXX]","[yyyy-MM-dd'T']H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],
"source_vertex_provider":"vertices","local_date_format":["yyyy-M[M]-d[d]","M[M]/d[d]/yyyy","d[d]-MMM-yyyy","d[d]-M[M]-yyyy","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSS[SSS]]][XXX]",
"yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"timestamp_with_timezone_format":["yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd H[H]:m[m][:s[s][.SSS[SSS]]][XXX]",
"yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"header":false,"name":"edges","uris":["sample.edges.csv"]}],"vertex_providers":[{"props":[],"storing":{},"key_column":1,
"time_with_timezone_format":["h[h]:m[m][:s[s]] a[ XXX]","[yyyy-MM-dd'T']H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],
"format":"csv","loading":{"create_key_mapping":true},"has_keys":true,"error_handling":{},"attributes":{},"timestamp_format":["yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSS[SSS]]][XXX]",
"yyyy-MM-dd H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"key_type":"integer","time_format":["h[h]:m[m][:s[s][.SSS]] a[ XXX]",
"[yyyy-MM-dd'T']H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"local_date_format":["yyyy-M[M]-d[d]","M[M]/d[d]/yyyy",
"d[d]-MMM-yyyy","d[d]-M[M]-yyyy","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"timestamp_with_timezone_format":["yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSS[SSS]]][XXX]",
"yyyy-MM-dd H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"header":false,"name":"vertices","uris":["sample.vertices.csv"]}],
"error_handling":{},"name":"simple graph","vertex_id_type":"integer","attributes":{},"loading":{}}

opg4j> var graph = session.readGraphWithProperties(config)
EntityProviderConfig vertexProviderConfig = new FileEntityProviderConfigBuilder(ProviderFormat.CSV)
.addUri("sample.vertices.csv")
.setName("vertices")
.setKeyType(IdType.INTEGER)
.setKeyColumn(1)
.build();

EntityProviderConfig edgeProviderConfig = new FileEntityProviderConfigBuilder(ProviderFormat.CSV)
.addUri("sample.edges.csv")
.setName("edges")
.setSourceVertexProvider("vertices")
.setDestinationVertexProvider("vertices")
.setSourceColumn(1)
.setDestinationColumn(2)
.build();

PartitionedGraphConfig config = GraphConfigBuilder.forPartitioned()
.addVertexProvider(vertexProviderConfig)
.addEdgeProvider(edgeProviderConfig)
.setName("simple graph")
.setVertexIdType(IdType.INTEGER)
.build();
PgxGraph graph = session.readGraphWithProperties(config);