Pypgx API¶
Public API for the PGX client.
-
class
pypgx.api.
AllPaths
(graph, java_all_paths)¶ Bases:
object
The paths from one source vertex to all other vertices.
-
destroy
()¶ Destroy this object.
-
get_path
(destination)¶ Get the path.
- Parameters
destination – The destination node
- Returns
The path result to the destination node
-
-
class
pypgx.api.
Analyst
(session, java_analyst)¶ Bases:
object
The Analyst gives access to all built-in algorithms of PGX.
Unlike some of the other classes inside this package, the Analyst is not stateless. It creates session-bound transient data to hold the result of algorithms and keeps track of them.
-
adamic_adar_counting
(graph, aa='adamic_adar')¶ Adamic-adar counting compares the amount of neighbors shared between vertices, this measure can be used with communities.
- Parameters
graph – Input graph
dc – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.
- Returns
Vertex property holding the computed scores
-
approximate_vertex_betweenness_centrality
(graph, seeds, bc='approx_betweenness')¶ - Parameters
graph – Input graph
seeds – The (unique) chosen nodes to be used to compute the approximated betweenness centrality coeficients
bc – Vertex property holding the betweenness centrality value for each vertex
- Returns
Vertex property holding the computed scores
-
center
(graph, center=None)¶ Periphery/center gives an overview of the extreme distances and the corresponding vertices in a graph.
The center is comprised by the set of vertices with eccentricity equal to the radius of the graph.
- Parameters
graph – Input graph
center – (Out argument) vertex set holding the vertices from the periphery or center of the graph
-
closeness_centrality
(graph, cc='closeness')¶ - Parameters
graph – Input graph
cc – Vertex property holding the closeness centrality
-
communities_conductance_minimization
(graph, max_iter=100, label='conductance_minimization')¶ Soman and Narang can find communities in a graph taking weighted edges into account.
- Parameters
graph – Input graph
max_iter – Maximum number of iterations that will be performed
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.
- Returns
Partition holding the node collections corresponding to the communities found by the algorithm
- Returns
Partition holding the node collections corresponding to the communities found by the algorithm
-
communities_infomap
(graph, rank, weight, tau=0.15, tol=0.0001, max_iter=100, label='infomap')¶ Infomap can find high quality communities in a graph.
- Parameters
graph – Input graph
rank – Vertex property holding the normalized PageRank value for each vertex
weight – Ridge property holding the weight of each edge in the graph
tau – Damping factor
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
max_iter – Maximum iteration number
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.
- Returns
Partition holding the node collections corresponding to the communities found by the algorithm
-
communities_label_propagation
(graph, max_iter=100, label='label_propagation')¶ Label propagation can find communities in a graph relatively fast.
- Parameters
graph – Input graph
max_iter – Maximum number of iterations that will be performed
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.
- Returns
Partition holding the node collections corresponding to the communities found by the algorithm
- Returns
Partition holding the node collections corresponding to the communities found by the algorithm
-
conductance
(graph, partition, partition_idx)¶ Conductance assesses the quality of a partition in a graph.
- Parameters
graph – Input graph
partition – Partition of the graph with the corresponding node collections
partition_idx – Number of the component to be used for computing its conductance
-
count_triangles
(graph, sort_vertices_by_degree)¶ Triangle counting gives an overview of the amount of connections between vertices in neighborhoods.
- Parameters
graph – Input graph
sort_vertices_by_degree – Boolean flag for sorting the nodes by their degree as preprocessing step
- Returns
The total number of triangles found
-
deepwalk_builder
(min_word_frequency=1, batch_size=128, num_epochs=2, layer_size=200, learning_rate=0.025, min_learning_rate=0.0001, window_size=5, walk_length=5, walks_per_vertex=4, sample_rate=1e-05, negative_sample=10, validation_fraction=0.05, seed=None)¶ Build a Deepwalk model and return it.
- Parameters
min_word_frequency – Minimum word frequency to consider before pruning
batch_size – Batch size for training the model
num_epochs – Number of epochs to train the model
layer_size – Number of dimensions for the output vectors
learning_rate – Initial learning rate
min_learning_rate – Minimum learning rate
window_size – Window size to consider while training the model
walk_length – Length of the walks
walks_per_vertex – Number of walks to consider per vertex
sample_rate – Sample rate
negative_sample – Number of negative samples
validation_fraction – Fraction of training data on which to compute final loss
seed – Random seed for training the model
- Returns
Built deepwalk model
-
degree_centrality
(graph, dc='degree')¶ - Measure the centrality of the vertices based on its degree, letting
you see how a vertex influences its neighborhood.
- Parameters
graph – Input graph
dc – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.
- Returns
Vertex property holding the computed scores
-
diameter
(graph, eccentricity='eccentricity')¶ Diameter/radius gives an overview of the distances in a graph.
- Parameters
graph – Input graph
eccentricity – (Out argument) vertex property holding the eccentricity value for each vertex
- Returns
Pair holding the diameter of the graph and a node property with eccentricity value for each node
-
eigenvector_centrality
(graph, tol=0.001, max_iter=100, l2_norm=False, in_edges=False, ec='eigenvector')¶ Eigenvector centrality gets the centrality of the vertices in an intrincated way using neighbors, allowing to find well-connected vertices.
- Parameters
graph – Input graph
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
max_iter – Maximum iteration number
l2_norm – Boolean flag to determine whether the algorithm will use the l2 norm (Euclidean norm) or the l1 norm (absolute value) to normalize the centrality scores
in_edges – Boolean flag to determine whether the algorithm will use the incoming or the outgoing edges in the graph for the computations
ec – Vertex property holding the resulting score for each vertex
- Returns
Vertex property holding the computed scores
-
fattest_path
(graph, root, capacity, distance='fattest_path_distance', parent='fattest_path_parent', parent_edge='fattest_path_parent_edge')¶ Fattest path is a fast algorithm for finding a shortest path adding constraints for flowing related matters.
- Parameters
graph – Input graph
root – Fattest path is a fast algorithm for finding a shortest path adding constraints for flowing related matters
capacity – Edge property holding the capacity of each edge in the graph
distance – Vertex property holding the capacity value of the fattest path up to the current vertex
parent – Vertex property holding the parent vertex of the each vertex in the fattest path
parent_edge – Vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
- Returns
AllPaths object holding the information of the possible fattest paths from the source node
-
filtered_bfs
(graph, root, filter, navigator, init_with_inf=True, max_depth=2147483647, distance='distance', parent='parent')¶ Breadth-first search with an option to filter edges during the traversal of the graph.
- Parameters
graph – Input graph
root – The source vertex from the graph for the path.
filter – GraphFilter object used to filter non desired nodes
navigator – Navigator expression to be evaluated on the vertices during the graph traversal
init_with_inf – Boolean flag to set the initial distance values of the vertices. If set to true, it will initialize the distances as INF, and -1 otherwise.
max_depth – Maximum depth limit for the BFS traversal
distance – Vertex property holding the hop distance for each reachable vertex in the graph
parent – Vertex property holding the parent vertex of the each reachable vertex in the path
- Returns
Distance and parent vertex properties
-
filtered_dfs
(graph, root, filter, navigator, init_with_inf=True, max_depth=2147483647, distance='distance', parent='parent')¶ Depth-first search with an option to filter edges during the traversal of the graph.
- Parameters
graph – Input graph
root – The source vertex from the graph for the path
filter – GraphFilter object used to filter non desired nodes
navigator – Navigator expression to be evaluated on the vertices during the graph traversal
init_with_inf – Boolean flag to set the initial distance values of the vertices. If set to true, it will initialize the distances as INF, and -1 otherwise.
max_depth – Maximum search depth
distance – Vertex property holding the hop distance for each reachable vertex in the graph
parent – Vertex property holding the parent vertex of the each reachable vertex in the path
- Returns
Distance and parent vertex properties
-
find_cycle
(graph, src=None, vertex_seq=None, edge_seq=None)¶ Find cycle looks for any loop in the graph.
- Parameters
graph – Input graph
src – Source vertex for the search
vertex_seq – (Out argument) vertex sequence holding the vertices in the cycle
edge_seq – (Out argument) edge sequence holding the edges in the cycle
- Returns
PgxPath representing the cycle as path, if exists.
-
get_deepwalk_model_loader
()¶ Return a ModelLoader that can be used for loading a DeepWalkModel.
- Returns
ModelLoader
-
get_pg2vec_model_loader
()¶ Return a ModelLoader that can be used for loading a Pg2vecModel.
- Returns
ModelLoader
-
hits
(graph, max_iter=100, auth='authorities', hubs='hubs')¶ Hyperlink-Induced Topic Search (HITS) assigns ranking scores to the vertices, aimed to assess the quality of information and references in linked structures.
- Parameters
graph – Input graph
max_iter – Number of iterations that will be performed
auth – Vertex property holding the authority score for each vertex
hubs – Vertex property holding the hub score for each vertex
- Returns
Vertex property holding the computed scores
-
in_degree_centrality
(graph, dc='in_degree')¶ - Measure the in-degree centrality of the vertices based on its degree, letting
you see how a vertex influences its neighborhood.
- Parameters
graph – Input graph
dc – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.
- Returns
Vertex property holding the computed scores
-
in_degree_distribution
(graph, dist_map=None)¶ - Parameters
graph – Input graph
dist_map – (Out argument) map holding a histogram of the vertex degrees in the graph
- Returns
Map holding a histogram of the vertex degrees in the graph
-
k_core
(graph, min_core=0, max_core=2147483647, kcore='kcore')¶ k-core decomposes a graph into layers revealing subgraphs with particular properties.
- Parameters
graph – Input graph
min_core – Minimum k-core value
max_core – Maximum k-core value
kcore – Vertex property holding the result value
- Returns
Pair holding the maximum core found and a node property with the largest k-core value for each node.
-
load_deepwalk_model
(path, key)¶ Load an encrypted DeepWalk model.
- Parameters
path – Path to model
key – The decryption key, or null if no encryption was used
- Returns
Loaded model
-
load_pg2vec_model
(path, key)¶ Load an encrypted pg2vec model.
- Parameters
path – Path to model
key – The decryption key, or null if no encryption was used
- Returns
Loaded model
-
local_clustering_coefficient
(graph, lcc='lcc')¶ LCC gives information about potential clustering options in a graph.
- Parameters
graph – Input graph
lcc – Vertex property holding the lcc value for each vertex
- Returns
Vertex property holding the lcc value for each vertex
-
matrix_factorization_gradient_descent
(bipartite_graph, weight, learning_rate=0.001, change_per_step=1.0, lbd=0.15, max_iter=100, vector_length=10, features='features')¶ - Parameters
bipartite_graph – Input graph between 1 and 5, the result will become inaccurate.
learning_rate – Learning rate for the optimization process
change_per_step – Parameter used to modulate the learning rate during the optimization process
lbd – Penalization parameter to avoid over-fitting during optimization process
max_iter – Maximum number of iterations that will be performed
vector_length – Size of the feature vectors to be generated for the factorization
features – Vertex property holding the generated feature vectors for each vertex. This function accepts names and VertexProperty objects.
- Returns
Matrix factorization model holding the feature vectors found by the algorithm
-
out_degree_centrality
(graph, dc='out_degree')¶ Measure the out-degree centrality of the vertices based on its degree, letting you see how a vertex influences its neighborhood.
- Parameters
graph – Input graph
dc – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.
- Returns
Vertex property holding the computed scores
-
out_degree_distribution
(graph, dist_map=None)¶ - Parameters
graph – Input graph
dist_map – (Out argument) map holding a histogram of the vertex degrees in the graph
- Returns
Map holding a histogram of the vertex degrees in the graph
-
pagerank
(graph, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='pagerank')¶ - Parameters
graph – Input graph
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor
max_iter – Maximum number of iterations that will be performed
norm – Determine whether the algorithm will take into account dangling vertices for the ranking scores.
rank – Vertex property holding the PageRank value for each vertex
- Returns
Vertex property holding the PageRank value for each vertex
-
pagerank_approximate
(graph, tol=0.001, damping=0.85, max_iter=100, rank='approx_pagerank')¶ - Parameters
graph – Input graph
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor
max_iter – Maximum number of iterations that will be performed
rank – Vertex property holding the PageRank value for each vertex
- Returns
Vertex property holding the PageRank value for each vertex
-
partition_conductance
(graph, partition)¶ Partition conductance assesses the quality of many partitions in a graph.
- Parameters
graph – Input graph
partition – Partition of the graph with the corresponding node collections
-
partition_modularity
(graph, partition)¶ Modularity summarizes information about the quality of components in a graph.
- Parameters
graph – Input graph
partition – Partition of the graph with the corresponding node collections
- Returns
Scalar (double) to store the conductance value of the given cut
-
periphery
(graph, periphery=None)¶ Periphery/center gives an overview of the extreme distances and the corresponding vertices in a graph.
- Parameters
graph – Input graph
periphery – (Out argument) vertex set holding the vertices from the periphery or center of the graph
- Returns
Vertex set holding the vertices from the periphery or center of the graph
-
personalized_pagerank
(graph, v, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='personalized_pagerank')¶ Personalized PageRank for a vertex of interest.
Compares and spots out important vertices in a graph.
- Parameters
graph – Input graph
v – The chosen vertex from the graph for personalization
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor
max_iter – Maximum number of iterations that will be performed
norm – Boolean flag to determine whether the algorithm will take into account dangling vertices for the ranking scores.
rank – Vertex property holding the PageRank value for each vertex
- Returns
Vertex property holding the computed scores
-
personalized_salsa
(bipartite_graph, v, tol=0.001, damping=0.85, max_iter=100, rank='personalized_salsa')¶ Personalized SALSA for a vertex of interest.
Assesses the quality of information and references in linked structures.
- Parameters
bipartite_graph – Bipartite graph
v – The chosen vertex from the graph for personalization
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor to modulate the degree of personalization of the scores by the algorithm
max_iter – Maximum number of iterations that will be performed
rank – (Out argument) vertex property holding the normalized authority/hub ranking score for each vertex
- Returns
Vertex property holding the computed scores
-
personalized_weighted_pagerank
(graph, v, weight, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='personalized_weighted_pagerank')¶ - Parameters
graph – Input graph
v – The chosen vertex from the graph for personalization
weight – Edge property holding the weight of each edge in the graph
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor
max_iter – Maximum number of iterations that will be performed
norm – Boolean flag to determine whether the algorithm will take into account dangling vertices for the ranking scores
rank – Vertex property holding the PageRank value for each vertex
-
pg2vec_builder
(graphlet_id_property_name, vertex_property_names, min_word_frequency=1, batch_size=128, num_epochs=5, layer_size=200, learning_rate=0.04, min_learning_rate=0.0001, window_size=4, walk_length=8, walks_per_vertex=5, graphlet_size_property_name='graphletSize-Pg2vec', use_graphlet_size=True, validation_fraction=0.05, seed=None)¶ Build a pg2Vec model and return it.
- Parameters
graphlet_id_property_name – Property name of the graphlet-id in the input graph
vertex_property_names – Property names to consider for pg2vec model training
min_word_frequency – Minimum word frequency to consider before pruning
batch_size – Batch size for training the model
num_epochs – Number of epochs to train the model
layer_size – Number of dimensions for the output vectors
learning_rate – Initial learning rate
min_learning_rate – Minimum learning rate
window_size – Window size to consider while training the model
walk_length – Length of the walks
walks_per_vertex – Number of walks to consider per vertex
graphlet_size_property_name – Property name for graphlet size
use_graphlet_size – Whether to use or not the graphlet size
validation_fraction – Fraction of training data on which to compute final loss
seed – Seed
- Returns
Build Pg2Vec Model
-
prim
(graph, weight, mst='mst')¶ Prim reveals tree structures with shortest paths in a graph.
- Parameters
graph – Input graph
weight – Edge property holding the weight of each edge in the graph
mst – Edge property holding the edges belonging to the minimum spanning tree of the graph
- Returns
Edge property holding the edges belonging to the minimum spanning tree of the graph (i.e. all the edges with in_mst=true)
-
radius
(graph, eccentricity='eccentricity')¶ Radius gives an overview of the distances in a graph. it is computed as the minimum graph eccentricity.
- Parameters
graph – Input graph
eccentricity – (Out argument) vertex property holding the eccentricity value for each vertex
- Returns
Pair holding the radius of the graph and a node property with eccentricity value for each node
-
reachability
(graph, src, dst, max_hops, ignore_edge_direction)¶ Reachability is a fast way to check if two vertices are reachable from each other.
- Parameters
graph – Input graph
src – Source vertex for the search
dst – Destination vertex for the search
max_hops – Maximum hop distance between the source and destination vertices
ignore_edge_direction – Boolean flag for ignoring the direction of the edges during the search
- Returns
The number of hops between the vertices. It will return -1 if the vertices are not connected or are not reachable given the condition of the maximum hop distance allowed.
-
salsa
(bipartite_graph, tol=0.001, max_iter=100, rank='salsa')¶ Stochastic Approach for Link-Structure Analysis (SALSA) computes ranking scores.
It assesses the quality of information and references in linked structures.
- Parameters
bipartite_graph – Bipartite graph
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
max_iter – Maximum number of iterations that will be performed
rank – Vertex property holding the value for each vertex in the graph
- Returns
Vertex property holding the computed scores
-
scc_kosaraju
(graph, label='scc_kosaraju')¶ Kosaraju finds strongly connected components in a graph.
- Parameters
graph – Input graph
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.
- Returns
Partition holding the node collections corresponding to the components found by the algorithm
-
scc_tarjan
(graph, label='scc_tarjan')¶ Tarjan finds strongly connected components in a graph.
- Parameters
graph – Input graph
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.
- Returns
Partition holding the node collections corresponding to the components found by the algorithm
-
shortest_path_bellman_ford
(graph, src, weight, distance='bellman_ford_distance', parent='bellman_ford_parent', parent_edge='bellman_ford_parent_edge')¶ Bellman-Ford finds multiple shortest paths at the same time.
- Parameters
graph – Input graph
src – Source node
distance – (Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
weight – Edge property holding the weight of each edge in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
- Returns
AllPaths holding the information of the possible shortest paths from the source node
-
shortest_path_bellman_ford_reversed
(graph, src, weight, distance='bellman_ford_distance', parent='bellman_ford_parent', parent_edge='bellman_ford_parent_edge')¶ Reversed Bellman-Ford finds multiple shortest paths at the same time.
- Parameters
graph – Input graph
src – Source node
distance – (Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
weight – Edge property holding the weight of each edge in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path.
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path.
- Returns
AllPaths holding the information of the possible shortest paths from the source node.
-
shortest_path_bidirectional_dijkstra
(graph, src, dst, weight, parent='bidirectional_dijkstra_parent', parent_edge='bidirectional_dijkstra_parent_edge')¶ Bidirectional dijkstra is a fast algorithm for finding a shortest path in a graph.
- Parameters
graph – Input graph
src – Source node
dst – Destination node
weight – Edge property holding the (positive) weight of each edge in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
- Returns
PgxPath holding the information of the shortest path, if it exists
-
shortest_path_dijkstra
(graph, src, dst, weight, parent='dijkstra_parent', parent_edge='dijkstra_parent_edge')¶ - Parameters
graph – Input graph
src – Source node
dst – Destination node
weight – Edge property holding the (positive) weight of each edge in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
- Returns
PgxPath holding the information of the shortest path, if it exists
-
shortest_path_filtered_bidirectional_dijkstra
(graph, src, dst, weight, filter_expression, parent='bidirectional_dijkstra_parent', parent_edge='bidirectional_dijkstra_parent_edge')¶ - Parameters
graph – Input graph
src – Source node
dst – Destination node
weight – Edge property holding the (positive) weight of each edge in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
filter_expression – graphFilter object for filtering
- Returns
PgxPath holding the information of the shortest path, if it exists
-
shortest_path_filtered_dijkstra
(graph, src, dst, weight, filter_expression, parent='dijkstra_parent', parent_edge='dijkstra_parent_edge')¶ - Parameters
graph – Input graph
src – Source node
dst – Destination node
weight – Edge property holding the (positive) weight of each edge in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
filter_expression – GraphFilter object for filtering
- Returns
PgxPath holding the information of the shortest path, if it exists
-
shortest_path_hop_distance
(graph, src, distance='hop_dist_distance', parent='hop_dist_parent', parent_edge='hop_dist_edge')¶ Hop distance can give a relatively fast insight on the distances in a graph.
- Parameters
graph – Input graph
src – Source node
distance – Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
- Returns
AllPaths holding the information of the possible shortest paths from the source node
-
shortest_path_hop_distance_reversed
(graph, src, distance='hop_dist_distance', parent='hop_dist_parent', parent_edge='hop_dist_edge')¶ Backwards hop distance can give a relatively fast insight on the distances in a graph.
- Parameters
graph – Input graph
src – Source node
distance – Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
- Returns
AllPaths holding the information of the possible shortest paths from the source node
-
topological_schedule
(graph, vs, topo_sched='topo_sched')¶ Topological schedule gives an order of visit for the reachable vertices from the source.
- Parameters
graph – Input graph
vs – Set of vertices to be used as the starting points for the scheduling order
topo_sched – (Out argument) vertex property holding the scheduled order of each vertex
- Returns
Vertex property holding the scheduled order of each vertex.
-
topological_sort
(graph, topo_sort='topo_sort')¶ Topological sort gives an order of visit for vertices in directed acyclic graphs.
- Parameters
graph – Input graph
topo_sort – (Out argument) vertex property holding the topological order of each vertex
-
vertex_betweenness_centrality
(graph, bc='betweenness')¶ - Parameters
graph – Input graph
bc – Vertex property holding the betweenness centrality value for each vertex
- Returns
Vertex property holding the computed scores
-
wcc
(graph, label='wcc')¶ Identify weakly connected components.
This can be useful for clustering graph data.
- Parameters
graph – Input graph
label – Vertex property holding the value for each vertex in the graph. Can be a string or a VertexProperty object.
- Returns
Partition holding the node collections corresponding to the components found by the algorithm.
-
weighted_closeness_centrality
(graph, weight, cc='weighted_closeness')¶ Measure the centrality of the vertices based on weighted distances, allowing to find well-connected vertices.
- Parameters
graph – Input graph
weight – Edge property holding the weight of each edge in the graph
cc – (Out argument) vertex property holding the closeness centrality value for each vertex
- Returns
Vertex property holding the computed scores
-
weighted_pagerank
(graph, weight, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='weighted_pagerank')¶ - Parameters
graph – Input graph
weight – Edge property holding the weight of each edge in the graph
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor
max_iter – Maximum number of iterations that will be performed
rank – Vertex property holding the PageRank value for each vertex
- Returns
Vertex property holding the computed the peageRank value
-
whom_to_follow
(graph, v, top_k=100, size_circle_of_trust=500, tol=0.001, damping=0.85, max_iter=100, salsa_tol=0.001, salsa_max_iter=100, hubs=None, auth=None)¶ Whom-to-follow (WTF) is a recommendation algorithm.
It returns two vertex sequences: one of similar users (hubs) and a second one with users to follow (auth).
- Parameters
graph – Input graph
v – The chosen vertex from the graph for personalization of the recommendations
top_k – The maximum number of recommendations that will be returned
size_circle_of_trust – The maximum size of the circle of trust
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor for the Pagerank stage
max_iter – Maximum number of iterations that will be performed for the Pagerank stage
salsa_tol – Maximum tolerated error value for the SALSA stage
salsa_max_iter – Maximum number of iterations that will be performed for the SALSA stage
hubs – (Out argument) vertex sequence holding the top rated hub vertices (similar users) for the recommendations
auth – (Out argument) vertex sequence holding the top rated authority vertices (users to follow) for the recommendations
- Returns
Vertex properties holding hubs and auth
-
-
class
pypgx.api.
BipartiteGraph
(session, java_graph)¶ Bases:
pypgx.api._pgx_graph.PgxGraph
A bipartite PgxGraph.
-
get_is_left_property
()¶ Get the ‘is Left’ vertex property of the graph.
-
-
class
pypgx.api.
CompiledProgram
(session, java_program)¶ Bases:
object
A compiled Green-Marl program.
Constructor arguments: :param session: Pgx Session :param java_program: Java compiledProgram
-
destroy
()¶ Free resources on the server taken up by this Program.
-
run
(*argv)¶ Run the compiled program with the given parameters.
If the Green-Marl procedure of this compiled program looks like this: procedure pagerank(G: Graph, e double, max int, nodePorp){…}
- Parameters
argv – All the arguments required by specified procedure
- Returns
Result of analysis as an AnalysisResult as a dict
-
-
class
pypgx.api.
EdgeCollection
(graph, java_collection)¶ Bases:
pypgx.api._pgx_collection.PgxCollection
A collection of edges.
-
add
(e)¶ Add one or multiple edges to the collection.
- Parameters
e – Edge or edge id. Can also be an iterable of edge/edge ids.
-
add_all
(edges)¶ Add multiple vertices to the collection.
- Parameters
edges – Iterable of edges/edges ids
-
contains
(e)¶ Check if the collection contains edge e.
- Parameters
e – PgxEdge object or id:
- Returns
Boolean
-
remove
(e)¶ Remove one or multiple edges from the collection.
- Parameters
e – Edges or edges id. Can also be an iterable of edges/edges ids.
-
remove_all
(edges)¶ Remove multiple edges from the collection.
- Parameters
edges – Iterable of edges/edges ids
-
-
class
pypgx.api.
EdgeSequence
(graph, java_collection)¶ Bases:
pypgx.api._pgx_collection.EdgeCollection
-
class
pypgx.api.
EdgeSet
(graph, java_collection)¶ Bases:
pypgx.api._pgx_collection.EdgeCollection
-
class
pypgx.api.
GraphBuilder
(session, java_graph_builder, id_type)¶ Bases:
object
-
add_edge
(src, dst, edge_id=None)¶ - Parameters
src – Source vertexBuilder or id
dst – Destination VertexBuilder or ID
edge_id – the ID of the new edge
-
add_vertex
(vertex=None)¶ Add the vertex with the given id to the graph builder.
If the vertex doesn’t exist it is added, if it exists a builder for that vertex is returned Throws an UnsupportedOperationException if vertex ID generation strategy is set to IdGenerationStrategy.AUTO_GENERATED.
- Parameters
vertex – The ID of the new vertex
- Returns
A vertexBuilder instance
-
build
(name=None)¶ - Parameters
name – The new name of the graph. If None, a name is generated.
- Returns
PgxGraph object
-
reset_edge
(edge)¶ Reset any change for the given edge.
- Parameters
edge – The id or the EdgeBuilder object to reset
- Returns
self
-
reset_vertex
(vertex)¶ Reset any change for the given vertex.
- Parameters
vertex – The id or the vertexBuilder object to reset
- Returns
self
-
-
class
pypgx.api.
MatrixFactorizationModel
(graph, java_mfm, features)¶ Bases:
object
Object that holds the state for repeatedly returning estimated ratings.
-
get_estimated_rating
(v)¶ Return estimated ratings for a specific vertex.
- Parameters
v – The vertex to get estimated ratings for.
- Returns
The VertexProperty containing the estimated ratings.
-
-
class
pypgx.api.
Namespace
(java_namespace)¶ Bases:
object
Represents a namespace for objects (e.g. graphs, properties) in PGX
-
class
pypgx.api.
PgqlResultSet
(graph, java_pgql_result_set)¶ Bases:
object
Result set of a pattern matching query.
Note: retrieving results from the server is not thread-safe.
-
get_row
(row)¶ Get row from result_set.
- Parameters
row – Row index
-
get_slice
(start, stop, step=1)¶ Get slice from result_set.
- Parameters
start – Start index
stop – Stop index
step – Step size
-
print
(file=None, num_results=1000, start=0)¶ Print the result set.
- Parameters
file – File to which results are printed (default is
sys.stdout
)num_results – Number of results to be printed
start – Index of the first result to be printed
-
-
class
pypgx.api.
Pgx
(java_pgx_class)¶ Bases:
object
Main entry point for PGX applications.
-
create_session
(source=None, base_url=None)¶ Create and return a session.
- Parameters
source – The session source string. Default value is “pgx_python”.
base_url – The base URL in the format host [ : port][ /path] of the PGX server REST end-point. If base_url is None, the default will be used which points to embedded PGX instance.
-
get_instance
(base_url=None, token=None)¶ Get a handle to a PGX instance.
- Parameters
base_url – The base URL in the format host [ : port][ /path] of the PGX server REST end-point. If base_url is None, the default will be used which points to embedded PGX instance.
token – The access token
-
set_default_url
(url)¶ Set the default base URL used by invocations of get_instance().
The new default URL affects sub-sequent calls of getInstance().
- Parameters
url – New URL
-
-
class
pypgx.api.
PgxCollection
(graph, java_collection)¶ Bases:
object
Superclass for Pgx collections.
-
clone
(name=None)¶ Clone and rename existing collection.
- Parameters
name – New name of the collection. If none, the old name is not changed.
-
property
size
¶ Get the number of elements in this collection.
-
to_mutable
(name=None)¶ Create a mutable copy of an existing collection.
- Parameters
name – New name of the collection. If none, the old name is not changed.
-
-
class
pypgx.api.
PgxEdge
(graph, java_edge)¶ Bases:
pypgx.api._pgx_entity.PgxEntity
-
property
label
¶ Return the edge label.
-
property
vertices
¶ Return the source and the destination vertex.
-
property
-
class
pypgx.api.
PgxEntity
(graph, java_entity)¶ Bases:
object
An abstraction of vertex and edge.
-
get_property
(property_name)¶ Get a property by name.
- Parameters
property_name – Property name
-
set_property
(property_name, value)¶ Set an entity property.
- Parameters
property_name – Property name
value – New value
-
-
class
pypgx.api.
PgxFrameColumn
(java_pgx_frame_column)¶ Bases:
object
-
destroy
()¶ Free resources on the server taken up by this column.
-
get_descriptor
()¶ Return a description of the column.
-
-
class
pypgx.api.
PgxGraph
(session, java_graph)¶ Bases:
object
A reference to a graph on the server side.
Operations on instances of this class are executed on the server side onto the referenced graph. Note that a session can have multiple objects referencing the same graph: the result of any operation mutating the graph on any of those references will be visible on all of them.
-
bipartite_sub_graph_from_in_degree
(vertex_properties=True, edge_properties=True, name=None, is_left_name=None, in_place=False)¶ Create a bipartite version of this graph with all vertices of in-degree = 0 being the left set.
- Parameters
vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties – List of edge properties belonging to graph specified to be kept in the new graph
name – New graph name
is_left_name – Name of the boolean isLeft vertex property of the new graph. If None, a name will be generated.
in_place – Whether to create a new copy (False) or overwrite this graph (True)
-
bipartite_sub_graph_from_left_set
(vset, vertex_properties=True, edge_properties=True, name=None, is_left_name=None)¶ Create a bipartite version of this graph with the given vertex set being the left set.
- Parameters
vset – Vertex set representing the left side
vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties – List of edge properties belonging to graph specified to be kept in the new graph
name – name of the new graph. If None, a name will be generated.
is_left_name – Name of the boolean isLeft vertex property of the new graph. If None, a name will be generated.
-
clone
(vertex_properties=True, edge_properties=True, name=None)¶ Return a copy of this graph.
- Parameters
vertex_properties – List of vertex properties belonging to graph specified to be cloned as well
edge_properties – List of edge properties belonging to graph specified to be cloned as well
name – Name of the new graph
-
clone_and_execute_pgql
(pgql_query)¶ Create a deep copy of the graph, and execute on it the pgql query.
- Parameters
pgql_query – Query string in PGQL
- Returns
A cloned PgxGraph with the pgql query executed
- throws InterruptedException if the caller thread gets interrupted while waiting for
completion.
- throws ExecutionException if any exception occurred during asynchronous execution.
The actual exception will be nested.
-
close
()¶ Destroy without waiting for completion.
-
combine_edge_properties_into_vector_property
(properties, name=None)¶ Take a list of scalar edge properties of same type and create a new edge vector property by combining them.
The dimension of the vector property will be equals to the number of properties.
- Parameters
properties – List of scalar edge properties
name – Name for the vector property. If not null, vector property will be named. If that results in a name conflict, the returned future will complete exceptionally.
-
combine_vertex_properties_into_vector_property
(properties, name=None)¶ Take a list of scalar vertex properties of same type and create a new vertex vector property by combining them.
The dimension of the vector property will be equals to the number of properties.
- Parameters
properties – List of scalar vertex properties
name – Name for the vector property. If not null, vector property will be named. If that results in a name conflict, the returned future will complete exceptionally.
-
property
config
¶ Get the GraphConfig object.
-
create_components
(components, num_components)¶ Create a Partition object holding a collection of vertex sets, one for each component.
- Parameters
components – Vertex property mapping each vertex to its component ID. Note that only component IDs in the range of [0..numComponents-1] are allowed. The returned future will complete exceptionally with an IllegalArgumentException if an invalid component ID is encountered. Gaps are supported: certain IDs not being associated with any vertices will yield to empty components.
num_components – How many different components the components property contains
- Returns
The Partition object
-
create_edge_property
(data_type, name=None)¶ Create a session-bound edge property.
- Parameters
data_type – Type of the vector property to create
name – Name of vector property to be created
-
create_edge_sequence
(name=None)¶ Create a new edge sequence.
- Parameters
name – Sequence name
-
create_edge_set
(name=None)¶ Create a new edge set.
- Parameters
name – Edge set name
-
create_edge_vector_property
(data_type, dim, name=None)¶ Create a session-bound vector property.
- Parameters
data_type – Type of the vector property to create
dim – Dimension of vector property to be created
name – Name of vector property to be created
-
create_map
(key_type, val_type, name=None)¶ Create a session-bound map.
Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]
- Parameters
key_type – Property type of the keys that are going to be stored inside the map
val_type – Property type of the values that are going to be stored inside the map
name – Map name
-
create_path
(src, dst, cost, parent, parent_edge)¶ - Parameters
src – Source vertex of the path
dst – Destination vertex of the path
cost – Property holding the edge costs. If null, the resulting cost will equal the hop distance.
parent – Property holding the parent vertices for each vertex of the shortest path. For example, if the shortest path is A -> B -> C, then parent[C] -> B and parent[B] -> A.
parent_edge – Property holding the parent edges for each vertex of the shortest path
- Returns
The PgxPath object
-
create_scalar
(data_type, name=None)¶ Create a new Scalar.
- Parameters
data_type – Scalar type
name – Name of the scalar to be created
-
create_vector_scalar
(data_type, name=None)¶ Create a new vertex property.
- Parameters
data_type – Property type
name – Name of the scalar to be created
-
create_vertex_property
(data_type, name=None)¶ Create a new vertex property.
- Parameters
data_type – Property type
name – Name of the property to be created
-
create_vertex_sequence
(name=None)¶ Create a new vertex sequence.
- Parameters
name – Sequence name
-
create_vertex_set
(name=None)¶ Create a new vertex set.
- Parameters
name – Set name
-
create_vertex_vector_property
(data_type, dim, name=None)¶ Create a session-bound vertex vector property.
- Parameters
data_type – Type of the vector property to create
dim – Dimension of vector property to be created
name – Name of vector property to be created
-
destroy_edge_property_if_exists
(name)¶ Destroy a specific edge property if it exists.
- Parameters
name – Property name
-
destroy_vertex_property_if_exists
(name)¶ Destroy a specific vertex property if it exists.
- Parameters
name – Property name
-
execute_pgql
(pgql_query)¶ (BETA) Blocking version of cloneAndExecutePgqlAsync(String).
Calls cloneAndExecutePgqlAsync(String) and waits for the returned PgxFuture to complete.
throws InterruptedException if the caller thread gets interrupted while waiting for completion.
throws ExecutionException if any exception occurred during asynchronous execution. The actual exception will be nested.
- Parameters
pgql_query – Query string in PGQL
- Returns
The query result set as PgqlResultSet object
-
explain_pgql
(pgql_query)¶ Explain the execution plan of a pattern matching query.
Note: Different PGX versions may return different execution plans.
- Parameters
pgql_query – Query string in PGQL
- Returns
The query plan
-
filter
(graph_filter, vertex_properties=True, edge_properties=True, name=None)¶ Create a subgraph of this graph.
To create the subgraph, a given filter expression is used to determine which parts of the graph will be part of the subgraph.
- Parameters
graph_filter – Object representing a filter expression that is applied to create the subgraph
vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties – List of edge properties belonging to graph specified to be kept in the new graph
name – Filtered graph name
-
get_collections
()¶ Retrieve all currently allocated collections associated with the graph.
-
get_edge
(eid)¶ Get an edge with a specified id.
- Parameters
eid – edge id
-
get_edge_label
()¶ Get the edge labels belonging to this graph.
-
get_edge_properties
()¶ Get the set of edge properties belonging to this graph.
This list might contain transient, private and published properties.
-
get_edge_property
(name)¶ Get an edge property by name.
- Parameters
name – Property name
-
get_edges
(filter_expr=None, name=None)¶ Create a new edge set containing vertices according to the given filter expression.
- Parameters
filter_expr – EdgeFilter object with the filter expression. If None all the vertices are returned.
name – the name of the collection to be created. If None, a name will be generated.
-
get_id
()¶ Get the Graph id.
- Returns
A string representation of the id of this graph.
-
get_meta_data
()¶ Get the GraphMetaData object.
- Returns
A ‘GraphMetaData’ object of this graph.
-
get_or_create_edge_property
(name, data_type=None, dim=0)¶ Get or create an edge property.
- Parameters
name – Property name
data_type – Property type
dim – Dimension of vector property to be created
-
get_or_create_edge_vector_property
(data_type, dim, name=None)¶ Get or create a session-bound edge property.
- Parameters
data_type – Type of the vector property to create
dim – Dimension of vector property to be created
name – Name of vector property to be created
-
get_or_create_vertex_property
(name, data_type=None, dim=0)¶ Get or create a vertex property.
- Parameters
name – Property name
data_type – Property type
dim – Dimension of vector property to be created
-
get_or_create_vertex_vector_property
(data_type, dim, name=None)¶ Get or create a session-bound vertex vector property.
- Parameters
data_type – Type of the vector property to create
dim – Dimension of vector property to be created
name – Name of vector property to be created
-
get_random_edge
()¶ Get a edge vertex from the graph.
-
get_random_vertex
()¶ Get a random vertex from the graph.
-
get_vertex
(vid)¶ Get a vertex with a specified id.
- Parameters
vid – Vertex id
- Returns
pgxVertex object
-
get_vertex_labels
()¶ Get the vertex labels belonging to this graph.
-
get_vertex_properties
()¶ Get the set of vertex properties belonging to this graph.
This list might contain transient, private and published properties.
-
get_vertex_property
(name)¶ Get a vertex property by name.
- Parameters
name – Property name
-
get_vertices
(filter_expr=None, name=None)¶ Create a new vertex set containing vertices according to the given filter expression.
- Parameters
filter_expr – VertexFilter object with the filter expression if None all the vertices are returned
name – The name of the collection to be created. If None, a name will be generated.
-
has_edge
(eid)¶ Check if the edge with id vid is in the graph.
- Parameters
eid – Edge id
-
has_edge_label
()¶ Return True if the graph has edge labels, False if not.
-
has_vertex
(vid)¶ Check if the vertex with id vid is in the graph.
- Parameters
vid – vertex id
-
has_vertex_labels
()¶ Return True if the graph has vertex labels, False if not.
-
is_bipartite
(is_left)¶ Check whether a given graph is a bipartite graph.
A graph is considered a bipartite graph if all nodes can be divided in a ‘left’ and a ‘right’ side where edges only go from nodes on the ‘left’ side to nodes on the ‘right’ side.
- Parameters
is_left – Boolean vertex property that - if the method returns true - will contain for each node whether it is on the ‘left’ side of the bipartite graph. If the method returns False, the content is undefined.
-
property
is_fresh
¶ Check whether an in-memory representation of a graph is fresh.
-
property
is_published
¶ Check if this graph is published with snapshots.
-
is_published_with_snapshots
()¶ Check if this graph is published with snapshots.
- Returns
True if this graph is published, false otherwise
-
property
pgx_instance
¶ Get the server instance.
-
pick_random_vertex
()¶ Select a random vertex from the graph.
- Returns
The PgxVertex object
-
prepare_pgql
(pgql_query)¶ Prepare a PGQL query.
- Parameters
pgql_query – Query string in PGQL
- Returns
A prepared statement object
-
publish
(vertex_properties=True, edge_properties=True)¶ Publish the graph so it can be shared between sessions.
This moves the graph name from the private into the public namespace.
- Parameters
vertex_properties – List of vertex properties belonging to graph specified to be published as well
edge_properties – List of edge properties belonging to graph specified by graph to be published as well
-
publish_with_snapshots
()¶ Publish the graph and all its snapshots so they can be shared between sessions.
-
query_pgql
(query)¶ Submit a pattern matching select only query.
- Parameters
query – Query string in PGQL
- Returns
PgqlResultSet with the result
-
rename
(name)¶ Rename this graph.
- Parameters
name – New name
-
simplify
(vertex_properties=True, edge_properties=True, keep_multi_edges=False, keep_self_edges=False, keep_trivial_vertices=False, in_place=False, name=None)¶ Create a simplified version of a graph.
Note that the returned graph and properties are transient and therefore session bound. They can be explicitly destroyed and get automatically freed once the session dies.
- Parameters
vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties – List of edge properties belonging to graph specified to be kept in the new graph
keep_multi_edges – Defines if multi-edges should be kept in the result
keep_self_edges – Defines if self-edges should be kept in the result
keep_trivial_vertices – Defines if isolated nodes should be kept in the result
in_place – If the operation should be done in place of if a new graph has to be created
name – New graph name
-
sort_by_degree
(vertex_properties=True, edge_properties=True, ascending=True, in_degree=True, in_place=False, name=None)¶ Create a sorted version of a graph and all its properties.
The returned graph is sorted such that the node numbering is ordered by the degree of the nodes. Note that the returned graph and properties are transient.
- Parameters
vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties – List of edge properties belonging to graph specified to be kept in the new graph
ascending – Sorting order
in_degree – If in_degree should be used for sorting. Otherwise use out degree.
in_place – If the sorting should be done in place or a new graph should be created
name – New graph name
-
sparsify
(sparsification, vertex_properties=True, edge_properties=True, name=None)¶ Sparsify the given graph and returns a new graph with less edges.
- Parameters
sparsification – The sparsification coefficient. Must be between 0.0 and 1.0..
vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties – List of edge properties belonging to graph specified to be kept in the new graph
name – Filtered graph name
-
store
(format, path, num_partitions=None, vertex_properties=True, edge_properties=True, overwrite=False)¶ Store graph in a file.
- Parameters
format – One of [‘pgb’, ‘edge_list’, ‘two_tables’, ‘adj_list’, ‘flat_file’, ‘graphml’, ‘pg’, ‘rdf’, ‘csv’]
path – Path to which graph will be stored
num_partitions – The number of partitions that should be created, when exporting to multiple files
vertex_properties – The collection of vertex properties to store together with the graph data. If not specified all the vertex properties are stored
edge_properties – The collection of edge properties to store together with the graph data. If not specified all the vertex properties are stored
overwrite – Overwrite if existing
-
transpose
(vertex_properties=True, edge_properties=True, edge_label_mapping=None, in_place=False, name=None)¶ Create a transpose of this graph.
A transpose of a directed graph is another directed graph on the same set of vertices with all of the edges reversed. If this graph contains an edge (u,v) then the return graph will contain an edge (v,u) and vice versa. If this graph is undirected (isDirected() returns false), this operation has no effect and will either return a copy or act as identity function depending on the mode parameter.
- Parameters
vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties – List of edge properties belonging to graph specified to be kept in the new graph
edge_label_mapping – Can be used to rename edge labels. For example, an edge (John,Mary) labeled “fatherOf” can be transformed to be labeled “hasFather” on the transpose graph’s edge (Mary,John) by passing in a dict like object {“fatherOf”:”hasFather”}.
in_place – If the transpose should be done in place or a new graph should be created
name – New graph name
-
undirect
(vertex_properties=True, edge_properties=True, keep_multi_edges=True, keep_self_edges=True, keep_trivial_vertices=True, in_place=False, name=None)¶ - Parameters
vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties – List of edge properties belonging to graph specified to be kept in the new graph
keep_multi_edges – Defines if multi-edges should be kept in the result
keep_self_edges – Defines if self-edges should be kept in the result
keep_trivial_vertices – Defines if isolated nodes should be kept in the result
in_place – If the operation should be done in place of if a new graph has to be created
name – New graph name
-
-
class
pypgx.api.
PgxMap
(graph, java_map)¶ Bases:
object
A map is a collection of key-value pairs.
-
contains_key
(key)¶ - Parameters
key – Key of the entry
-
entries
()¶ Return an entry set.
-
get
(key)¶ Get the entry with the specified key.
- Parameters
key – Key of the entry
- Returns
Value
-
keys
()¶ Return a key set.
-
put
(key, value)¶ Set the value for a key in the map specified by the given name.
Returns the old value for the given key or null if there was no entry for this key before.
- Parameters
key – Key of the entry
value – New value
- Returns
Old value for the key or null if there was no entry before
-
remove
(key)¶ Remove the entry specified by the given key from the map with the given name.
Returns true if the map did contain an entry with the given key, false otherwise.
- Parameters
key – Key of the entry
- Returns
True if the map contained the key
-
property
size
¶ Map size.
-
-
class
pypgx.api.
PgxPartition
(graph, java_partition, property)¶ Bases:
object
-
get_partition_by_index
(idx)¶ Get a partition by index.
- Parameters
idx – The index. Must be between 0 and size() - 1.
- Returns
The set of vertices representing the partition
-
get_partition_by_vertex
(v)¶ Get the partition a particular vertex belongs to.
- Parameters
v – The vertex
- Returns
The set of vertices representing the partition the given vertex belongs to
-
get_partition_index_of_vertex
(v)¶ Get a partition by index.
- Parameters
v – The index. Must be between 0 and size() - 1.
- Returns
The set of vertices representing the partition
-
-
class
pypgx.api.
PgxSession
(java_session)¶ Bases:
object
A PGX session represents an active user connected to a ServerInstance.
Every session gets a workspace assigned on the server, which can be used to read graph data, create transient data or custom algorithms for the sake of graph analysis. Once a session gets destroyed, all data in the session workspace is freed.
-
compile_program
(path, overwrite=False)¶ Compile a Green-Marl program for parallel execution with all optimizations enabled.
- Parameters
path – Path to program
overwrite – If the procedure in the given code already exists, overwrite if true, throw an exception otherwise
-
compile_program_code
(code, overwrite=False)¶ Compile a Green-Marl program.
- Parameters
code – The Green-Marl code to compile
overwrite – If the procedure in the given code already exists, overwrite if true, throw an exception otherwise
-
create_analyst
()¶ Create and return a new analyst.
- Returns
An analyst object
-
create_graph_builder
(id_type='integer', vertex_id_generation_strategy='user_ids', edge_id_generation_strategy='auto_generated')¶ Create a graph builder with integer vertex IDs.
- Parameters
id_type – The type of the vertex ID
vertex_id_generation_strategy – The vertices Id generation strategy to be used
edge_id_generation_strategy – The edges Id generation strategy to be used
-
describe_graph_file
(file_path)¶ Describe the graph contained in the file at the given path.
- Parameters
file_path – Graph file path
- Returns
The configuration which can be used to load the graph
-
describe_graph_files
(files_path)¶ Describe the graph contained in the files at the given paths.
- Parameters
files_path – Paths to the files
- Returns
The configuration which can be used to load the graph
-
destroy
()¶ Destroy this session object.
-
explain_pgql
(pgql_query)¶ Explain the execution plan of a pattern matching query.
Note: Different PGX versions may return different execution plans.
- Parameters
pgql_query – Query string in PGQL
- Returns
The query plan
-
get_available_compiled_program_ids
()¶ Get the set of available compiled program IDs.
-
get_available_snapshots
(snapshot)¶ Return a list of all available snapshots of the given input graph.
- Parameters
snapshot – A ‘PgxGraph’ object for which the available snapshots shall be retrieved
- Returns
A list of ‘GraphMetaData’ objects, each corresponding to a snapshot of the input graph
-
get_compiled_program
(id)¶ Get a compiled program by ID.
- Parameters
id – The id of the compiled program
-
get_graph
(name, namespace=None)¶ Find and return a graph with name name within the given namespace loaded inside PGX.
The search for the snapshot to return is done according to the following rules:
if namespace is private, than the search occurs on already referenced snapshots of the graph with name name and the most recent snapshot is returned
if namespace is public, then the search occurs on published graphs and the most recent snapshot of the published graph with name name is returned
if namespace is null, then the private namespace is searched first and, if no snapshot is found, the public namespace is then searched
Multiple calls of this method with the same parameters will return different PgxGraph objects referencing the same graph, with the server keeping track of how many references a session has to each graph.
Therefore, a graph is released within the server either if:
all the references are moved to another graph (e.g. via setSnapshot(PgxGraph, long))
the Destroyable.destroy() method is called on one reference: note that
this invalidates all references
- Parameters
name – The name of the graph
namespace – The namespace where to look up the graph
- Returns
The graph with the given name
-
get_graphs
(namespace)¶ Return a collection of graph names accessible under the given namespace.
- Parameters
namespace – The namespace where to look up the graphs
-
get_name
()¶ Get the name of the current session.
- Returns
Name of this session
-
get_pattern_matching_semantic
()¶ - Returns
The current pattern matching semantic. If the return value is None, the current session respects the pattern matching configuration of the engine.
-
get_pgql_result_set
(id)¶ Get a PGQL result set by ID.
- Parameters
id – The PGQL result set ID
- Returns
The requested PGQL result set or None if no such result set exists for this session
-
prepare_pgql
(pgql_query)¶ Prepare a pattern matching query with a ON-clause.
The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as getGraph(String).
- Parameters
pgql_query – Query string in PGQL
- Returns
A prepared statement object
-
query_pgql
(pgql_query)¶ Submit a pattern matching query with a ON-clause.
The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as PgxSession.getGraph(String).
- Parameters
pgql_query – Query string in PGQL
- Returns
The query result set
throws InterruptedException if the caller thread gets interrupted while waiting for completion. throws ExecutionException if any exception occurred during asynchronous execution. The actual exception will be nested.
-
read_frame
()¶ Create a new frame reader with which it is possible to parameterize the loading of the row frame.
- Returns
A frame reader object with which it is possible to parameterize the loading
-
read_graph_as_of
(config, meta_data=None, creation_timestamp=None, new_graph_name=None)¶ Read a graph and its properties of a specific version (metaData or creationTimestamp) into memory.
The creationTimestamp must be a valid version of the graph.
- Parameters
config – The graph config
meta_data – The metaData object returned by get_available_snapshots(GraphConfig) identifying the version
creation_timestamp – The creation timestamp (milliseconds since jan 1st 1970) identifying the version to be checked out
new_graph_name – How the graph should be named. If None, a name will be generated.
- Returns
The PgxGraph object
-
read_graph_file
(file_path, file_format=None, graph_name=None)¶ - Parameters
file_path – File path
file_format – File format of graph
graph_name – Name of graph
-
read_graph_files
(file_paths, edge_file_paths=None, file_format=None, graph_name=None)¶ Load the graph contained in the files at the given paths.
- Parameters
file_paths – Paths to the vertex files
edge_file_paths – Path to the edge file
file_format – File format
graph_name – Loaded graph name
-
read_graph_with_properties
(config, max_age=9223372036854775807, max_age_time_unit='days', block_if_full=False, update_if_not_fresh=True, graph_name=None)¶ Read a graph and its properties, specified in the graph config, into memory.
- Parameters
config – The graph config
max_age – If another snapshot of the given graph already exists, the age of the latest existing snapshot will be compared to the given maxAge. If the latest snapshot is in the given range, it will be returned, otherwise a new snapshot will be created.
max_age_time_unit – The time unit of the maxAge parameter
block_if_full – If true and a new snapshot needs to be created but no more snapshots are allowed by the server configuration, the returned future will not complete until space becomes available. Iterable full and this flage is false, the returned future will complete exceptionally instead.
update_if_not_fresh – If a newer data version exists in the backing data source (see PgxGraph.is_fresh()), this flag tells whether to read it and create another snapshot inside PGX. If the “snapshots_source” field of config is SnapshotsSource.REFRESH, the returned graph may have multiple snapshots, depending on whether previous reads with the same config occurred; otherwise, if the “snapshots_source” field is SnapshotsSource.CHANGE_SET, only the most recent snapshot (either pre-existing or freshly read) will be visible.
graph_name – How the graph should be named. If null, a name will be generated. If a graph with that name already exists, the returned future will complete exceptionally.
-
register_keystore
(keystore_path, keystore_password)¶ Register a keystore.
- Parameters
keystore_path – The path to the keystore which shall be registered
keystore_password – The password of the provided keystore
-
property
server_instance
¶ Get the server instance.
- Returns
The server instance
-
set_pattern_matching_semantic
(pattern_matching_semantic)¶ Set the pattern matching semantic of the session.
- Parameters
pattern_matching_semantic – Pattern matching semantic. If None is passed, the session respects the pattern matching semantic of the engine. Should be either ‘HOMOMORPHISM’ or ‘ISOMORPHISM’.
-
set_snapshot
(graph, meta_data=None, creation_timestamp=None, force_delete_properties=False)¶ Set a graph to a specific snapshot.
You can use this method to jump back and forth in time between various snapshots of the same graph. If successful, the given graph will point to the requested snapshot after the returned future completes.
- Parameters
graph – Input graph
meta_data – A GraphMetaData object used to identify the snapshot
creation_timestamp – The metaData object returned by (GraphConfig) identifying the version to be checked out
force_delete_properties – Graphs with transient properties cannot be checked out to a different version. If this flag is set to true, the checked out graph will no longer contain any transient properties. If false, the returned future will complete exceptionally with an UnsupportedOperationException as its cause.
-
-
class
pypgx.api.
Scalar
(graph, java_scalar)¶ Bases:
object
A scalar value.
-
destroy
()¶ Free resources on the server taken up by this Scalar.
-
get
()¶ Get scalar value.
-
set
(value)¶ Set the scalar value.
- Parameters
value – Value to be assigned
-
-
class
pypgx.api.
ServerInstance
(java_server_instance)¶ Bases:
object
A PGX server instance.
-
create_session
(source, idle_timeout=None, task_timeout=None, time_unit='milliseconds')¶ - Parameters
source – A descriptive string identifying the client
idle_timeout – If not null, tries to overwrite server default idle timeout
task_timeout – If not null, tries to overwrite server default task timeout
time_unit – Time unit of idleTimeout and taskTimeout (‘days’, ‘hours’, ‘microseconds’, ‘milliseconds’, ‘minutes’, ‘nanoseconds’, ‘seconds’)
- Returns
PgxSession
-
get_pgx_config
()¶ Get the PGX config.
- Returns
Dict containing current config
-
get_server_state
()¶ - Returns
Server state as a dict
-
get_session
(session_id)¶ Get a session by ID.
- Parameters
session_id – Id of the session
- Returns
PgxSession
-
get_version
()¶ Get the PGX extended version of this instance.
- Returns
VersionInfo object
-
kill_session
(session_id)¶ Kill a session.
- Parameters
session_id – Session id
-
-
class
pypgx.api.
VertexCollection
(graph, java_collection)¶ Bases:
pypgx.api._pgx_collection.PgxCollection
A collection of vertices.
-
add
(v)¶ Add one or multiple vertices to the collection.
- Parameters
v – Vertex or vertex id. Can also be an iterable of vertices/Vetrices ids
-
add_all
(vertices)¶ Add multiple vertices to the collection.
- Parameters
vertices – Iterable of vertices/Vetrices ids
-
contains
(v)¶ Check if the collection contains vertex v.
- Parameters
v – PgxVertex object or id
-
remove
(v)¶ Remove one or multiple vertices from the collection.
- Parameters
v – Vertex or vertex id. Can also be an iterable of vertices/Vetrices ids.
-
remove_all
(vertices)¶ Remove multiple vertices from the collection.
- Parameters
vertices – Iterable of vertices/Vetrices ids
-
-
class
pypgx.api.
VertexSequence
(graph, java_collection)¶ Bases:
pypgx.api._pgx_collection.VertexCollection
-
class
pypgx.api.
VertexSet
(graph, java_collection)¶ Bases:
pypgx.api._pgx_collection.VertexCollection