5. Graph algorithms¶

The Analyst class can be used to execute algorithms on a given graph.

class pypgx.api.Analyst(session, java_analyst)

The Analyst gives access to all built-in algorithms of PGX.

Unlike some of the other classes inside this package, the Analyst is not stateless. It creates session-bound transient data to hold the result of algorithms and keeps track of them.

adamic_adar_counting(graph, aa='adamic_adar')

Adamic-adar counting compares the amount of neighbors shared between vertices, this measure can be used with communities.

Parameters

graph – Input graph
aa – Edge property holding the Adamic-Adar index for each edge in the graph. Can be a string or an EdgeProperty object.

Returns

Edge property holding the computed scores

all_reachable_vertices_edges(graph, src, dst, k, filter=None)

Find all the vertices and edges on a path between the src and target of length smaller or equal to k.

Parameters

graph – Input graph
src – The source vertex
dst – The destination vertex
k – The dimension of the distances property; i.e. number of high-degree vertices.
filter – The filter to be used on edges when searching for a path

Returns

The vertices on the path, the edges on the path and a map containing the distances from the source vertex for each vertex on the path

approximate_vertex_betweenness_centrality(graph, seeds, bc='approx_betweenness')

Parameters

graph – Input graph
seeds – The (unique) chosen nodes to be used to compute the approximated betweenness centrality coefficients
bc – Vertex property holding the betweenness centrality value for each vertex

Returns

Vertex property holding the computed scores

bipartite_check(graph, is_left='is_left')

Verify whether a graph is bipartite.

Parameters

graph – Input graph
is_left – (out-argument) vertex property holding the side of each vertex in a bipartite graph (true for left, false for right).

Returns

vertex property holding the side of each vertex in a bipartite graph (true for left, false for right).

center(graph, center=None)

Periphery/center gives an overview of the extreme distances and the corresponding vertices in a graph.

The center is comprised by the set of vertices with eccentricity equal to the radius of the graph.

Parameters

graph – Input graph
center – (Out argument) vertex set holding the vertices from the periphery or center of the graph

close(): Destroy without waiting for completion.

closeness_centrality(graph, cc='closeness')

Parameters

graph – Input graph
cc – Vertex property holding the closeness centrality

communities_conductance_minimization(graph, max_iter=100, label='conductance_minimization')

Soman and Narang can find communities in a graph taking weighted edges into account.

Parameters

graph – Input graph
max_iter – Maximum number of iterations that will be performed
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

communities_infomap(graph, rank, weight, tau=0.15, tol=0.0001, max_iter=100, label='infomap')

Infomap can find high quality communities in a graph.

Parameters

graph – Input graph
rank – Vertex property holding the normalized PageRank value for each vertex
weight – Ridge property holding the weight of each edge in the graph
tau – Damping factor
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
max_iter – Maximum iteration number
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

communities_label_propagation(graph, max_iter=100, label='label_propagation')

Label propagation can find communities in a graph relatively fast.

Parameters

graph – Input graph
max_iter – Maximum number of iterations that will be performed
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

compute_high_degree_vertices(graph, k, high_degree_vertex_mapping=None, high_degree_vertices=None)

Compute the k vertices with the highest degrees in the graph.

Parameters

graph – Input graph
k – Number of high-degree vertices to be computed
high_degree_vertex_mapping – (out argument) map with the top k high-degree vertices and their indices
high_degree_vertices – (out argument) the high-degree vertices

Returns

a map with the top k high-degree vertices and their indices and a vertex set containing the same vertices

conductance(graph, partition, partition_idx)

Conductance assesses the quality of a partition in a graph.

Parameters

graph – Input graph
partition – Partition of the graph with the corresponding node collections
partition_idx – Number of the component to be used for computing its conductance

count_triangles(graph, sort_vertices_by_degree)

Triangle counting gives an overview of the amount of connections between vertices in neighborhoods.

Parameters

graph – Input graph
sort_vertices_by_degree – Boolean flag for sorting the nodes by their degree as preprocessing step

Returns

The total number of triangles found

create_distance_index(graph, high_degree_vertex_mapping, high_degree_vertices, index=None)

Compute an index with distances to each high-degree vertex

Parameters

graph – Input graph
high_degree_vertex_mapping – a map with the top k high-degree vertices and their indices and a vertex
high_degree_vertices – the high-degree vertices
index – (out-argument) the index containing the distances to each high-degree vertex for all vertices

Returns

the index containing the distances to each high-degree vertex for all vertices

deepwalk_builder(min_word_frequency=1, batch_size=128, num_epochs=2, layer_size=200, learning_rate=0.025, min_learning_rate=0.0001, window_size=5, walk_length=5, walks_per_vertex=4, sample_rate=1e-05, negative_sample=10, validation_fraction=0.05, seed=None)

Build a DeepWalk model and return it.

Parameters

min_word_frequency – Minimum word frequency to consider before pruning
batch_size – Batch size for training the model
num_epochs – Number of epochs to train the model
layer_size – Number of dimensions for the output vectors
learning_rate – Initial learning rate
min_learning_rate – Minimum learning rate
window_size – Window size to consider while training the model
walk_length – Length of the walks
walks_per_vertex – Number of walks to consider per vertex
sample_rate – Sample rate
negative_sample – Number of negative samples
validation_fraction – Fraction of training data on which to compute final loss
seed – Random seed for training the model

Returns

Built DeepWalk model

degree_centrality(graph, dc='degree')

Measure the centrality of the vertices based on its degree.

This lets you see how a vertex influences its neighborhood.

Parameters

graph – Input graph
dc – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Vertex property holding the computed scores

destroy(): Destroy with waiting for completion.

diameter(graph, eccentricity='eccentricity')

Diameter/radius gives an overview of the distances in a graph.

Parameters

graph – Input graph
eccentricity – (Out argument) vertex property holding the eccentricity value for each vertex

Returns

Pair holding the diameter of the graph and a node property with eccentricity value for each node

eigenvector_centrality(graph, tol=0.001, max_iter=100, l2_norm=False, in_edges=False, ec='eigenvector')

Eigenvector centrality gets the centrality of the vertices in an intrincated way using neighbors, allowing to find well-connected vertices.

Parameters

graph – Input graph
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
max_iter – Maximum iteration number
l2_norm – Boolean flag to determine whether the algorithm will use the l2 norm (Euclidean norm) or the l1 norm (absolute value) to normalize the centrality scores
in_edges – Boolean flag to determine whether the algorithm will use the incoming or the outgoing edges in the graph for the computations
ec – Vertex property holding the resulting score for each vertex

Returns

Vertex property holding the computed scores

enumerate_simple_paths(graph, src, dst, k, vertices_on_path, edges_on_path, dist)

Enumerate simple paths between the source and destination vertex.

Parameters

graph – Input graph
src – The source vertex
dst – The destination vertex
k – maximum number of iterations
vertices_on_path – VertexSet containing all vertices to be considered while enumerating paths
edges_on_path – EdgeSet containing all edges to be consider while enumerating paths
dist – map containing the hop-distance from the source vertex to each vertex that is to be considered while enumerating the paths

Returns

Triple containing containing the path lengths, a vertex-sequence containing the vertices on the paths and edge-sequence containing the edges on the paths

fattest_path(graph, root, capacity, distance='fattest_path_distance', parent='fattest_path_parent', parent_edge='fattest_path_parent_edge')

Fattest path is a fast algorithm for finding a shortest path adding constraints for flowing related matters.

Parameters

graph – Input graph
root – Fattest path is a fast algorithm for finding a shortest path adding constraints for flowing related matters
capacity – Edge property holding the capacity of each edge in the graph
distance – Vertex property holding the capacity value of the fattest path up to the current vertex
parent – Vertex property holding the parent vertex of the each vertex in the fattest path
parent_edge – Vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

AllPaths object holding the information of the possible fattest paths from the source node

filtered_bfs(graph, root, navigator, init_with_inf=True, max_depth=2147483647, distance='distance', parent='parent')

Breadth-first search with an option to filter edges during the traversal of the graph.

Parameters

graph – Input graph
root – The source vertex from the graph for the path.
navigator – Navigator expression to be evaluated on the vertices during the graph traversal
init_with_inf – Boolean flag to set the initial distance values of the vertices. If set to true, it will initialize the distances as INF, and -1 otherwise.
max_depth – Maximum depth limit for the BFS traversal
distance – Vertex property holding the hop distance for each reachable vertex in the graph
parent – Vertex property holding the parent vertex of the each reachable vertex in the path

Returns

Distance and parent vertex properties

filtered_dfs(graph, root, navigator, init_with_inf=True, max_depth=2147483647, distance='distance', parent='parent')

Depth-first search with an option to filter edges during the traversal of the graph.

Parameters

graph – Input graph
root – The source vertex from the graph for the path
navigator – Navigator expression to be evaluated on the vertices during the graph traversal
init_with_inf – Boolean flag to set the initial distance values of the vertices. If set to true, it will initialize the distances as INF, and -1 otherwise.
max_depth – Maximum search depth
distance – Vertex property holding the hop distance for each reachable vertex in the graph
parent – Vertex property holding the parent vertex of the each reachable vertex in the path

Returns

Distance and parent vertex properties

find_cycle(graph, src=None, vertex_seq=None, edge_seq=None)

Find cycle looks for any loop in the graph.

Parameters

graph – Input graph
src – Source vertex for the search
vertex_seq – (Out argument) vertex sequence holding the vertices in the cycle
edge_seq – (Out argument) edge sequence holding the edges in the cycle

Returns

PgxPath representing the cycle as path, if exists.

get_deepwalk_model_loader()

Return a ModelLoader that can be used for loading a DeepWalkModel.

Returns: ModelLoader

get_pg2vec_model_loader()

Return a ModelLoader that can be used for loading a Pg2vecModel.

Returns: ModelLoader

get_supervised_graphwise_model_loader()

Return a ModelLoader that can be used for loading a SupervisedGraphWiseModel.

Returns: ModelLoader

get_unsupervised_graphwise_model_loader()

Return a ModelLoader that can be used for loading a UnsupervisedGraphWiseModel.

Returns: ModelLoader

graphwise_conv_layer_config(num_sampled_neighbors=10, neighbor_weight_property_name=None, activation_fn='ReLU', weight_init_scheme='XAVIER_UNIFORM')

Build a GraphWise conv layer configuration and return it.

Parameters

num_sampled_neighbors – Number of neighbors to sample
neighbor_weight_property_name – Neighbor weight property name.
activation_fn – Activation function. Supported functions: RELU, LEAKY_RELU, TANH, LINEAR. If this is the last layer, this setting will be ignored and replaced by the activation function of the loss function, e.g softmax or sigmoid.
weight_init_scheme – Initialization scheme for the weights in the layer. Supported schemes: XAVIER, XAVIER_UNIFORM, ONES, ZEROS. Note that biases are always initialized with zeros.

Returns

Built GraphWiseConvLayerConfig

graphwise_dgi_layer_config(corruption_function=None, readout_function='MEAN', discriminator='BILINEAR')

Build a GraphWise DGI layer configuration and return it.

Parameters

corruption_function(CorruptionFunction) – Corruption Function to use
readout_function(str) – Neighbor weight property name. Supported functions: MEAN
discriminator(str) – discriminator function. Supported functions: BILINEAR

Returns

GraphWiseDgiLayerConfig object

graphwise_pred_layer_config(hidden_dim=None, activation_fn='ReLU', weight_init_scheme='XAVIER_UNIFORM')

Build a GraphWise prediction layer configuration and return it.

Parameters

hidden_dim – Hidden dimension. If this is the last layer, this setting will be ignored and replaced by the number of classes.
activation_fn – Activation function. Supported functions: RELU, LEAKY_RELU, TANH, LINEAR. If this is the last layer, this setting will be ignored and replaced by the activation function of the loss function, e.g softmax or sigmoid.
weight_init_scheme – Initialization scheme for the weights in the layer. Supportes schemes: XAVIER, XAVIER_UNIFORM, ONES, ZEROS. Note that biases are always initialized with zeros.

Returns

Built GraphWisePredictionLayerConfig

hits(graph, max_iter=100, auth='authorities', hubs='hubs')

Hyperlink-Induced Topic Search (HITS) assigns ranking scores to the vertices, aimed to assess the quality of information and references in linked structures.

Parameters

graph – Input graph
max_iter – Number of iterations that will be performed
auth – Vertex property holding the authority score for each vertex
hubs – Vertex property holding the hub score for each vertex

Returns

Vertex property holding the computed scores

in_degree_centrality(graph, dc='in_degree')

Measure the in-degree centrality of the vertices based on its degree.

This lets you see how a vertex influences its neighborhood.

Parameters

graph – Input graph
dc – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Vertex property holding the computed scores

in_degree_distribution(graph, dist_map=None)

Calculate the in-degree distribution.

In-degree distribution gives information about the incoming flows in a graph.

Parameters

graph – Input graph
dist_map – (Out argument) map holding a histogram of the vertex degrees in the graph

Returns

Map holding a histogram of the vertex degrees in the graph

k_core(graph, min_core=0, max_core=2147483647, kcore='kcore')

k-core decomposes a graph into layers revealing subgraphs with particular properties.

Parameters

graph – Input graph
min_core – Minimum k-core value
max_core – Maximum k-core value
kcore – Vertex property holding the result value

Returns

Pair holding the maximum core found and a node property with the largest k-core value for each node.

limited_shortest_path_hop_dist(graph, src, dst, max_hops, high_degree_vertex_mapping, high_degree_vertices, index, path_vertices=None, path_edges=None)

Compute the shortest path between the source and destination vertex.

The algorithm only considers paths up to a length of k.

Parameters

graph – Input graph
src – The source vertex
dst – The destination vertex
max_hops – The maximum number of edges to follow when trying to find a path
high_degree_vertex_mapping – Map with the top k high-degree vertices and their indices
high_degree_vertices – The high-degree vertices
index – Index containing distances to high-degree vertices
path_vertices – (out-argument) will contain the vertices on the found path or will be empty if there is none
path_edges – (out-argument) will contain the vertices on the found path or will be empty if there is none

Returns

A tuple containing the vertices in the shortest path from src to dst and the edges on the path. Both will be empty if there is no path within maxHops steps

limited_shortest_path_hop_dist_filtered(graph, src, dst, max_hops, high_degree_vertex_mapping, high_degree_vertices, index, filter, path_vertices=None, path_edges=None)

Compute the shortest path between the source and destination vertex.

The algorithm only considers paths up to a length of k.

Parameters

graph – Input graph
src – The source vertex
dst – The destination vertex
max_hops – The maximum number of edges to follow when trying to find a path
high_degree_vertex_mapping – Map with the top k high-degree vertices and their indices
high_degree_vertices – The high-degree vertices
index – Index containing distances to high-degree vertices
filter – Filter to be evaluated on the edges when searching for a path
path_vertices – (out-argument) will contain the vertices on the found path or will be empty if there is none
path_edges – (out-argument) will contain the vertices on the found path or will be empty if there is none

Returns

A tuple containing the vertices in the shortest path from src to dst and the edges on the path. Both will be empty if there is no path within maxHops steps

load_deepwalk_model(path, key)

Load an encrypted DeepWalk model.

Parameters

path – Path to model
key – The decryption key, or null if no encryption was used

Returns

Loaded model

load_pg2vec_model(path, key)

Load an encrypted pg2vec model.

Parameters

path – Path to model
key – The decryption key, or null if no encryption was used

Returns

Loaded model

load_supervised_graphwise_model(path, key)

Load an encrypted SupervisedGraphWise model.

Parameters

path – Path to model
key – The decryption key, or null if no encryption was used

Returns

Loaded model

load_unsupervised_graphwise_model(path, key)

Load an encrypted UnsupervisedGraphWise model.

Parameters

path – Path to model
key – The decryption key, or null if no encryption was used

Returns

Loaded model

local_clustering_coefficient(graph, lcc='lcc')

LCC gives information about potential clustering options in a graph.

Parameters

graph – Input graph
lcc – Vertex property holding the lcc value for each vertex

Returns

Vertex property holding the lcc value for each vertex

louvain(graph, weight, max_iter=100, nbr_pass=1, tol=0.0001, community='community')

Louvain to detect communities in a graph

Parameters

graph – Input graph.
weight – Weights of the edges of the graph.
max_iter – Maximum number of iterations that will be performed during each pass.
nbr_pass – Number of passes that will be performed.
tol – maximum tolerated error value, the algorithm will stop once the graph’s total modularity gain becomes smaller than this value.
community – Vertex property holding the community ID assigned to each vertex

Returns

Community IDs vertex property

matrix_factorization_gradient_descent(bipartite_graph, weight, learning_rate=0.001, change_per_step=1.0, lbd=0.15, max_iter=100, vector_length=10, features='features')

Parameters

bipartite_graph – Input graph between 1 and 5, the result will become inaccurate.
learning_rate – Learning rate for the optimization process
change_per_step – Parameter used to modulate the learning rate during the optimization process
lbd – Penalization parameter to avoid over-fitting during optimization process
max_iter – Maximum number of iterations that will be performed
vector_length – Size of the feature vectors to be generated for the factorization
features – Vertex property holding the generated feature vectors for each vertex. This function accepts names and VertexProperty objects.

Returns

Matrix factorization model holding the feature vectors found by the algorithm

matrix_factorization_recommendations(bipartite_graph, user, vector_length, feature, estimated_rating=None)

Complement for Matrix Factorization.

The generated feature vectors will be used for making predictions in cases where the given user vertex has not been related to a particular item from the item set. Similarly to the recommendations from matrix factorization, this algorithm will perform dot products between the given user vertex and the rest of vertices in the graph, giving a score of 0 to the items that are already related to the user and to the products with other user vertices, hence returning the results of the dot products for the unrelated item vertices. The scores from those dot products can be interpreted as the predicted scores for the unrelated items given a particular user vertex.

Parameters

bipartite_graph – Bipartite input graph
user – Vertex from the left (user) side of the graph
vector_length – size of the feature vectors
feature – vertex property holding the feature vectors for each vertex
estimated_rating – (out argument) vertex property holding the estimated rating score for each vertex

Returns

vertex property holding the estimated rating score for each vertex

out_degree_centrality(graph, dc='out_degree')

Measure the out-degree centrality of the vertices based on its degree.

This lets you see how a vertex influences its neighborhood.

Parameters

graph – Input graph
dc – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Vertex property holding the computed scores

out_degree_distribution(graph, dist_map=None)

Parameters

graph – Input graph
dist_map – (Out argument) map holding a histogram of the vertex degrees in the graph

Returns

Map holding a histogram of the vertex degrees in the graph

pagerank(graph, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='pagerank')

Parameters

graph – Input graph
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor
max_iter – Maximum number of iterations that will be performed
norm – Determine whether the algorithm will take into account dangling vertices for the ranking scores.
rank – Vertex property holding the PageRank value for each vertex, or name for a new property

Returns

Vertex property holding the PageRank value for each vertex

pagerank_approximate(graph, tol=0.001, damping=0.85, max_iter=100, rank='approx_pagerank')

Parameters

graph – Input graph
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor
max_iter – Maximum number of iterations that will be performed
rank – Vertex property holding the PageRank value for each vertex

Returns

Vertex property holding the PageRank value for each vertex

partition_conductance(graph, partition)

Partition conductance assesses the quality of many partitions in a graph.

Parameters

graph – Input graph
partition – Partition of the graph with the corresponding node collections

partition_modularity(graph, partition)

Modularity summarizes information about the quality of components in a graph.

Parameters

graph – Input graph
partition – Partition of the graph with the corresponding node collections

Returns

Scalar (double) to store the conductance value of the given cut

periphery(graph, periphery=None)

Periphery/center gives an overview of the extreme distances and the corresponding vertices in a graph.

Parameters

graph – Input graph
periphery – (Out argument) vertex set holding the vertices from the periphery or center of the graph

Returns

Vertex set holding the vertices from the periphery or center of the graph

personalized_pagerank(graph, v, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='personalized_pagerank')

Personalized PageRank for a vertex of interest.

Compares and spots out important vertices in a graph.

Parameters

graph – Input graph
v – The chosen vertex from the graph for personalization
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor
max_iter – Maximum number of iterations that will be performed
norm – Boolean flag to determine whether the algorithm will take into account dangling vertices for the ranking scores.
rank – Vertex property holding the PageRank value for each vertex

Returns

Vertex property holding the computed scores

personalized_salsa(bipartite_graph, v, tol=0.001, damping=0.85, max_iter=100, rank='personalized_salsa')

Personalized SALSA for a vertex of interest.

Assesses the quality of information and references in linked structures.

Parameters

bipartite_graph – Bipartite graph
v – The chosen vertex from the graph for personalization
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor to modulate the degree of personalization of the scores by the algorithm
max_iter – Maximum number of iterations that will be performed
rank – (Out argument) vertex property holding the normalized authority/hub ranking score for each vertex

Returns

Vertex property holding the computed scores

personalized_weighted_pagerank(graph, v, weight, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='personalized_weighted_pagerank')

Parameters

graph – Input graph
v – The chosen vertex from the graph for personalization
weight – Edge property holding the weight of each edge in the graph
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor
max_iter – Maximum number of iterations that will be performed
norm – Boolean flag to determine whether the algorithm will take into account dangling vertices for the ranking scores
rank – Vertex property holding the PageRank value for each vertex

pg2vec_builder(graphlet_id_property_name, vertex_property_names, min_word_frequency=1, batch_size=128, num_epochs=5, layer_size=200, learning_rate=0.04, min_learning_rate=0.0001, window_size=4, walk_length=8, walks_per_vertex=5, graphlet_size_property_name='graphletSize-Pg2vec', use_graphlet_size=True, validation_fraction=0.05, seed=None)

Build a pg2Vec model and return it.

Parameters

graphlet_id_property_name – Property name of the graphlet-id in the input graph
vertex_property_names – Property names to consider for pg2vec model training
min_word_frequency – Minimum word frequency to consider before pruning
batch_size – Batch size for training the model
num_epochs – Number of epochs to train the model
layer_size – Number of dimensions for the output vectors
learning_rate – Initial learning rate
min_learning_rate – Minimum learning rate
window_size – Window size to consider while training the model
walk_length – Length of the walks
walks_per_vertex – Number of walks to consider per vertex
graphlet_size_property_name – Property name for graphlet size
use_graphlet_size – Whether to use or not the graphlet size
validation_fraction – Fraction of training data on which to compute final loss
seed – Seed

Returns

Build Pg2Vec Model

prim(graph, weight, mst='mst')

Prim reveals tree structures with shortest paths in a graph.

Parameters

graph – Input graph
weight – Edge property holding the weight of each edge in the graph
mst – Edge property holding the edges belonging to the minimum spanning tree of the graph

Returns

Edge property holding the edges belonging to the minimum spanning tree of the graph (i.e. all the edges with in_mst=true)

radius(graph, eccentricity='eccentricity')

Radius gives an overview of the distances in a graph. it is computed as the minimum graph eccentricity.

Parameters

graph – Input graph
eccentricity – (Out argument) vertex property holding the eccentricity value for each vertex

Returns

Pair holding the radius of the graph and a node property with eccentricity value for each node

random_walk_with_restart(graph, source, length, reset_prob, visit_count=None)

Perform a random walk over the graph.

The walk will start at the given source vertex and will randomly visit neighboring vertices in the graph, with a probability equal to the value of reset_probability of going back to the starting point. The random walk will also go back to the starting point every time it reaches a vertex with no outgoing edges. The algorithm will stop once it reaches the specified walk length.

Parameters

graph – Input graph
source – Starting point of the random walk
length – Length (number of steps) of the random walk
reset_prob – Probability value for resetting the random walk
visit_count – (out argument) map holding the number of visits during the random walk for each vertex in the graph

Returns

map holding the number of visits during the random walk for each vertex in the graph

reachability(graph, src, dst, max_hops, ignore_edge_direction)

Reachability is a fast way to check if two vertices are reachable from each other.

Parameters

graph – Input graph
src – Source vertex for the search
dst – Destination vertex for the search
max_hops – Maximum hop distance between the source and destination vertices
ignore_edge_direction – Boolean flag for ignoring the direction of the edges during the search

Returns

The number of hops between the vertices. It will return -1 if the vertices are not connected or are not reachable given the condition of the maximum hop distance allowed.

salsa(bipartite_graph, tol=0.001, max_iter=100, rank='salsa')

Stochastic Approach for Link-Structure Analysis (SALSA) computes ranking scores.

It assesses the quality of information and references in linked structures.

Parameters

bipartite_graph – Bipartite graph
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
max_iter – Maximum number of iterations that will be performed
rank – Vertex property holding the value for each vertex in the graph

Returns

Vertex property holding the computed scores

scc_kosaraju(graph, label='scc_kosaraju')

Kosaraju finds strongly connected components in a graph.

Parameters

graph – Input graph
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the components found by the algorithm

scc_tarjan(graph, label='scc_tarjan')

Tarjan finds strongly connected components in a graph.

Parameters

graph – Input graph
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the components found by the algorithm

shortest_path_bellman_ford(graph, src, weight, distance='bellman_ford_distance', parent='bellman_ford_parent', parent_edge='bellman_ford_parent_edge')

Bellman-Ford finds multiple shortest paths at the same time.

Parameters

graph – Input graph
src – Source node
distance – (Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
weight – Edge property holding the weight of each edge in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

AllPaths holding the information of the possible shortest paths from the source node

shortest_path_bellman_ford_reversed(graph, src, weight, distance='bellman_ford_distance', parent='bellman_ford_parent', parent_edge='bellman_ford_parent_edge')

Reversed Bellman-Ford finds multiple shortest paths at the same time.

Parameters

graph – Input graph
src – Source node
distance – (Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
weight – Edge property holding the weight of each edge in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path.
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path.

Returns

AllPaths holding the information of the possible shortest paths from the source node.

shortest_path_bidirectional_dijkstra(graph, src, dst, weight, parent='bidirectional_dijkstra_parent', parent_edge='bidirectional_dijkstra_parent_edge')

Bidirectional dijkstra is a fast algorithm for finding a shortest path in a graph.

Parameters

graph – Input graph
src – Source node
dst – Destination node
weight – Edge property holding the (positive) weight of each edge in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

PgxPath holding the information of the shortest path, if it exists

shortest_path_dijkstra(graph, src, dst, weight, parent='dijkstra_parent', parent_edge='dijkstra_parent_edge')

Parameters

graph – Input graph
src – Source node
dst – Destination node
weight – Edge property holding the (positive) weight of each edge in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

PgxPath holding the information of the shortest path, if it exists

shortest_path_filtered_bidirectional_dijkstra(graph, src, dst, weight, filter_expression, parent='bidirectional_dijkstra_parent', parent_edge='bidirectional_dijkstra_parent_edge')

Parameters

graph – Input graph
src – Source node
dst – Destination node
weight – Edge property holding the (positive) weight of each edge in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
filter_expression – graphFilter object for filtering

Returns

PgxPath holding the information of the shortest path, if it exists

shortest_path_filtered_dijkstra(graph, src, dst, weight, filter_expression, parent='dijkstra_parent', parent_edge='dijkstra_parent_edge')

Parameters

graph – Input graph
src – Source node
dst – Destination node
weight – Edge property holding the (positive) weight of each edge in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
filter_expression – GraphFilter object for filtering

Returns

PgxPath holding the information of the shortest path, if it exists

shortest_path_hop_distance(graph, src, distance='hop_dist_distance', parent='hop_dist_parent', parent_edge='hop_dist_edge')

Hop distance can give a relatively fast insight on the distances in a graph.

Parameters

graph – Input graph
src – Source node
distance – Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

AllPaths holding the information of the possible shortest paths from the source node

shortest_path_hop_distance_reversed(graph, src, distance='hop_dist_distance', parent='hop_dist_parent', parent_edge='hop_dist_edge')

Backwards hop distance can give a relatively fast insight on the distances in a graph.

Parameters

graph – Input graph
src – Source node
distance – Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

AllPaths holding the information of the possible shortest paths from the source node

supervised_graphwise_builder(vertex_target_property_name, vertex_input_property_names=[], edge_input_property_names=[], loss_fn='SOFTMAX_CROSS_ENTROPY', batch_gen='STANDARD', batch_gen_params=[], pred_layer_config=None, conv_layer_config=None, batch_size=128, num_epochs=3, learning_rate=0.01, layer_size=128, class_weights=None, seed=None)

Build a SupervisedGraphWise model and return it.

Parameters

vertex_target_property_name – Target property name
vertex_input_property_names – Vertices Input feature names
edge_input_property_names – Edges Input feature names
loss_fn – Loss function. Supported: String (‘SOFTMAX_CROSS_ENTROPY’, ‘SIGMOID_CROSS_ENTROPY’) or LossFunction object
batch_gen – Batch generator. Supported: ‘STANDARD’, ‘STRATIFIED_OVERSAMPLING’
batch_gen_params – List of parameters passed to the batch generator
pred_layer_config – Prediction layer configuration as list of PredLayerConfig, or default if None
conv_layer_config – Conv layer configuration as list of ConvLayerConfig, or default if None
batch_size – Batch size for training the model
num_epochs – Number of epochs to train the model
learning_rate – Learning rate
layer_size – Number of dimensions for the output vectors
class_weights – Class weights to be used in the loss function. The loss for the corresponding class will be multiplied by the factor given in this map. If null, uniform class weights will be used.
seed – Seed

Returns

Built SupervisedGraphWise model

topological_schedule(graph, vs, topo_sched='topo_sched')

Topological schedule gives an order of visit for the reachable vertices from the source.

Parameters

graph – Input graph
vs – Set of vertices to be used as the starting points for the scheduling order
topo_sched – (Out argument) vertex property holding the scheduled order of each vertex

Returns

Vertex property holding the scheduled order of each vertex.

topological_sort(graph, topo_sort='topo_sort')

Topological sort gives an order of visit for vertices in directed acyclic graphs.

Parameters

graph – Input graph
topo_sort – (Out argument) vertex property holding the topological order of each vertex

unsupervised_graphwise_builder(vertex_input_property_names=[], edge_input_property_names=[], loss_fn='SIGMOID_CROSS_ENTROPY', conv_layer_config=None, batch_size=128, num_epochs=3, learning_rate=0.001, layer_size=128, class_weights=None, seed=None, dgi_layer_config=None)

Build a UnsupervisedGraphWise model and return it.

Parameters

vertex_input_property_names – Vertices Input feature names
edge_input_property_names – Edges Input feature names
loss_fn – Loss function. Supported: SIGMOID_CROSS_ENTROPY
conv_layer_config – Conv layer configuration as list of ConvLayerConfig, or default if None
batch_size – Batch size for training the model
num_epochs – Number of epochs to train the model
learning_rate – Learning rate
layer_size – Number of dimensions for the output vectors
seed – Seed
dgi_layer_config – Dgi layer configuration as DgiLayerConfig object, or default if None

Returns

Built UnsupervisedGraphWise model

vertex_betweenness_centrality(graph, bc='betweenness')

Parameters

graph – Input graph
bc – Vertex property holding the betweenness centrality value for each vertex

Returns

Vertex property holding the computed scores

wcc(graph, label='wcc')

Identify weakly connected components.

This can be useful for clustering graph data.

Parameters

graph – Input graph
label – Vertex property holding the value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the components found by the algorithm.

weighted_closeness_centrality(graph, weight, cc='weighted_closeness')

Measure the centrality of the vertices based on weighted distances, allowing to find well-connected vertices.

Parameters

graph – Input graph
weight – Edge property holding the weight of each edge in the graph
cc – (Out argument) vertex property holding the closeness centrality value for each vertex

Returns

Vertex property holding the computed scores

weighted_pagerank(graph, weight, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='weighted_pagerank')

Parameters

graph – Input graph
weight – Edge property holding the weight of each edge in the graph
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor
max_iter – Maximum number of iterations that will be performed
rank – Vertex property holding the PageRank value for each vertex

Returns

Vertex property holding the computed the peageRank value

whom_to_follow(graph, v, top_k=100, size_circle_of_trust=500, tol=0.001, damping=0.85, max_iter=100, salsa_tol=0.001, salsa_max_iter=100, hubs=None, auth=None)

Whom-to-follow (WTF) is a recommendation algorithm.

It returns two vertex sequences: one of similar users (hubs) and a second one with users to follow (auth).

Parameters

graph – Input graph
v – The chosen vertex from the graph for personalization of the recommendations
top_k – The maximum number of recommendations that will be returned
size_circle_of_trust – The maximum size of the circle of trust
tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping – Damping factor for the Pagerank stage
max_iter – Maximum number of iterations that will be performed for the Pagerank stage
salsa_tol – Maximum tolerated error value for the SALSA stage
salsa_max_iter – Maximum number of iterations that will be performed for the SALSA stage
hubs – (Out argument) vertex sequence holding the top rated hub vertices (similar users) for the recommendations
auth – (Out argument) vertex sequence holding the top rated authority vertices (users to follow) for the recommendations

Returns

Vertex properties holding hubs and auth