17.1 Using the DeepWalk Algorithm
DeepWalk is a widely employed vertex representation learning algorithm used in industry.
It consists of two main steps:
- First, the random walk generation step computes random walks for each vertex (with a pre-defined walk length and a pre-defined number of walks per vertex).
- Second, these generated walks are fed to a Word2vec algorithm to generate the vector representation for each vertex (which is the word in the input provided to the Word2vec algorithm). See KDD technical brief for more details on DeepWalk algorithm.
DeepWalk creates vertex embeddings for a specific graph and cannot be updated to incorporate modifications on the graph. Instead, a new DeepWalk model should be trained on this modified graph. Lastly, it is important to note that the memory consumption of the DeepWalk model is O(2n*d)
where n
is the number of vertices in the graph and d
is the embedding length.
The following describes a few use cases where DeepWalk algorithm can be applied:
- Community Detection in Social Networks: To leverage the vertex embedding to identify groups of users who interact more frequently with each other than with the rest of the network. This can help in targeted marketing, understanding social dynamics, or improving network moderation strategies.
- New Items Recommendation: To analyze user interactions and content consumption patterns that help recommendation systems to recommend new content (such as videos, articles, or products) based on the embeddings of similar users or items.
- Knowledge Graph Enhancement: To enhance knowledge graphs by generating embeddings for entities (vertices). This helps to infer missing relationships, thereby improving the completeness of the graph and its usability in search engines and question-answering systems.
The following describes the usage of the main functionalities of DeepWalk in PGX
using DBpedia graph as an example with 8,637,721
vertices and
165,049,964
edges:
- Loading a Graph
- Building a Minimal DeepWalk Model
- Building a Customized DeepWalk Model
- Training a DeepWalk Model
- Getting the Loss Value For a DeepWalk Model
- Computing Similar Vertices for a Given Vertex
- Computing Similar Vertices for a Vertex Batch
- Getting All Trained Vertex Vectors
- Storing a Trained DeepWalk Model
- Loading a Pre-Trained DeepWalk Model
- Destroying a DeepWalk Model
Parent topic: Using the Machine Learning Library (PgxML) for Graphs