17.6 Using the Unsupervised Anomaly Detection GraphWise Algorithm (Vertex Embeddings and Anomaly Scores)
UnsupervisedAnomalyDetectionGraphWise is an inductive vertex representation learning and anomaly detection algorithm which is able to leverage vertex and edge feature information. Although it can be applied to a wide variety of tasks, it is particularly suitable for unsupervised learning of vertex embeddings for anomaly detection. After training this model, it is possible to infer anomaly scores or labels for unseen nodes.
Model Structure
A UnsupervisedAnomalyDetectionGraphWise
model consists of graph convolutional layers followed by an
embedding layer. There are two types of embedding layers - DGI layer
and Dominant layer. Both the layers are for inductive vertex
representation learning with different loss functions. The embedding
layer defaults to the DGI layer.
The forward pass through a convolutional layer for a vertex proceeds as follows:
- A set of neighbors of the vertex is sampled.
- The previous layer representations of the neighbors are mean-aggregated, and the aggregated features are concatenated with the previous layer representation of the vertex.
- This concatenated vector is multiplied with weights, and a bias vector is added.
- The result is normalized to such that the layer output has unit norm.
The DGI Layer, which is based on (Deep Graph Infomax (DGI) by Velickovic et al.) consists of three parts that enable unsupervised learning using embeddings produced by the convolution layers.
- Corruption function: Shuffles the node features while preserving the graph structure to produce negative embedding samples using the convolution layers.
- Readout function: Sigmoid activated mean of embeddings, used as summary of a graph.
- Discriminator: Measures the similarity of positive (unshuffled) embeddings with the summary as well as the similarity of negative samples with the summary from which the loss function is computed.
Since none of these contains mutable hyperparameters, the default DGI layer is always used and cannot be adjusted.
The Dominant layer enables unsupervised learning using a deep autoencoder. It uses the graph convolutional networks (GCNs) to reconstruct the features in the autoencoder setting, together with the reconstructed structure that is estimated using the dot products of the embeddings.
The loss function is computed from the feature reconstruction loss and the structure reconstruction loss. The importance given to features or to the structure can be tuned with the alpha hyperparameter.
The following describes a few use cases where
UnsupervisedAnomalyDetectionGraphWise
algorithm can be applied:
- Cybersecurity: To detect abnormal behavior in network traffic by analyzing the graph of connections between devices. Anomalous patterns might indicate security breaches, malware infections, or insider threats.
- Credit Card Fraud Detection: To identify suspicious credit card transactions by examining the relationships between transactions, users, and vendors, and scoring those that deviate from typical patterns.
- Smart Grid Monitoring: To monitor power grids to detect anomalies in electricity usage patterns that can indicate faults or unauthorized usage. This helps to ensure efficient and secure energy distribution.
The following describes the usage of the main
functionalities of the implementation of
Dominant
in PGX. The example
demonstrates a scenario to detect fraudulent vertices based on their
features.
- Loading a Graph
- Building a Minimal Unsupervised Anomaly Detection GraphWise Model
- Advanced Hyperparameter Customization
- Building an Unsupervised Anomaly Detection GraphWise Model Using Partitioned Graphs
- Training an Unsupervised Anomaly Detection GraphWise Model
- Getting the Loss Value for an Unsupervised Anomaly Detection GraphWise Model
- Inferring Embeddings for an Unsupervised Anomaly Detection GraphWise Model
- Inferring Anomalies
- Storing an Unsupervised Anomaly Detection GraphWise Model
- Loading a Pre-Trained Unsupervised Anomaly Detection GraphWise Model
- Destroying an Unsupervised Anomaly Detection GraphWise Model
Parent topic: Using the Machine Learning Library (PgxML) for Graphs