Graph Neural Networks for Updating Node/Edge Representations¶

All models based on graph neural networks start with updating node/edge representations. We introduce various GNN models implemented in DGL-LifeSci for representation update.

Contents

AttentiveFP
GAT
GCN
MGCN
MPNN
SchNet
Weave
GIN
GraphSAGE
WLN
GNNOGB
NF
PAGTN

AttentiveFP ¶

class dgllife.model.gnn.attentivefp.AttentiveFPGNN(node_feat_size, edge_feat_size, num_layers=2, graph_feat_size=200, dropout=0.0)[source]¶

Pushing the Boundaries of Molecular Representation for Drug Discovery with the Graph Attention Mechanism

This class performs message passing in AttentiveFP and returns the updated node representations.

Parameters

node_feat_size (int) – Size for the input node features.
edge_feat_size (int) – Size for the input edge features.
num_layers (int) – Number of GNN layers. Default to 2.
graph_feat_size (int) – Size for the graph representations to be computed. Default to 200.
dropout (float) – The probability for performing dropout. Default to 0.

forward(g, node_feats, edge_feats)[source]¶

Performs message passing and updates node representations.

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs.
node_feats (float32 tensor of shape (V, node_feat_size)) – Input node features. V for the number of nodes.
edge_feats (float32 tensor of shape (E, edge_feat_size)) – Input edge features. E for the number of edges.

Returns

node_feats – Updated node representations.

Return type

float32 tensor of shape (V, graph_feat_size)

reset_parameters()[source]¶: Reinitialize model parameters.

class dgllife.model.gnn.gat.GAT(in_feats, hidden_feats=None, num_heads=None, feat_drops=None, attn_drops=None, alphas=None, residuals=None, agg_modes=None, activations=None, biases=None, allow_zero_in_degree=False)[source]¶

GAT from Graph Attention Networks

Parameters

in_feats (int) – Number of input node features
hidden_feats (list of int) – hidden_feats[i] gives the output size of an attention head in the i-th GAT layer. len(hidden_feats) equals the number of GAT layers. By default, we use [32, 32].
num_heads (list of int) – num_heads[i] gives the number of attention heads in the i-th GAT layer. len(num_heads) equals the number of GAT layers. By default, we use 4 attention heads for each GAT layer.
feat_drops (list of float) – feat_drops[i] gives the dropout applied to the input features in the i-th GAT layer. len(feat_drops) equals the number of GAT layers. By default, this will be zero for all GAT layers.
attn_drops (list of float) – attn_drops[i] gives the dropout applied to attention values of edges in the i-th GAT layer. len(attn_drops) equals the number of GAT layers. By default, this will be zero for all GAT layers.
alphas (list of float) – Hyperparameters in LeakyReLU, which are the slopes for negative values. alphas[i] gives the slope for negative value in the i-th GAT layer. len(alphas) equals the number of GAT layers. By default, this will be 0.2 for all GAT layers.
residuals (list of bool) – residual[i] decides if residual connection is to be used for the i-th GAT layer. len(residual) equals the number of GAT layers. By default, residual connection is performed for each GAT layer.
agg_modes (list of str) – The way to aggregate multi-head attention results for each GAT layer, which can be either ‘flatten’ for concatenating all-head results or ‘mean’ for averaging all-head results. agg_modes[i] gives the way to aggregate multi-head attention results for the i-th GAT layer. len(agg_modes) equals the number of GAT layers. By default, we flatten all-head results for each GAT layer.
activations (list of activation function or None) – activations[i] gives the activation function applied to the aggregated multi-head results for the i-th GAT layer. len(activations) equals the number of GAT layers. By default, no activation is applied for each GAT layer.
biases (list of bool) – biases[i] gives whether to use bias for the i-th GAT layer. len(activations) equals the number of GAT layers. By default, we use bias for all GAT layers.
allow_zero_in_degree (bool) – Whether to allow zero in degree nodes in graph for all layers. By default, will not allow zero in degree nodes.

forward(g, feats)[source]¶

Update node representations.

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs
feats (FloatTensor of shape (N, M1)) –
- N is the total number of nodes in the batch of graphs
- M1 is the input node feature size, which equals in_feats in initialization

Returns

feats –

N is the total number of nodes in the batch of graphs
M2 is the output node representation size, which equals hidden_sizes[-1] if agg_modes[-1] == ‘mean’ and hidden_sizes[-1] * num_heads[-1] otherwise.

Return type

FloatTensor of shape (N, M2)

reset_parameters()[source]¶: Reinitialize model parameters.

GATv2 — .. automodule:: dgllife.model.gnn.gatv2

members

GCN ¶

class dgllife.model.gnn.gcn.GCN(in_feats, hidden_feats=None, gnn_norm=None, activation=None, residual=None, batchnorm=None, dropout=None, allow_zero_in_degree=None)[source]¶

GCN from Semi-Supervised Classification with Graph Convolutional Networks

Parameters

in_feats (int) – Number of input node features.
hidden_feats (list of int) – hidden_feats[i] gives the size of node representations after the i-th GCN layer. len(hidden_feats) equals the number of GCN layers. By default, we use [64, 64].
gnn_norm (list of str) – gnn_norm[i] gives the message passing normalizer for the i-th GCN layer, which can be ‘right’, ‘both’ or ‘none’. The ‘right’ normalizer divides the aggregated messages by each node’s in-degree. The ‘both’ normalizer corresponds to the symmetric adjacency normalization in the original GCN paper. The ‘none’ normalizer simply sums the messages. len(gnn_norm) equals the number of GCN layers. By default, we use ['none', 'none'].
activation (list of activation functions or None) – If not None, activation[i] gives the activation function to be used for the i-th GCN layer. len(activation) equals the number of GCN layers. By default, ReLU is applied for all GCN layers.
residual (list of bool) – residual[i] decides if residual connection is to be used for the i-th GCN layer. len(residual) equals the number of GCN layers. By default, residual connection is performed for each GCN layer.
batchnorm (list of bool) – batchnorm[i] decides if batch normalization is to be applied on the output of the i-th GCN layer. len(batchnorm) equals the number of GCN layers. By default, batch normalization is applied for all GCN layers.
dropout (list of float) – dropout[i] decides the dropout probability on the output of the i-th GCN layer. len(dropout) equals the number of GCN layers. By default, no dropout is performed for all layers.
allow_zero_in_degree (bool) – Whether to allow zero in degree nodes in graph for all layers. By default, will not allow zero in degree nodes.

forward(g, feats)[source]¶

Update node representations.

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs
feats (FloatTensor of shape (N, M1)) –
- N is the total number of nodes in the batch of graphs
- M1 is the input node feature size, which equals in_feats in initialization

Returns

feats –

N is the total number of nodes in the batch of graphs
M2 is the output node representation size, which equals hidden_sizes[-1] in initialization.

Return type

FloatTensor of shape (N, M2)

reset_parameters()[source]¶: Reinitialize model parameters.

MGCN ¶

class dgllife.model.gnn.mgcn.MGCNGNN(feats=128, n_layers=3, num_node_types=100, num_edge_types=3000, cutoff=30.0, gap=0.1)[source]¶

MGCN.

MGCN is introduced in Molecular Property Prediction: A Multilevel Quantum Interactions Modeling Perspective.

This class performs message passing in MGCN and returns the updated node representations.

Parameters

feats (int) – Size for the node and edge embeddings to learn. Default to 128.
n_layers (int) – Number of gnn layers to use. Default to 3.
num_node_types (int) – Number of node types to embed. Default to 100.
num_edge_types (int) – Number of edge types to embed. Default to 3000.
cutoff (float) – Largest center in RBF expansion. Default to 30.
gap (float) – Difference between two adjacent centers in RBF expansion. Default to 0.1.

forward(g, node_types, edge_dists)[source]¶

Performs message passing and updates node representations.

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs.
node_types (int64 tensor of shape (V)) – Node types to embed, V for the number of nodes.
edge_dists (float32 tensor of shape (E, 1)) – Distances between end nodes of edges, E for the number of edges.

Returns

Output node representations.

Return type

float32 tensor of shape (V, feats * (n_layers + 1))

reset_parameters()[source]¶: Reinitialize model parameters.

MPNN ¶

class dgllife.model.gnn.mpnn.MPNNGNN(node_in_feats, edge_in_feats, node_out_feats=64, edge_hidden_feats=128, num_step_message_passing=6)[source]¶

MPNN.

MPNN is introduced in Neural Message Passing for Quantum Chemistry.

This class performs message passing in MPNN and returns the updated node representations.

Parameters

node_in_feats (int) – Size for the input node features.
node_out_feats (int) – Size for the output node representations. Default to 64.
edge_in_feats (int) – Size for the input edge features. Default to 128.
edge_hidden_feats (int) – Size for the hidden edge representations.
num_step_message_passing (int) – Number of message passing steps. Default to 6.

forward(g, node_feats, edge_feats)[source]¶

Performs message passing and updates node representations.

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs.
node_feats (float32 tensor of shape (V, node_in_feats)) – Input node features. V for the number of nodes in the batch of graphs.
edge_feats (float32 tensor of shape (E, edge_in_feats)) – Input edge features. E for the number of edges in the batch of graphs.

Returns

node_feats – Output node representations.

Return type

float32 tensor of shape (V, node_out_feats)

reset_parameters()[source]¶: Reinitialize model parameters.

SchNet ¶

class dgllife.model.gnn.schnet.SchNetGNN(node_feats=64, hidden_feats=None, num_node_types=100, cutoff=30.0, gap=0.1)[source]¶

SchNet.

SchNet is introduced in SchNet: A continuous-filter convolutional neural network for modeling quantum interactions.

This class performs message passing in SchNet and returns the updated node representations.

Parameters

node_feats (int) – Size for node representations to learn. Default to 64.
hidden_feats (list of int) – hidden_feats[i] gives the size of hidden representations for the i-th interaction layer. len(hidden_feats) equals the number of interaction layers. Default to [64, 64, 64].
num_node_types (int) – Number of node types to embed. Default to 100.
cutoff (float) – Largest center in RBF expansion. Default to 30.
gap (float) – Difference between two adjacent centers in RBF expansion. Default to 0.1.

forward(g, node_types, edge_dists)[source]¶

Performs message passing and updates node representations.

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs.
node_types (int64 tensor of shape (V)) – Node types to embed, V for the number of nodes.
edge_dists (float32 tensor of shape (E, 1)) – Distances between end nodes of edges, E for the number of edges.

Returns

node_feats – Updated node representations.

Return type

float32 tensor of shape (V, node_feats)

reset_parameters()[source]¶: Reinitialize model parameters.

Weave ¶

class dgllife.model.gnn.weave.WeaveGNN(node_in_feats, edge_in_feats, num_layers=2, hidden_feats=50, activation=<function relu>)[source]¶

The component of Weave for updating node and edge representations.

Weave is introduced in Molecular Graph Convolutions: Moving Beyond Fingerprints.

Parameters

node_in_feats (int) – Size for the input node features.
edge_in_feats (int) – Size for the input edge features.
num_layers (int) – Number of Weave layers to use, which is equivalent to the times of message passing. Default to 2.
hidden_feats (int) – Size for the hidden node and edge representations. Default to 50.
activation (callable) – Activation function to be used. It cannot be None. Default to ReLU.

forward(g, node_feats, edge_feats, node_only=True)[source]¶

Updates node representations (and edge representations).

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs.
node_feats (float32 tensor of shape (V, node_in_feats)) – Input node features. V for the number of nodes in the batch of graphs.
edge_feats (float32 tensor of shape (E, edge_in_feats)) – Input edge features. E for the number of edges in the batch of graphs.
node_only (bool) – Whether to return updated node representations only or to return both node and edge representations. Default to True.

Returns

float32 tensor of shape (V, gnn_hidden_feats) – Updated node representations.
float32 tensor of shape (E, gnn_hidden_feats), optional – This is returned only when node_only==False. Updated edge representations.

reset_parameters()[source]¶: Reinitialize model parameters.

GIN ¶

class dgllife.model.gnn.gin.GIN(num_node_emb_list, num_edge_emb_list, num_layers=5, emb_dim=300, JK='last', dropout=0.5)[source]¶

Graph Isomorphism Network from Strategies for Pre-training Graph Neural Networks

This module is for updating node representations only.

Parameters

num_node_emb_list (list of int) – num_node_emb_list[i] gives the number of items to embed for the i-th categorical node feature variables. E.g. num_node_emb_list[0] can be the number of atom types and num_node_emb_list[1] can be the number of atom chirality types.
num_edge_emb_list (list of int) – num_edge_emb_list[i] gives the number of items to embed for the i-th categorical edge feature variables. E.g. num_edge_emb_list[0] can be the number of bond types and num_edge_emb_list[1] can be the number of bond direction types.
num_layers (int) – Number of GIN layers to use. Default to 5.
emb_dim (int) – The size of each embedding vector. Default to 300.
JK (str) –
JK for jumping knowledge as in Representation Learning on Graphs with Jumping Knowledge Networks. It decides how we are going to combine the all-layer node representations for the final output. There can be four options for this argument, concat, last, max and sum. Default to ‘last’.
- 'concat': concatenate the output node representations from all GIN layers
- 'last': use the node representations from the last GIN layer
- 'max': apply max pooling to the node representations across all GIN layers
- 'sum': sum the output node representations from all GIN layers
dropout (float) – Dropout to apply to the output of each GIN layer. Default to 0.5

forward(g, categorical_node_feats, categorical_edge_feats)[source]¶

Update node representations

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs
categorical_node_feats (list of LongTensor of shape (N)) –
- Input categorical node features
- len(categorical_node_feats) should be the same as len(self.node_embeddings)
- N is the total number of nodes in the batch of graphs
categorical_edge_feats (list of LongTensor of shape (E)) –
- Input categorical edge features
- len(categorical_edge_feats) should be the same as len(num_edge_emb_list) in the arguments
- E is the total number of edges in the batch of graphs

Returns

final_node_feats – Output node representations, N for the number of nodes and M for output size. In particular, M will be emb_dim * (num_layers + 1) if self.JK == ‘concat’ and emb_dim otherwise.

Return type

float32 tensor of shape (N, M)

reset_parameters()[source]¶: Reinitialize model parameters.

GraphSAGE ¶

class dgllife.model.gnn.graphsage.GraphSAGE(in_feats, hidden_feats=None, activation=None, dropout=None, aggregator_type=None)[source]¶

GraphSAGE from Inductive Representation Learning on Large Graphs

Parameters

in_feats (int) – Number of input node features.
hidden_feats (list of int) – hidden_feats[i] gives the size of node representations after the i-th GraphSAGE layer. len(hidden_feats) equals the number of GraphSAGE layers. By default, we use [64, 64].
activation (list of activation functions or None) – If not None, activation[i] gives the activation function to be used for the i-th GraphSAGE layer. len(activation) equals the number of GraphSAGE layers. By default, ReLU is applied for all GraphSAGE layers.
dropout (list of float or None) – dropout[i] decides the dropout probability on the output of the i-th GraphSAGE layer. len(dropout) equals the number of GraphSAGE layers. By default, no dropout is performed for all layers.
aggregator_type (list of str) – aggregator_type[i] decides the aggregator type for the i-th GraphSAGE layer, which can be one of 'mean', 'gcn', 'pool', 'lstm'. By default, we use 'mean' for all layers.

forward(g, feats)[source]¶

Update node representations.

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs
feats (FloatTensor of shape (N, M1)) –
- N is the total number of nodes in the batch of graphs
- M1 is the input node feature size, which equals in_feats in initialization

Returns

feats –

N is the total number of nodes in the batch of graphs
M2 is the output node representation size, which equals hidden_sizes[-1] in initialization.

Return type

FloatTensor of shape (N, M2)

reset_parameters()[source]¶: Reinitialize model parameters.

WLN ¶

class dgllife.model.gnn.wln.WLN(node_in_feats, edge_in_feats, node_out_feats=300, n_layers=3, project_in_feats=True, set_comparison=True)[source]¶

Weisfeiler-Lehman Network (WLN)

WLN is introduced in Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network.

This class performs message passing and updates node representations.

Parameters

node_in_feats (int) – Size for the input node features.
edge_in_feats (int) – Size for the input edge features.
node_out_feats (int) – Size for the output node representations. Default to 300.
n_layers (int) – Number of times for message passing. Note that same parameters are shared across n_layers message passing. Default to 3.
project_in_feats (bool) – Whether to project input node features. If this is False, we expect node_in_feats to be the same as node_out_feats. Default to True.
set_comparison (bool) – Whether to perform final node representation update mimicking set comparison. Default to True.

forward(g, node_feats, edge_feats)[source]¶

Performs message passing and updates node representations.

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs
node_feats (float32 tensor of shape (V, node_in_feats)) – Input node features. V for the number of nodes.
edge_feats (float32 tensor of shape (E, edge_in_feats)) – Input edge features. E for the number of edges.

Returns

Updated node representations.

Return type

float32 tensor of shape (V, node_out_feats)

reset_parameters()[source]¶: Initialize model parameters.

GNNOGB ¶

class dgllife.model.gnn.gnn_ogb.GNNOGB(in_edge_feats, num_node_types=1, hidden_feats=300, n_layers=5, batchnorm=True, activation=<function relu>, dropout=0.0, gnn_type='gcn', virtual_node=True, residual=False, jk=False)[source]¶

Variant of GCN/GIN from Open Graph Benchmark: Datasets for Machine Learning on Graphs

Parameters

in_edge_feats (int) – Number of input edge features.
num_node_types (int) – Number of node types to embed. (Default: 1)
hidden_feats (int) – Size for hidden representations. (Default: 300)
n_layers (int) – Number of GNN layers to use. (Default: 5)
batchnorm (bool) – Whether to apply batch normalization. (Default: True)
activation (callable or None) – Activation function to apply to the output of each GNN layer except for the last layer. If None, no activation will be applied. (Default: ReLU)
dropout (float) – The probability for dropout. (Default: 0, i.e. no dropout)
gnn_type (str) – The GNN type to use, which can be either ‘gcn’ or ‘gin’. (Default: ‘gcn’)
virtual_node (bool) – Whether to use virtual node. (Default: True)
residual (bool) – Whether to apply residual connections for virtual node embeddings. (Default: False)
jk (bool) – Whether to sum over the output of all GNN layers as in JK networks. (Default: False)

forward(g, node_feats, edge_feats)[source]¶

Update node representations.

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs
node_feats (LongTensor of shape (N, 1)) – Input categorical node features. N for the number of nodes.
edge_feats (FloatTensor of shape (E, in_edge_feats)) – Input edge features. E for the number of edges.

Returns

Output node representations

Return type

FloatTensor of shape (N, hidden_feats)

reset_parameters()[source]¶: Reinitialize model parameters.

NF ¶

class dgllife.model.gnn.nf.NFGNN(in_feats, hidden_feats=None, max_degree=10, activation=None, batchnorm=None, dropout=None)[source]¶

GNN from Convolutional Networks on Graphs for Learning Molecular Fingerprints

NF stands for neural fingerprint.

Parameters

in_feats (int) – Number of input node features.
hidden_feats (list of int, optional) – hidden_feats[i] gives the size of node representations after the i-th NF layer. len(hidden_feats) equals the number of NF layers. By default, we use [64, 64].
max_degree (int) – The maximum node degree to consider when updating weights. Default to be 10.
activation (list of activation functions or None) – If not None, activation[i] gives the activation function to be used for the i-th NF layer. len(activation) equals the number of NF layers. By default, ReLU is applied for all NF layers.
batchnorm (list of bool, optional) – batchnorm[i] decides if batch normalization is to be applied on the output of the i-th NF layer. len(batchnorm) equals the number of NF layers. By default, batch normalization is applied for all NF layers.
dropout (list of float, optional) – dropout[i] decides the dropout to be applied on the output of the i-th NF layer. len(dropout) equals the number of NF layers. By default, dropout is not applied for all NF layers.

forward(g, feats)[source]¶

Update node representations.

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs
feats (FloatTensor of shape (N, M1)) –
- N is the total number of nodes in the batch of graphs
- M1 is the input node feature size, which equals in_feats in initialization

Returns

feats –

N is the total number of nodes in the batch of graphs
M2 is the output node representation size, which equals hidden_sizes[-1] in initialization.

Return type

FloatTensor of shape (N, M2)

reset_parameters()[source]¶: Reinitialize model parameters.

PAGTN ¶

class dgllife.model.gnn.pagtn.PAGTNGNN(node_in_feats, node_out_feats, node_hid_feats, edge_feats, depth=5, nheads=1, dropout=0.1, activation=LeakyReLU(negative_slope=0.2))[source]¶

Multilayer PAGTN model for updating node representations. PAGTN is introduced in Path-Augmented Graph Transformer Network.

Parameters

node_in_feats (int) – Size for the input node features.
node_out_feats (int) – Size for the output node features.
node_hid_feats (int) – Size for the hidden node features.
edge_feats (int) – Size for the input edge features.
depth (int) – Number of PAGTN layers to be applied.
nheads (int) – Number of attention heads.
dropout (float) – The probability for performing dropout. Default to 0.1
activation (callable) – Activation function to apply. Default to LeakyReLU.

forward(g, node_feats, edge_feats)[source]¶

Update node representations.

Parameters

g (DGLGraph) – DGLGraph for a batch of graphs
node_feats (float32 tensor of shape (V, node_in_feats)) – Input node features. V for the number of nodes in the batch of graphs.
edge_feats (float32 tensor of shape (E, edge_in_feats)) – Input edge features. E for the number of edges in the batch of graphs.

Returns

Updated node features.

Return type

float32 tensor of shape (V, node_out_feats)

reset_parameters()[source]¶: Reinitialize learnable parameters.