Model Zoo¶
This section introduces complete models for various downstream tasks.
Contents
Building Blocks¶
MLP Predictor¶

class
dgllife.model.model_zoo.mlp_predictor.
MLPPredictor
(in_feats, hidden_feats, n_tasks, dropout=0.0)[source]¶ Twolayer MLP for regression or soft classification over multiple tasks from graph representations.
For classification tasks, the output will be logits, i.e. values before sigmoid or softmax.
 Parameters
Hadamard Link Predictor¶

class
dgllife.model.model_zoo.hadamard_link_predictor.
HadamardLinkPredictor
(in_feats, hidden_feats=256, num_layers=3, n_tasks=1, dropout=0.0, activation=<function relu>)[source]¶ Link prediction by taking the elementwise multiplication of two node representations
The elementwise multiplication is also called Hadamard product.
 Parameters
in_feats (int) – Number of input node features
hidden_feats (int) – Number of hidden features. Default to 256.
num_layers (int) – Number of linear layers used in total, which should be at least 2, counting the input and output layers. Default to 3.
n_tasks (int) – Number of output tasks. Default to 1.
dropout (float) – Dropout before each linear layer except for the first one. Default to 0., i.e. no dropout is performed.
activation (callable) – Activation function to apply after the output of each linear layer. Default to ReLU.

forward
(left_node_feats, right_node_feats)[source]¶ Link Prediction
Perform link prediction for P pairs of nodes. Note that this model is symmetric and we don’t have separate parameters for the two arguments.
 Parameters
left_node_feats (float32 tensor of shape (P, D1)) – Representations for the first node in P pairs. D1 for the number of input node features.
right_node_feats (float32 tensor of shape (P, D1)) – Representations for the second node in P pairs. D1 for the number of input node features.
 Returns
Presoftmax/sigmoid logits, D2 equals n_tasks.
 Return type
float32 tensor of shape (P, D2)
Molecular Property Prediction¶
AttentiveFP Predictor¶

class
dgllife.model.model_zoo.attentivefp_predictor.
AttentiveFPPredictor
(node_feat_size, edge_feat_size, num_layers=2, num_timesteps=2, graph_feat_size=200, n_tasks=1, dropout=0.0)[source]¶ AttentiveFP for regression and classification on graphs.
AttentiveFP is introduced in Pushing the Boundaries of Molecular Representation for Drug Discovery with the Graph Attention Mechanism.
 Parameters
node_feat_size (int) – Size for the input node features.
edge_feat_size (int) – Size for the input edge features.
num_layers (int) – Number of GNN layers. Default to 2.
num_timesteps (int) – Times of updating the graph representations with GRU. Default to 2.
graph_feat_size (int) – Size for the learned graph representations. Default to 200.
n_tasks (int) – Number of tasks, which is also the output size. Default to 1.
dropout (float) – Probability for performing the dropout. Default to 0.

forward
(g, node_feats, edge_feats, get_node_weight=False)[source]¶ Graphlevel regression/soft classification.
 Parameters
g (DGLGraph) – DGLGraph for a batch of graphs.
node_feats (float32 tensor of shape (V, node_feat_size)) – Input node features. V for the number of nodes.
edge_feats (float32 tensor of shape (E, edge_feat_size)) – Input edge features. E for the number of edges.
get_node_weight (bool) – Whether to get the weights of atoms during readout. Default to False.
 Returns
float32 tensor of shape (G, n_tasks) – Prediction for the graphs in the batch. G for the number of graphs.
node_weights (list of float32 tensor of shape (V, 1), optional) – This is returned when
get_node_weight
isTrue
. The list has a lengthnum_timesteps
andnode_weights[i]
gives the node weights in the ith update.
GAT Predictor¶

class
dgllife.model.model_zoo.gat_predictor.
GATPredictor
(in_feats, hidden_feats=None, num_heads=None, feat_drops=None, attn_drops=None, alphas=None, residuals=None, agg_modes=None, activations=None, biases=None, classifier_hidden_feats=128, classifier_dropout=0.0, n_tasks=1, predictor_hidden_feats=128, predictor_dropout=0.0)[source]¶ GATbased model for regression and classification on graphs.
GAT is introduced in Graph Attention Networks. This model is based on GAT and can be used for regression and classification on graphs.
After updating node representations, we perform a weighted sum with learnable weights and max pooling on them and concatenate the output of the two operations, which is then fed into an MLP for final prediction.
For classification tasks, the output will be logits, i.e. values before sigmoid or softmax.
 Parameters
in_feats (int) – Number of input node features
hidden_feats (list of int) –
hidden_feats[i]
gives the output size of an attention head in the ith GAT layer.len(hidden_feats)
equals the number of GAT layers. By default, we use[32, 32]
.num_heads (list of int) –
num_heads[i]
gives the number of attention heads in the ith GAT layer.len(num_heads)
equals the number of GAT layers. By default, we use 4 attention heads for each GAT layer.feat_drops (list of float) –
feat_drops[i]
gives the dropout applied to the input features in the ith GAT layer.len(feat_drops)
equals the number of GAT layers. By default, this will be zero for all GAT layers.attn_drops (list of float) –
attn_drops[i]
gives the dropout applied to attention values of edges in the ith GAT layer.len(attn_drops)
equals the number of GAT layers. By default, this will be zero for all GAT layers.alphas (list of float) – Hyperparameters in LeakyReLU, which are the slopes for negative values.
alphas[i]
gives the slope for negative value in the ith GAT layer.len(alphas)
equals the number of GAT layers. By default, this will be 0.2 for all GAT layers.residuals (list of bool) –
residual[i]
decides if residual connection is to be used for the ith GAT layer.len(residual)
equals the number of GAT layers. By default, residual connection is performed for each GAT layer.agg_modes (list of str) – The way to aggregate multihead attention results for each GAT layer, which can be either ‘flatten’ for concatenating allhead results or ‘mean’ for averaging allhead results.
agg_modes[i]
gives the way to aggregate multihead attention results for the ith GAT layer.len(agg_modes)
equals the number of GAT layers. By default, we flatten multihead results for intermediate GAT layers and compute mean of multihead results for the last GAT layer.activations (list of activation function or None) –
activations[i]
gives the activation function applied to the aggregated multihead results for the ith GAT layer.len(activations)
equals the number of GAT layers. By default, ELU is applied for intermediate GAT layers and no activation is applied for the last GAT layer.biases (list of bool) –
biases[i]
gives whether to add bias for the ith GAT layer.len(activations)
equals the number of GAT layers. By default, bias is added for all GAT layers.classifier_hidden_feats (int) – (Deprecated, see
predictor_hidden_feats
) Size of hidden graph representations in the classifier. Default to 128.classifier_dropout (float) – (Deprecated, see
predictor_dropout
) The probability for dropout in the classifier. Default to 0.n_tasks (int) – Number of tasks, which is also the output size. Default to 1.
predictor_hidden_feats (int) – Size for hidden representations in the output MLP predictor. Default to 128.
predictor_dropout (float) – The probability for dropout in the output MLP predictor. Default to 0.

forward
(bg, feats)[source]¶ Graphlevel regression/soft classification.
 Parameters
bg (DGLGraph) – DGLGraph for a batch of graphs.
feats (FloatTensor of shape (N, M1)) –
N is the total number of nodes in the batch of graphs
M1 is the input node feature size, which must match in_feats in initialization
 Returns
Predictions on graphs
B for the number of graphs in the batch
 Return type
FloatTensor of shape (B, n_tasks)
GATv2 Predictor¶

class
dgllife.model.model_zoo.gatv2_predictor.
GATv2Predictor
(in_feats, hidden_feats=None, num_heads=None, feat_drops=None, attn_drops=None, alphas=None, residuals=None, activations=None, allow_zero_in_degree=False, biases=None, share_weights=None, agg_modes=None, n_tasks=1, predictor_out_feats=128, predictor_dropout=0.0)[source]¶ GATv2based model for regression and classification on graphs
GATv2 is introduced in How Attentive Are Graph Attention Networks? <https://arxiv.org/pdf/2105.14491.pdf>. This model is based on GATv2 and can be used for regression and classification on graphs.
After updating node representations, we perform a weighted sum with learnable weights and max pooling on them and concatenate the output of the two operations, which is then fed into an MLP for final prediction.
For classification tasks, the output will be logits, i.e. values before sigmoid or softmax.
 Parameters
in_feats (int) – Number of input node features
hidden_feats (list of int, optional) –
hidden_feats[i]
gives the output size of an attention head in the ith GATv2 layer.len(hidden_feats)
equals the number of GATv2 layers. By default, we use[32, 32]
.num_heads (list of int, optional) –
num_heads[i]
gives the number of attention heads in the ith GATv2 layer.len(num_heads)
equals the number of GATv2 layers. By default , we use 4 attention heads per GATv2 layer.feat_drops (list of float, optional) –
feat_drops[i]
gives the dropout applied to the input features in the ith GATv2 layer.len(feat_drops)
equals the number of GATv2 layers. By default, we use zero for all GATv2 layers.attn_drops (list of float, optional) –
attn_drops[i]
gives the dropout applied to the attention values of edges in the ith GATv2 layer.len(attn_drops)
equals the number of GATv2 layers. By default, we use zero for all GATv2 layers.alphas (list of float, optional) –
alphas[i]
gives the slope for the negative values in the LeakyReLU function of the ith GATv2 layer.len(alphas)
equals the number of GATv2 layers. By default, we use 0.2 for all GATv2 layers.residuals (list of bool, optional) –
residuals[i]
decides if residual connection is to be used for the ith GATv2 layer.len(residuals)
equals the number of GATv2 layers. By default, we useFalse
for all GATv2 layers.activations (list of callable, optional) –
activations[i]
gives the activation function applied to the result of the ith GATv2 layer.len(activations)
equals the number of GATv2 layers. By default, we use ELU for all GATv2 layers, except for the last layer.allow_zero_in_degree (bool, optional) – If there are 0indegree nodes in the graph, output for those nodes will be invalid since no messages will be passed to those nodes. This is harmful for some applications, causing silent performance regression . This module will raise a DGLError if it detects 0indegree nodes in input graph. By setting True, it will suppress the check and let the users handle it by themselves. Defaults: False.
biases (list of bool, optional) –
biases[i]
decides if an additive bias is allowed to be learned by the ith GATv2 layer.len(biases)
equals the number of GATv2 layers. By default, additive biases are learned for all GATv2 layers.share_weights (list of bool, optional) –
share_weights[i]
decides if the learnable weight matrix for source and destination nodes is the same in the ith GATv2 layer.len(share_weights)
equals the number of GATv2 Layers. By default, no weight sharing is used in all GATv2 layers.agg_modes (list of str, optional) –
agg_modes[i]
gives the way to aggregate multihead attention results in the ith GATv2 layer.len(agg_modes)
equals the number of GATv2 Layers. By default, we flatten allhead results for each GATv2 layer, except for the last layer.n_tasks (int, optional) – Number of tasks, which is also the output size. Default to 1.
predictor_out_feats (int, optional) – Size for hidden representations in the output MLP predictor. Default to 128.
predictor_dropout (float, optional) – The probability for dropout in the output MLP predictor. Default to 0.

forward
(bg, feats, get_attention=False)[source]¶ Graphlevel regression/soft classification.
 Parameters
bg (DGLGraph) – DGLGraph for a batch of graphs.
feats (FloatTensor of shape (N, M1)) –
N is the total number of nodes in the batch of graphs.
M1 is the input node feature size, which equals in_feats in initialization
get_attention (bool, optional) – Whether to return the attention values. Defaults: False
 Returns
preds (FloatTensor of shape (B, n_tasks)) –
Predictions on graphs
B for the number of graphs in the batch
attentions (list of FloatTensor of shape (E, H, 1), optional) – It is returned when
get_attention
is True.attentions[i]
gives the attention values in the ith GATv2 layer.E is the number of edges.
H is the number of attention heads.
GCN Predictor¶

class
dgllife.model.model_zoo.gcn_predictor.
GCNPredictor
(in_feats, hidden_feats=None, gnn_norm=None, activation=None, residual=None, batchnorm=None, dropout=None, classifier_hidden_feats=128, classifier_dropout=0.0, n_tasks=1, predictor_hidden_feats=128, predictor_dropout=0.0)[source]¶ GCNbased model for regression and classification on graphs.
GCN is introduced in SemiSupervised Classification with Graph Convolutional Networks. This model is based on GCN and can be used for regression and classification on graphs.
After updating node representations, we perform a weighted sum with learnable weights and max pooling on them and concatenate the output of the two operations, which is then fed into an MLP for final prediction.
For classification tasks, the output will be logits, i.e. values before sigmoid or softmax.
 Parameters
in_feats (int) – Number of input node features.
hidden_feats (list of int) –
hidden_feats[i]
gives the size of node representations after the ith GCN layer.len(hidden_feats)
equals the number of GCN layers. By default, we use[64, 64]
.gnn_norm (list of str) –
gnn_norm[i]
gives the message passing normalizer for the ith GCN layer, which can be ‘right’, ‘both’ or ‘none’. The ‘right’ normalizer divides the aggregated messages by each node’s indegree. The ‘both’ normalizer corresponds to the symmetric adjacency normalization in the original GCN paper. The ‘none’ normalizer simply sums the messages.len(gnn_norm)
equals the number of GCN layers. By default, we use['none', 'none']
.activation (list of activation functions or None) – If None, no activation will be applied. If not None,
activation[i]
gives the activation function to be used for the ith GCN layer.len(activation)
equals the number of GCN layers. By default, ReLU is applied for all GCN layers.residual (list of bool) –
residual[i]
decides if residual connection is to be used for the ith GCN layer.len(residual)
equals the number of GCN layers. By default, residual connection is performed for each GCN layer.batchnorm (list of bool) –
batchnorm[i]
decides if batch normalization is to be applied on the output of the ith GCN layer.len(batchnorm)
equals the number of GCN layers. By default, batch normalization is applied for all GCN layers.dropout (list of float) –
dropout[i]
decides the dropout probability on the output of the ith GCN layer.len(dropout)
equals the number of GCN layers. By default, no dropout is performed for all layers.classifier_hidden_feats (int) – (Deprecated, see
predictor_hidden_feats
) Size of hidden graph representations in the classifier. Default to 128.classifier_dropout (float) – (Deprecated, see
predictor_dropout
) The probability for dropout in the classifier. Default to 0.n_tasks (int) – Number of tasks, which is also the output size. Default to 1.
predictor_hidden_feats (int) – Size for hidden representations in the output MLP predictor. Default to 128.
predictor_dropout (float) – The probability for dropout in the output MLP predictor. Default to 0.

forward
(bg, feats)[source]¶ Graphlevel regression/soft classification.
 Parameters
bg (DGLGraph) – DGLGraph for a batch of graphs.
feats (FloatTensor of shape (N, M1)) –
N is the total number of nodes in the batch of graphs
M1 is the input node feature size, which must match in_feats in initialization
 Returns
Predictions on graphs
B for the number of graphs in the batch
 Return type
FloatTensor of shape (B, n_tasks)
MGCN Predictor¶

class
dgllife.model.model_zoo.mgcn_predictor.
MGCNPredictor
(feats=128, n_layers=3, classifier_hidden_feats=64, n_tasks=1, num_node_types=100, num_edge_types=3000, cutoff=5.0, gap=1.0, predictor_hidden_feats=64)[source]¶ MGCN for for regression and classification on graphs.
MGCN is introduced in Molecular Property Prediction: A Multilevel Quantum Interactions Modeling Perspective.
 Parameters
feats (int) – Size for the node and edge embeddings to learn. Default to 128.
n_layers (int) – Number of gnn layers to use. Default to 3.
classifier_hidden_feats (int) – (Deprecated, see
predictor_hidden_feats
) Size for hidden representations in the classifier. Default to 64.n_tasks (int) – Number of tasks, which is also the output size. Default to 1.
num_node_types (int) – Number of node types to embed. Default to 100.
num_edge_types (int) – Number of edge types to embed. Default to 3000.
cutoff (float) – Largest center in RBF expansion. Default to 5.0
gap (float) – Difference between two adjacent centers in RBF expansion. Default to 1.0
predictor_hidden_feats (int) – Size for hidden representations in the output MLP predictor. Default to 64.

forward
(g, node_types, edge_dists)[source]¶ Graphlevel regression/soft classification.
 Parameters
g (DGLGraph) – DGLGraph for a batch of graphs.
node_types (int64 tensor of shape (V)) – Node types to embed, V for the number of nodes.
edge_dists (float32 tensor of shape (E, 1)) – Distances between end nodes of edges, E for the number of edges.
 Returns
Prediction for the graphs in the batch. G for the number of graphs.
 Return type
float32 tensor of shape (G, n_tasks)
MPNN Predictor¶

class
dgllife.model.model_zoo.mpnn_predictor.
MPNNPredictor
(node_in_feats, edge_in_feats, node_out_feats=64, edge_hidden_feats=128, n_tasks=1, num_step_message_passing=6, num_step_set2set=6, num_layer_set2set=3)[source]¶ MPNN for regression and classification on graphs.
MPNN is introduced in Neural Message Passing for Quantum Chemistry.
 Parameters
node_in_feats (int) – Size for the input node features.
edge_in_feats (int) – Size for the input edge features.
node_out_feats (int) – Size for the output node representations. Default to 64.
edge_hidden_feats (int) – Size for the hidden edge representations. Default to 128.
n_tasks (int) – Number of tasks, which is also the output size. Default to 1.
num_step_message_passing (int) – Number of message passing steps. Default to 6.
num_step_set2set (int) – Number of set2set steps. Default to 6.
num_layer_set2set (int) – Number of set2set layers. Default to 3.

forward
(g, node_feats, edge_feats)[source]¶ Graphlevel regression/soft classification.
 Parameters
g (DGLGraph) – DGLGraph for a batch of graphs.
node_feats (float32 tensor of shape (V, node_in_feats)) – Input node features.
edge_feats (float32 tensor of shape (E, edge_in_feats)) – Input edge features.
 Returns
Prediction for the graphs in the batch. G for the number of graphs.
 Return type
float32 tensor of shape (G, n_tasks)
SchNet Predictor¶

class
dgllife.model.model_zoo.schnet_predictor.
SchNetPredictor
(node_feats=64, hidden_feats=None, classifier_hidden_feats=64, n_tasks=1, num_node_types=100, cutoff=30.0, gap=0.1, predictor_hidden_feats=64)[source]¶ SchNet for regression and classification on graphs.
SchNet is introduced in SchNet: A continuousfilter convolutional neural network for modeling quantum interactions.
 Parameters
node_feats (int) – Size for node representations to learn. Default to 64.
hidden_feats (list of int) –
hidden_feats[i]
gives the size of hidden representations for the ith interaction (gnn) layer.len(hidden_feats)
equals the number of interaction (gnn) layers. Default to[64, 64, 64]
.classifier_hidden_feats (int) – (Deprecated, see
predictor_hidden_feats
) Size for hidden representations in the classifier. Default to 64.n_tasks (int) – Number of tasks, which is also the output size. Default to 1.
num_node_types (int) – Number of node types to embed. Default to 100.
cutoff (float) – Largest center in RBF expansion. Default to 30.
gap (float) – Difference between two adjacent centers in RBF expansion. Default to 0.1.
predictor_hidden_feats (int) – Size for hidden representations in the output MLP predictor. Default to 64.

forward
(g, node_types, edge_dists)[source]¶ Graphlevel regression/soft classification.
 Parameters
g (DGLGraph) – DGLGraph for a batch of graphs.
node_types (int64 tensor of shape (V)) – Node types to embed, V for the number of nodes.
edge_dists (float32 tensor of shape (E, 1)) – Distances between end nodes of edges, E for the number of edges.
 Returns
Prediction for the graphs in the batch. G for the number of graphs.
 Return type
float32 tensor of shape (G, n_tasks)
Weave Predictor¶

class
dgllife.model.model_zoo.weave_predictor.
WeavePredictor
(node_in_feats, edge_in_feats, num_gnn_layers=2, gnn_hidden_feats=50, gnn_activation=<function relu>, graph_feats=128, gaussian_expand=True, gaussian_memberships=None, readout_activation=Tanh(), n_tasks=1)[source]¶ Weave for regression and classification on graphs.
Weave is introduced in Molecular Graph Convolutions: Moving Beyond Fingerprints
 Parameters
node_in_feats (int) – Size for the input node features.
edge_in_feats (int) – Size for the input edge features.
num_gnn_layers (int) – Number of GNN (Weave) layers to use. Default to 2.
gnn_hidden_feats (int) – Size for the hidden node and edge representations. Default to 50.
gnn_activation (callable) – Activation function to be used in GNN (Weave) layers. Default to ReLU.
graph_feats (int) – Size for the hidden graph representations. Default to 50.
gaussian_expand (bool) – Whether to expand each dimension of node features by gaussian histogram in computing graph representations. Default to True.
gaussian_memberships (list of 2tuples) – For each tuple, the first and second element separately specifies the mean and std for constructing a normal distribution. This argument comes into effect only when
gaussian_expand==True
. By default, we set this to be a list consisting of(1.645, 0.283)
,(1.080, 0.170)
,(0.739, 0.134)
,(0.468, 0.118)
,(0.228, 0.114)
,(0., 0.114)
,(0.228, 0.114)
,(0.468, 0.118)
,(0.739, 0.134)
,(1.080, 0.170)
,(1.645, 0.283)
.readout_activation (callable) – Activation function to be used in computing graph representations out of node representations. Default to Tanh.
n_tasks (int) – Number of tasks, which is also the output size. Default to 1.

forward
(g, node_feats, edge_feats)[source]¶ Graphlevel regression/soft classification.
 Parameters
g (DGLGraph) – DGLGraph for a batch of graphs.
node_feats (float32 tensor of shape (V, node_in_feats)) – Input node features. V for the number of nodes.
edge_feats (float32 tensor of shape (E, edge_in_feats)) – Input edge features. E for the number of edges.
 Returns
Prediction for the graphs in the batch. G for the number of graphs.
 Return type
float32 tensor of shape (G, n_tasks)
GIN Predictor¶

class
dgllife.model.model_zoo.gin_predictor.
GINPredictor
(num_node_emb_list, num_edge_emb_list, num_layers=5, emb_dim=300, JK='last', dropout=0.5, readout='mean', n_tasks=1)[source]¶ GINbased model for regression and classification on graphs.
GIN was first introduced in How Powerful Are Graph Neural Networks for general graph property prediction problems. It was further extended in Strategies for Pretraining Graph Neural Networks for pretraining and semisupervised learning on largescale datasets.
For classification tasks, the output will be logits, i.e. values before sigmoid or softmax.
 Parameters
num_node_emb_list (list of int) – num_node_emb_list[i] gives the number of items to embed for the ith categorical node feature variables. E.g. num_node_emb_list[0] can be the number of atom types and num_node_emb_list[1] can be the number of atom chirality types.
num_edge_emb_list (list of int) – num_edge_emb_list[i] gives the number of items to embed for the ith categorical edge feature variables. E.g. num_edge_emb_list[0] can be the number of bond types and num_edge_emb_list[1] can be the number of bond direction types.
num_layers (int) – Number of GIN layers to use. Default to 5.
emb_dim (int) – The size of each embedding vector. Default to 300.
JK (str) –
JK for jumping knowledge as in Representation Learning on Graphs with Jumping Knowledge Networks. It decides how we are going to combine the alllayer node representations for the final output. There can be four options for this argument,
'concat'
,'last'
,'max'
and'sum'
. Default to ‘last’.'concat'
: concatenate the output node representations from all GIN layers'last'
: use the node representations from the last GIN layer'max'
: apply max pooling to the node representations across all GIN layers'sum'
: sum the output node representations from all GIN layers
dropout (float) – Dropout to apply to the output of each GIN layer. Default to 0.5.
readout (str) – Readout for computing graph representations out of node representations, which can be
'sum'
,'mean'
,'max'
,'attention'
, or'set2set'
. Default to ‘mean’.n_tasks (int) – Number of tasks, which is also the output size. Default to 1.

forward
(g, categorical_node_feats, categorical_edge_feats)[source]¶ Graphlevel regression/soft classification.
 Parameters
g (DGLGraph) – DGLGraph for a batch of graphs
categorical_node_feats (list of LongTensor of shape (N)) –
Input categorical node features
len(categorical_node_feats) should be the same as len(num_node_emb_list)
N is the total number of nodes in the batch of graphs
categorical_edge_feats (list of LongTensor of shape (E)) –
Input categorical edge features
len(categorical_edge_feats) should be the same as len(num_edge_emb_list) in the arguments
E is the total number of edges in the batch of graphs
 Returns
Predictions on graphs
B for the number of graphs in the batch
 Return type
FloatTensor of shape (B, n_tasks)
GNN OGB Predictor¶

class
dgllife.model.model_zoo.gnn_ogb_predictor.
GNNOGBPredictor
(in_edge_feats, num_node_types=1, hidden_feats=300, n_layers=5, n_tasks=1, batchnorm=True, activation=<function relu>, dropout=0.0, gnn_type='gcn', virtual_node=True, residual=False, jk=False, readout='mean')[source]¶ Variant of GCN/GIN from Open Graph Benchmark: Datasets for Machine Learning on Graphs for graph property prediction
 Parameters
in_edge_feats (int) – Number of input edge features.
num_node_types (int) – Number of node types to embed. (Default: 1)
hidden_feats (int) – Size for hidden representations. (Default: 300)
n_layers (int) – Number of GNN layers to use. (Default: 5)
n_tasks (int) – Number of output tasks. (Default: 1)
batchnorm (bool) – Whether to apply batch normalization. (Default: True)
activation (callable or None) – Activation function to apply to the output of each GNN layer except for the last layer. If None, no activation will be applied. (Default: ReLU)
dropout (float) – The probability for dropout. (Default: 0, i.e. no dropout)
gnn_type (str) – The GNN type to use, which can be either ‘gcn’ or ‘gin’. (Default: ‘gcn’)
virtual_node (bool) – Whether to use virtual node. (Default: True)
residual (bool) – Whether to apply residual connections for virtual node embeddings. (Default: False)
jk (bool) – Whether to sum over the output of all GNN layers as in JK networks. (Default: False)
readout (str) – The readout function for computing graphlevel representations out of node representations, which can be ‘mean’, ‘sum’ or ‘max’. (Default: ‘mean’)

forward
(g, node_feats, edge_feats)[source]¶ Predict graph properties.
 Parameters
g (DGLGraph) – DGLGraph for a batch of graphs
node_feats (LongTensor of shape (N, 1)) – Input categorical node features. N for the number of nodes.
edge_feats (FloatTensor of shape (E, in_edge_feats)) – Input edge features. E for the number of edges.
 Returns
Predicted graph properties
 Return type
FloatTensor of shape (B, n_tasks)
Neural Fingerprint Predictor¶

class
dgllife.model.model_zoo.nf_predictor.
NFPredictor
(in_feats, n_tasks=1, hidden_feats=None, max_degree=10, activation=None, batchnorm=None, dropout=None, predictor_hidden_size=128, predictor_batchnorm=True, predictor_dropout=0.0, predictor_activation=<builtin method tanh of type object>)[source]¶ Neural Fingerprint (NF) for regression and classification on graphs.
NF is introduced in Convolutional Networks on Graphs for Learning Molecular Fingerprints. This model can be used for regression and classification on graphs.
After updating node representations, we perform a sum and max pooling on them and concatenate the output of the two operations, which is then fed into an MLP for final prediction.
For classification tasks, the output will be logits, i.e. values before sigmoid or softmax.
 Parameters
in_feats (int) – Number of input node features.
n_tasks (int) – Number of tasks, which is also the output size. Default to 1.
hidden_feats (list of int, optional) –
hidden_feats[i]
gives the size of node representations after the ith NF layer.len(hidden_feats)
equals the number of NF layers. By default, we use[64, 64]
.max_degree (int) – The maximum node degree to consider when updating weights. Default to be 10.
activation (list of activation functions or None) – If not None,
activation[i]
gives the activation function to be used for the ith NF layer.len(activation)
equals the number of NF layers. By default, ReLU is applied for all NF layers.batchnorm (list of bool, optional) –
batchnorm[i]
decides if batch normalization is to be applied on the output of the ith NF layer.len(batchnorm)
equals the number of NF layers. By default, batch normalization is applied for all NF layers.dropout (list of float, optional) –
dropout[i]
decides the dropout to be applied on the output of the ith NF layer.len(dropout)
equals the number of NF layers. By default, dropout is not applied for all NF layers.predicor_hidden_size (int) – Size for hidden representations in the output MLP predictor. Default to be 128.
predictor_batchnorm (bool) – Whether to apply batch normalization in the output MLP predictor. Default to be True. Default to be True.
predictor_dropout (float) – The dropout probability in the output MLP predictor. Default to be 0.
predictor_activation (activation function) – The activation function in the output MLP predictor. Default to be Tanh.

forward
(g, feats)[source]¶ Update node representations.
 Parameters
g (DGLGraph) – DGLGraph for a batch of graphs
feats (FloatTensor of shape (N, M1)) –
N is the total number of nodes in the batch of graphs
M1 is the input node feature size, which equals in_feats in initialization
 Returns
Predictions on graphs
B for the number of graphs in the batch
 Return type
FloatTensor of shape (B, n_tasks)
PathAugmented Graph Transformer Predictor¶

class
dgllife.model.model_zoo.pagtn_predictor.
PAGTNPredictor
(node_in_feats, node_out_feats, node_hid_feats, edge_feats, depth=5, nheads=1, dropout=0.1, activation=LeakyReLU(negative_slope=0.2), n_tasks=1, mode='sum')[source]¶ PAGTN model for regression and classification on graphs.
PAGTN is introduced in PathAugmented Graph Transformer Network.
 Parameters
node_in_feats (int) – Size for the input node features.
node_out_feats (int) – Size for the output node features in PAGTN layers.
node_hid_feats (int) – Size for the hidden node features in PAGTN layers.
edge_feats (int) – Size for the input edge features.
depth (int) – Number of PAGTN layers to be applied.
nheads (int) – Number of attention heads.
dropout (float) – The probability for performing dropout. Default to 0.1
activation (callable) – Activation function to apply. Default to LeakyReLU.
n_tasks (int) – Number of tasks, which is also the output size. Default to 1.
mode ('max' or 'mean' or 'sum') – Whether to compute elementwise maximum, mean or sum of the node representations.

forward
(g, node_feats, edge_feats)[source]¶ Graphlevel regression/soft classification.
 Parameters
g (DGLGraph) – DGLGraph for a batch of graphs
node_feats (float32 tensor of shape (V, node_in_feats)) – Input node features. V for the number of nodes in the batch of graphs.
edge_feats (float32 tensor of shape (E, edge_in_feats)) – Input edge features. E for the number of edges in the batch of graphs.
 Returns
Updated node features.
 Return type
float32 tensor of shape (V, node_out_feats)
Generative Models¶
DGMG¶

class
dgllife.model.model_zoo.dgmg.
DGMG
(atom_types, bond_types, node_hidden_size=128, num_prop_rounds=2, dropout=0.2)[source]¶ DGMG model
Learning Deep Generative Models of Graphs
Users only need to initialize an instance of this class.
 Parameters
atom_types (list) – E.g. [‘C’, ‘N’].
bond_types (list) – E.g. [Chem.rdchem.BondType.SINGLE, Chem.rdchem.BondType.DOUBLE, Chem.rdchem.BondType.TRIPLE, Chem.rdchem.BondType.AROMATIC].
node_hidden_size (int) – Size of atom representation. Default to 128.
num_prop_rounds (int) – Number of message passing rounds for each time. Default to 2.
dropout (float) – Probability for dropout. Default to 0.2.

add_node_and_update
(a=None)[source]¶ Decide if to add a new atom. If a new atom should be added, update the graph.

choose_dest_and_update
(bond_type, a=None)[source]¶ Choose destination and connect it to the latest atom. Add edges for both directions and update the graph.

forward
(actions=None, rdkit_mol=False, compute_log_prob=False, max_num_steps=400)[source]¶  Parameters
actions (list of 2tuples or None.) – If actions are not None, generate a molecule according to actions. Otherwise, a molecule will be generated based on sampled actions.
rdkit_mol (bool) – Whether to maintain a Chem.rdchem.Mol object. This brings extra computational cost, but is necessary if we are interested in learning the generated molecule.
compute_log_prob (bool) – Whether to compute log likelihood
max_num_steps (int) – Maximum number of steps allowed. This only comes into effect during inference and prevents the model from not stopping.
 Returns
torch.tensor consisting of a float only, optional – The log likelihood for the actions taken
str, optional – The generated molecule in the form of SMILES

get_log_prob
()[source]¶ Compute the log likelihood for the decision sequence, typically corresponding to the generation of a molecule.
 Returns
 Return type
torch.tensor consisting of a float only

prepare_log_prob
(compute_log_prob)[source]¶ Setup for returning log likelihood
 Parameters
compute_log_prob (bool) – Whether to compute log likelihood

teacher_forcing
(actions)[source]¶ Generate a molecule according to a sequence of actions.
 Parameters
actions (list of 2tuples of int) –
actions[t] gives (i, j), the action to execute by DGMG at timestep t.  If i = 0, j specifies either the type of the atom to add or termination  If i = 1, j specifies either the type of the bond to add or termination  If i = 2, j specifies the destination atom id for the bond to add.
With the formulation of DGMG, j must be created before the decision.
JTNNVAE¶

class
dgllife.model.model_zoo.jtvae.
JTNNVAE
(vocab, hidden_size, latent_size, depth, stereo=True)[source]¶ 
forward
(batch_trees, batch_tree_graphs, batch_mol_graphs, stereo_cand_batch_idx, stereo_cand_labels, batch_stereo_cand_graphs, beta=0)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Reaction Prediction
WLN for Reaction Center Prediction¶

class
dgllife.model.model_zoo.wln_reaction_center.
WLNReactionCenter
(node_in_feats, edge_in_feats, node_pair_in_feats, node_out_feats=300, n_layers=3, n_tasks=5)[source]¶ WeisfeilerLehman Network (WLN) for Reaction Center Prediction.
The model is introduced in Predicting Organic Reaction Outcomes with WeisfeilerLehman Network.
The model uses WLN to update atom representations and then predicts the score for each pair of atoms to form a bond.
 Parameters
node_in_feats (int) – Size for the input node features.
edge_in_feats (int) – Size for the input edge features.
node_out_feats (int) – Size for the output node representations. Default to 300.
node_pair_in_feats (int) – Size for the input features of node pairs.
n_layers (int) – Number of times for message passing. Note that same parameters are shared across n_layers message passing. Default to 3.
n_tasks (int) – Number of tasks for prediction.

forward
(batch_mol_graphs, batch_complete_graphs, node_feats, edge_feats, node_pair_feats)[source]¶ Predict score for each pair of nodes.
 Parameters
batch_mol_graphs (DGLGraph) – A batch of molecular graphs.
batch_complete_graphs (DGLGraph) – A batch of fully connected graphs.
node_feats (float32 tensor of shape (V, node_in_feats)) – Input node features. V for the number of nodes.
edge_feats (float32 tensor of shape (E, edge_in_feats)) – Input edge features. E for the number of edges.
node_pair_feats (float32 tensor of shape (E_full, node_pair_in_feats)) – Input features for each pair of nodes. E_full for the number of edges in the batch of complete graphs.
 Returns
scores (float32 tensor of shape (E_full, 5)) – Predicted scores for each pair of atoms to perform one of the following 5 actions in reaction:
The bond between them gets broken
Forming a single bond
Forming a double bond
Forming a triple bond
Forming an aromatic bond
biased_scores (float32 tensor of shape (E_full, 5)) – Comparing to scores, a bias is added if the pair is for a same atom.
WLN for Ranking Candidate Products¶

class
dgllife.model.model_zoo.wln_reaction_ranking.
WLNReactionRanking
(node_in_feats, edge_in_feats, node_hidden_feats=500, num_encode_gnn_layers=3)[source]¶ WeisfeilerLehman Network (WLN) for Candidate Product Ranking
The model is introduced in Predicting Organic Reaction Outcomes with WeisfeilerLehman Network and then further improved in A graphconvolutional neural network model for the prediction of chemical reactivity
The model updates representations of nodes in candidate products with WLN and predicts the score for candidate products to be the real product.
 Parameters

forward
(reactant_graph, reactant_node_feats, reactant_edge_feats, product_graphs, product_node_feats, product_edge_feats, candidate_scores, batch_num_candidate_products)[source]¶ Predicts the score for candidate products to be the true product
 Parameters
reactant_graph (DGLGraph) – DGLGraph for a batch of reactants.
reactant_node_feats (float32 tensor of shape (V1, node_in_feats)) – Input node features for the reactants. V1 for the number of nodes.
reactant_edge_feats (float32 tensor of shape (E1, edge_in_feats)) – Input edge features for the reactants. E1 for the number of edges in reactant_graph.
product_graphs (DGLGraph) – DGLGraph for the candidate products in a batch of reactions.
product_node_feats (float32 tensor of shape (V2, node_in_feats)) – Input node features for the candidate products. V2 for the number of nodes.
product_edge_feats (float32 tensor of shape (E2, edge_in_feats)) – Input edge features for the candidate products. E2 for the number of edges in the graphs for candidate products.
candidate_scores (float32 tensor of shape (B, 1)) – Scores for candidate products based on the model for reaction center prediction
batch_num_candidate_products (list of int) – Number of candidate products for the reactions in the batch
 Returns
Predicted scores for candidate products
 Return type
float32 tensor of shape (B, 1)
ProteinLigand Binding Affinity Prediction
ACNN¶

class
dgllife.model.model_zoo.acnn.
ACNN
(hidden_sizes=None, weight_init_stddevs=None, dropouts=None, features_to_use=None, radial=None, num_tasks=1)[source]¶ Atomic Convolutional Networks.
The model was proposed in Atomic Convolutional Networks for Predicting ProteinLigand Binding Affinity.
The prediction proceeds as follows:
Perform message passing to update atom representations for the ligand, protein and proteinligand complex.
Predict the energy of atoms from their representations with an MLP.
Take the sum of predicted energy of atoms within each molecule for predicted energy of the ligand, protein and proteinligand complex.
Make the final prediction by subtracting the predicted ligand and protein energy from the predicted complex energy.
 Parameters
hidden_sizes (list of int) –
hidden_sizes[i]
gives the size of hidden representations in the ith hidden layer of the MLP. By Default,[32, 32, 16]
will be used.weight_init_stddevs (list of float) –
weight_init_stddevs[i]
gives the std to initialize parameters in the ith layer of the MLP. Note thatlen(weight_init_stddevs) == len(hidden_sizes) + 1
due to the output layer. By default, we use1 / sqrt(hidden_sizes[i])
for hidden layers and 0.01 for the output layer.dropouts (list of float) –
dropouts[i]
gives the dropout in the ith hidden layer of the MLP. By default, no dropout is used.features_to_use (None or float tensor of shape (T)) – In the original paper, these are atomic numbers to consider, representing the types of atoms. T for the number of types of atomic numbers. If None, we use same parameters for all atoms regardless of their type. Default to None.
radial (list) – The list consists of 3 sublists of floats, separately for the options of interaction cutoff, the options of rbf kernel mean and the options of rbf kernel scaling. By default,
[[12.0], [0.0, 2.0, 4.0, 6.0, 8.0], [4.0]]
will be used.num_tasks (int) – Number of output tasks. Default to 1.

forward
(graph)[source]¶ Apply the model for prediction.
 Parameters
graph (DGLHeteroGraph) – DGLHeteroGraph consisting of the ligand graph, the protein graph and the complex graph, along with preprocessed features. For a batch of proteinligand pairs, we assume zero padding is performed so that the number of ligand and protein atoms is the same in all pairs.
 Returns
Predicted proteinligand binding affinity. B for the number of proteinligand pairs in the batch and O for the number of tasks.
 Return type
Float32 tensor of shape (B, O)
PotentialNet¶

class
dgllife.model.model_zoo.potentialnet.
PotentialNet
(f_in, f_bond, f_spatial, f_gather, n_etypes, n_bond_conv_steps, n_spatial_conv_steps, n_rows_fc, dropouts)[source]¶ Proteinligand binding affinity prediction using a ‘staged gated graph neural network’ introduced in PotentialNet for Molecular Property Prediction.
 Parameters
f_in (int) – The dimension size of input features to GatedGraphConv, equivalent to the dimension size of atomic features in the molecular graph.
f_bond (int) – The dimension size of the output from GatedGraphConv in stage 1, equivalent to the dimension size of input to the linear layer at the end of stage 1.
f_spatial (int) – The dimension size of the output from GatedGraphConv in stage 2, equivalent to the dimension size of input to the linear layer at the end of stage 2.
f_gather (int) – The dimension size of the output from stage 1 & 2, equivalent to the dimension size of output from the linear layer at the end of stage 1 & 2.
n_etypes (int) – The number of heterogeneous edge types for stage 2. This includes the number of covalent bond types from stage 1 and the number of spatial edge types based on distances. Default to 9 (5 covalent bond types in stage 1 plus 4 distance bins in stage 2).
n_bond_conv_steps (int) – The number of bond convolution layers(steps) of GatedGraphConv in stage 1.
n_spatial_conv_steps (int) – The number of spatial convolution layers(steps) of GatedGraphConv in stage 2.
n_rows_fc (list of int) – The widths of the fully connected neural networks at each layer in stage 3.
dropouts (list of 3 floats) – The amount of dropout applied at the end of each stage.

forward
(bigraph, knn_graph)[source]¶ Compute the prediction on graphs using PotentialNet model.
 Parameters
bigraph (DGLGraph) – The molecular graph for stage 1 of PotentialNet, with bigraph.ndata[‘h’] being the input node features. and bigraph.edata[‘e’] being the onehot encoding of the edge types.
knn_graph (DGLGraph) – The knearestneighbor graph for stage 2 of PotentialNet, with no initial node features and knn_graph.edata[‘e’] being the onehot encoding of the edge types.
 Returns
x – The prediction based on the input features and graphs. For the task of binding affinity prediction, the shape is (B, 1), where B is the batch size.
 Return type
torch.Tensor