Model Zoo

This section introduces complete models for various downstream tasks.

Building Blocks

MLP Predictor

class dgllife.model.model_zoo.mlp_predictor.MLPPredictor(in_feats, hidden_feats, n_tasks, dropout=0.0)[source]

Two-layer MLP for regression or soft classification over multiple tasks from graph representations.

For classification tasks, the output will be logits, i.e. values before sigmoid or softmax.

Parameters
  • in_feats (int) – Number of input graph features

  • hidden_feats (int) – Number of graph features in hidden layers

  • n_tasks (int) – Number of tasks, which is also the output size.

  • dropout (float) – The probability for dropout. Default to be 0., i.e. no dropout is performed.

forward(feats)[source]

Make prediction.

Parameters

feats (FloatTensor of shape (B, M3)) –

  • B is the number of graphs in a batch

  • M3 is the input graph feature size, must match in_feats in initialization

Returns

Return type

FloatTensor of shape (B, n_tasks)

Molecular Property Prediction

AttentiveFP Predictor

class dgllife.model.model_zoo.attentivefp_predictor.AttentiveFPPredictor(node_feat_size, edge_feat_size, num_layers=2, num_timesteps=2, graph_feat_size=200, n_tasks=1, dropout=0.0)[source]

AttentiveFP for regression and classification on graphs.

AttentiveFP is introduced in Pushing the Boundaries of Molecular Representation for Drug Discovery with the Graph Attention Mechanism.

Parameters
  • node_feat_size (int) – Size for the input node features.

  • edge_feat_size (int) – Size for the input edge features.

  • num_layers (int) – Number of GNN layers. Default to 2.

  • num_timesteps (int) – Times of updating the graph representations with GRU. Default to 2.

  • graph_feat_size (int) – Size for the learned graph representations. Default to 200.

  • n_tasks (int) – Number of tasks, which is also the output size. Default to 1.

  • dropout (float) – Probability for performing the dropout. Default to 0.

forward(g, node_feats, edge_feats, get_node_weight=False)[source]

Graph-level regression/soft classification.

Parameters
  • g (DGLGraph) – DGLGraph for a batch of graphs.

  • node_feats (float32 tensor of shape (V, node_feat_size)) – Input node features. V for the number of nodes.

  • edge_feats (float32 tensor of shape (E, edge_feat_size)) – Input edge features. E for the number of edges.

  • get_node_weight (bool) – Whether to get the weights of atoms during readout. Default to False.

Returns

  • float32 tensor of shape (G, n_tasks) – Prediction for the graphs in the batch. G for the number of graphs.

  • node_weights (list of float32 tensor of shape (V, 1), optional) – This is returned when get_node_weight is True. The list has a length num_timesteps and node_weights[i] gives the node weights in the i-th update.

GAT Predictor

class dgllife.model.model_zoo.gat_predictor.GATPredictor(in_feats, hidden_feats=None, num_heads=None, feat_drops=None, attn_drops=None, alphas=None, residuals=None, agg_modes=None, activations=None, biases=None, classifier_hidden_feats=128, classifier_dropout=0.0, n_tasks=1, predictor_hidden_feats=128, predictor_dropout=0.0)[source]

GAT-based model for regression and classification on graphs.

GAT is introduced in Graph Attention Networks. This model is based on GAT and can be used for regression and classification on graphs.

After updating node representations, we perform a weighted sum with learnable weights and max pooling on them and concatenate the output of the two operations, which is then fed into an MLP for final prediction.

For classification tasks, the output will be logits, i.e. values before sigmoid or softmax.

Parameters
  • in_feats (int) – Number of input node features

  • hidden_feats (list of int) – hidden_feats[i] gives the output size of an attention head in the i-th GAT layer. len(hidden_feats) equals the number of GAT layers. By default, we use [32, 32].

  • num_heads (list of int) – num_heads[i] gives the number of attention heads in the i-th GAT layer. len(num_heads) equals the number of GAT layers. By default, we use 4 attention heads for each GAT layer.

  • feat_drops (list of float) – feat_drops[i] gives the dropout applied to the input features in the i-th GAT layer. len(feat_drops) equals the number of GAT layers. By default, this will be zero for all GAT layers.

  • attn_drops (list of float) – attn_drops[i] gives the dropout applied to attention values of edges in the i-th GAT layer. len(attn_drops) equals the number of GAT layers. By default, this will be zero for all GAT layers.

  • alphas (list of float) – Hyperparameters in LeakyReLU, which are the slopes for negative values. alphas[i] gives the slope for negative value in the i-th GAT layer. len(alphas) equals the number of GAT layers. By default, this will be 0.2 for all GAT layers.

  • residuals (list of bool) – residual[i] decides if residual connection is to be used for the i-th GAT layer. len(residual) equals the number of GAT layers. By default, residual connection is performed for each GAT layer.

  • agg_modes (list of str) – The way to aggregate multi-head attention results for each GAT layer, which can be either ‘flatten’ for concatenating all-head results or ‘mean’ for averaging all-head results. agg_modes[i] gives the way to aggregate multi-head attention results for the i-th GAT layer. len(agg_modes) equals the number of GAT layers. By default, we flatten multi-head results for intermediate GAT layers and compute mean of multi-head results for the last GAT layer.

  • activations (list of activation function or None) – activations[i] gives the activation function applied to the aggregated multi-head results for the i-th GAT layer. len(activations) equals the number of GAT layers. By default, ELU is applied for intermediate GAT layers and no activation is applied for the last GAT layer.

  • biases (list of bool) – biases[i] gives whether to add bias for the i-th GAT layer. len(activations) equals the number of GAT layers. By default, bias is added for all GAT layers.

  • classifier_hidden_feats (int) – (Deprecated, see predictor_hidden_feats) Size of hidden graph representations in the classifier. Default to 128.

  • classifier_dropout (float) – (Deprecated, see predictor_dropout) The probability for dropout in the classifier. Default to 0.

  • n_tasks (int) – Number of tasks, which is also the output size. Default to 1.

  • predictor_hidden_feats (int) – Size for hidden representations in the output MLP predictor. Default to 128.

  • predictor_dropout (float) – The probability for dropout in the output MLP predictor. Default to 0.

forward(bg, feats)[source]

Graph-level regression/soft classification.

Parameters
  • bg (DGLGraph) – DGLGraph for a batch of graphs.

  • feats (FloatTensor of shape (N, M1)) –

    • N is the total number of nodes in the batch of graphs

    • M1 is the input node feature size, which must match in_feats in initialization

Returns

  • Predictions on graphs

  • B for the number of graphs in the batch

Return type

FloatTensor of shape (B, n_tasks)

GATv2 Predictor

class dgllife.model.model_zoo.gatv2_predictor.GATv2Predictor(in_feats, hidden_feats=None, num_heads=None, feat_drops=None, attn_drops=None, alphas=None, residuals=None, activations=None, allow_zero_in_degree=False, biases=None, share_weights=None, agg_modes=None, n_tasks=1, predictor_out_feats=128, predictor_dropout=0.0)[source]

GATv2-based model for regression and classification on graphs

GATv2 is introduced in How Attentive Are Graph Attention Networks? <https://arxiv.org/pdf/2105.14491.pdf>. This model is based on GATv2 and can be used for regression and classification on graphs.

After updating node representations, we perform a weighted sum with learnable weights and max pooling on them and concatenate the output of the two operations, which is then fed into an MLP for final prediction.

For classification tasks, the output will be logits, i.e. values before sigmoid or softmax.

Parameters
  • in_feats (int) – Number of input node features

  • hidden_feats (list of int, optional) – hidden_feats[i] gives the output size of an attention head in the i-th GATv2 layer. len(hidden_feats) equals the number of GATv2 layers. By default, we use [32, 32].

  • num_heads (list of int, optional) – num_heads[i] gives the number of attention heads in the i-th GATv2 layer. len(num_heads) equals the number of GATv2 layers. By default , we use 4 attention heads per GATv2 layer.

  • feat_drops (list of float, optional) – feat_drops[i] gives the dropout applied to the input features in the i-th GATv2 layer. len(feat_drops) equals the number of GATv2 layers. By default, we use zero for all GATv2 layers.

  • attn_drops (list of float, optional) – attn_drops[i] gives the dropout applied to the attention values of edges in the i-th GATv2 layer. len(attn_drops) equals the number of GATv2 layers. By default, we use zero for all GATv2 layers.

  • alphas (list of float, optional) – alphas[i] gives the slope for the negative values in the LeakyReLU function of the i-th GATv2 layer. len(alphas) equals the number of GATv2 layers. By default, we use 0.2 for all GATv2 layers.

  • residuals (list of bool, optional) – residuals[i] decides if residual connection is to be used for the i-th GATv2 layer. len(residuals) equals the number of GATv2 layers. By default, we use False for all GATv2 layers.

  • activations (list of callable, optional) – activations[i] gives the activation function applied to the result of the i-th GATv2 layer. len(activations) equals the number of GATv2 layers. By default, we use ELU for all GATv2 layers, except for the last layer.

  • allow_zero_in_degree (bool, optional) – If there are 0-in-degree nodes in the graph, output for those nodes will be invalid since no messages will be passed to those nodes. This is harmful for some applications, causing silent performance regression . This module will raise a DGLError if it detects 0-in-degree nodes in input graph. By setting True, it will suppress the check and let the users handle it by themselves. Defaults: False.

  • biases (list of bool, optional) – biases[i] decides if an additive bias is allowed to be learned by the i-th GATv2 layer. len(biases) equals the number of GATv2 layers. By default, additive biases are learned for all GATv2 layers.

  • share_weights (list of bool, optional) – share_weights[i] decides if the learnable weight matrix for source and destination nodes is the same in the i-th GATv2 layer. len(share_weights) equals the number of GATv2 Layers. By default, no weight sharing is used in all GATv2 layers.

  • agg_modes (list of str, optional) – agg_modes[i] gives the way to aggregate multi-head attention results in the i-th GATv2 layer. len(agg_modes) equals the number of GATv2 Layers. By default, we flatten all-head results for each GATv2 layer, except for the last layer.

  • n_tasks (int, optional) – Number of tasks, which is also the output size. Default to 1.

  • predictor_out_feats (int, optional) – Size for hidden representations in the output MLP predictor. Default to 128.

  • predictor_dropout (float, optional) – The probability for dropout in the output MLP predictor. Default to 0.

forward(bg, feats, get_attention=False)[source]

Graph-level regression/soft classification.

Parameters
  • bg (DGLGraph) – DGLGraph for a batch of graphs.

  • feats (FloatTensor of shape (N, M1)) –

    • N is the total number of nodes in the batch of graphs.

    • M1 is the input node feature size, which equals in_feats in initialization

  • get_attention (bool, optional) – Whether to return the attention values. Defaults: False

Returns

  • preds (FloatTensor of shape (B, n_tasks)) –

    • Predictions on graphs

    • B for the number of graphs in the batch

  • attentions (list of FloatTensor of shape (E, H, 1), optional) – It is returned when get_attention is True. attentions[i] gives the attention values in the i-th GATv2 layer.

    • E is the number of edges.

    • H is the number of attention heads.

GCN Predictor

class dgllife.model.model_zoo.gcn_predictor.GCNPredictor(in_feats, hidden_feats=None, gnn_norm=None, activation=None, residual=None, batchnorm=None, dropout=None, classifier_hidden_feats=128, classifier_dropout=0.0, n_tasks=1, predictor_hidden_feats=128, predictor_dropout=0.0)[source]

GCN-based model for regression and classification on graphs.

GCN is introduced in Semi-Supervised Classification with Graph Convolutional Networks. This model is based on GCN and can be used for regression and classification on graphs.

After updating node representations, we perform a weighted sum with learnable weights and max pooling on them and concatenate the output of the two operations, which is then fed into an MLP for final prediction.

For classification tasks, the output will be logits, i.e. values before sigmoid or softmax.

Parameters
  • in_feats (int) – Number of input node features.

  • hidden_feats (list of int) – hidden_feats[i] gives the size of node representations after the i-th GCN layer. len(hidden_feats) equals the number of GCN layers. By default, we use [64, 64].

  • gnn_norm (list of str) – gnn_norm[i] gives the message passing normalizer for the i-th GCN layer, which can be ‘right’, ‘both’ or ‘none’. The ‘right’ normalizer divides the aggregated messages by each node’s in-degree. The ‘both’ normalizer corresponds to the symmetric adjacency normalization in the original GCN paper. The ‘none’ normalizer simply sums the messages. len(gnn_norm) equals the number of GCN layers. By default, we use ['none', 'none'].

  • activation (list of activation functions or None) – If None, no activation will be applied. If not None, activation[i] gives the activation function to be used for the i-th GCN layer. len(activation) equals the number of GCN layers. By default, ReLU is applied for all GCN layers.

  • residual (list of bool) – residual[i] decides if residual connection is to be used for the i-th GCN layer. len(residual) equals the number of GCN layers. By default, residual connection is performed for each GCN layer.

  • batchnorm (list of bool) – batchnorm[i] decides if batch normalization is to be applied on the output of the i-th GCN layer. len(batchnorm) equals the number of GCN layers. By default, batch normalization is applied for all GCN layers.

  • dropout (list of float) – dropout[i] decides the dropout probability on the output of the i-th GCN layer. len(dropout) equals the number of GCN layers. By default, no dropout is performed for all layers.

  • classifier_hidden_feats (int) – (Deprecated, see predictor_hidden_feats) Size of hidden graph representations in the classifier. Default to 128.

  • classifier_dropout (float) – (Deprecated, see predictor_dropout) The probability for dropout in the classifier. Default to 0.

  • n_tasks (int) – Number of tasks, which is also the output size. Default to 1.

  • predictor_hidden_feats (int) – Size for hidden representations in the output MLP predictor. Default to 128.

  • predictor_dropout (float) – The probability for dropout in the output MLP predictor. Default to 0.

forward(bg, feats)[source]

Graph-level regression/soft classification.

Parameters
  • bg (DGLGraph) – DGLGraph for a batch of graphs.

  • feats (FloatTensor of shape (N, M1)) –

    • N is the total number of nodes in the batch of graphs

    • M1 is the input node feature size, which must match in_feats in initialization

Returns

  • Predictions on graphs

  • B for the number of graphs in the batch

Return type

FloatTensor of shape (B, n_tasks)

MGCN Predictor

class dgllife.model.model_zoo.mgcn_predictor.MGCNPredictor(feats=128, n_layers=3, classifier_hidden_feats=64, n_tasks=1, num_node_types=100, num_edge_types=3000, cutoff=5.0, gap=1.0, predictor_hidden_feats=64)[source]

MGCN for for regression and classification on graphs.

MGCN is introduced in Molecular Property Prediction: A Multilevel Quantum Interactions Modeling Perspective.

Parameters
  • feats (int) – Size for the node and edge embeddings to learn. Default to 128.

  • n_layers (int) – Number of gnn layers to use. Default to 3.

  • classifier_hidden_feats (int) – (Deprecated, see predictor_hidden_feats) Size for hidden representations in the classifier. Default to 64.

  • n_tasks (int) – Number of tasks, which is also the output size. Default to 1.

  • num_node_types (int) – Number of node types to embed. Default to 100.

  • num_edge_types (int) – Number of edge types to embed. Default to 3000.

  • cutoff (float) – Largest center in RBF expansion. Default to 5.0

  • gap (float) – Difference between two adjacent centers in RBF expansion. Default to 1.0

  • predictor_hidden_feats (int) – Size for hidden representations in the output MLP predictor. Default to 64.

forward(g, node_types, edge_dists)[source]

Graph-level regression/soft classification.

Parameters
  • g (DGLGraph) – DGLGraph for a batch of graphs.

  • node_types (int64 tensor of shape (V)) – Node types to embed, V for the number of nodes.

  • edge_dists (float32 tensor of shape (E, 1)) – Distances between end nodes of edges, E for the number of edges.

Returns

Prediction for the graphs in the batch. G for the number of graphs.

Return type

float32 tensor of shape (G, n_tasks)

MPNN Predictor

class dgllife.model.model_zoo.mpnn_predictor.MPNNPredictor(node_in_feats, edge_in_feats, node_out_feats=64, edge_hidden_feats=128, n_tasks=1, num_step_message_passing=6, num_step_set2set=6, num_layer_set2set=3)[source]

MPNN for regression and classification on graphs.

MPNN is introduced in Neural Message Passing for Quantum Chemistry.

Parameters
  • node_in_feats (int) – Size for the input node features.

  • edge_in_feats (int) – Size for the input edge features.

  • node_out_feats (int) – Size for the output node representations. Default to 64.

  • edge_hidden_feats (int) – Size for the hidden edge representations. Default to 128.

  • n_tasks (int) – Number of tasks, which is also the output size. Default to 1.

  • num_step_message_passing (int) – Number of message passing steps. Default to 6.

  • num_step_set2set (int) – Number of set2set steps. Default to 6.

  • num_layer_set2set (int) – Number of set2set layers. Default to 3.

forward(g, node_feats, edge_feats)[source]

Graph-level regression/soft classification.

Parameters
  • g (DGLGraph) – DGLGraph for a batch of graphs.

  • node_feats (float32 tensor of shape (V, node_in_feats)) – Input node features.

  • edge_feats (float32 tensor of shape (E, edge_in_feats)) – Input edge features.

Returns

Prediction for the graphs in the batch. G for the number of graphs.

Return type

float32 tensor of shape (G, n_tasks)

SchNet Predictor

class dgllife.model.model_zoo.schnet_predictor.SchNetPredictor(node_feats=64, hidden_feats=None, classifier_hidden_feats=64, n_tasks=1, num_node_types=100, cutoff=30.0, gap=0.1, predictor_hidden_feats=64)[source]

SchNet for regression and classification on graphs.

SchNet is introduced in SchNet: A continuous-filter convolutional neural network for modeling quantum interactions.

Parameters
  • node_feats (int) – Size for node representations to learn. Default to 64.

  • hidden_feats (list of int) – hidden_feats[i] gives the size of hidden representations for the i-th interaction (gnn) layer. len(hidden_feats) equals the number of interaction (gnn) layers. Default to [64, 64, 64].

  • classifier_hidden_feats (int) – (Deprecated, see predictor_hidden_feats) Size for hidden representations in the classifier. Default to 64.

  • n_tasks (int) – Number of tasks, which is also the output size. Default to 1.

  • num_node_types (int) – Number of node types to embed. Default to 100.

  • cutoff (float) – Largest center in RBF expansion. Default to 30.

  • gap (float) – Difference between two adjacent centers in RBF expansion. Default to 0.1.

  • predictor_hidden_feats (int) – Size for hidden representations in the output MLP predictor. Default to 64.

forward(g, node_types, edge_dists)[source]

Graph-level regression/soft classification.

Parameters
  • g (DGLGraph) – DGLGraph for a batch of graphs.

  • node_types (int64 tensor of shape (V)) – Node types to embed, V for the number of nodes.

  • edge_dists (float32 tensor of shape (E, 1)) – Distances between end nodes of edges, E for the number of edges.

Returns

Prediction for the graphs in the batch. G for the number of graphs.

Return type

float32 tensor of shape (G, n_tasks)

Weave Predictor

class dgllife.model.model_zoo.weave_predictor.WeavePredictor(node_in_feats, edge_in_feats, num_gnn_layers=2, gnn_hidden_feats=50, gnn_activation=<function relu>, graph_feats=128, gaussian_expand=True, gaussian_memberships=None, readout_activation=Tanh(), n_tasks=1)[source]

Weave for regression and classification on graphs.

Weave is introduced in Molecular Graph Convolutions: Moving Beyond Fingerprints

Parameters
  • node_in_feats (int) – Size for the input node features.

  • edge_in_feats (int) – Size for the input edge features.

  • num_gnn_layers (int) – Number of GNN (Weave) layers to use. Default to 2.

  • gnn_hidden_feats (int) – Size for the hidden node and edge representations. Default to 50.

  • gnn_activation (callable) – Activation function to be used in GNN (Weave) layers. Default to ReLU.

  • graph_feats (int) – Size for the hidden graph representations. Default to 50.

  • gaussian_expand (bool) – Whether to expand each dimension of node features by gaussian histogram in computing graph representations. Default to True.

  • gaussian_memberships (list of 2-tuples) – For each tuple, the first and second element separately specifies the mean and std for constructing a normal distribution. This argument comes into effect only when gaussian_expand==True. By default, we set this to be a list consisting of (-1.645, 0.283), (-1.080, 0.170), (-0.739, 0.134), (-0.468, 0.118), (-0.228, 0.114), (0., 0.114), (0.228, 0.114), (0.468, 0.118), (0.739, 0.134), (1.080, 0.170), (1.645, 0.283).

  • readout_activation (callable) – Activation function to be used in computing graph representations out of node representations. Default to Tanh.

  • n_tasks (int) – Number of tasks, which is also the output size. Default to 1.

forward(g, node_feats, edge_feats)[source]

Graph-level regression/soft classification.

Parameters
  • g (DGLGraph) – DGLGraph for a batch of graphs.

  • node_feats (float32 tensor of shape (V, node_in_feats)) – Input node features. V for the number of nodes.

  • edge_feats (float32 tensor of shape (E, edge_in_feats)) – Input edge features. E for the number of edges.

Returns

Prediction for the graphs in the batch. G for the number of graphs.

Return type

float32 tensor of shape (G, n_tasks)

GIN Predictor

class dgllife.model.model_zoo.gin_predictor.GINPredictor(num_node_emb_list, num_edge_emb_list, num_layers=5, emb_dim=300, JK='last', dropout=0.5, readout='mean', n_tasks=1)[source]

GIN-based model for regression and classification on graphs.

GIN was first introduced in How Powerful Are Graph Neural Networks for general graph property prediction problems. It was further extended in Strategies for Pre-training Graph Neural Networks for pre-training and semi-supervised learning on large-scale datasets.

For classification tasks, the output will be logits, i.e. values before sigmoid or softmax.

Parameters
  • num_node_emb_list (list of int) – num_node_emb_list[i] gives the number of items to embed for the i-th categorical node feature variables. E.g. num_node_emb_list[0] can be the number of atom types and num_node_emb_list[1] can be the number of atom chirality types.

  • num_edge_emb_list (list of int) – num_edge_emb_list[i] gives the number of items to embed for the i-th categorical edge feature variables. E.g. num_edge_emb_list[0] can be the number of bond types and num_edge_emb_list[1] can be the number of bond direction types.

  • num_layers (int) – Number of GIN layers to use. Default to 5.

  • emb_dim (int) – The size of each embedding vector. Default to 300.

  • JK (str) –

    JK for jumping knowledge as in Representation Learning on Graphs with Jumping Knowledge Networks. It decides how we are going to combine the all-layer node representations for the final output. There can be four options for this argument, 'concat', 'last', 'max' and 'sum'. Default to ‘last’.

    • 'concat': concatenate the output node representations from all GIN layers

    • 'last': use the node representations from the last GIN layer

    • 'max': apply max pooling to the node representations across all GIN layers

    • 'sum': sum the output node representations from all GIN layers

  • dropout (float) – Dropout to apply to the output of each GIN layer. Default to 0.5.

  • readout (str) – Readout for computing graph representations out of node representations, which can be 'sum', 'mean', 'max', 'attention', or 'set2set'. Default to ‘mean’.

  • n_tasks (int) – Number of tasks, which is also the output size. Default to 1.

forward(g, categorical_node_feats, categorical_edge_feats)[source]

Graph-level regression/soft classification.

Parameters
  • g (DGLGraph) – DGLGraph for a batch of graphs

  • categorical_node_feats (list of LongTensor of shape (N)) –

    • Input categorical node features

    • len(categorical_node_feats) should be the same as len(num_node_emb_list)

    • N is the total number of nodes in the batch of graphs

  • categorical_edge_feats (list of LongTensor of shape (E)) –

    • Input categorical edge features

    • len(categorical_edge_feats) should be the same as len(num_edge_emb_list) in the arguments

    • E is the total number of edges in the batch of graphs

Returns

  • Predictions on graphs

  • B for the number of graphs in the batch

Return type

FloatTensor of shape (B, n_tasks)

GNN OGB Predictor

class dgllife.model.model_zoo.gnn_ogb_predictor.GNNOGBPredictor(in_edge_feats, num_node_types=1, hidden_feats=300, n_layers=5, n_tasks=1, batchnorm=True, activation=<function relu>, dropout=0.0, gnn_type='gcn', virtual_node=True, residual=False, jk=False, readout='mean')[source]

Variant of GCN/GIN from Open Graph Benchmark: Datasets for Machine Learning on Graphs for graph property prediction

Parameters
  • in_edge_feats (int) – Number of input edge features.

  • num_node_types (int) – Number of node types to embed. (Default: 1)

  • hidden_feats (int) – Size for hidden representations. (Default: 300)

  • n_layers (int) – Number of GNN layers to use. (Default: 5)

  • n_tasks (int) – Number of output tasks. (Default: 1)

  • batchnorm (bool) – Whether to apply batch normalization. (Default: True)

  • activation (callable or None) – Activation function to apply to the output of each GNN layer except for the last layer. If None, no activation will be applied. (Default: ReLU)

  • dropout (float) – The probability for dropout. (Default: 0, i.e. no dropout)

  • gnn_type (str) – The GNN type to use, which can be either ‘gcn’ or ‘gin’. (Default: ‘gcn’)

  • virtual_node (bool) – Whether to use virtual node. (Default: True)

  • residual (bool) – Whether to apply residual connections for virtual node embeddings. (Default: False)

  • jk (bool) – Whether to sum over the output of all GNN layers as in JK networks. (Default: False)

  • readout (str) – The readout function for computing graph-level representations out of node representations, which can be ‘mean’, ‘sum’ or ‘max’. (Default: ‘mean’)

forward(g, node_feats, edge_feats)[source]

Predict graph properties.

Parameters
  • g (DGLGraph) – DGLGraph for a batch of graphs

  • node_feats (LongTensor of shape (N, 1)) – Input categorical node features. N for the number of nodes.

  • edge_feats (FloatTensor of shape (E, in_edge_feats)) – Input edge features. E for the number of edges.

Returns

Predicted graph properties

Return type

FloatTensor of shape (B, n_tasks)

reset_parameters()[source]

Reinitialize model parameters.

Neural Fingerprint Predictor

class dgllife.model.model_zoo.nf_predictor.NFPredictor(in_feats, n_tasks=1, hidden_feats=None, max_degree=10, activation=None, batchnorm=None, dropout=None, predictor_hidden_size=128, predictor_batchnorm=True, predictor_dropout=0.0, predictor_activation=<built-in method tanh of type object>)[source]

Neural Fingerprint (NF) for regression and classification on graphs.

NF is introduced in Convolutional Networks on Graphs for Learning Molecular Fingerprints. This model can be used for regression and classification on graphs.

After updating node representations, we perform a sum and max pooling on them and concatenate the output of the two operations, which is then fed into an MLP for final prediction.

For classification tasks, the output will be logits, i.e. values before sigmoid or softmax.

Parameters
  • in_feats (int) – Number of input node features.

  • n_tasks (int) – Number of tasks, which is also the output size. Default to 1.

  • hidden_feats (list of int, optional) – hidden_feats[i] gives the size of node representations after the i-th NF layer. len(hidden_feats) equals the number of NF layers. By default, we use [64, 64].

  • max_degree (int) – The maximum node degree to consider when updating weights. Default to be 10.

  • activation (list of activation functions or None) – If not None, activation[i] gives the activation function to be used for the i-th NF layer. len(activation) equals the number of NF layers. By default, ReLU is applied for all NF layers.

  • batchnorm (list of bool, optional) – batchnorm[i] decides if batch normalization is to be applied on the output of the i-th NF layer. len(batchnorm) equals the number of NF layers. By default, batch normalization is applied for all NF layers.

  • dropout (list of float, optional) – dropout[i] decides the dropout to be applied on the output of the i-th NF layer. len(dropout) equals the number of NF layers. By default, dropout is not applied for all NF layers.

  • predicor_hidden_size (int) – Size for hidden representations in the output MLP predictor. Default to be 128.

  • predictor_batchnorm (bool) – Whether to apply batch normalization in the output MLP predictor. Default to be True. Default to be True.

  • predictor_dropout (float) – The dropout probability in the output MLP predictor. Default to be 0.

  • predictor_activation (activation function) – The activation function in the output MLP predictor. Default to be Tanh.

forward(g, feats)[source]

Update node representations.

Parameters
  • g (DGLGraph) – DGLGraph for a batch of graphs

  • feats (FloatTensor of shape (N, M1)) –

    • N is the total number of nodes in the batch of graphs

    • M1 is the input node feature size, which equals in_feats in initialization

Returns

  • Predictions on graphs

  • B for the number of graphs in the batch

Return type

FloatTensor of shape (B, n_tasks)

reset_parameters()[source]

Reinitialize model parameters.

Path-Augmented Graph Transformer Predictor

class dgllife.model.model_zoo.pagtn_predictor.PAGTNPredictor(node_in_feats, node_out_feats, node_hid_feats, edge_feats, depth=5, nheads=1, dropout=0.1, activation=LeakyReLU(negative_slope=0.2), n_tasks=1, mode='sum')[source]

PAGTN model for regression and classification on graphs.

PAGTN is introduced in Path-Augmented Graph Transformer Network.

Parameters
  • node_in_feats (int) – Size for the input node features.

  • node_out_feats (int) – Size for the output node features in PAGTN layers.

  • node_hid_feats (int) – Size for the hidden node features in PAGTN layers.

  • edge_feats (int) – Size for the input edge features.

  • depth (int) – Number of PAGTN layers to be applied.

  • nheads (int) – Number of attention heads.

  • dropout (float) – The probability for performing dropout. Default to 0.1

  • activation (callable) – Activation function to apply. Default to LeakyReLU.

  • n_tasks (int) – Number of tasks, which is also the output size. Default to 1.

  • mode ('max' or 'mean' or 'sum') – Whether to compute elementwise maximum, mean or sum of the node representations.

forward(g, node_feats, edge_feats)[source]

Graph-level regression/soft classification.

Parameters
  • g (DGLGraph) – DGLGraph for a batch of graphs

  • node_feats (float32 tensor of shape (V, node_in_feats)) – Input node features. V for the number of nodes in the batch of graphs.

  • edge_feats (float32 tensor of shape (E, edge_in_feats)) – Input edge features. E for the number of edges in the batch of graphs.

Returns

Updated node features.

Return type

float32 tensor of shape (V, node_out_feats)

Generative Models

DGMG

class dgllife.model.model_zoo.dgmg.DGMG(atom_types, bond_types, node_hidden_size=128, num_prop_rounds=2, dropout=0.2)[source]

DGMG model

Learning Deep Generative Models of Graphs

Users only need to initialize an instance of this class.

Parameters
  • atom_types (list) – E.g. [‘C’, ‘N’].

  • bond_types (list) – E.g. [Chem.rdchem.BondType.SINGLE, Chem.rdchem.BondType.DOUBLE, Chem.rdchem.BondType.TRIPLE, Chem.rdchem.BondType.AROMATIC].

  • node_hidden_size (int) – Size of atom representation. Default to 128.

  • num_prop_rounds (int) – Number of message passing rounds for each time. Default to 2.

  • dropout (float) – Probability for dropout. Default to 0.2.

add_edge_or_not(a=None)[source]

Decide if to add a new bond.

Parameters

a (None or int) – If None, a new action will be sampled. If not None, teacher forcing will be used to enforce the decision of the corresponding action.

add_node_and_update(a=None)[source]

Decide if to add a new atom. If a new atom should be added, update the graph.

Parameters

a (None or int) – If None, a new action will be sampled. If not None, teacher forcing will be used to enforce the decision of the corresponding action.

choose_dest_and_update(bond_type, a=None)[source]

Choose destination and connect it to the latest atom. Add edges for both directions and update the graph.

Parameters
  • bond_type (int) – The type of the new bond to add

  • a (None or int) – If None, a new action will be sampled. If not None, teacher forcing will be used to enforce the decision of the corresponding action.

count_step()[source]

Increment the step by 1.

forward(actions=None, rdkit_mol=False, compute_log_prob=False, max_num_steps=400)[source]
Parameters
  • actions (list of 2-tuples or None.) – If actions are not None, generate a molecule according to actions. Otherwise, a molecule will be generated based on sampled actions.

  • rdkit_mol (bool) – Whether to maintain a Chem.rdchem.Mol object. This brings extra computational cost, but is necessary if we are interested in learning the generated molecule.

  • compute_log_prob (bool) – Whether to compute log likelihood

  • max_num_steps (int) – Maximum number of steps allowed. This only comes into effect during inference and prevents the model from not stopping.

Returns

  • torch.tensor consisting of a float only, optional – The log likelihood for the actions taken

  • str, optional – The generated molecule in the form of SMILES

get_log_prob()[source]

Compute the log likelihood for the decision sequence, typically corresponding to the generation of a molecule.

Returns

Return type

torch.tensor consisting of a float only

init_weights()[source]

Initialize model weights

prepare_log_prob(compute_log_prob)[source]

Setup for returning log likelihood

Parameters

compute_log_prob (bool) – Whether to compute log likelihood

rollout(max_num_steps)[source]

Sample a molecule from the distribution learned by DGMG.

teacher_forcing(actions)[source]

Generate a molecule according to a sequence of actions.

Parameters

actions (list of 2-tuples of int) –

actions[t] gives (i, j), the action to execute by DGMG at timestep t. - If i = 0, j specifies either the type of the atom to add or termination - If i = 1, j specifies either the type of the bond to add or termination - If i = 2, j specifies the destination atom id for the bond to add.

With the formulation of DGMG, j must be created before the decision.

JTNNVAE

class dgllife.model.model_zoo.jtvae.JTNNVAE(vocab, hidden_size, latent_size, depth, stereo=True)[source]
forward(batch_trees, batch_tree_graphs, batch_mol_graphs, stereo_cand_batch_idx, stereo_cand_labels, batch_stereo_cand_graphs, beta=0)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Reaction Prediction

WLN for Reaction Center Prediction

class dgllife.model.model_zoo.wln_reaction_center.WLNReactionCenter(node_in_feats, edge_in_feats, node_pair_in_feats, node_out_feats=300, n_layers=3, n_tasks=5)[source]

Weisfeiler-Lehman Network (WLN) for Reaction Center Prediction.

The model is introduced in Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network.

The model uses WLN to update atom representations and then predicts the score for each pair of atoms to form a bond.

Parameters
  • node_in_feats (int) – Size for the input node features.

  • edge_in_feats (int) – Size for the input edge features.

  • node_out_feats (int) – Size for the output node representations. Default to 300.

  • node_pair_in_feats (int) – Size for the input features of node pairs.

  • n_layers (int) – Number of times for message passing. Note that same parameters are shared across n_layers message passing. Default to 3.

  • n_tasks (int) – Number of tasks for prediction.

forward(batch_mol_graphs, batch_complete_graphs, node_feats, edge_feats, node_pair_feats)[source]

Predict score for each pair of nodes.

Parameters
  • batch_mol_graphs (DGLGraph) – A batch of molecular graphs.

  • batch_complete_graphs (DGLGraph) – A batch of fully connected graphs.

  • node_feats (float32 tensor of shape (V, node_in_feats)) – Input node features. V for the number of nodes.

  • edge_feats (float32 tensor of shape (E, edge_in_feats)) – Input edge features. E for the number of edges.

  • node_pair_feats (float32 tensor of shape (E_full, node_pair_in_feats)) – Input features for each pair of nodes. E_full for the number of edges in the batch of complete graphs.

Returns

  • scores (float32 tensor of shape (E_full, 5)) – Predicted scores for each pair of atoms to perform one of the following 5 actions in reaction:

    • The bond between them gets broken

    • Forming a single bond

    • Forming a double bond

    • Forming a triple bond

    • Forming an aromatic bond

  • biased_scores (float32 tensor of shape (E_full, 5)) – Comparing to scores, a bias is added if the pair is for a same atom.

WLN for Ranking Candidate Products

class dgllife.model.model_zoo.wln_reaction_ranking.WLNReactionRanking(node_in_feats, edge_in_feats, node_hidden_feats=500, num_encode_gnn_layers=3)[source]

Weisfeiler-Lehman Network (WLN) for Candidate Product Ranking

The model is introduced in Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network and then further improved in A graph-convolutional neural network model for the prediction of chemical reactivity

The model updates representations of nodes in candidate products with WLN and predicts the score for candidate products to be the real product.

Parameters
  • node_in_feats (int) – Size for the input node features.

  • edge_in_feats (int) – Size for the input edge features.

  • node_hidden_feats (int) – Size for the hidden node representations. Default to 500.

  • num_encode_gnn_layers (int) – Number of WLN layers for updating node representations.

forward(reactant_graph, reactant_node_feats, reactant_edge_feats, product_graphs, product_node_feats, product_edge_feats, candidate_scores, batch_num_candidate_products)[source]

Predicts the score for candidate products to be the true product

Parameters
  • reactant_graph (DGLGraph) – DGLGraph for a batch of reactants.

  • reactant_node_feats (float32 tensor of shape (V1, node_in_feats)) – Input node features for the reactants. V1 for the number of nodes.

  • reactant_edge_feats (float32 tensor of shape (E1, edge_in_feats)) – Input edge features for the reactants. E1 for the number of edges in reactant_graph.

  • product_graphs (DGLGraph) – DGLGraph for the candidate products in a batch of reactions.

  • product_node_feats (float32 tensor of shape (V2, node_in_feats)) – Input node features for the candidate products. V2 for the number of nodes.

  • product_edge_feats (float32 tensor of shape (E2, edge_in_feats)) – Input edge features for the candidate products. E2 for the number of edges in the graphs for candidate products.

  • candidate_scores (float32 tensor of shape (B, 1)) – Scores for candidate products based on the model for reaction center prediction

  • batch_num_candidate_products (list of int) – Number of candidate products for the reactions in the batch

Returns

Predicted scores for candidate products

Return type

float32 tensor of shape (B, 1)

Protein-Ligand Binding Affinity Prediction

ACNN

class dgllife.model.model_zoo.acnn.ACNN(hidden_sizes=None, weight_init_stddevs=None, dropouts=None, features_to_use=None, radial=None, num_tasks=1)[source]

Atomic Convolutional Networks.

The model was proposed in Atomic Convolutional Networks for Predicting Protein-Ligand Binding Affinity.

The prediction proceeds as follows:

  1. Perform message passing to update atom representations for the ligand, protein and protein-ligand complex.

  2. Predict the energy of atoms from their representations with an MLP.

  3. Take the sum of predicted energy of atoms within each molecule for predicted energy of the ligand, protein and protein-ligand complex.

  4. Make the final prediction by subtracting the predicted ligand and protein energy from the predicted complex energy.

Parameters
  • hidden_sizes (list of int) – hidden_sizes[i] gives the size of hidden representations in the i-th hidden layer of the MLP. By Default, [32, 32, 16] will be used.

  • weight_init_stddevs (list of float) – weight_init_stddevs[i] gives the std to initialize parameters in the i-th layer of the MLP. Note that len(weight_init_stddevs) == len(hidden_sizes) + 1 due to the output layer. By default, we use 1 / sqrt(hidden_sizes[i]) for hidden layers and 0.01 for the output layer.

  • dropouts (list of float) – dropouts[i] gives the dropout in the i-th hidden layer of the MLP. By default, no dropout is used.

  • features_to_use (None or float tensor of shape (T)) – In the original paper, these are atomic numbers to consider, representing the types of atoms. T for the number of types of atomic numbers. If None, we use same parameters for all atoms regardless of their type. Default to None.

  • radial (list) – The list consists of 3 sublists of floats, separately for the options of interaction cutoff, the options of rbf kernel mean and the options of rbf kernel scaling. By default, [[12.0], [0.0, 2.0, 4.0, 6.0, 8.0], [4.0]] will be used.

  • num_tasks (int) – Number of output tasks. Default to 1.

forward(graph)[source]

Apply the model for prediction.

Parameters

graph (DGLHeteroGraph) – DGLHeteroGraph consisting of the ligand graph, the protein graph and the complex graph, along with preprocessed features. For a batch of protein-ligand pairs, we assume zero padding is performed so that the number of ligand and protein atoms is the same in all pairs.

Returns

Predicted protein-ligand binding affinity. B for the number of protein-ligand pairs in the batch and O for the number of tasks.

Return type

Float32 tensor of shape (B, O)

PotentialNet

class dgllife.model.model_zoo.potentialnet.PotentialNet(f_in, f_bond, f_spatial, f_gather, n_etypes, n_bond_conv_steps, n_spatial_conv_steps, n_rows_fc, dropouts)[source]

Protein-ligand binding affinity prediction using a ‘staged gated graph neural network’ introduced in PotentialNet for Molecular Property Prediction.

Parameters
  • f_in (int) – The dimension size of input features to GatedGraphConv, equivalent to the dimension size of atomic features in the molecular graph.

  • f_bond (int) – The dimension size of the output from GatedGraphConv in stage 1, equivalent to the dimension size of input to the linear layer at the end of stage 1.

  • f_spatial (int) – The dimension size of the output from GatedGraphConv in stage 2, equivalent to the dimension size of input to the linear layer at the end of stage 2.

  • f_gather (int) – The dimension size of the output from stage 1 & 2, equivalent to the dimension size of output from the linear layer at the end of stage 1 & 2.

  • n_etypes (int) – The number of heterogeneous edge types for stage 2. This includes the number of covalent bond types from stage 1 and the number of spatial edge types based on distances. Default to 9 (5 covalent bond types in stage 1 plus 4 distance bins in stage 2).

  • n_bond_conv_steps (int) – The number of bond convolution layers(steps) of GatedGraphConv in stage 1.

  • n_spatial_conv_steps (int) – The number of spatial convolution layers(steps) of GatedGraphConv in stage 2.

  • n_rows_fc (list of int) – The widths of the fully connected neural networks at each layer in stage 3.

  • dropouts (list of 3 floats) – The amount of dropout applied at the end of each stage.

forward(bigraph, knn_graph)[source]

Compute the prediction on graphs using PotentialNet model.

Parameters
  • bigraph (DGLGraph) – The molecular graph for stage 1 of PotentialNet, with bigraph.ndata[‘h’] being the input node features. and bigraph.edata[‘e’] being the one-hot encoding of the edge types.

  • knn_graph (DGLGraph) – The k-nearest-neighbor graph for stage 2 of PotentialNet, with no initial node features and knn_graph.edata[‘e’] being the one-hot encoding of the edge types.

Returns

x – The prediction based on the input features and graphs. For the task of binding affinity prediction, the shape is (B, 1), where B is the batch size.

Return type

torch.Tensor