dgllife.utils.PN_graph_construction_and_featurization

dgllife.utils.PN_graph_construction_and_featurization(ligand_mol, protein_mol, ligand_coordinates, protein_coordinates, max_num_ligand_atoms=None, max_num_protein_atoms=None, max_num_neighbors=4, distance_bins=[1.5, 2.5, 3.5, 4.5], strip_hydrogens=False)[source]
Graph construction and featurization for `PotentialNet for Molecular Property Prediction

<https://pubs.acs.org/doi/10.1021/acscentsci.8b00507>`__.

Parameters
  • ligand_mol (rdkit.Chem.rdchem.Mol) – RDKit molecule instance.

  • protein_mol (rdkit.Chem.rdchem.Mol) – RDKit molecule instance.

  • ligand_coordinates (Float Tensor of shape (V1, 3)) – Atom coordinates in a ligand.

  • protein_coordinates (Float Tensor of shape (V2, 3)) – Atom coordinates in a protein.

  • max_num_ligand_atoms (int or None) – Maximum number of atoms in ligands for zero padding, which should be no smaller than ligand_mol.GetNumAtoms() if not None. If None, no zero padding will be performed. Default to None.

  • max_num_protein_atoms (int or None) – Maximum number of atoms in proteins for zero padding, which should be no smaller than protein_mol.GetNumAtoms() if not None. If None, no zero padding will be performed. Default to None.

  • max_num_neighbors (int) – Maximum number of neighbors allowed for each atom when constructing KNN graph. Default to 4.

  • distance_bins (list of float) – Distance bins to determine the edge types. Edges of the first edge type are added between pairs of atoms whose distances are less than distance_bins[0]. The length matches the number of edge types to be constructed. Default [1.5, 2.5, 3.5, 4.5].

  • strip_hydrogens (bool) – Whether to exclude hydrogen atoms. Default to False.

Returns

  • complex_bigraph (DGLGraph) – Bigraph with the ligand and the protein (pocket) combined and canonical features extracted. The atom features are stored as DGLGraph.ndata[‘h’]. The edge types are stored as DGLGraph.edata[‘e’]. The bigraphs of the ligand and the protein are batched together as one complex graph.

  • complex_knn_graph (DGLGraph) – K-nearest-neighbor graph with the ligand and the protein (pocket) combined and edge features extracted based on distances. The edge types are stored as DGLGraph.edata[‘e’]. The knn graphs of the ligand and the protein are batched together as one complex graph.