Utils for Molecules¶
Utilities in DGL-LifeSci for working with molecules.
IO Utils¶
Utils for loading molecules and accessing their information.
Get 3D coordinates of the molecule. |
|
|
Load a molecule from a file of format |
Load molecules from files with multiprocessing, which can be of format |
|
Load SMILES from a txt file. |
Analysis¶
Utils for analyzing a collection of molecules.
|
Analyze a collection of molecules |
Graph Construction¶
The modeling of graph neural networks starts with constructing appropriate graph topologies. We provide three common graph constructions:
bigraph
: Bi-directed graphs corresponding exactly to molecular graphscomplete_graph
: Graphs with all pairs of atoms connectednearest_neighbor_graph
: Graphs where each atom is connected to its closest (k) atoms based on molecule coordinates
|
Convert an RDKit molecule object into a DGLGraph and featurize for it. |
|
Convert a SMILES into a bi-directed DGLGraph and featurize for it. |
|
Convert an RDKit molecule object into a bi-directed DGLGraph and featurize for it. |
Convert a SMILES into a complete DGLGraph and featurize for it. |
|
|
Convert an RDKit molecule into a complete DGLGraph and featurize for it. |
Find k nearest neighbors for each atom |
|
Convert an RDKit molecule into a nearest neighbor graph and featurize for it. |
|
Convert a SMILES into a nearest neighbor graph and featurize for it. |
|
An abstract class for writing graph constructors. |
|
|
Convert RDKit molecule objects into bi-directed DGLGraphs and featurize for them. |
Convert SMILES strings into bi-directed DGLGraphs and featurize for them. |
Featurization for Molecules¶
To apply graph neural networks, we need to prepare node and edge features for molecules. Intuitively,
they can be developed based on various descriptors (features) of atoms/bonds/molecules. Particularly, we can
work with numerical descriptors directly or use one_hot_encoding
for categorical descriptors. When using
multiple descriptors together, we can simply concatenate them with ConcatFeaturizer
.
General Utils¶
|
One-hot encoding. |
|
Concatenate the evaluation results of multiple functions as a single feature. |
Featurization for Nodes¶
We consider the following atom descriptors:
type/atomic number
degree (excluding neighboring hydrogen atoms)
total degree (including neighboring hydrogen atoms)
explicit valence
implicit valence
hybridization
total number of neighboring hydrogen atoms
formal charge
number of radical electrons
aromatic atom
ring membership
chirality
mass
We can employ their numerical values directly or with one-hot encoding.
|
One hot encoding for the type of an atom. |
|
One hot encoding for the atomic number of an atom. |
Get the atomic number for an atom. |
|
|
One hot encoding for the degree of an atom. |
Get the degree of an atom. |
|
One hot encoding for the degree of an atom including Hs. |
|
The degree of an atom including Hs. |
|
One hot encoding for the explicit valence of an aotm. |
|
Get the explicit valence of an atom. |
|
One hot encoding for the implicit valence of an atom. |
|
Get the implicit valence of an atom. |
|
One hot encoding for the hybridization of an atom. |
|
One hot encoding for the total number of Hs of an atom. |
|
Get the total number of Hs of an atom. |
|
One hot encoding for the formal charge of an atom. |
|
Get formal charge for an atom. |
|
One hot encoding for the number of radical electrons of an atom. |
|
Get the number of radical electrons for an atom. |
|
One hot encoding for whether the atom is aromatic. |
|
Get whether the atom is aromatic. |
|
One hot encoding for whether the atom is in ring. |
|
Get whether the atom is in ring. |
|
One hot encoding for the chiral tag of an atom. |
|
One hot encoding for the chirality type of an atom. |
|
|
Get the mass of an atom and scale it. |
Get whether the atom is chiral center |
For using featurization methods like above in creating node features:
|
An abstract class for atom featurizers. |
Get the feature size for |
|
A default featurizer for atoms. |
|
Get the feature size for |
|
AtomFeaturizer in Strategies for Pre-training Graph Neural Networks. |
|
The atom featurizer used in AttentiveFP |
|
Get the feature size for |
|
The atom featurizer used in PAGTN |
|
Get the feature size for |
Featurization for Edges¶
We consider the following bond descriptors:
type
conjugated bond
ring membership
stereo configuration
|
One hot encoding for the type of a bond. |
One hot encoding for whether the bond is conjugated. |
|
Get whether the bond is conjugated. |
|
One hot encoding for whether the bond is in a ring of any size. |
|
Get whether the bond is in a ring of any size. |
|
|
One hot encoding for the stereo configuration of a bond. |
|
One hot encoding for the direction of a bond. |
For using featurization methods like above in creating edge features:
|
An abstract class for bond featurizers. |
Get the feature size for |
|
A default featurizer for bonds. |
|
Get the feature size for |
|
BondFeaturizer in Strategies for Pre-training Graph Neural Networks. |
|
The bond featurizer used in AttentiveFP |
|
Get the feature size for |
|
The edge featurizer used in PAGTN |
|
Get the feature size. |