Utils for Molecules

Utilities in DGL-LifeSci for working with molecules.

IO Utils

Utils for loading molecules and accessing their information.

dgllife.utils.get_mol_3d_coordinates(mol)

Get 3D coordinates of the molecule.

dgllife.utils.load_molecule(molecule_file[, …])

Load a molecule from a file of format .mol2 or .sdf or .pdbqt or .pdb.

dgllife.utils.multiprocess_load_molecules(files)

Load molecules from files with multiprocessing, which can be of format .mol2 or .sdf or .pdbqt or .pdb.

dgllife.utils.load_smiles_from_txt(file)

Load SMILES from a txt file.

Analysis

Utils for analyzing a collection of molecules.

dgllife.utils.analyze_mols([smiles, mols, …])

Analyze a collection of molecules

Graph Construction

The modeling of graph neural networks starts with constructing appropriate graph topologies. We provide three common graph constructions:

  • bigraph: Bi-directed graphs corresponding exactly to molecular graphs

  • complete_graph: Graphs with all pairs of atoms connected

  • nearest_neighbor_graph: Graphs where each atom is connected to its closest (k) atoms based on molecule coordinates

dgllife.utils.mol_to_graph(mol, …[, …])

Convert an RDKit molecule object into a DGLGraph and featurize for it.

dgllife.utils.smiles_to_bigraph(smiles[, …])

Convert a SMILES into a bi-directed DGLGraph and featurize for it.

dgllife.utils.mol_to_bigraph(mol[, …])

Convert an RDKit molecule object into a bi-directed DGLGraph and featurize for it.

dgllife.utils.smiles_to_complete_graph(smiles)

Convert a SMILES into a complete DGLGraph and featurize for it.

dgllife.utils.mol_to_complete_graph(mol[, …])

Convert an RDKit molecule into a complete DGLGraph and featurize for it.

dgllife.utils.k_nearest_neighbors(…[, …])

Find k nearest neighbors for each atom

dgllife.utils.mol_to_nearest_neighbor_graph(…)

Convert an RDKit molecule into a nearest neighbor graph and featurize for it.

dgllife.utils.smiles_to_nearest_neighbor_graph(…)

Convert a SMILES into a nearest neighbor graph and featurize for it.

Featurization for Molecules

To apply graph neural networks, we need to prepare node and edge features for molecules. Intuitively, they can be developed based on various descriptors (features) of atoms/bonds/molecules. Particularly, we can work with numerical descriptors directly or use one_hot_encoding for categorical descriptors. When using multiple descriptors together, we can simply concatenate them with ConcatFeaturizer.

General Utils

dgllife.utils.one_hot_encoding(x, allowable_set)

One-hot encoding.

dgllife.utils.ConcatFeaturizer(func_list)

Concatenate the evaluation results of multiple functions as a single feature.

Featurization for Nodes

We consider the following atom descriptors:

  • type/atomic number

  • degree (excluding neighboring hydrogen atoms)

  • total degree (including neighboring hydrogen atoms)

  • explicit valence

  • implicit valence

  • hybridization

  • total number of neighboring hydrogen atoms

  • formal charge

  • number of radical electrons

  • aromatic atom

  • ring membership

  • chirality

  • mass

We can employ their numerical values directly or with one-hot encoding.

dgllife.utils.atom_type_one_hot(atom[, …])

One hot encoding for the type of an atom.

dgllife.utils.atomic_number_one_hot(atom[, …])

One hot encoding for the atomic number of an atom.

dgllife.utils.atomic_number(atom)

Get the atomic number for an atom.

dgllife.utils.atom_degree_one_hot(atom[, …])

One hot encoding for the degree of an atom.

dgllife.utils.atom_degree(atom)

Get the degree of an atom.

dgllife.utils.atom_total_degree_one_hot(atom)

One hot encoding for the degree of an atom including Hs.

dgllife.utils.atom_total_degree(atom)

The degree of an atom including Hs.

dgllife.utils.atom_explicit_valence_one_hot(atom)

One hot encoding for the explicit valence of an aotm.

dgllife.utils.atom_explicit_valence(atom)

Get the explicit valence of an atom.

dgllife.utils.atom_implicit_valence_one_hot(atom)

One hot encoding for the implicit valence of an atom.

dgllife.utils.atom_implicit_valence(atom)

Get the implicit valence of an atom.

dgllife.utils.atom_hybridization_one_hot(atom)

One hot encoding for the hybridization of an atom.

dgllife.utils.atom_total_num_H_one_hot(atom)

One hot encoding for the total number of Hs of an atom.

dgllife.utils.atom_total_num_H(atom)

Get the total number of Hs of an atom.

dgllife.utils.atom_formal_charge_one_hot(atom)

One hot encoding for the formal charge of an atom.

dgllife.utils.atom_formal_charge(atom)

Get formal charge for an atom.

dgllife.utils.atom_num_radical_electrons_one_hot(atom)

One hot encoding for the number of radical electrons of an atom.

dgllife.utils.atom_num_radical_electrons(atom)

Get the number of radical electrons for an atom.

dgllife.utils.atom_is_aromatic_one_hot(atom)

One hot encoding for whether the atom is aromatic.

dgllife.utils.atom_is_aromatic(atom)

Get whether the atom is aromatic.

dgllife.utils.atom_is_in_ring_one_hot(atom)

One hot encoding for whether the atom is in ring.

dgllife.utils.atom_is_in_ring(atom)

Get whether the atom is in ring.

dgllife.utils.atom_chiral_tag_one_hot(atom)

One hot encoding for the chiral tag of an atom.

dgllife.utils.atom_chirality_type_one_hot(atom)

One hot encoding for the chirality type of an atom.

dgllife.utils.atom_mass(atom[, coef])

Get the mass of an atom and scale it.

dgllife.utils.atom_is_chiral_center(atom)

Get whether the atom is chiral center

For using featurization methods like above in creating node features:

dgllife.utils.BaseAtomFeaturizer(…[, …])

An abstract class for atom featurizers.

dgllife.utils.BaseAtomFeaturizer.feat_size([…])

Get the feature size for feat_name.

dgllife.utils.CanonicalAtomFeaturizer([…])

A default featurizer for atoms.

dgllife.utils.CanonicalAtomFeaturizer.feat_size([…])

Get the feature size for feat_name.

dgllife.utils.PretrainAtomFeaturizer([…])

AtomFeaturizer in Strategies for Pre-training Graph Neural Networks.

dgllife.utils.AttentiveFPAtomFeaturizer([…])

The atom featurizer used in AttentiveFP

dgllife.utils.AttentiveFPAtomFeaturizer.feat_size([…])

Get the feature size for feat_name.

Featurization for Edges

We consider the following bond descriptors:

  • type

  • conjugated bond

  • ring membership

  • stereo configuration

dgllife.utils.bond_type_one_hot(bond[, …])

One hot encoding for the type of a bond.

dgllife.utils.bond_is_conjugated_one_hot(bond)

One hot encoding for whether the bond is conjugated.

dgllife.utils.bond_is_conjugated(bond)

Get whether the bond is conjugated.

dgllife.utils.bond_is_in_ring_one_hot(bond)

One hot encoding for whether the bond is in a ring of any size.

dgllife.utils.bond_is_in_ring(bond)

Get whether the bond is in a ring of any size.

dgllife.utils.bond_stereo_one_hot(bond[, …])

One hot encoding for the stereo configuration of a bond.

dgllife.utils.bond_direction_one_hot(bond[, …])

One hot encoding for the direction of a bond.

For using featurization methods like above in creating edge features:

dgllife.utils.BaseBondFeaturizer(…[, …])

An abstract class for bond featurizers.

dgllife.utils.BaseBondFeaturizer.feat_size([…])

Get the feature size for feat_name.

dgllife.utils.CanonicalBondFeaturizer([…])

A default featurizer for bonds.

dgllife.utils.CanonicalBondFeaturizer.feat_size([…])

Get the feature size for feat_name.

dgllife.utils.PretrainBondFeaturizer([…])

BondFeaturizer in Strategies for Pre-training Graph Neural Networks.

dgllife.utils.AttentiveFPBondFeaturizer([…])

The bond featurizer used in AttentiveFP

dgllife.utils.AttentiveFPBondFeaturizer.feat_size([…])

Get the feature size for feat_name.