biobuild base#
The lowest level of biobuild base classes.
The base class for classes storing and manipulating biopython structures
This houses most of the essential functionality of the library for most users.
The Molecule class adds additional features on top.
- class biobuild.core.entity.BaseEntity(structure, model: int = 0)[source]#
Bases:
objectTHe Base class for all classes that store and handle biopython structures, namely the Molecule class.
- Parameters:
structure (Structure or Bio.PDB.Structure) – The biopython structure
model (int) – The index of the model to use (default: 0)
- add_atoms(*atoms: Atom, residue=None, _copy: bool = False)[source]#
Add atoms to the structure. This will automatically adjust the atom’s serial number to fit into the structure.
- Parameters:
atoms (base_classes.Atom) – The atoms to add
residue (int or str) – The residue to which the atoms should be added, this may be either the seqid or the residue name, if None the atoms are added to the last residue. Note, that if multiple identically named residues are present, the first one is chosen, so using the seqid is a safer option!
_copy (bool) – If True, the atoms are copied and then added to the structure. This will leave the original atoms (and their parent structures) untouched.
- add_bond(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom, order: int = 1)[source]#
Add a bond between two atoms
- Parameters:
atom1 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.
atom2 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.
order (int) – The order of the bond, i.e. 1 for single, 2 for double, 3 for triple, etc.
- add_bonds(*bonds)[source]#
Add multiple bonds at once
- Parameters:
bonds – The bonds to add, each bond is a tuple of two atoms. Each atom may be specified directly (biopython object) or by providing the serial number, the full_id or the id of the atoms.
- add_chains(*chains: Chain, adjust_seqid: bool = True, _copy: bool = False)[source]#
Add chains to the structure
- Parameters:
chains (base_classes.Chain) – The chains to add
adjust_seqid (bool) – If True, the seqid of the chains is adjusted to match the current number of chains in the structure (i.e. a new chain can be given seqid A, and it will be adjusted to the correct value of C if there are already two other chains in the molecule).
_copy (bool) – If True, the chains are copied before adding them to the molecule. This is useful if you want to add the same chain to multiple molecules, while leaving them and their original parent structures intakt.
- add_residues(*residues: Residue, adjust_seqid: bool = True, _copy: bool = False)[source]#
Add residues to the structure
- Parameters:
residues (base_classes.Residue) – The residues to add
adjust_seqid (bool) – If True, the seqid of the residues is adjusted to match the current number of residues in the structure (i.e. a new residue can be given seqid 1, and it will be adjusted to the correct value of 3 if there are already two other residues in the molecule).
_copy (bool) – If True, the residues are copied before adding them to the molecule. This is useful if you want to add the same residue to multiple molecules, while leaving them and their original parent structures intakt.
- adjust_indexing(mol)[source]#
Adjust the indexing of a molecule to match the scaffold index
- Parameters:
mol (Molecule) – The molecule to adjust the indexing of
- apply_standard_bonds(_compounds=None) list[source]#
Get the standard bonds for the structure
- Parameters:
_compounds – The compounds to use for the standard bonds. If None, the default compounds are used.
- Returns:
A list of tuples of atom pairs that are bonded
- Return type:
list
- property atoms#
A sorted list of all atoms in the structure
- property attach_residue#
The residue at which to attach other molecules to this one.
- autolabel()[source]#
Automatically label atoms in the structure to match the CHARMM force field atom nomenclature. This is useful if you want to use some pre-generated PDB file that may have used a different labelling scheme for atoms.
Note
The labels are infererred and therefore may occasionally not be “correct”. It is advisable to check the labels after using this method.
- property bonds#
All bonds in the structure
- property chains#
A sorted list of all chains in the structure
- compute_angle(atom1: str | int | Atom, atom2: str | int | Atom, atom3: str | int | Atom)[source]#
Compute the angle between three atoms where atom2 is the middle atom.
- Parameters:
atom1 – The first atom
atom2 – The second atom
atom3 – The third atom
- Returns:
The angle in degrees
- Return type:
float
- compute_angles()[source]#
Compute all angles of consecutively bonded atom triplets within the molecule.
- Returns:
angles – A dictionary of the form {atom_triplet: angle}
- Return type:
dict
- compute_dihedral(atom1: str | int | Atom, atom2: str | int | Atom, atom3: str | int | Atom, atom4: str | int | Atom)[source]#
Compute the dihedral angle between four atoms
- Parameters:
atom1 – The first atom
atom2 – The second atom
atom3 – The third atom
atom4 – The fourth atom
- Returns:
The dihedral angle in degrees
- Return type:
float
- compute_dihedrals()[source]#
Compute all dihedrals of consecutively bonded atom quartets within the molecule.
- Returns:
dihedrals – A dictionary of the form {atom_quartet: dihedral}
- Return type:
dict
- count_atoms() int[source]#
Count the number of atoms in the structure
- Returns:
The number of atoms
- Return type:
int
- count_bonds() int[source]#
Count the number of bonds in the structure
- Returns:
The number of bonds
- Return type:
int
- count_chains() int[source]#
Count the number of chains in the structure
- Returns:
The number of chains
- Return type:
int
- count_clashes(clash_threshold: float = 0.9) int[source]#
Count all clashes in the molecule.¨
- Parameters:
clash_threshold (float, optional) – The minimal allowed distance between two atoms (in Angstrom).
- Returns:
The number of clashes.
- Return type:
int
- count_residues() int[source]#
Count the number of residues in the structure
- Returns:
The number of residues
- Return type:
int
- draw(residue_graph: bool = False)[source]#
Prepare a view of the molecule in 3D using Plotly but do not open a browser window.
- Parameters:
residue_graph (bool) – If True, a residue graph is shown instead of the full structure.
- Returns:
viewer – The viewer object
- Return type:
- find_clashes(clash_threshold: float = 0.9) list[source]#
Find all clashes in the molecule.
- Parameters:
clash_threshold (float, optional) – The minimal allowed distance between two atoms (in Angstrom).
- Returns:
A list of tuples of atoms that clash.
- Return type:
list
- classmethod from_cif(filename: str, id: str = None)[source]#
Load a Molecule from a CIF file
- Parameters:
filename (str) – Path to the CIF file
id (str) – The id of the Molecule. By default an id is inferred from the filename.
- classmethod from_json(filename: str)[source]#
Make a Molecule from a JSON file
- Parameters:
filename (str) – Path to the JSON file
- classmethod from_molfile(filename: str)[source]#
Make a Molecule from a molfile
- Parameters:
filename (str) – Path to the molfile
- classmethod from_openmm(topology, positions)[source]#
Load a Molecule from an OpenMM topology and positions
- Parameters:
topology (simtk.openmm.app.Topology) – The OpenMM topology
positions (simtk.unit.Quantity) – The OpenMM positions
- classmethod from_pdb(filename: str, id: str = None)[source]#
Read a Molecule from a PDB file
- Parameters:
filename (str) – Path to the PDB file
root_atom (str or int) – The id or the serial number of the root atom (optional)
id (str) – The id of the Molecule. By default an id is inferred from the filename.
- classmethod from_pybel(mol)[source]#
Load a Molecule from a Pybel molecule
- Parameters:
mol (pybel.Molecule) – The Pybel molecule
- classmethod from_rdkit(mol, id: str = None)[source]#
Load a Molecule from an RDKit molecule
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule
id (str) – The id of the Molecule. By default an id is inferred from the “_Name” property of the mol object (if present).
- get_ancestors(atom1: str | int | Atom, atom2: str | int | Atom)[source]#
Get the atoms upstream of a bond. This will return the set of all atoms that are connected before the bond atom1-atom2 in the direction of atom1, the selection can be reversed by reversing the order of atoms (atom2-atom1).
- Parameters:
atom1 – The first atom
atom2 – The second atom
- Returns:
A set of atoms
- Return type:
set
Examples
OH
/
- (1)CH3 — CH
CH2 — (2)CH3
``` >>> mol.get_ancestors(“(1)CH3”, “CH”) set() >>> mol.get_ancestors(“CH”, “CH2”) {“(1)CH3”, “OH”} >>> mol.get_ancestors(“CH2”, “CH”) {“(2)CH3”}
- get_atom(atom: int | str | tuple, by: str = None, residue: int | Residue = None)[source]#
Get an atom from the structure either based on its id, serial number or full_id. Note, if multiple atoms match the requested criteria, for instance there are multiple ‘C1’ from different residues, only the first one is returned. To get all atoms matching the criteria, use the get_atoms method.
- Parameters:
atom – The atom id, serial number, full_id tuple, or element symbol.
by (str) – The type of parameter to search for. Can be either ‘id’, ‘serial’, ‘full_id’, or ‘element’. Because this looks for one specific atom, this parameter can be inferred from the datatype of the atom parameter. If it is an integer, it is assumed to be the serial number, if it is a string, it is assumed to be the atom id and if it is a tuple, it is assumed to be the full_id.
residue (int or Residue) – A specific residue to search in. If None, the entire structure is searched.
- Returns:
atom – The atom
- Return type:
- get_atom_graph(_copy: bool = True)[source]#
Get an AtomGraph for the Molecule
- Parameters:
_copy (bool) – If True, not the “original” AtomGraph object that the Molecule relies on is returned but a new one. However, the molecule will still be linked to the new graph. This is useful if you want to make changes to the graph itself (not including changes to the graph nodes, i.e. the atoms itself, such as rotations).
- Returns:
The generated graph
- Return type:
- get_atom_quartets() list[source]#
Compute quartets of four consequtively bonded atoms
- Returns:
atom_quartets – A list of atom quartets
- Return type:
list
- get_atoms(*atoms: int | str | tuple, by: str = None) list[source]#
Get one or more atoms from the structure either based on their id, serial number or full_id. Note, if multiple atoms match the requested criteria, for instance there are multiple ‘C1’ from different residues all of them are returned in a list. It is a safer option to use the full_id or serial number to retrieve a specific atom. If no search parameters are provided, the underlying iterator of the structure is returned.
Note
This does not support mixed queries. I.e. you cannot query for an atom with id ‘C1’ and serial number 1 at the same time. Each call can only query for one type of parameter.
- Parameters:
atoms – The atom id, serial number, full_id tuple, or element string symbol. This supports multiple atoms to search for. However, only one type of parameter is supported per call. If left empty, the underlying generator is returned.
by (str) – The type of parameter to search for. Can be either ‘id’, ‘serial’ or ‘full_id’ If None is given, the parameter is inferred from the datatype of the atoms argument ‘serial’ in case of int, ‘id’ in case of str, full_id in case of a tuple.
- Returns:
atom – The atom(s)
- Return type:
list or generator
- get_attach_residue()[source]#
Get the residue that is used for attaching other molecules to this one.
- get_bond(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom, add_if_not_present: bool = True) Bond[source]#
Get/make a bond between two atoms.
- get_bonds(atom1: int | str | tuple | Atom | Residue = None, atom2: int | str | tuple | Atom = None, residue_internal: bool = True, either_way: bool = True)[source]#
Get one or multiple bonds from the molecule. If only one atom is provided, all bonds that are connected to that atom are returned.
- Parameters:
atom1 – The atom id, serial number or full_id tuple of the first atom. This may also be a residue, in which case all bonds between atoms in that residue are returned.
atom2 – The atom id, serial number or full_id tuple of the second atom
residue_internal (bool) – If True, only bonds where both atoms are in the given residue (if atom1 is a residue) are returned. If False, all bonds where either atom is in the given residue are returned.
either_way (bool) – If True, the order of the atoms does not matter, if False, the order of the atoms does matter. By setting this to false, it is possible to also search for bonds that have a specific atom in position 1 or 2 depending on which argument was set, while leaving the other input as none.
- Returns:
bond – The bond(s). If no input is given, all bonds are returned as a generator.
- Return type:
list or generator
- get_chain(chain: str)[source]#
Get a chain from the structure either based on its name.
- Parameters:
chain – The chain id
- Returns:
chain – The chain
- Return type:
- get_descendants(atom1: str | int | Atom, atom2: str | int | Atom)[source]#
Get the atoms downstream of a bond. This will return the set of all atoms that are connected after the bond atom1-atom2 in the direction of atom2, the selection can be reversed by reversing the order of atoms (atom2-atom1).
- Parameters:
atom1 – The first atom
atom2 – The second atom
- Returns:
A set of atoms
- Return type:
set
Examples
OH
/
- (1)CH3 — CH
CH2 — (2)CH3
``` >>> mol.get_descendants(“(1)CH3”, “CH”) {“OH”, “CH2”, “(2)CH3”} >>> mol.get_descendants(“CH”, “CH2”) {“(2)CH3”} >>> mol.get_descendants(“CH2”, “CH”) {“OH”, “(1)CH3”}
- get_linkage()[source]#
Get the linkage that is currently set as default attachment specication for this molecule
- get_neighbors(atom: int | str | tuple | Atom, n: int = 1, mode: str = 'upto')[source]#
Get the neighbors of an atom.
- Parameters:
atom – The atom
n – The number of bonds that may separate the atom from its neighbors.
mode – The mode to use. Can be “upto” or “at”. If upto, all neighbors that are at most n bonds away are returned. If at, only neighbors that are exactly n bonds away are returned.
- Returns:
A set of atoms
- Return type:
set
Examples
O — (2)CH2
/
- (1)CH3 — CH OH
(1)CH2 — (2)CH3
``` >>> mol.get_neighbors(“(2)CH2”, n=1) {“O”, “OH”} >>> mol.get_neighbors(“(2)CH2”, n=2, mode=”upto”) {“O”, “OH”, “CH”} >>> mol.get_neighbors(“(2)CH2”, n=2, mode=”at”) {“CH”}
- get_residue(residue: int | str | tuple | Residue, by: str = None, chain=None)[source]#
Get a residue from the structure either based on its name, serial number or full_id. Note, if multiple residues match the requested criteria, for instance there are multiple ‘MAN’ from different chains, only the first one is returned.
- Parameters:
residue – The residue id, seqid or full_id tuple
by (str) – The type of parameter to search for. Can be either ‘name’, ‘seqid’ or ‘full_id’ By default, this is inferred from the datatype of the residue parameter. If it is an integer, it is assumed to be the sequence identifying number, if it is a string, it is assumed to be the residue name and if it is a tuple, it is assumed to be the full_id.
chain (str) – Further restrict to a residue from a specific chain.
- Returns:
residue – The residue
- Return type:
- get_residue_connections(residue_a=None, residue_b=None, triplet: bool = True, rotatable_only: bool = False)[source]#
Get bonds between atoms that connect different residues in the structure This method is different from infer_residue_connections in that it works with the already present bonds in the molecule instead of computing new ones.
- Parameters:
residue_a (Union[int, str, tuple, base_classes.Residue]) – The residues to consider. If None, all residues are considered. Otherwise, only between the specified residues are considered.
residue_b (Union[int, str, tuple, base_classes.Residue]) – The residues to consider. If None, all residues are considered. Otherwise, only between the specified residues are considered.
triplet (bool) – Whether to include bonds between atoms that are in the same residue but neighboring a bond that connects different residues. This is useful for residues that have a side chain that is connected to the main chain. This is mostly useful if you intend to use the returned list for some purpose, because the additionally returned bonds are already present in the structure from inference or standard-bond applying and therefore do not actually add any particular information to the Molecule object itself.
rotatable_only (bool) – Whether to only return bonds that are rotatable. This is useful if you want to use the returned bonds for optimization.
- Returns:
A set of tuples of atom pairs that are bonded and connect different residues
- Return type:
list
- get_residue_graph(detailed: bool = False, locked: bool = True)#
Generate a ResidueGraph for the Molecule
- Parameters:
detailed (bool) – If True the graph will include the residues and all atoms that form bonds connecting different residues. If False, the graph will only include the residues and their connections without factual bonds between any existing atoms.
locked (bool) – If True, the graph will also migrate the information on any locked bonds into the graph. This is only relevant if detailed is True.
- get_residues(*residues: int | str | tuple | Residue, by: str = None, chain=None)[source]#
Get residues from the structure either based on their name, serial number or full_id.
- Parameters:
residues – The residues’ id, seqid or full_id tuple. If None is passed, the iterator over all residues is returned.
by (str) – The type of parameter to search for. Can be either ‘name’, ‘seqid’ or ‘full_id’ By default, this is inferred from the datatype of the residue parameter. If it is an integer, it is assumed to be the sequence identifying number, if it is a string, it is assumed to be the residue name and if it is a tuple, it is assumed to be the full_id.
chain (str) – Further restrict to residues from a specific chain.
- Returns:
The residue(s)
- Return type:
list or generator
- get_root() Atom[source]#
Get the root atom of the molecule. The root atom is the atom at which it is attached to another molecule.
- property id#
- infer_bonds(max_bond_length: float = None, restrict_residues: bool = True) list[source]#
Infer bonds between atoms in the structure
- Parameters:
max_bond_length (float) – The maximum distance between atoms to consider them bonded. If None, the default value is 1.6 Angstroms.
restrict_residues (bool) – Whether to restrict bonds to only those in the same residue. If False, bonds between atoms in different residues are also inferred.
- Returns:
A list of tuples of atom pairs that are bonded
- Return type:
list
- infer_residue_connections(bond_length: float | tuple = None, triplet: bool = True) list[source]#
Infer bonds between atoms that connect different residues in the structure
- Parameters:
bond_length (float or tuple) – If a float is given, the maximum distance between atoms to consider them bonded. If a tuple, the minimal and maximal distance between atoms. If None, the default value is min 0.8 Angstrom, max 1.6 Angstroms.
triplet (bool) – Whether to include bonds between atoms that are in the same residue but neighboring a bond that connects different residues. This is useful for residues that have a side chain that is connected to the main chain. This is mostly useful if you intend to use the returned list for some purpose, because the additionally returned bonds are already present in the structure from inference or standard-bond applying and therefore do not actually add any particular information to the Molecule object itself.
- Returns:
A list of tuples of atom pairs that are bonded and considered residue connections.
- Return type:
list
Examples
For a molecule with the following structure: ```
- connection –> OA OB — H
/ /
- (1)CA — (2)CA (1)CB
/
- (6)CA (3)CA (2)CB — (3)CB
/
(5)CA — (4)CA
``` The circular residue A and linear residue B are connected by a bond between (1)CA and the oxygen OA and (1)CB. By default, because OA originally is associated with residue A, only the bond OA — (1)CB is returned. However, if triplet=True, the bond OA — (1)CA is also returned, because the entire connecting “bridge” between residues A and B spans either bond around OA. >>> mol.infer_residue_connections(triplet=False) [(“OA”, “(1)CB”)] >>> mol.infer_residue_connections(triplet=True) [(“OA”, “(1)CB”), (“OA”, “(2)CA”)]
- is_locked(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom)[source]#
Check if a bond is locked
- Parameters:
atom1 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.
atom2 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.
- Returns:
True if the bond is locked, False otherwise
- Return type:
bool
- property linkage#
The patch or recipe to use for attaching other molecules to this one
- classmethod load(filename: str)[source]#
Load a Molecule from a pickle file
- Parameters:
filename (str) – Path to the file
- lock_bond(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom)[source]#
Lock a bond between two atoms
- Parameters:
atom1 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.
atom2 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.
- property locked_bonds#
All bonds that are locked and cannot be rotated around.
- make_atom_graph(_copy: bool = True)#
Get an AtomGraph for the Molecule
- Parameters:
_copy (bool) – If True, not the “original” AtomGraph object that the Molecule relies on is returned but a new one. However, the molecule will still be linked to the new graph. This is useful if you want to make changes to the graph itself (not including changes to the graph nodes, i.e. the atoms itself, such as rotations).
- Returns:
The generated graph
- Return type:
- make_residue_graph(detailed: bool = False, locked: bool = True)[source]#
Generate a ResidueGraph for the Molecule
- Parameters:
detailed (bool) – If True the graph will include the residues and all atoms that form bonds connecting different residues. If False, the graph will only include the residues and their connections without factual bonds between any existing atoms.
locked (bool) – If True, the graph will also migrate the information on any locked bonds into the graph. This is only relevant if detailed is True.
- property model#
The biopython model
- property patch#
The patch to use for attaching other molecules to this one (synonym for recipe)
- purge_bonds(atom: int | str | Atom = None)[source]#
Remove all bonds connected to an atom
- Parameters:
atom – The atom to remove the bonds from, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms. If None, all bonds are removed.
- quartet(atom1: str | int | Atom, atom2: str | int | Atom, atom3: str | int | Atom, atom4: str | int | Atom)[source]#
Make an atom quartet from four atoms.
- Parameters:
atom1 – The four atoms that make up the quartet.
atom2 – The four atoms that make up the quartet.
atom3 – The four atoms that make up the quartet.
atom4 – The four atoms that make up the quartet.
- property recipe#
The recipe to use for stitching other molecules to this one (synonym for patch)
- reindex(start_chainid: int = 1, start_resid: int = 1, start_atomid: int = 1)[source]#
Reindex the atoms and residues in the structure. You can use this method if you made substantial changes to the molecule and want to be sure that there are no gaps in the atom and residue numbering.
- Parameters:
start_chainid (int) – The starting chain id (default: 1=A, 2=B, …, 26=Z, 27=AA, 28=AB, …)
start_resid (int) – The starting residue id
start_atomid (int) – The starting atom id
- relabel_hydrogens()[source]#
Relabel hydrogen atoms in the structure to match the standard labelling according to the CHARMM force field. This is useful if you want to use some pre-generated PDB file that may have used a different labelling scheme for atoms.
- remove_atoms(*atoms: int | str | tuple | Atom) list[source]#
Remove one or more atoms from the structure
- Parameters:
atoms – The atoms to remove, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.
- Returns:
The removed atoms
- Return type:
list
- remove_bond(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom)[source]#
Remove a bond between two atoms
- Parameters:
atom1 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.
atom2 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.
- remove_residues(*residues: int | Residue) list[source]#
Remove residues from the structure
- Parameters:
residues (int or base_classes.Residue) – The residues to remove, either the object itself or its seqid
- Returns:
The removed residues
- Return type:
list
- rename_atom(atom: int | Atom, name: str, residue: int | Residue = None)[source]#
Rename an atom
- Parameters:
atom (int or base_classes.Atom) – The atom to rename, either the object itself or its serial number
name (str) – The new name (id)
residue (int or base_classes.Residue) – The residue to which the atom belongs, either the object itself or its seqid. Useful when giving a possibly redundant id as identifier in multi-residue molecules.
- rename_chain(chain: str | Chain, name: str)[source]#
Rename a chain
- Parameters:
chain (str or Chain) – The chain to rename, either the object itself or its id
name (str) – The new name
- rename_residue(residue: int | Residue, name: str)[source]#
Rename a residue
- Parameters:
residue (int or Residue) – The residue to rename, either the object itself or its seqid
name (str) – The new name
- property residues#
A sorted list of all residues in the structure
- property root_atom#
The root atom of this molecule/scaffold at which it is attached to another molecule/scaffold
- property root_residue#
The residue of the root atom
- rotate_ancestors(atom1: str | int | Atom, atom2: str | int | Atom, angle: float, angle_is_degrees: bool = True)[source]#
Rotate all ancestor atoms (atoms before atom1) of a bond
- Parameters:
atom1 (Union[str, int, base_classes.Atom]) – The first atom (whose upstream neighbors are rotated)
atom2 (Union[str, int, base_classes.Atom]) – The second atom
angle (float) – The angle to rotate by
angle_is_degrees (bool) – Whether the angle is given in degrees (default) or radians
- rotate_around_bond(atom1: str | int | Atom, atom2: str | int | Atom, angle: float, descendants_only: bool = False, angle_is_degrees: bool = True)[source]#
Rotate the structure around a bond
- Parameters:
atom1 – The first atom
atom2 – The second atom
angle – The angle to rotate by in degrees
descendants_only – Whether to only rotate the descendants of the bond, i.e. only atoms that come after atom2 (sensible only for linear molecules, or bonds that are not part of a circular structure).
angle_is_degrees – Whether the angle is given in degrees (default) or radians
Examples
For a molecule starting as: ```
OH
/
- (1)CH3 — CH
CH2 — (2)CH3
``` we can rotate around the bond (1)CH3 — CH by 180° using
>>> import numpy as np >>> angle = 180 >>> mol.rotate_around_bond("(1)CH3", "CH", angle)
and thus achieve the following: ```
CH2 — (2)CH3
/
- (1)CH3 — CH
OH
- rotate_descendants(atom1: str | int | Atom, atom2: str | int | Atom, angle: float, angle_is_degrees: bool = True)[source]#
Rotate all descendant atoms (atoms after atom2) of a bond.
- Parameters:
atom1 (Union[str, int, base_classes.Atom]) – The first atom
atom2 (Union[str, int, base_classes.Atom]) – The second atom (whose downstream neighbors are rotated)
angle (float) – The angle to rotate by
angle_is_degrees (bool) – Whether the angle is given in degrees (default) or radians
- save(filename: str)[source]#
Save the object to a pickle file
- Parameters:
filename (str) – Path to the PDB file
- set_attach_residue(residue: int | Residue = None)[source]#
Set the residue that is used for attaching other molecules to this one.
- Parameters:
residue – The residue to be used for attaching other molecules to this one
- set_linkage(link: str | Linkage = None, _topology=None)[source]#
Set a linkage to be used for attaching other molecules to this one
- Parameters:
link (str or Linkage) – The linkage to be used. Can be either a string with the name of a known Linkage in the loaded topology, or an instance of the Linkage class. If None is given, the currently loaded default linkage is removed.
_topology – The topology to use for referencing the link.
- set_root(atom)[source]#
Set the root atom of the molecule
- Parameters:
atom (Atom or int or str or tuple) – The atom to be used as the root atom. This may be an Atom object, an atom serial number, an atom id (must be unique), or the full-id tuple.
- show(residue_graph: bool = False)[source]#
Open a browser window to view the molecule in 3D using Plotly
- Parameters:
residue_graph (bool) – If True, a residue graph is shown instead of the full structure.
- property structure#
The biopython structure
- to_biopython()[source]#
Convert the molecule to a Biopython structure
- Returns:
The Biopython structure
- Return type:
Bio.PDB.Structure.Structure
- to_cif(filename: str)[source]#
Write the molecule to a CIF file
- Parameters:
filename (str) – Path to the CIF file
- to_json(filename: str, type: str = None, names: list = None, identifiers: list = None, one_letter_code: str = None, three_letter_code: str = None)[source]#
Write the molecule to a JSON file
- Parameters:
filename (str) – Path to the JSON file
type (str) – The type of the molecule to be written to the JSON file (e.g. “protein”, “ligand”, etc.).
names (list) – A list of names of the molecules to be written to the JSON file.
identifiers (list) – A list of identifiers of the molecules to be written to the JSON file (e.g. SMILES, InChI, etc.).
one_letter_code (str) – A one-letter code for the molecule to be written to the JSON file.
three_letter_code (str) – A three-letter code for the molecule to be written to the JSON file.
- to_molfile(filename: str)[source]#
Write the molecule to a Molfile
- Parameters:
filename (str) – Path to the Mol file
- to_openmm()[source]#
Convert the molecule to an OpenMM Topology
- Returns:
The OpenMM topology
- Return type:
openmm.app.Topology
- to_pdb(filename: str, symmetric: bool = True)[source]#
Write the molecule to a PDB file
- Parameters:
filename (str) – Path to the PDB file
symmetric (bool) – If True, bonds are written symmetrically - i.e. if atom A is bonded to atom B, then atom B is also bonded to atom A, and both atoms will get an entry in the “CONECT” section. If False, only one of the atoms will get an entry in the “CONECT” section.
- to_pybel()[source]#
Convert the molecule to a Pybel molecule
- Returns:
The Pybel molecule
- Return type:
pybel.Molecule
- to_rdkit()[source]#
Convert the molecule to an RDKit molecule
- Returns:
The RDKit molecule
- Return type:
rdkit.Chem.rdchem.Mol
- unlock_bond(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom)[source]#
Unlock a bond between two atoms
- Parameters:
atom1 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.
atom2 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.
- update_atom_graph()[source]#
Generate a new up-to-date AtomGraph after any manual changes were done to the Molecule’s underlying biopython structure.
- vet(clash_range: tuple = (0.6, 1.7), angle_range: tuple = (90, 180)) bool[source]#
Vet the structural integrity of a molecule. This will return True if there are no clashes and all angles of adjacent atom triplets are within a tolerable range, False otherwise.
- Parameters:
clash_range (tuple, optional) – The minimal and maximal allowed distances for two bonded atoms (in Angstrom). The minimal distance is also used for non-bonded atoms.
angle_range (tuple, optional) – The minimal and maximal allowed angles between tree adjacent bonded atoms (in degrees).
- Returns:
True if the structure is alright, False otherwise.
- Return type:
bool
- biobuild.core.entity.should_invert(bond, direct_connecting_atoms)[source]#
Check if a given bond should be inverted during bond direction
- Parameters:
bond (tuple) – A tuple of two atoms that are bonded
direct_connecting_atoms (set) – A set of atoms that directly participate in bonds connecting different residues
- Returns:
Whether the bond should be inverted
- Return type:
bool
biobuild base_classes#
The base_classes are deriviatives of the original Biopython classes, but with the change that they use a UUID4 as their identifier (full_id) instead of a hierarchical tuple. This makes each object unique and allows for easy comparison where a == b is akin to a is b. Consequently, the __hash__ method is overwritten to use the UUID4 as the hash.
Warning
Each class has its own copy method that returns a deep copy of the object with a new UUID4. So a.copy() == a is False, while a standard deepcopy(a) == a is True since the UUID4 will not have been updated automatically.
Converting to and from biopython#
Each biobuild class can be generated from a biopython class using the from_biopython class method. And each biobuild class has a to_biopython method that returns the pure-biopython equivalent. It is important to note, that for most purposes, however, the biobuild classes should work fine as trop-in replacements for the original biopython classes.
import Bio.PDB as bio
from biobuild.core.base_classes import Atom
bio_atom = bio.Atom("CA", (0, 0, 0))
atom = Atom.from_biopython(bio_atom)
assert atom == bio_atom # False since atom uses a UUID4 as its identifier
assert atom.to_biopython() == bio_atom # True
The conversion from and to biopython works hierarchically, so if an entire biopython structure is converted to biobuild then all atoms, residues, chains and models will be converted to their biobuild equivalents.
import Bio.PDB as bio
from biobuild.core.base_classes import Structure
bio_structure = bio.PDBParser().get_structure("test", "test.pdb")
structure = Structure.from_biopython(bio_structure)
atoms = list(structure.get_atoms())
bio_atoms = list(bio_structure.get_atoms())
assert len(atoms) == len(bio_atoms) # True
- class biobuild.core.base_classes.Atom(id: str, coord: ndarray, serial_number: int = 1, bfactor: float = 0.0, occupancy: float = 1.0, fullname: str = None, element: str = None, altloc=' ', pqr_charge=None, radius=None)[source]#
Bases:
ID,AtomAn Atom object that inherits from Biopython’s Atom class.
- Parameters:
id (str) – The atom identifier
coord (ndarray) – The atom coordinates
serial_number (int, optional) – The atom serial number. The default is 1.
bfactor (float, optional) – The atom bfactor. The default is 0.0.
occupancy (float, optional) – The atom occupancy. The default is 1.0.
fullname (str, optional) – The atom fullname. The default is None, in which case the id is used again.
element (str, optional) – The atom element. The default is None, in which case it is inferred based on the id.
altloc (str, optional) – The atom altloc. The default is “ “.
pqr_charge (float, optional) – The atom pqr_charge. The default is None.
radius (float, optional) – The atom radius. The default is None.
- altloc#
- anisou_array#
- bfactor#
- coord#
- disordered_flag#
- element#
- classmethod from_biopython(atom) Atom[source]#
Convert a Biopython atom to an Atom object
- Parameters:
atom – The Biopython atom
- Returns:
The Atom object
- Return type:
- property full_id#
A self-adjusting full_id for an Biopython Atom
- fullname#
- id#
- level#
- mass#
- name#
- occupancy#
- parent#
- pqr_charge#
- radius#
- serial_number#
- sigatm_array#
- siguij_array#
- to_biopython()[source]#
Convert the Atom object to a Biopython atom
- Returns:
The Biopython atom
- Return type:
- xtra#
- class biobuild.core.base_classes.Bond(*atoms)[source]#
Bases:
objectA class representing a bond between two atoms.
- atom1#
- atom2#
- compute_length() float[source]#
Compute the bond length.
- Returns:
The bond length.
- Return type:
float
- is_double() bool[source]#
Check if the bond is a double bond.
- Returns:
True if the bond is a double bond, False otherwise.
- Return type:
bool
- is_single() bool[source]#
Check if the bond is a single bond.
- Returns:
True if the bond is a single bond, False otherwise.
- Return type:
bool
- is_triple() bool[source]#
Check if the bond is a triple bond.
- Returns:
True if the bond is a triple bond, False otherwise.
- Return type:
bool
- order#
- class biobuild.core.base_classes.Chain(id)[source]#
Bases:
ID,ChainA Chain object that inherits from Biopython’s Chain class.
- Parameters:
id (str) – The chain identifier
- child_dict#
- child_list#
- classmethod from_biopython(chain) Chain[source]#
Convert a BioPython Chain object to a Chain object.
- Parameters:
chain (BioPython Chain object) – The chain to convert.
- Returns:
The converted chain.
- Return type:
- property full_id#
A self-adjusting full_id for an Biopython Chain
- internal_coord#
- level#
- parent#
- to_biopython() Chain[source]#
Convert a Chain object to a pure BioPython Chain object.
- Parameters:
with_children (bool, optional) – Whether to convert the residues of the chain as well. The default is True.
- Returns:
The converted chain.
- Return type:
bio.Chain.Chain
- xtra#
- class biobuild.core.base_classes.Model(id)[source]#
Bases:
Model,IDA Model object that inherits from Biopython’s Model class.
- Parameters:
id (int or str) – The model identifier
- child_dict#
- child_list#
- classmethod from_biopython(model)[source]#
Convert a BioPython Model object to a Model object.
- Parameters:
model (BioPython Model object) – The model to convert.
- Returns:
The converted model.
- Return type:
- property full_id#
A self-adjusting full_id for an Biopython Model
- level#
- parent#
- property serial_num#
- property serial_number#
- to_biopython()[source]#
Convert a Model object to a pure BioPython Model object.
- Returns:
The converted model.
- Return type:
bio.Model.Model
- xtra#
- class biobuild.core.base_classes.Residue(resname, segid, icode)[source]#
Bases:
ID,ResidueA Residue object that inherits from Biopython’s Residue class.
- Parameters:
resname (str) – The residue name
segid (str) – The residue segid.
icode (int) – The residue icode. This is the residue serial number.
- add(atom)[source]#
Add an Atom object.
Checks for adding duplicate atoms, and raises a PDBConstructionException if so.
- child_dict#
- child_list#
- property coord#
- disordered#
- classmethod from_biopython(residue) Residue[source]#
Convert a BioPython Residue object to a Residue object.
- Parameters:
residue (BioPython Residue object) – The residue to convert.
- Returns:
The converted residue
- Return type:
- property full_id#
A self-adjusting full_id for an Biopython Residue
- property id#
Return identifier.
- internal_coord#
- level#
- parent#
- resname#
- segid#
- to_biopython() Residue[source]#
Convert a Residue object to a pure BioPython Residue object.
- Returns:
The converted residue.
- Return type:
bio.Residue.Residue
- xtra#
- class biobuild.core.base_classes.Structure(id)[source]#
Bases:
ID,StructureA Structure object that inherits from Biopython’s Structure class.
- Parameters:
id (str) – The structure identifier
- child_dict#
- child_list#
- classmethod from_biopython(structure: Structure) Structure[source]#
Convert a BioPython Structure object to a Structure object.
- Parameters:
structure (BioPython Structure object) – The structure to convert.
- Returns:
The converted structure.
- Return type:
- property full_id#
- level#
- parent#
- to_biopython() Structure[source]#
Convert a Structure object to a pure BioPython Structure object.
- Returns:
The converted structure.
- Return type:
bio.Structure.Structure
- xtra#