buildamol base#

The lowest level of buildamol base classes.

The base class for classes storing and manipulating molecular structures This houses most of the essential functionality of the library for most users. The Molecule class adds additional features on top.

class buildamol.core.entity.BaseEntity(structure, model: int = 0)[source]#

Bases: object

The Base class for all classes that store and handle molecular structures. This class is not meant to be used directly but serves as the base for the Molecule class.

Parameters:
  • structure (Structure or Bio.PDB.Structure) – A BuildAMol or Biopython structure

  • model (int) – The index of the model to use (default: 0)

add_atoms(*atoms: Atom, residue=None, _copy: bool = False)[source]#

Add atoms to the structure. This will automatically adjust the atom’s serial number to fit into the structure.

Parameters:
  • atoms (base_classes.Atom) – The atoms to add

  • residue (int or str) – The residue to which the atoms should be added, this may be either the seqid or the residue name, if None the atoms are added to the last residue. Note, that if multiple identically named residues are present, the first one is chosen, so using the seqid is a safer option!

  • _copy (bool) – If True, the atoms are copied and then added to the structure. This will leave the original atoms (and their parent structures) untouched.

add_bond(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom, order: int = 1)[source]#

Add a bond between two atoms

Parameters:
  • atom1 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

  • atom2 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

  • order (int) – The order of the bond, i.e. 1 for single, 2 for double, 3 for triple, etc.

add_bonds(*bonds)[source]#

Add multiple bonds at once.

Parameters:

bonds – The bonds to add, each bond is a tuple of two atoms. Each atom may be specified directly (BuildAMol object) or by providing the serial number, the full_id or the id of the atoms.

add_chains(*chains: Chain, adjust_seqid: bool = True, _copy: bool = False)[source]#

Add chains to the structure

Parameters:
  • chains (base_classes.Chain) – The chains to add

  • adjust_seqid (bool) – If True, the seqid of the chains is adjusted to match the current number of chains in the structure (i.e. a new chain can be given seqid A, and it will be adjusted to the correct value of C if there are already two other chains in the molecule).

  • _copy (bool) – If True, the chains are copied before adding them to the molecule. This is useful if you want to add the same chain to multiple molecules, while leaving them and their original parent structures intakt.

add_hydrogens(*atoms: int | str | Atom)[source]#

Infer missing hydrogens in the structure.

Parameters:

atoms – The atoms to infer hydrogens for. If None, all atoms are considered.

add_model(model: int | Model = None)[source]#

Add a new model to the molecule’s structure

Parameters:

model (int or Model) – If not given, a new completely blank model is created. If an integer is given, an existing model is copied and added to the molecule. If a Model object is given, it is added to the molecule.

add_residues(*residues: Residue, adjust_seqid: bool = True, _copy: bool = False)[source]#

Add residues to the structure

Parameters:
  • residues (base_classes.Residue) – The residues to add

  • adjust_seqid (bool) – If True, the seqid of the residues is adjusted to match the current number of residues in the structure (i.e. a new residue can be given seqid 1, and it will be adjusted to the correct value of 3 if there are already two other residues in the molecule).

  • _copy (bool) – If True, the residues are copied before adding them to the molecule. This is useful if you want to add the same residue to multiple molecules, while leaving them and their original parent structures intakt.

adjust_bond_length(atom1, atom2, length: float, move_descendants: bool = False)[source]#

Adjust the bond length between two atoms

Parameters:
  • atom1 – The atoms to bond, which can either be directly provided (Atom object) or by providing the serial number, the full_id or the id of the atoms.

  • atom2 – The atoms to bond, which can either be directly provided (Atom object) or by providing the serial number, the full_id or the id of the atoms.

  • length (float) – The new bond length

  • move_descendants (bool) – If True, this method will infer all descendant atoms and move them accordingly to preserve the overall geometry of the molecule. It will make things slower, however!

adjust_indexing(mol)[source]#

Adjust the indexing of a molecule to match the scaffold index

Parameters:

mol (Molecule) – The molecule to adjust the indexing of

adjust_to_ph(ph: float | int | tuple, inplace: bool = True, **kwargs)[source]#

Adjust the protonation state and charges to match a certain pH

Note

This requires rdkit and molscrub packages to be installed!

Parameters:
  • ph (float or tuple) – The pH value to adjust the structure to. If a tuple is given, a pH range can be specified as (low, high).

  • inplace (bool) – If True, the structure is modified in place, otherwise a new structure is returned.

  • **kwargs – Additional keyword arguments to pass to the scrub class of the molscrub package.

align_to(axis: str | ndarray)[source]#

Align the structure (via it’s primary axis, i.e. the axis perpendicular to the main plane) to some other axis. This will rotate the molecule so that the primary axis is aligned with the given axis. This only works for (more or less) planar molecules.

Parameters:

axis (str or np.ndarray) – The axis to align to. This can be either a unit vector or one of the strings “x”, “y”, or “z” to align to the respective axes.

apply_standard_bonds(_compounds=None) list[source]#

Use reference compounds to infer bonds in the structure. This will be exclusively based on the residue and atom ids and not on the actual distances between atoms.

Parameters:

_compounds – The compounds to use for the standard bonds. If None, the default compounds are used.

Returns:

A list of tuples of atom pairs that are bonded

Return type:

list

apply_standard_bonds_for(*residues, _compounds=None) list[source]#

Use reference compounds to infer bonds in the structure for specific residues. This will be exclusively based on the residue and atom ids and not on the actual distances between atoms.

Parameters:
  • residues – The residues to consider

  • _compounds – The compounds to use for the standard bonds. If None, the default compounds are used.

Returns:

A list of tuples of atom pairs that are bonded

Return type:

list

property atoms#

A sorted list of all atoms in the structure

property attach_residue#

The residue at which to attach other molecules to this one.

autolabel(atoms: list = None)[source]#

Automatically label atoms in the structure to match the CHARMM force field atom nomenclature. This is useful if you want to use some pre-generated PDB file that may have used a different labelling scheme for atoms.

Parameters:

atoms (list) – Optionally restrict the autolabelling to a specific set of atoms. If None, all atoms are considered.

Returns:

The molecule with the autolabelled atoms (in-place modification).

Return type:

Molecule

Note

The labels are infererred and therefore may occasionally not be “correct”. It is advisable to check the labels after using this method.

bend_at_bond(atom1: str | int | Atom, atom2: str | int | Atom, angle: float, neighbor: str | int | Atom = None, angle_is_degrees: bool = True)[source]#

Bend the molecule at a specific bond. This will rotate the atoms downstream of the bond in direction atom1->atom2 by the given angle. The axis of rotation will be the plane vector specified by the two atoms and one neighboring atom. A specific neighbor can be provided to ensure a specific plane is used (recommended), otherwise a random neighbor of atom1 will be used (preference is given to non-Hydrogens but a Hydrogen will be used if no other neighbor is found).

Parameters:
  • atom1 (Union[str, int, base_classes.Atom]) – The first atom of the bond

  • atom2 (Union[str, int, base_classes.Atom]) – The second atom of the bond

  • angle (float) – The angle to bend by

  • neighbor (Union[str, int, base_classes.Atom], optional) – The atom to use as a neighbor for the plane vector, by default None, in which case a random neighbor of atom1 will be used. It is recommended to specify this to ensure a specific plane is used.

  • angle_is_degrees (bool, optional) – Whether the angle is given in degrees (default) or radians

property bonds#

All bonds in the molecule

property center_of_geometry#

The center of geometry of the molecule

property center_of_mass#

The center of mass of the molecule

property chains#

A sorted list of all chains in the molecule

change_element(atom: int | Atom, element: str, adjust_bond_length: bool = True)[source]#

Change the element of an atom. This will automatically add or remove hydrogens if the new element has a different valency.

Parameters:
  • atom (int or base_classes.Atom) – The atom to rename, either the object itself or its serial number

  • element (str) – The new element

  • adjust_bond_length (bool) – If True, adjust the bond length to match the new element. This may slow down the process if the atom is central in a very large molecule.

property charge#

The total charge of the molecule

chem2dview(linewidth: float = None, atoms: str = None, highlight_color: str = None, **kwargs)[source]#

View the molecule in 2D through RDKit

Parameters:
  • linewidth (float) – The linewidth of the bonds.

  • atoms (str) – The label to use for the atoms. This can be any of the following: - None (default, element symbols, except for carbon) - “element” (force element symbols, even for carbon) - “serial” (the atom serial number) - “id” (the atom id / name) - “resid” (the residue serial number + atom id) - “off” (no labels) - any function that takes an (rdkit) atom and returns a string

  • highlight_color (str) – The color to use for highlighting atoms

cis(*bond: Atom | tuple | Bond)[source]#

Rotate the molecule such that the atoms in the bond are in a cis configuration.

Parameters:

*bond (Atom or tuple or Bond) – The bond to rotate

cleanup(remove_empty_models: bool = True, remove_empty_chains: bool = True, remove_empty_residues: bool = True, reindex: bool = True, remove_hydrogens: bool = False, add_hydrogens: bool = False, apply_standard_bonds: bool = False, infer_bonds: bool = False)[source]#

Clean up the molecule by removing empty models, chains, and residues. This can optionally also reindex the atoms and residues, remove or add hydrogen atoms, and apply standard bonds or infer bonds.

Parameters:
  • remove_empty_models (bool) – Whether to remove empty models

  • remove_empty_chains (bool) – Whether to remove empty chains

  • remove_empty_residues (bool) – Whether to remove empty residues

  • reindex (bool) – Whether to reindex the atoms and residues after cleaning up

  • remove_hydrogens (bool) – Whether to remove all hydrogen atoms

  • add_hydrogens (bool) – Whether to add all hydrogen atoms

  • apply_standard_bonds (bool) – Whether to apply standard connectivity based on loaded compounds (see load_compounds)

  • infer_bonds (bool) – Whether to infer bonds from the atom positions and element types

clear()[source]#

Clear the molecule of all models, chains, residues, and atoms.

collapse_chains(resnames: list = None)[source]#

Turn each chain of the molecule into a single residue but preserve the the chains.

Parameters:

resnames (list, optional) – A list of residue names to use for the residues. If None, the residue names are taken from the first residue in each chain. A string can also be given to use the same name for all residues.

compute_angle(atom1: str | int | Atom, atom2: str | int | Atom, atom3: str | int | Atom)[source]#

Compute the angle between three atoms where atom2 is the middle atom.

Parameters:
  • atom1 – The first atom

  • atom2 – The second atom

  • atom3 – The third atom

Returns:

The angle in degrees

Return type:

float

compute_angles()[source]#

Compute all angles of consecutively bonded atom triplets within the molecule.

Returns:

angles – A dictionary of the form {atom_triplet: angle}

Return type:

dict

compute_dihedral(atom1: str | int | Atom, atom2: str | int | Atom, atom3: str | int | Atom, atom4: str | int | Atom)[source]#

Compute the dihedral angle between four atoms

Parameters:
  • atom1 – The first atom

  • atom2 – The second atom

  • atom3 – The third atom

  • atom4 – The fourth atom

Returns:

The dihedral angle in degrees

Return type:

float

compute_dihedrals()[source]#

Compute all dihedrals of consecutively bonded atom quartets within the molecule.

Returns:

dihedrals – A dictionary of the form {atom_quartet: dihedral}

Return type:

dict

compute_length_along_axis(axis: str | ndarray) float[source]#

Compute the length of the molecule along a specific axis. This can be computed on any molecule but may not be meaningful in all cases (e.g. circular or branched molecules).

Parameters:

axis (str or np.ndarray) – The axis to compute the length along. This can be either a unit vector or one of the strings “x”, “y”, or “z” to align to the respective axes.

compute_normal_axis() ndarray[source]#

Compute the normal axis of the molecule. This is the axis that is perpendicular to the main plane of the molecule. This can be computed on any molecule but will only be meaningful for (more or less) planar molecules.

compute_principal_axis() ndarray[source]#

Compute the principal axis of the molecule. This is the axis that shows the most variance in the coordinates. This can be computed on any molecule but will only be meaningful for (more or less) linear molecules.

copy(n: int = 1) list[source]#

Create one or multiple deepcopy of the molecule

Parameters:

n (int, optional) – The number of copies to make, by default 1

Returns:

The copied molecule(s)

Return type:

Molecule or list

count_atoms() int[source]#

Count the number of atoms in the structure

Returns:

The number of atoms

Return type:

int

count_bonds() int[source]#

Count the number of bonds in the structure

Returns:

The number of bonds

Return type:

int

count_chains() int[source]#

Count the number of chains in the structure

Returns:

The number of chains

Return type:

int

count_clashes(clash_threshold: float = 1.0, ignore_hydrogens: bool = True, coarse_precheck: bool = True) int[source]#

Count all clashes in the molecule.

Parameters:
  • clash_threshold (float, optional) – The minimal allowed distance between two atoms (in Angstrom).

  • ignore_hydrogens (bool, optional) – Whether to ignore clashes with hydrogen atoms (default: True)

  • coarse_precheck (bool, optional) – If set to True a coarse-grained pre-screening on residue-level is done to speed up the computation. This may cause the sytem to overlook clashes if individual residues are particularly large, however (e.g. lipids with long carbon chains).

Returns:

The number of clashes.

Return type:

int

count_models() int[source]#

Count the number of models in the structure

Returns:

The number of models

Return type:

int

count_residues() int[source]#

Count the number of residues in the structure

Returns:

The number of residues

Return type:

int

double(atom1, atom2, adjust_hydrogens: bool = False)[source]#

Set a double bond between two atoms

Parameters:
  • atom1 – The first atom

  • atom2 – The second atom

  • adjust_hydrogens (bool) – Whether to adjust the number of hydrogens on the atoms based on the bond order

draw(*args, **kwargs)[source]#
draw2d(linewidth: float = None, atoms: str = None, highlight_color: str = None, **kwargs)#

View the molecule in 2D through RDKit

Parameters:
  • linewidth (float) – The linewidth of the bonds.

  • atoms (str) – The label to use for the atoms. This can be any of the following: - None (default, element symbols, except for carbon) - “element” (force element symbols, even for carbon) - “serial” (the atom serial number) - “id” (the atom id / name) - “resid” (the residue serial number + atom id) - “off” (no labels) - any function that takes an (rdkit) atom and returns a string

  • highlight_color (str) – The color to use for highlighting atoms

draw3d(*args, **kwargs)#
drop_atom_names()[source]#

Turn all atom ids (e.g. “CA”) into element symbols (e.g. “C”)

drop_atoms(*atoms: int | str | tuple | Atom)[source]#

Remove one or more atoms from the structure. This method returns the Molecule object itself rather than the removed atoms. Use remove_atoms if you need the removed atoms.

Parameters:

atoms – The atoms to remove, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

drop_bond(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom)[source]#

Remove a bond between two atoms

Parameters:
  • atom1 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

  • atom2 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

drop_chains(*chains: int | Chain)[source]#

Remove chains from the structure. This method returns the structure itself rather than the removed chains. If you want to get the removed chains, use remove_chains.

Parameters:

chains (int or Chain) – The chains to remove, either the object itself or its id

drop_empty_chains()[source]#

Remove all empty chains from the molecule

drop_empty_models()[source]#

Remove all empty models from the molecule

drop_empty_residues()[source]#

Remove all empty residues from the molecule

drop_hydrogens(*atoms: int | str | Atom)[source]#

Remove all hydrogens in the molecule.

Parameters:

atoms – The atoms to remove hydrogens from. If None, all atoms are considered.

drop_model(model: int | Model)[source]#

Drop a model from the molecule without removing its chains from the molecule. The chains of the dropped model will be removed from the model but remain in the molecule.

Parameters:

model (int or Model) – The model to drop

drop_residues(*residues: int | Residue)[source]#

Remove residues from the molecule. This method returns the molecule itself rather than the removed residues. If you want to get the removed residues, use remove_residues.

Parameters:

residues (int or base_classes.Residue) – The residues to remove, either the object itself or its seqid

find_clashes(clash_threshold: float = 1.0, ignore_hydrogens: bool = True, coarse_precheck: bool = True) list[source]#

Find all clashes in the molecule.

Parameters:
  • clash_threshold (float, optional) – The minimal allowed distance between two atoms (in Angstrom).

  • ignore_hydrogens (bool, optional) – Whether to ignore clashes with hydrogen atoms (default: True)

  • coarse_precheck (bool, optional) – If set to True a coarse-grained pre-screening on residue-level is done to speed up the computation. This may cause the sytem to overlook clashes if individual residues are particularly large, however (e.g. lipids with long carbon chains).

Returns:

A list of tuples of atoms that clash.

Return type:

list

find_clashes_with(other, clash_threshold: float = 1.0, ignore_hydrogens: bool = True, coarse_precheck: bool = True) list[source]#

Find all clashes between this molecule and another one.

Parameters:
  • other (Molecule) – The other molecule to compare with

  • clash_threshold (float, optional) – The minimal allowed distance between two atoms (in Angstrom).

  • ignore_hydrogens (bool, optional) – Whether to ignore clashes with hydrogen atoms (default: True)

  • coarse_precheck (bool, optional) – If set to True a coarse-grained pre-screening on residue-level is done to speed up the computation. This may cause the sytem to overlook clashes if individual residues are particularly large, however (e.g. lipids with long carbon chains).

Returns:

A list of tuples of atoms that clash.

Return type:

list

flip(plane_vector: ndarray, center: ndarray = None)[source]#

Flip the molecule around an axis

Parameters:
  • plane_vector (np.ndarray or str) – The vector defining the plane to flip around. This must be a unit vector. It may also be one of the strings “xy”, “xz”, or “yz” to flip around the respective planes.

  • center (np.ndarray) – The center of the flip

classmethod from_cif(filename: str, id: str = None)[source]#

Load a Molecule from a CIF file

Parameters:
  • filename (str) – Path to the CIF file

  • id (str) – The id of the Molecule. By default an id is inferred from the filename.

classmethod from_json(filename: str)[source]#

Make a Molecule from a JSON file

Parameters:

filename (str) – Path to the JSON file

classmethod from_molfile(filename: str)[source]#

Make a Molecule from a molfile

Parameters:

filename (str) – Path to the molfile

classmethod from_openmm(topology, positions)[source]#

Load a Molecule from an OpenMM topology and positions

Parameters:
  • topology (simtk.openmm.app.Topology) – The OpenMM topology

  • positions (simtk.unit.Quantity) – The OpenMM positions

classmethod from_pdb(filename: str, id: str = None, model: int = 0, has_atom_ids: bool = True)[source]#

Read a Molecule from a PDB file

Parameters:
  • filename (str) – Path to the PDB file

  • root_atom (str or int) – The id or the serial number of the root atom (optional)

  • id (str) – The id of the Molecule. By default an id is inferred from the filename.

  • model (int) – The index of the model to use (default: 0)

  • has_atom_ids (bool) – If the PDB file provides no atom ids, set this to False in order to autolabel the atoms.

classmethod from_pdbqt(filename: str)[source]#

Make a Molecule from a PDBQT file

Parameters:

filename (str) – Path to the PDBQT file

classmethod from_pybel(mol)[source]#

Load a Molecule from a Pybel molecule

Parameters:

mol (pybel.Molecule) – The Pybel molecule

classmethod from_rdkit(mol, id: str = None)[source]#

Load a Molecule from an RDKit molecule

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule

  • id (str) – The id of the Molecule. By default an id is inferred from the “_Name” property of the mol object (if present).

classmethod from_stk(obj)[source]#

Load a Molecule from an stk ConstructedMolecule

Parameters:

obj (stk.ConstructedMolecule) – The stk ConstructedMolecule

classmethod from_xml(filename: str)[source]#

Make a Molecule from an XML file

Parameters:

filename (str) – Path to the XML file

classmethod from_xyz(filename: str)[source]#

Make a Molecule from an XYZ file

Parameters:

filename (str) – Path to the XYZ file

get_ancestors(atom1: str | int | Atom, atom2: str | int | Atom) set[source]#

Get the atoms upstream of a bond. This will return the set of all atoms that are connected before the bond atom1-atom2 in the direction of atom1, the selection can be reversed by reversing the order of atoms (atom2-atom1).

Parameters:
  • atom1 – The first atom

  • atom2 – The second atom

Returns:

A set of atoms

Return type:

set

Examples

For a molecule ```

OH

/

(1)CH3 — CH

CH2 — (2)CH3

``` >>> mol.get_ancestors(“(1)CH3”, “CH”) set() >>> mol.get_ancestors(“CH”, “CH2”) {“(1)CH3”, “OH”} >>> mol.get_ancestors(“CH2”, “CH”) {“(2)CH3”}

get_atom(atom: int | str | tuple, by: str = None, residue: int | Residue = None)[source]#

Get an atom from the structure either based on its id, serial number or full_id. Note, if multiple atoms match the requested criteria, for instance there are multiple ‘C1’ from different residues, only the first one is returned. To get all atoms matching the criteria, use the get_atoms method.

Parameters:
  • atom – The atom id, serial number, full_id tuple, or element symbol.

  • by (str) – The type of parameter to search for. Can be either ‘id’, ‘serial’, ‘full_id’, or ‘element’. Because this looks for one specific atom, this parameter can be inferred from the datatype of the atom parameter. If it is an integer, it is assumed to be the serial number, if it is a string, it is assumed to be the atom id and if it is a tuple, it is assumed to be the full_id.

  • residue (int or Residue) – A specific residue to search in. If None, the entire structure is searched.

Returns:

atom – The atom

Return type:

base_classes.Atom

get_atom_graph(_copy: bool = True) AtomGraph[source]#

Get an AtomGraph for the Molecule

Parameters:

_copy (bool) – If True, not the “original” AtomGraph object that the Molecule relies on is returned but a new one. However, the molecule will still be linked to the new graph. This is useful if you want to make changes to the graph itself (not including changes to the graph nodes, i.e. the atoms itself, such as rotations).

Returns:

The generated graph

Return type:

AtomGraph

get_atom_quartets() list[source]#

Compute quartets of four consequtively bonded atoms

Returns:

atom_quartets – A list of atom quartets

Return type:

list

get_atom_triplets()[source]#

Compute triplets of three consequtively bonded atoms

get_atoms(*atoms: int | str | tuple, by: str = None, keep_order: bool = False, residue: int | Residue = None, filter: callable = None) list[source]#

Get one or more atoms from the structure either based on their id, serial number or full_id. Note, if multiple atoms match the requested criteria, for instance there are multiple ‘C1’ from different residues all of them are returned in a list. It is a safer option to use the full_id or serial number to retrieve a specific atom. If no search parameters are provided, the underlying atom-generator of the structure is returned.

Note

This does not support mixed queries. I.e. you cannot query for an atom with id ‘C1’ and serial number 1 at the same time. Each call can only query for one type of parameter.

Parameters:
  • atoms – The atom id, serial number, full_id tuple, or element string symbol. This supports multiple atoms to search for. However, only one type of parameter is supported per call. If left empty, the underlying generator is returned.

  • by (str) – The type of parameter to search for. Can be either ‘id’, ‘serial’, ‘full_id’, or ‘element’ If None is given, the parameter is inferred from the datatype of the atoms argument ‘serial’ in case of int, ‘id’ in case of str, full_id in case of a tuple.

  • keep_order (bool) – Whether to return the atoms in the order they were queried. If False, the atoms are returned in the order they appear in the structure.

  • residue (int or Residue) – A specific residue to search in. If None, the entire structure is searched.

  • filter (callable) – A filter function that is applied to the atoms. If the filter returns True, the atom is included in the result. The filter function must take an atom as its only argument and return a boolean.

Returns:

atom – The atom(s)

Return type:

list or generator

get_atoms_within(anchor: Atom | ndarray, distance: float) set[source]#

Get all atoms within a certain distance from an anchor point.

Parameters:
  • anchor (Atom or np.ndarray) – The anchor point. This can be either an Atom object or a 3D coordinate as a numpy array.

  • distance (float) – The distance threshold.

Returns:

A set of atoms within the specified distance from the anchor point.

Return type:

set

get_attach_residue()[source]#

Get the residue that is used for attaching other molecules to this one.

get_axial_hydrogen(atom: int | str | tuple | Atom) Atom[source]#

Get the axial hydrogen neighbor of an atom, if the atom is in a ring structure.

Parameters:

atom – The atom

Returns:

The axial hydrogen, if it exists, None otherwise

Return type:

Atom

get_axial_neighbor(atom: int | str | tuple | Atom) Atom[source]#

Get the axial neighbor of an atom, if the atom is in a ring structure.

Parameters:

atom – The atom

Returns:

The axial neighbor, if it exists, None otherwise

Return type:

Atom

get_bond(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom, add_if_not_present: bool = True) Bond[source]#

Get/make a bond between two atoms.

Parameters:
  • atom1 (str or int or tuple or Atom) – The first atom

  • atom2 (str or int or tuple or Atom) – The second atom

  • add_if_not_present (bool) – Whether to add the bond if it is not present

Returns:

bond – The bond object. If the bond is not present and add_if_not_present is False, None is returned.

Return type:

Bond

get_bond_array() ndarray[source]#

Get the bonds of the atoms in the molecule as an array of atom1, atom2, bond_order

Returns:

The bonds

Return type:

np.ndarray

get_bond_mask() ndarray[source]#

Get the bonds of the atoms in the molecule as a 2D mask where fields with 1 indicate a bond between the atoms of row and column.

Returns:

The bond mask

Return type:

np.ndarray

get_bonds(atom1: int | str | tuple | Atom | Residue = None, atom2: int | str | tuple | Atom = None, residue_internal: bool = True, either_way: bool = True)[source]#

Get one or multiple bonds from the molecule. If only one atom is provided, all bonds that are connected to that atom are returned.

Parameters:
  • atom1 – The atom id, serial number or full_id tuple of the first atom. This may also be a residue, in which case all bonds between atoms in that residue are returned.

  • atom2 – The atom id, serial number or full_id tuple of the second atom

  • residue_internal (bool) – If True, only bonds where both atoms are in the given residue (if atom1 is a residue) are returned. If False, all bonds where either atom is in the given residue are returned.

  • either_way (bool) – If True, the order of the atoms does not matter, if False, the order of the atoms does matter. By setting this to false, it is possible to also search for bonds that have a specific atom in position 1 or 2 depending on which argument was set, while leaving the other input as none.

Returns:

bond – The bond(s). If no input is given, all bonds are returned as a generator.

Return type:

list or generator

get_chain(chain: str)[source]#

Get a chain from the structure either based on its name.

Parameters:

chain – The chain id

Returns:

chain – The chain

Return type:

base_classes.Chain

get_chains()[source]#
get_coords(*atom_selector, **atom_selectors) ndarray[source]#

Get the coordinates of the atoms in the molecule

Parameters:

atom_selectors – Arguments or keyword arguments to pass to get_atoms(). If None, all atoms are selected.

Returns:

The coordinates

Return type:

np.ndarray

get_degree(atom: int | str | Atom)[source]#

Get the degree of an atom in the structure

Parameters:

atom – The atom to get the degree of, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

Returns:

The degree of the atom’s connectivity as the sum of the bond orders that connect it to its neighbors

Return type:

int

get_descendants(atom1: str | int | Atom, atom2: str | int | Atom) set[source]#

Get the atoms downstream of a bond. This will return the set of all atoms that are connected after the bond atom1-atom2 in the direction of atom2, the selection can be reversed by reversing the order of atoms (atom2-atom1).

Parameters:
  • atom1 – The first atom

  • atom2 – The second atom

Returns:

A set of atoms

Return type:

set

Examples

For a molecule ```

OH

/

(1)CH3 — CH

CH2 — (2)CH3

``` >>> mol.get_descendants(“(1)CH3”, “CH”) {“OH”, “CH2”, “(2)CH3”} >>> mol.get_descendants(“CH”, “CH2”) {“(2)CH3”} >>> mol.get_descendants(“CH2”, “CH”) {“OH”, “(1)CH3”}

get_double_bonds()[source]#

Get all double bonds in the molecule

get_equatorial_hydrogen(atom: int | str | tuple | Atom) Atom[source]#

Get the equatorial hydrogen neighbor of an atom, if the atom is in a ring structure.

Parameters:

atom – The atom

Returns:

The equatorial hydrogen, if it exists, None otherwise

Return type:

Atom

get_equatorial_neighbor(atom: int | str | tuple | Atom) Atom[source]#

Get the equatorial neighbor of an atom, if the atom is in a ring structure.

Parameters:

atom – The atom

Returns:

The equatorial neighbor, if it exists, None otherwise

Return type:

Atom

get_hydrogen(atom: int | str | tuple | Atom) Atom[source]#

Get any hydrogen neighbor of an atom.

Parameters:

atom – The atom

Returns:

The hydrogen, if it exists, None otherwise

Return type:

Atom

get_hydrogens(atom: int | str | tuple | Atom = None) set[source]#

Get multiple hydrogen atoms

Parameters:

atom – A specific atom whose hydrogen neighbors should be returned. If None, all hydrogen atoms in the molecule are returned.

Returns:

A set of hydrogen atoms

Return type:

set

get_left_hydrogen(atom: int | str | tuple | Atom) Atom[source]#

Get the “left-protruding” hydrogen neighbor of an atom with two hydrogens and two non-hydrogen neighbors.

Parameters:

atom – The atom

Returns:

The left hydrogen, if it exists, None otherwise

Return type:

Atom

Example

In a molecule: ```

H_B |

CH3 – C – CH2 – OH

H_A

``` We want to get the left and right hydrogens of the central C atom (labeled only C). Using part of the logic behind R/S nomenclature for chiral centers, we prioritize the non-H neighbors and then rotate the molecule such that the highest order non-H neighbor points toward the user and the other non-H neighbor points away. The left and right hydrogens are then determined based on their orientation in this view.

In this case, the left hydrogen is H_A and the right hydrogen is H_B.

get_linkage()[source]#

Get the linkage that is currently set as default attachment specication for this molecule

get_model(model: int = None) Model[source]#

Get a model from the molecule.

Parameters:

model (Int) – The id of the model to get. If not provided the current working model is returned.

Returns:

The model

Return type:

Model

get_models()[source]#
get_neighbors(atom: int | str | tuple | Atom, n: int = 1, mode: str = 'upto', filter: callable = None) set[source]#

Get the neighbors of an atom.

Parameters:
  • atom – The atom

  • n – The number of bonds that may separate the atom from its neighbors.

  • mode – The mode to use. Can be “upto” or “at”. If upto, all neighbors that are at most n bonds away are returned. If at, only neighbors that are exactly n bonds away are returned.

  • filter – A filter function that is applied to the neighbors. If the filter returns True, the atom is included in the result.

Returns:

A set of atoms

Return type:

set

Examples

For a molecule ```

O — (2)CH2

/

(1)CH3 — CH OH

(1)CH2 — (2)CH3

``` >>> mol.get_neighbors(“(2)CH2”, n=1) {“O”, “OH”} >>> mol.get_neighbors(“(2)CH2”, n=2, mode=”upto”) {“O”, “OH”, “CH”} >>> mol.get_neighbors(“(2)CH2”, n=2, mode=”at”) {“CH”}

get_quartets()[source]#

A generator for all atom quartets in the structure

get_residue(residue: int | str | tuple | Residue, by: str = None, chain=None)[source]#

Get a residue from the structure either based on its name, serial number or full_id. Note, if multiple residues match the requested criteria, for instance there are multiple ‘MAN’ from different chains, only the first one is returned.

Parameters:
  • residue – The residue id, seqid or full_id tuple

  • by (str) – The type of parameter to search for. Can be either ‘name’, ‘serial’ (or ‘seqid’) or ‘full_id’ By default, this is inferred from the datatype of the residue parameter. If it is an integer, it is assumed to be the sequence identifying number (serial number), if it is a string, it is assumed to be the residue name and if it is a tuple, it is assumed to be the full_id.

  • chain (str) – Further restrict to a residue from a specific chain.

Returns:

residue – The residue

Return type:

base_classes.Residue

get_residue_connections(residue_a=None, residue_b=None, triplet: bool = True, rotatable_only: bool = False)[source]#

Get bonds between atoms that connect different residues in the structure This method is different from infer_residue_connections in that it works with the already present bonds in the molecule instead of computing new ones.

Parameters:
  • residue_a (Union[int, str, tuple, base_classes.Residue]) – The residues to consider. If None, all residues are considered. Otherwise, only between the specified residues are considered.

  • residue_b (Union[int, str, tuple, base_classes.Residue]) – The residues to consider. If None, all residues are considered. Otherwise, only between the specified residues are considered.

  • triplet (bool) – Whether to include bonds between atoms that are in the same residue but neighboring a bond that connects different residues. This is useful for residues that have a side chain that is connected to the main chain. This is mostly useful if you intend to use the returned list for some purpose, because the additionally returned bonds are already present in the structure from inference or standard-bond applying and therefore do not actually add any particular information to the Molecule object itself.

  • rotatable_only (bool) – Whether to only return bonds that are rotatable. This is useful if you want to use the returned bonds for optimization.

Returns:

A set of tuples of atom pairs that are bonded and connect different residues

Return type:

list

get_residue_graph(detailed: bool = False, locked: bool = True) ResidueGraph#

Generate a ResidueGraph for the Molecule

Parameters:
  • detailed (bool) – If True the graph will include the residues and all atoms that form bonds connecting different residues. If False, the graph will only include the residues and their connections without factual bonds between any existing atoms.

  • locked (bool) – If True, the graph will also migrate the information on any locked bonds into the graph. This is only relevant if detailed is True.

get_residues(*residues: int | str | tuple | Residue, by: str = None, chain=None, filter: callable = None)[source]#

Get residues from the structure either based on their name, serial number or full_id.

Parameters:
  • residues – The residues’ id, seqid or full_id tuple. If None is passed, the iterator over all residues is returned.

  • by (str) – The type of parameter to search for. Can be either ‘name’, ‘seqid’ (or ‘serial’) or ‘full_id’ By default, this is inferred from the datatype of the residue parameter. If it is an integer, it is assumed to be the sequence identifying number (serial number), if it is a string, it is assumed to be the residue name and if it is a tuple, it is assumed to be the full_id.

  • chain (str) – Further restrict to residues from a specific chain.

Returns:

The residue(s)

Return type:

list or generator

get_right_hydrogen(atom: int | str | tuple | Atom) Atom[source]#

Get the “right-protruding” hydrogen neighbor of an atom with two hydrogens and two non-hydrogen neighbors.

Parameters:

atom – The atom

Returns:

The right hydrogen, if it exists, None otherwise

Return type:

Atom

Example

In a molecule: ```

H_B |

CH3 – C – CH2 – OH

H_A

``` We want to get the left and right hydrogens of the central C atom (labeled only C). Using part of the logic behind R/S nomenclature for chiral centers, we prioritize the non-H neighbors and then rotate the molecule such that the highest order non-H neighbor points toward the user and the other non-H neighbor points away. The left and right hydrogens are then determined based on their orientation in this view.

In this case, the left hydrogen is H_A and the right hydrogen is H_B.

get_root() Atom[source]#

Get the root atom of the molecule. The root atom is the atom at which it is attached to another molecule.

get_single_bonds()[source]#

Get all single bonds in the molecule

get_structure() Structure[source]#
get_triple_bonds()[source]#

Get all triple bonds in the molecule

has_clashes(clash_threshold: float = 1.0, ignore_hydrogens: bool = True, coarse_precheck: bool = True) bool[source]#

Check if the molecule has any clashes.

Parameters:
  • clash_threshold (float, optional) – The minimal allowed distance between two atoms (in Angstrom).

  • ignore_hydrogens (bool, optional) – Whether to ignore clashes with hydrogen atoms (default: True)

  • coarse_precheck (bool, optional) – If set to True a coarse-grained pre-screening on residue-level is done to speed up the computation. This may cause the sytem to overlook clashes if individual residues are particularly large, however (e.g. lipids with long carbon chains).

Returns:

True if there are clashes, False otherwise.

Return type:

bool

has_hydrogens() bool[source]#

Check if the structure has hydrogen atoms

Returns:

True if the structure has hydrogen atoms, False otherwise

Return type:

bool

property id#
index_by_chain()[source]#

Reindex the residues in the structure by chain. This will let each chain start with a residue 1. This will not reindex the atoms, only the residues.

infer_bonds(max_bond_length: float = None, restrict_residues: bool = True, infer_bond_orders: bool = False) list[source]#

Infer bonds between atoms in the structure

Parameters:
  • max_bond_length (float) – The maximum distance between atoms to consider them bonded. If None, the default value is 1.6 Angstroms.

  • restrict_residues (bool) – Whether to restrict bonds to only those in the same residue. If False, bonds between atoms in different residues are also inferred.

  • infer_bond_orders (bool) – Whether to infer the bond orders (double and tripple bonds) based on registered functional groups. This will slow the inference down, however.

Returns:

A list of tuples of atom pairs that are bonded

Return type:

list

infer_bonds_for(*residues_or_atoms: Residue | Atom, max_bond_length: float = None, infer_bond_orders: bool = False)[source]#

Infer bonds between atoms in the structure for a specific set of residues or atoms

Parameters:
  • residues_or_atoms – The residues or atoms to consider

  • max_bond_length (float) – The maximum distance between atoms to consider them bonded. If None, the default value is 1.6 Angstroms.

  • infer_bond_orders (bool) – Whether to infer the bond orders (double and tripple bonds) based on registered functional groups. This will slow the inference down, however.

Returns:

  • list – A list of tuples of atom pairs that are bonded

  • .. versionchanged:: 1.2.10 – infer_bonds_for now works with both residues and individual atoms but only accepts Residue and Atom objects as input and cannot search for them via serial numbers or ids. To keep using the old behavior where only residues were supported via any identifier use the infer_bonds_for_residues method instead.

infer_bonds_for_atoms(*atoms: Atom, max_bond_length: float = None, infer_bond_orders: bool = False)[source]#

Infer bonds between atoms in the structure for a specific set of atoms

Parameters:
  • atoms – The atoms to consider

  • max_bond_length (float) – The maximum distance between atoms to consider them bonded. If None, the default value is 1.6 Angstroms.

  • infer_bond_orders (bool) – Whether to infer the bond orders (double and tripple bonds) based on registered functional groups. This will slow the inference down, however.

Returns:

A list of tuples of atom pairs that are bonded

Return type:

list

infer_bonds_for_residues(*residues, max_bond_length: float = None, infer_bond_orders: bool = False)[source]#

Infer bonds between atoms in the structure for a specific set of residues

Parameters:
  • residues – The residues to consider

  • max_bond_length (float) – The maximum distance between atoms to consider them bonded. If None, the default value is 1.6 Angstroms.

  • infer_bond_orders (bool) – Whether to infer the bond orders (double and tripple bonds) based on registered functional groups. This will slow the inference down, however.

Returns:

A list of tuples of atom pairs that are bonded

Return type:

list

infer_residue_connections(bond_length: float | tuple = None, triplet: bool = True) list[source]#

Infer bonds between atoms that connect different residues in the structure

Parameters:
  • bond_length (float or tuple) – If a float is given, the maximum distance between atoms to consider them bonded. If a tuple, the minimal and maximal distance between atoms. If None, the default value is min 0.8 Angstrom, max 1.6 Angstroms.

  • triplet (bool) – Whether to include bonds between atoms that are in the same residue but neighboring a bond that connects different residues. This is useful for residues that have a side chain that is connected to the main chain. This is mostly useful if you intend to use the returned list for some purpose, because the additionally returned bonds are already present in the structure from inference or standard-bond applying and therefore do not actually add any particular information to the Molecule object itself.

Returns:

A list of bonds that link atoms from different residues.

Return type:

list

Examples

For a molecule with the following structure: ```

connection –> OA OB — H

/ /

(1)CA — (2)CA (1)CB

/

(6)CA (3)CA (2)CB — (3)CB

/

(5)CA — (4)CA

``` The circular residue A and linear residue B are connected by a bond between (1)CA and the oxygen OA and (1)CB. By default, because OA originally is associated with residue A, only the bond OA — (1)CB is returned. However, if triplet=True, the bond OA — (1)CA is also returned, because the entire connecting “bridge” between residues A and B spans either bond around OA. >>> mol.infer_residue_connections(triplet=False) [(“OA”, “(1)CB”)] >>> mol.infer_residue_connections(triplet=True) [(“OA”, “(1)CB”), (“OA”, “(2)CA”)]

is_cis(*bond: Atom | tuple | Bond) bool[source]#

Check if the atoms in the bond are in a cis configuration.

Parameters:

*bond (Atom or tuple or Bond) – The bond to check

Returns:

Whether the bond is in a cis configuration

Return type:

bool

is_locked(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom)[source]#

Check if a bond is locked

Parameters:
  • atom1 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

  • atom2 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

Returns:

True if the bond is locked, False otherwise

Return type:

bool

is_trans(*bond: Atom | tuple | Bond) bool[source]#

Check if the atoms in the bond are in trans configuration

Parameters:

*bond (Atom or tuple or Bond) – The bond to check

Returns:

Whether the bond is in a trans configuration

Return type:

bool

Softlink atoms to the structure. This will add the atoms to the index of the maintained structure but it will not adjust the atoms’ own parent references. This is useful if you want to have atoms be accessible from multiple Molecule objects.

Parameters:

atoms (base_classes.Atom) – The atoms to link

Softlink chains to the structure. This will add the chains to the index of the maintained structure but it will not adjust the chains’ own parent references. This is useful if you want to have chains be accessible from multiple Molecule objects.

Parameters:

chains (base_classes.Chain) – The chains to link

Softlink residues to the structure. This will add the residues to the index of the maintained structure but it will not adjust the residues’ own parent references. This is useful if you want to have residues be accessible from multiple Molecule objects.

Parameters:
  • residues (base_classes.Residue) – The residues to link

  • chain (str or base_classes.Chain) – The chain to which the residues should be linked. If None, the residues are linked to the current working chain.

property linkage#

The patch or recipe to use for attaching other molecules to this one

classmethod load(filename: str)[source]#

Load a Molecule from a pickle file

Parameters:

filename (str) – Path to the file

lock_all()[source]#

Lock all bonds in the structure so they cannot be rotated around

lock_bond(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom)[source]#

Lock a bond between two atoms

Parameters:
  • atom1 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

  • atom2 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

property locked_bonds#

All bonds that are locked and cannot be rotated around.

make_atom_graph(_copy: bool = True) AtomGraph#

Get an AtomGraph for the Molecule

Parameters:

_copy (bool) – If True, not the “original” AtomGraph object that the Molecule relies on is returned but a new one. However, the molecule will still be linked to the new graph. This is useful if you want to make changes to the graph itself (not including changes to the graph nodes, i.e. the atoms itself, such as rotations).

Returns:

The generated graph

Return type:

AtomGraph

make_residue_graph(detailed: bool = False, locked: bool = True) ResidueGraph[source]#

Generate a ResidueGraph for the Molecule

Parameters:
  • detailed (bool) – If True the graph will include the residues and all atoms that form bonds connecting different residues. If False, the graph will only include the residues and their connections without factual bonds between any existing atoms.

  • locked (bool) – If True, the graph will also migrate the information on any locked bonds into the graph. This is only relevant if detailed is True.

property mass#

The total mass of the molecule

merge(other, adjust_indexing: bool = True)[source]#

Merge another molecule into this one. This will simply add all chains, residues, and atoms of the other molecule to this one. It will NOT perform any kind of geometrical alignment or anything like that.

Parameters:
  • other (Molecule) – The other molecule to merge into this one

  • adjust_indexing (bool) – Whether to adjust the indexing of the atoms and residues in the merged molecule

property model#

The working model of the structure

property models#

A list of all models in the base-structure

move(vector: ndarray)[source]#

Move the molecule in 3D space

Parameters:

vector (np.ndarray) – The vector to move the molecule by

move_to(pos: ndarray)[source]#

Move the molecule to a specific position in 3D space

Parameters:

pos (np.ndarray) – The position to move the molecule to. This will be the new center of geometry.

nglview()[source]#

View the molecule in 3D through nglview

property patch#

The patch to use for attaching other molecules to this one (synonym for recipe)

place(pos: ndarray)#

Move the molecule to a specific position in 3D space

Parameters:

pos (np.ndarray) – The position to move the molecule to. This will be the new center of geometry.

plotly(residue_graph: bool = False, atoms: bool = True, line_color: str = 'black')[source]#

Prepare a view of the molecule in 3D using Plotly but do not open a browser window.

Parameters:
  • residue_graph (bool) – If True, a residue graph is shown instead of the full structure.

  • atoms (bool) – Whether to draw the atoms (default: True)

  • line_color (str) – The color of the lines connecting the atoms

Returns:

viewer – The viewer object

Return type:

MoleculeViewer3D

purge_bonds(atom: int | str | Atom = None)[source]#

Remove all bonds connected to an atom

Parameters:

atom – The atom to remove the bonds from, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms. If None, all bonds are removed.

py3dmol(style: str = 'stick', color: str = None, size: tuple = None)[source]#

View the molecule in 3D through py3Dmol

Parameters:
  • style (str) – The style to use for the visualization. Can be “line”, “stick”, “sphere”, “cartoon”, “surface”, or “label”

  • color (str) – A specific color to use for the visualization

  • size (tuple) – The size of the view as a tuple of (width, height) in pixels.

quartet(atom1: str | int | Atom, atom2: str | int | Atom, atom3: str | int | Atom, atom4: str | int | Atom)[source]#

Make an atom quartet from four atoms.

Parameters:
  • atom1 – The four atoms that make up the quartet.

  • atom2 – The four atoms that make up the quartet.

  • atom3 – The four atoms that make up the quartet.

  • atom4 – The four atoms that make up the quartet.

property recipe#

The recipe to use for stitching other molecules to this one (synonym for patch)

reindex(start_chainid: int = 1, start_resid: int = 1, start_atomid: int = 1)[source]#

Reindex the atoms and residues in the structure. You can use this method if you made substantial changes to the molecule and want to be sure that there are no gaps in the atom and residue numbering.

Parameters:
  • start_chainid (int) – The starting chain id (default: 1=A, 2=B, …, 26=Z, 27=AA, 28=AB, …)

  • start_resid (int) – The starting residue id

  • start_atomid (int) – The starting atom id

relabel_hydrogens()[source]#

Relabel hydrogen atoms in the structure to match the standard labelling according to the CHARMM force field. This is useful if you want to use some pre-generated PDB file that may have used a different labelling scheme for atoms.

remove_atoms(*atoms: int | str | tuple | Atom) list[source]#

Remove one or more atoms from the structure and return them.

Parameters:

atoms – The atoms to remove, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

Returns:

The removed atoms

Return type:

list

remove_bond(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom)[source]#
remove_chains(*chains: int | Chain) list[source]#

Remove chains from the structure and return them.

Parameters:

chains (int or Chain) – The chains to remove, either the object itself or its id

Returns:

The removed chains

Return type:

list

remove_empty_chains()[source]#
remove_empty_models()[source]#
remove_empty_residues()[source]#
remove_hydrogens(*atoms: int | str | Atom)[source]#
remove_model(model: int | Model)[source]#

Remove a model from the molecule and all its chains from the molecule and return the removed model.

Parameters:

model (int or Model) – The model to remove

Returns:

The removed model

Return type:

Model

remove_residues(*residues: int | Residue) list[source]#

Remove residues from the molecule and return them.

Parameters:

residues (int or base_classes.Residue) – The residues to remove, either the object itself or its seqid

Returns:

The removed residues

Return type:

list

rename_atom(atom: int | Atom, name: str, residue: int | Residue = None)[source]#

Rename an atom

Parameters:
  • atom (int or base_classes.Atom) – The atom to rename, either the object itself or its serial number

  • name (str) – The new name (id)

  • residue (int or base_classes.Residue) – The residue to which the atom belongs, either the object itself or its seqid. Useful when giving a possibly redundant id as identifier in multi-residue molecules.

rename_atoms(old_name: str, new_name: str, residue_name: str = None)[source]#

Rename multiple atoms to the same name

Parameters:
  • old_name (str) – The name of the atoms to rename

  • new_name (str) – The new name

  • residue_name (str) – The name of the residue of the atoms to rename (if only atoms from a specific type of residue should be renamed).

rename_chain(chain: str | Chain, name: str)[source]#

Rename a chain

Parameters:
  • chain (str or Chain) – The chain to rename, either the object itself or its id

  • name (str) – The new name

rename_residue(residue: int | Residue, name: str)[source]#

Rename a residue

Parameters:
  • residue (int or Residue) – The residue to rename, either the object itself or its seqid

  • name (str) – The new name

rename_residues(old_name: str, new_name: str)[source]#

Rename multiple residues to the same name

Parameters:
  • old_name (str) – The name of the residues to rename

  • new_name (str)

property residues#

A sorted list of all residues in the molecule

property root_atom#

The root atom of this molecule/scaffold at which it is attached to another molecule/scaffold

property root_residue#

The residue of the root atom

rotate(angle: float, axis: ndarray, center: ndarray = None, angle_is_degrees: bool = True)[source]#

Rotate the molecule around an axis

Parameters:
  • angle (float) – The angle to rotate by

  • axis (np.ndarray or str) – The axis to rotate around. This must be a unit vector. Alternatively, it may be one of the strings “x”, “y”, or “z” to rotate around the respective axes.

  • center (np.ndarray) – The center of the rotation. By default the center of geometry is used to achieve relative rotations (i.e. without translation). Use “absolute” if you want to rotate around the literal axes.

  • angle_is_degrees (bool) – Whether the angle is given in degrees (default) or radians

rotate_ancestors(atom1: str | int | Atom, atom2: str | int | Atom, angle: float, angle_is_degrees: bool = True)[source]#

Rotate all ancestor atoms (atoms before atom1) of a bond

Parameters:
  • atom1 (Union[str, int, base_classes.Atom]) – The first atom (whose upstream neighbors are rotated)

  • atom2 (Union[str, int, base_classes.Atom]) – The second atom

  • angle (float) – The angle to rotate by

  • angle_is_degrees (bool) – Whether the angle is given in degrees (default) or radians

rotate_around_bond(atom1: str | int | Atom, atom2: str | int | Atom, angle: float, descendants_only: bool = False, angle_is_degrees: bool = True)[source]#

Rotate the structure around a bond

Parameters:
  • atom1 – The first atom

  • atom2 – The second atom

  • angle – The angle to rotate by in degrees

  • descendants_only – Whether to only rotate the descendants of the bond, i.e. only atoms that come after atom2 (sensible only for linear molecules, or bonds that are not part of a circular structure).

  • angle_is_degrees – Whether the angle is given in degrees (default) or radians

Examples

For a molecule starting as: ```

OH

/

(1)CH3 — CH

CH2 — (2)CH3

``` we can rotate around the bond (1)CH3 — CH by 180° using

>>> import numpy as np
>>> angle = 180
>>> mol.rotate_around_bond("(1)CH3", "CH", angle)

and thus achieve the following: ```

CH2 — (2)CH3

/

(1)CH3 — CH

OH

```

rotate_descendants(atom1: str | int | Atom, atom2: str | int | Atom, angle: float, angle_is_degrees: bool = True)[source]#

Rotate all descendant atoms (atoms after atom2) of a bond.

Parameters:
  • atom1 (Union[str, int, base_classes.Atom]) – The first atom

  • atom2 (Union[str, int, base_classes.Atom]) – The second atom (whose downstream neighbors are rotated)

  • angle (float) – The angle to rotate by

  • angle_is_degrees (bool) – Whether the angle is given in degrees (default) or radians

save(filename: str)[source]#

Save the object to a pickle file

Parameters:

filename (str) – Path to the PDB file

search_by_constraints(constraints: list) list[source]#

Search for atoms based on a list of constraints. The constraints must be constraint functions from structural.neighbors.constraints. Each entry in the constraints list represents the constraints for one specific atom. Constraints apply to atom neighborhoods not the atom graph as a whole! This means that constraints are applied to the neighbors of the atoms when searching!

Parameters:

constraints (list) – A list of constraint functions

Returns:

A list of matching atoms. Each entry in this list will be a dictionary mapping the atoms (values) to the constraint function index for which they match (key).

Return type:

list

Examples

For a molecule ```

OH

/

(1)CH3 — CH

CH2 — (2)CH3

``` we can search for the metyhl groups by using the following constraints:

>>> from buildamol.core.structural.neighbors import constraints
>>> constraints = [
...     # the first atom must be a carbon and have three hydrogen neighbors
...     # we only search for the methyl-carbons...
...     constraints.multi_constraint(
...         constraints.has_element("C"),
...         constraints.has_neighbor_hist({"H": 3}),
...     ),
...     ]
>>> mol.search_by_constraints(constraints)
[{0: (1)C}, {0: (2)C}]
set_attach_residue(residue: int | Residue = None)[source]#

Set the residue that is used for attaching other molecules to this one.

Parameters:

residue – The residue to be used for attaching other molecules to this one

set_bond(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom, order: int = 1)[source]#

Specify a bond between two atoms. The difference between this method and add_bond is that the latter can be used to incrementally add bond orders (i.e. make a double bond out of a single bond by calling the method twice). This method will always set the bond order to the provided value.

Parameters:
  • atom1 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

  • atom2 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

  • order (int) – The order of the bond, i.e. 1 for single, 2 for double, 3 for triple, etc.

set_bond_order(atom1, atom2, order: int, adjust_hydrogens: bool = False)[source]#

Set the order of a bond between two atoms

Parameters:
  • atom1 – The first atom

  • atom2 – The second atom

  • order (int) – The order of the bond

  • adjust_hydrogens (bool) – Whether to adjust the number of hydrogens on the atoms based on the bond order

set_bonds(*bonds)[source]#

Specify multiple bonds at once. The difference between this method and add_bonds is that the latter can be used to incrementally add bond orders (i.e. make a double bond out of a single bond by calling the method twice or certain bonds are specified multiple times in the arguments). This method will always set the bond order to the provided value.

Parameters:

bonds – The bonds to add, each bond is a tuple of two atoms. Each atom may be specified directly (BuildAMol object) or by providing the serial number, the full_id or the id of the atoms.

set_charge(atom: str | int | tuple | Atom, charge: int, adjust_protonation: bool = True)[source]#

Set the charge of an atom. This will automatically adjust the number of protons on the atom if the charge is changed.

Parameters:
  • atom (str or int or tuple or Atom) – The atom whose charge should be changed

  • charge (int) – The new charge. This is NOT the charge difference to apply but the final charge of the atom.

  • adjust_protonation (bool) – If True, adjust the number of protons on the atom to match the charge.

set_coords(coords: ndarray, *atom_selector, **atom_selectors)[source]#

Set the coordinates of the atoms in the molecule

Parameters:
  • coords (np.ndarray) – The new coordinates

  • atom_selectors – Arguments or keyword arguments to pass to get_atoms(). If None, all atoms are selected. The number and order of atoms in the selection must match the number and order of coordinates.

set_linkage(link: str | Linkage = None, _topology=None)[source]#

Set a linkage to be used for attaching other molecules to this one

Parameters:
  • link (str or Linkage) – The linkage to be used. Can be either a string with the name of a known Linkage in the loaded topology, or an instance of the Linkage class. If None is given, the currently loaded default linkage is removed.

  • _topology – The topology to use for referencing the link.

set_model(model: int)[source]#

Set the current working model of the molecule

Parameters:

model (Int) – The id of the model to set as active

set_parent(obj: Atom | Residue | Chain | Model, parent: Residue | Chain | Model)[source]#

Reassign a structural component like an Atom to a new parent object.

Parameters:
set_root(atom)[source]#

Set the root atom of the molecule

Parameters:

atom (Atom or int or str or tuple) – The atom to be used as the root atom. This may be an Atom object, an atom serial number, an atom id (must be unique), or the full-id tuple.

show(*args, **kwargs)[source]#
show2d(*args, **kwargs)[source]#

View the molecule in 2D

show3d(*args, **kwargs)#
single(atom1, atom2, adjust_hydrogens: bool = False)[source]#

Set a single bond between two atoms

Parameters:
  • atom1 – The first atom

  • atom2 – The second atom

  • adjust_hydrogens (bool) – Whether to adjust the number of hydrogens on the atoms based on the bond order

split_contiguous(target_residues: list = None)[source]#

Split residues that contain multiple contiguous atom groups into separate residues. Residues that are split will be removed from the molecule and replaced with the new residues labeled “UNL_X” where X is a counter. The indexing is not affected by this operation (i.e. atom serials are not changed).

Parameters:

target_residues (list) – A list of residues to split. If None, all residues are split.

split_models(_copy: bool = False) list[source]#

Split the molecule into multiple molecules, each containing one of the models.

split_residues()[source]#

Split the molecule into separate residues, creating a list of new molecules, each with a single residue.

squash(chain_id: str = 'A', resname: str = 'UNK')[source]#

Turn the entire molecule into a single chain with a single residue.

squash_chains(chain_id: str = 'A')[source]#

Turn all chains of the molecule into a single chain but preserve the residues.

stack(axis: str | ndarray, n: int, pad: float = 0)[source]#

Stack the molecule along an axis. This will create n copies of the molecule along the axis with a padding of pad between them. This method is a convenience wrapper for move and merge and will not perform any kind of alignment or rotation.

Parameters:
  • axis (str or np.ndarray) – The axis to stack along. This can be either a unit vector or one of the strings “x”, “y”, or “z” to stack along the respective axes.

  • n (int) – The number of copies to stack

  • pad (float) – The padding between the copies

property structure#

The buildamol base-structure

superimpose_to_atom(ref_atom: Atom | int | str, other_atom: Atom | ndarray)[source]#

Superimpose the molecule to another molecule based on two atoms. This will move this molecule so that the atom in ref_atom is superimposed to the atom in other_atom.

Parameters:
  • ref_atom (Atom or int or str) – The atom to superimpose in this molecule

  • other_atom (Atom or np.ndarray) – The atom to superimpose to in the other molecule or an arbitrary coordinate

superimpose_to_bond(ref_bond: tuple | Bond, other_bond: tuple | Bond)[source]#

Superimpose the molecule to another molecule based on two bonds. This will move this molecule so that the atoms in ref_bond are superimposed to the atoms in other_bond.

Parameters:
  • ref_bond (tuple or Bond) – The bond to reference in this molecule

  • other_bond (tuple or Bond) – The bond to superimpose to in the other molecule

superimpose_to_pair(pair1, pair2)[source]#

Superimpose the molecule to another molecule based on two atom pairs (they do not need to be bonded). This will move this molecule so that the atoms in pair1 are superimposed to the atoms in pair2.

Parameters:
  • pair1 (tuple) – The pair to superimpose in this molecule. These may either be Atom objects or any input which can be used to get atoms in this molecule.

  • pair2 (tuple) – The pair to superimpose to. These must be either Atom objects or arbitrary coordinates (np.ndarray).

superimpose_to_residue(ref_residue, other_residue)[source]#

Superimpose the molecule to another molecule based on two residues. This will move this molecule so that the residues are superimposed.

Parameters:
  • ref_residue (Residue or int or str) – The residue to superimpose to in this molecule

  • other_residue (Residue) – The residue to superimpose to in the other molecule

superimpose_to_triplet(ref_triplet: tuple, other_triplet: tuple)[source]#

Superimpose the molecule to another molecule based on two atom triplets. This will move this molecule so that the atoms in ref_triplet are superimposed to the atoms in other_triplet.

Parameters:
  • ref_triplet (tuple) – The triplet to superimpose to. These may either be Atom objects or any input which can be used to get atoms in this molecule.

  • other_triplet (tuple) – The triplet to superimpose from.. These must be either Atom objects or arbitrary coordinates (np.ndarray).

to_biopython()[source]#

Convert the molecule to a Biopython structure

Returns:

The Biopython structure

Return type:

Bio.PDB.Structure.Structure

to_cif(filename: str)[source]#

Write the molecule to a CIF file

Parameters:

filename (str) – Path to the CIF file

to_json(filename: str, type: str = None, names: list = None, identifiers: list = None, one_letter_code: str = None, three_letter_code: str = None)[source]#

Write the molecule to a JSON file

Parameters:
  • filename (str) – Path to the JSON file

  • type (str) – The type of the molecule to be written to the JSON file (e.g. “protein”, “ligand”, etc.).

  • names (list) – A list of names of the molecules to be written to the JSON file.

  • identifiers (list) – A list of identifiers of the molecules to be written to the JSON file (e.g. SMILES, InChI, etc.).

  • one_letter_code (str) – A one-letter code for the molecule to be written to the JSON file.

  • three_letter_code (str) – A three-letter code for the molecule to be written to the JSON file.

to_molfile(filename: str)[source]#

Write the molecule to a Molfile

Parameters:

filename (str) – Path to the Mol file

to_numpy(export_bonds: bool = True)[source]#

Convert the molecule to numpy arrays

Parameters:

export_bonds (bool) – If True, the bonds are also exported. If False, the bond array will remain empty.

Returns:

The atomic numbers and atomic coordinates in one array and the bonds with atom serial numbers and bond order in a second array

Return type:

tuple

to_openmm()[source]#

Convert the molecule to an OpenMM Topology

Return type:

openmm.app.PDBFile

to_pdb(filename: str, symmetric: bool = True)[source]#

Write the molecule to a PDB file

Parameters:
  • filename (str) – Path to the PDB file

  • symmetric (bool) – If True, bonds are written symmetrically - i.e. if atom A is bonded to atom B, then atom B is also bonded to atom A, and both atoms will get an entry in the “CONECT” section. If False, only one of the atoms will get an entry in the “CONECT” section.

to_pdbqt(filename: str)[source]#

Write the molecule to a PDBQT file

Parameters:

filename (str) – Path to the PDBQT file

to_pybel()[source]#

Convert the molecule to a Pybel molecule

Returns:

The Pybel molecule

Return type:

pybel.Molecule

to_rdkit()[source]#

Convert the molecule to an RDKit molecule

Returns:

The RDKit molecule

Return type:

rdkit.Chem.rdchem.Mol

to_stk()[source]#

Convert the molecule to a STK molecule

Returns:

The STK molecule

Return type:

stk.BuildingBlock

to_xml(filename: str, atom_attributes: list = None)[source]#

Write the molecule to an XML file

Parameters:
  • filename (str) – Path to the XML file

  • atom_attributes (list) –

    A list of attributes to include in the XML file. Always included are:
    • serial_number

    • id

    • element

to_xyz(filename: str)[source]#

Write the molecule to an XYZ file

Parameters:

filename (str) – Path to the XYZ file

trans(*bond: Atom | tuple | Bond)[source]#

Rotate the molecule such that the atoms in the bond are in a trans configuration.

Parameters:

*bond (Atom or tuple or Bond) – The bond to rotate

transpose(vector: ndarray, angle: float, axis: ndarray, center: ndarray = None, angle_is_degrees: bool = True)[source]#

Transpose the molecule in 3D space

Parameters:
  • vector (np.ndarray) – The vector to move the molecule by

  • angle (float) – The angle to rotate by

  • axis (np.ndarray) – The axis to rotate around. This must be a unit vector.

  • center (np.ndarray) – The center of the rotation

  • angle_is_degrees (bool) – Whether the angle is given in degrees (default) or radians

triple(atom1, atom2, adjust_hydrogens: bool = False)[source]#

Set a triple bond between two atoms

Parameters:
  • atom1 – The first atom

  • atom2 – The second atom

  • adjust_hydrogens (bool) – Whether to adjust the number of hydrogens on the atoms based on the bond order

Unlink atoms from the structure. This will remove the atoms from the index of the maintained structure but it will not adjust the atoms’ own parent references. This is useful if you want to have atoms be accessible from multiple Molecule objects.

Parameters:

atoms (base_classes.Atom) – The atoms to unlink

Unlink chains from the structure. This will remove the chains from the index of the maintained structure but it will not adjust the chains’ own parent references. This is useful if you want to have chains be accessible from multiple Molecule objects.

Parameters:

chains (base_classes.Chain) – The chains to unlink

Unlink residues from the structure. This will remove the residues from the index of the maintained structure but it will not adjust the residues’ own parent references. This is useful if you want to have residues be accessible from multiple Molecule objects.

Parameters:

residues (base_classes.Residue) – The residues to unlink

unlock_all()[source]#

Unlock all bonds in the structure

unlock_bond(atom1: int | str | tuple | Atom, atom2: int | str | tuple | Atom)[source]#

Unlock a bond between two atoms

Parameters:
  • atom1 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

  • atom2 – The atoms to bond, which can either be directly provided (biopython object) or by providing the serial number, the full_id or the id of the atoms.

update_atom_graph()[source]#

Generate a new up-to-date AtomGraph after any manual changes were done to the Molecule’s underlying biopython structure.

buildamol.core.entity.infer_search_param(input)[source]#

Infer the search parameter ‘by’ for the get_atoms/residues etc. methods

buildamol.core.entity.should_invert(bond, direct_connecting_atoms)[source]#

Check if a given bond should be inverted during bond direction

Parameters:
  • bond (tuple) – A tuple of two atoms that are bonded

  • direct_connecting_atoms (set) – A set of atoms that directly participate in bonds connecting different residues

Returns:

Whether the bond should be inverted

Return type:

bool

Base Classes#

The base_classes are deriviatives of the original Biopython classes, but with the change that they use a UUID4 as their identifier (full_id) instead of a hierarchical tuple. This makes each object unique and allows for easy comparison where a == b is akin to a is b. Consequently, the __hash__ method is overwritten to use the UUID4 as the hash.

Warning

Each class has its own copy method that returns a deep copy of the object with a new UUID4. So a.copy() == a is False, while a standard deepcopy(a) == a is True since the UUID4 will not have been updated automatically.

Converting to and from biopython#

Each BuildAMol class can be generated from a biopython class using the from_biopython class method. And each BuildAMol class has a to_biopython method that returns the pure-biopython equivalent. It is important to note, that for most purposes, however, the BuildAMol classes should work fine as trop-in replacements for the original biopython classes.

import Bio.PDB as bio
from buildamol.base_classes import Atom

bio_atom = bio.Atom("CA", (0, 0, 0))
atom = Atom.from_biopython(bio_atom)

assert atom == bio_atom # False since atom uses a UUID4 as its identifier
assert atom.to_biopython() == bio_atom # True

The conversion from and to biopython works hierarchically, so if an entire biopython structure is converted to BuildAMol then all atoms, residues, chains and models will be converted to their BuildAMol equivalents.

import Bio.PDB as bio
from buildamol.base_classes import Structure

bio_structure = bio.PDBParser().get_structure("test", "test.pdb")
structure = Structure.from_biopython(bio_structure)

atoms = list(structure.get_atoms())
bio_atoms = list(bio_structure.get_atoms())
assert len(atoms) == len(bio_atoms) # True
class buildamol.base_classes.Atom(id: str, coord: ndarray, serial_number: int = 1, bfactor: float = 0.0, occupancy: float = 1.0, fullname: str = None, element: str = None, altloc=' ', pqr_charge=None, radius=None)[source]#

Bases: ID, Atom

An Atom object that inherits from Biopython’s Atom class.

Parameters:
  • id (str) – The atom identifier

  • coord (ndarray) – The atom coordinates

  • serial_number (int, optional) – The atom serial number. The default is 1.

  • bfactor (float, optional) – The atom bfactor. The default is 0.0.

  • occupancy (float, optional) – The atom occupancy. The default is 1.0.

  • fullname (str, optional) – The atom fullname. The default is None, in which case the id is used again.

  • element (str, optional) – The atom element. The default is None, in which case it is inferred based on the id.

  • altloc (str, optional) – The atom altloc. The default is “ “.

  • pqr_charge (float, optional) – The atom pqr_charge. The default is None.

  • radius (float, optional) – The atom radius. The default is None.

altloc#
anisou_array#
property atomic_number#

The atomic number of the atom’s element.

bfactor#
property charge#

The atom charge.

coord#
disordered_flag#
element#
equals(other, include_coord: bool = False) bool[source]#

Check if the atom is equal to another atom. This will return True if the two atoms match and have same the parent-serial number.

classmethod from_biopython(atom) Atom[source]#

Convert a Biopython atom to an Atom object

Parameters:

atom – The Biopython atom

Returns:

The Atom object

Return type:

Atom

classmethod from_element(element: str, **kwargs)[source]#

Create a blank atom with a given element and coordinates.

Parameters:
  • element (str) – The atom element.

  • **kwargs – Additional keyword arguments to pass to the new Atom initializer.

property full_id#

A self-adjusting full_id for an Biopython Atom

fullname#
get_axial_hydrogen()[source]#

Get the axial hydrogen neighbor of an atom, if the atom is in a ring structure.

Parameters:

atom – The atom

Returns:

The axial hydrogen, if it exists, None otherwise

Return type:

Atom

get_axial_neighbor()[source]#

Get the axial neighbor of an atom, if the atom is in a ring structure.

Parameters:

atom – The atom

Returns:

The axial neighbor, if it exists, None otherwise

Return type:

Atom

get_bonds() list[source]#

Get a list of all bonds that this atom is part of.

get_equatorial_hydrogen()[source]#

Get the equatorial hydrogen neighbor of an atom, if the atom is in a ring structure.

Parameters:

atom – The atom

Returns:

The equatorial hydrogen, if it exists, None otherwise

Return type:

Atom

get_equatorial_neighbor()[source]#

Get the equatorial neighbor of an atom, if the atom is in a ring structure.

Parameters:

atom – The atom

Returns:

The equatorial neighbor, if it exists, None otherwise

Return type:

Atom

get_hydrogens() set[source]#

Get all hydrogen neighbors of an atom.

Parameters:

atom – The atom

Returns:

A set of hydrogen neighbors, if they exist, an empty set otherwise

Return type:

set

get_left_hydrogen()[source]#

Get the “left-protruding” hydrogen neighbor of an atom with two hydrogens and two non-hydrogen neighbors.

Parameters:

atom – The atom

Returns:

The left hydrogen, if it exists, None otherwise

Return type:

Atom

Example

In a molecule: ```

H_B |

CH3 – C – CH2 – OH

H_A

``` We want to get the left and right hydrogens of the central C atom (labeled only C). Using part of the logic behind R/S nomenclature for chiral centers, we prioritize the non-H neighbors and then rotate the molecule such that the highest order non-H neighbor points toward the user and the other non-H neighbor points away. The left and right hydrogens are then determined based on their orientation in this view.

In this case, the left hydrogen is H_A and the right hydrogen is H_B.

get_neighbors(n: int = 1, mode: str = 'upto', filter: callable = None) set[source]#

Get the neighboring atoms of this atom within n bonds.

Parameters:
  • n (int) – The number of bonds to search for neighbors.

  • mode (str, optional) – The mode to use for searching for neighbors. The default is “upto”, which will return all neighbors within n bonds. Other options are “at” which will return only neighbors that are exactly n bonds away.

  • filter (callable, optional) – A function that takes an atom as input and returns True if the atom should be included in the output. The default is None.

Returns:

A set of neighboring atoms.

Return type:

set

get_right_hydrogen()[source]#

Get the “right-protruding” hydrogen neighbor of an atom with two hydrogens and two non-hydrogen neighbors.

Parameters:

atom – The atom

Returns:

The right hydrogen, if it exists, None otherwise

Return type:

Atom

Example

In a molecule: ```

H_B |

CH3 – C – CH2 – OH

H_A

``` We want to get the left and right hydrogens of the central C atom (labeled only C). Using part of the logic behind R/S nomenclature for chiral centers, we prioritize the non-H neighbors and then rotate the molecule such that the highest order non-H neighbor points toward the user and the other non-H neighbor points away. The left and right hydrogens are then determined based on their orientation in this view.

In this case, the left hydrogen is H_A and the right hydrogen is H_B.

id#
level#
mass#
matches(other, include_id: bool = True, include_coord: bool = False) bool[source]#

Check if the atom matches another atom. This will return True if the two atoms have the same element, id, parent-residue name, and altloc.

property molecule#
move(vector)[source]#

Move the atom by a vector.

Parameters:

vector (ndarray) – The vector to move the atom by.

property name#

Synonym for id.

classmethod new(element_or_id: str, coord: ndarray = None, generate_id: bool = True, **kwargs) Atom[source]#

Create a blank atom with a given element and coordinates.

Parameters:
  • element_or_id (str) –

    The atom element. If the element is not found in the periodic table, it will be used as the atom id. To ensure the correct element is assigned to the atom either pass it as it as keyword argument directly or to let BuildAMol correctly infer the element use one of the following patterns with your atom id:

    • <element+><number> -> C1, O2, CA2 (calcium id=CA2)

    • <number><element+> -> 1C, 2O, 1CA (calcium id=1CA)

    • <space><element><string|number+> -> “ CA” (Carbon id=CA), “ ND2 “ (Nitrongen id=ND2), “ OXT” (Oxygen id=OXT)

    • <element+><space+> -> “CA “ (Calcium id=CA), “FE “ (Iron id=FE)

    • <element+>_<string|number+> -> “CA_” (Calcium id=CA), “C_A” (Carbon id=CA)

    (the + indicates multiple characters)

  • coord (ndarray, optional) – The atom coordinates. The default is None.

  • generate_id (bool, optional) – Whether to automatically generate a new id for the atom to avoid identically named atoms. The default is True.

  • **kwargs – Additional keyword arguments to pass to the Atom initializer

Returns:

The blank atom.

Return type:

Atom

occupancy#
parent: Residue | None#
pqr_charge#
radius#
serial_number#
set_element(element, adjust_id: bool = True)[source]#

Set the atom element.

Parameters:
  • element (str) – The element to set.

  • adjust_id (bool, optional) – Whether to adjust the atom id to the new element. The default is True.

set_id(id)[source]#

Set the atom identifier.

Parameters:

id (str) – The identifier to set.

sigatm_array#
siguij_array#
to_biopython()[source]#

Convert the Atom object to a Biopython atom

Returns:

The Biopython atom

Return type:

Atom

property weight#

The atom mass (synonym for mass).

xtra: dict#
class buildamol.base_classes.Bond(*atoms)[source]#

Bases: object

A class representing a bond between two atoms.

atom1#

The first atom in the bond.

Type:

Atom

atom2#

The second atom in the bond.

Type:

Atom

atom1#
atom2#
cis()[source]#

Make the bond a cis bond.

Returns:

The bond itself.

Return type:

Bond

compute_length() float[source]#

Compute the bond length.

Returns:

The bond length.

Return type:

float

double()[source]#

Make the bond a double bond.

get_other_atom(atom: Atom) Atom[source]#

Get the other atom in the bond.

Parameters:

atom (Atom) – The atom to get the other atom for.

Returns:

The other atom in the bond.

Return type:

Atom

Raises:

ValueError – If the atom is not in the bond.

invert()[source]#

Invert the bond, i.e. swap the two atoms.

is_cis() bool[source]#

Check if the bond is a cis bond.

Returns:

True if the bond is a cis bond, False otherwise.

Return type:

bool

is_double() bool[source]#

Check if the bond is a double bond.

Returns:

True if the bond is a double bond, False otherwise.

Return type:

bool

is_single() bool[source]#

Check if the bond is a single bond.

Returns:

True if the bond is a single bond, False otherwise.

Return type:

bool

is_trans() bool[source]#

Check if the bond is a trans bond.

Returns:

True if the bond is a trans bond, False otherwise.

Return type:

bool

is_triple() bool[source]#

Check if the bond is a triple bond.

Returns:

True if the bond is a triple bond, False otherwise.

Return type:

bool

property length: float#
order#
single()[source]#

Make the bond a single bond.

to_list() list[source]#

Convert the bond to a list of atom1, atom2, bond_order.

Returns:

The bond as a list.

Return type:

list

to_tuple() tuple[source]#

Convert the bond to a tuple of atom1, atom2, bond_order.

Returns:

The bond as a tuple.

Return type:

tuple

to_vector() ndarray[source]#

Convert the bond to a vector.

Returns:

The bond as a vector from atom1 to atom2.

Return type:

ndarray

trans()[source]#

Make the bond a trans bond.

Returns:

The bond itself.

Return type:

Bond

triple()[source]#

Make the bond a triple bond.

class buildamol.base_classes.Chain(id)[source]#

Bases: ID, Chain

A Chain object that inherits from Biopython’s Chain class.

Parameters:

id (str) – The chain identifier

add(residue)[source]#

Add a child to the Entity.

property atoms#
child_dict: dict[Any, _Child]#
child_list: list[_Child]#
copy()[source]#

Return a deep copy of the chain with a new UUID4.

Returns:

The copied chain.

Return type:

Chain

count_residues() int[source]#

Count the number of residues in the chain.

equals(other) bool[source]#

Check if the chain is equal to another chain. This will check if the two chains have the same id, the same parent-model id, and have equal residues.

classmethod from_biopython(chain) Chain[source]#

Convert a BioPython Chain object to a Chain object.

Parameters:

chain (BioPython Chain object) – The chain to convert.

Returns:

The converted chain.

Return type:

Chain

property full_id#

A self-adjusting full_id for an Biopython Chain

get_coords() ndarray[source]#

Get the coordinates of all atoms in the chain.

get_residue(residue: str | int) Residue[source]#

Get a residue by its name or serial number.

Note

If there are multiple residues with the same name, the first one will be returned.

Parameters:

residue (str or int) – The residue name or serial number.

Returns:

The residue.

Return type:

Residue

get_residues(*residues: str | int) List[Residue][source]#

Get all residues in the chain.

Parameters:

residues (str or int, optional) – The residue name or serial number to filter by.

Returns:

The list of residues. If no residues argument is specified the default generator is returned.

Return type:

List[Residue]

internal_coord#
level: str#

Softlink a residue into this chain’s child_list without touching the residue’s own parent references.

matches(other) bool[source]#

Check if the chain matches another chain. This will return True if the two chains have matching residues.

property molecule#
move(vector)[source]#

Move the chain by a vector.

Parameters:

vector (ndarray) – The vector to move the chain by.

property name#

Synonym for id.

classmethod new(id: str) Chain[source]#

Create a blank chain with a given id.

Parameters:

id (str, optional) – The chain identifier. The default is None.

Returns:

The blank chain.

Return type:

Chain

parent: _Parent | None#
property residues#
to_biopython() Chain[source]#

Convert a Chain object to a pure BioPython Chain object.

Parameters:

with_children (bool, optional) – Whether to convert the residues of the chain as well. The default is True.

Returns:

The converted chain.

Return type:

bio.Chain.Chain

Unlink a residue from this chain’s child_list without touching the residue’s own parent references.

xtra#
class buildamol.base_classes.Model(id)[source]#

Bases: Model, ID

A Model object that inherits from Biopython’s Model class.

Parameters:

id (int or str) – The model identifier

add(chain)[source]#

Add a child to the Entity.

property chains#

Get the chains in the model.

child_dict: dict[Any, _Child]#
child_list: list[_Child]#
copy()[source]#

Return a deep copy of the model with a new UUID4.

Returns:

The copied model.

Return type:

Model

count_chains() int[source]#

Count the number of chains in the model.

equals(other) bool[source]#

Check if the model is equal to another model. This will return True if the two models have the same id, same parent-structure id, and have matching chains.

classmethod from_biopython(model)[source]#

Convert a BioPython Model object to a Model object.

Parameters:

model (BioPython Model object) – The model to convert.

Returns:

The converted model.

Return type:

Model

property full_id#

A self-adjusting full_id for an Biopython Model

get_chain(chain: str | int) Chain[source]#

Get a chain by its id.

Parameters:

chain (str or int) – The chain id.

Returns:

The chain.

Return type:

Chain

get_chains(*chains: str | int) List[Chain][source]#

Get all chains in the model.

Parameters:

chains (str or int, optional) – The chain id to filter by.

Returns:

The list of chains. If no chains argument is specified the default generator is returned.

Return type:

List[Chain]

get_coords() ndarray[source]#

Get the coordinates of all atoms in the model.

level: str#

Softlink a chain into this model’s child_list without touching the chain’s own parent references.

matches(other) bool[source]#

Check if the model matches another model. This will return True if the two models have matching chains.

property molecule#
move(vector)[source]#

Move the model by a vector.

Parameters:

vector (ndarray) – The vector to move the model by.

classmethod new(id: int = None) Model[source]#

Create a blank model with a given id.

Parameters:

id (int) – The model identifier.

Returns:

The blank model.

Return type:

Model

parent: _Parent | None#
property serial_num#
property serial_number#
to_biopython()[source]#

Convert a Model object to a pure BioPython Model object.

Returns:

The converted model.

Return type:

bio.Model.Model

Unlink a chain from this model’s child_list without touching the chain’s own parent references.

xtra#
class buildamol.base_classes.Residue(resname, segid=' ', icode=1)[source]#

Bases: ID, Residue

A Residue object that inherits from Biopython’s Residue class.

Parameters:
  • resname (str) – The residue name

  • segid (str) – The residue segid.

  • icode (int) – The residue icode. This is the residue serial number.

add(atom)[source]#

Add an Atom object.

Checks for adding duplicate atoms, and raises a PDBConstructionException if so.

property atoms#
child_dict: dict[Any, _Child]#
child_list: list[_Child]#
property coord#
copy() Residue[source]#

Return a deep copy of the residue with a new UUID4.

Returns:

The copied residue.

Return type:

Residue

count_atoms() int[source]#

Count the number of atoms in the residue.

disordered#
equals(other, include_serial: bool = False) bool[source]#

Check if the residue is equal to another residue. This will check if the two residues are in the same parent and if all atoms are matching.

classmethod from_biopython(residue) Residue[source]#

Convert a BioPython Residue object to a Residue object.

Parameters:

residue (BioPython Residue object) – The residue to convert.

Returns:

The converted residue

Return type:

Residue

property full_id#

A self-adjusting full_id for an Biopython Residue

get_atom(atom: str | int) Atom[source]#

Get an atom by its name or serial number.

Parameters:

atom (str or int) – The atom name or serial number.

Returns:

The atom.

Return type:

Atom

get_atoms(*atoms: str | int) List[Atom][source]#

Get all atoms in the residue.

Parameters:

atoms (str or int, optional) – The atom name or serial number to filter by.

Returns:

The list of atoms. If no atoms argument is specified the default generator is returned.

Return type:

List[Atom]

get_bonds(residue_internal: bool = True) list[source]#

Get a list of all bonds with participating atoms that belong to this residue.

Parameters:

residue_internal (bool, optional) – Whether to only return bonds that are internal to this residue. Or also include bonds to atoms outside of this residue. The default is True.

get_coord() ndarray[source]#

Get the center of mass of the residue.

get_coords() ndarray[source]#

Get the coordinates of all atoms in the residue.

get_neighbors(n: int = 1, mode: str = 'upto', filter: callable = None) set[source]#

Get the neighboring residues of this residue as they appear in the topology ResidueGraph.

Parameters:
  • n (int) – The number of bonds to search for neighbors.

  • mode (str, optional) – The mode to use for searching for neighbors. The default is “upto”, which will return all neighbors within n bonds. Other options are “at” which will return only neighbors that are exactly n bonds away.

  • filter (callable, optional) – A function that takes a residue as input and returns True if the residue should be included in the output. The default is None.

property id#

Return identifier.

internal_coord#
level: str#

Softlink an atom into this residue’s child_list without touching the atom’s own parent references.

matches(other) bool[source]#

Check if the residue matches another residue. This will return True if the two residues have the same resname, segid, and parent-chain id.

property molecule#
move(vector)[source]#

Move the residue by a vector.

Parameters:

vector (ndarray) – The vector to move the residue by.

property name#

Synonym for resname.

classmethod new(resname: str, segid: str = ' ', icode: int = None) Residue[source]#

Create a blank residue with a given name and segid.

Parameters:
  • resname (str) – The residue name.

  • segid (str, optional) – The residue segid. The default is “ “.

  • icode (int, optional) – The residue icode. The default is None.

Returns:

The blank residue.

Return type:

Residue

parent: _Parent | None#
resname#
segid#
set_coord(value)[source]#

Set the center of mass of the residue.

to_biopython() Residue[source]#

Convert a Residue object to a pure BioPython Residue object.

Returns:

The converted residue.

Return type:

bio.Residue.Residue

Unlink an atom from this residue’s child_list without touching the atom’s own parent references.

xtra#
class buildamol.base_classes.Structure(id)[source]#

Bases: ID, Structure

A Structure object that inherits from Biopython’s Structure class.

Parameters:

id (str) – The structure identifier

add(model)[source]#

Add a child to the Entity.

child_dict: dict[Any, _Child]#
child_list: list[_Child]#
copy()[source]#

Return a deep copy of the structure with a new UUID4.

Returns:

The copied structure.

Return type:

Structure

equals(other) bool[source]#

Check if the structure is equal to another structure. This will return True if the two structures have the same id and have equal models.

classmethod from_biopython(structure: Structure) Structure[source]#

Convert a BioPython Structure object to a Structure object.

Parameters:

structure (BioPython Structure object) – The structure to convert.

Returns:

The converted structure.

Return type:

Structure

property full_id#
get_coords() ndarray[source]#

Get the coordinates of all atoms in the structure.

level: str#

Softlink a model into this structure’s child_list without touching the model’s own parent references.

matches(other) bool[source]#

Check if the structure matches another structure. This will return True if the two structures have the same id.

property molecule#
move(vector)[source]#

Move the structure by a vector.

Parameters:

vector (ndarray) – The vector to move the structure by.

classmethod new(id: str) Structure[source]#

Create a blank structure with a given id.

Parameters:

id (str) – The structure identifier.

Returns:

The blank structure.

Return type:

Structure

parent: _Parent | None#
to_biopython() Structure[source]#

Convert a Structure object to a pure BioPython Structure object.

Returns:

The converted structure.

Return type:

bio.Structure.Structure

Unlink a model from this structure’s child_list without touching the model’s own parent references.

xtra#