"""
Linkage definitions
===================
A linkage is a connection between two _molecules_. At its core each linkage simply defines two atoms that should be connected,
and what atoms to remove in the process. It is a "pseudo" chemical reaction, so to speak.
Building on the CHARMM force field, biobuild distinguishes two kinds of linkages: patches and recipies.
A **patch** is a linkage that can be applied purely geometrically and does not require numeric optimization. This is because a
patch includes geometric data in form of _internal coordinates_ of the atoms in the immediate vicinity of the newly formed bond.
Using this data, biobuild is able to attach molecule to one another through simple matrix transformations. Conesquently, patches
are the most efficient way to connect molecules and are preferable to **recipes** - the other type of linkage.
A **recipe** on the other hand is a linkage that requires numeric optimization. This is because a recipe does not include any
geometric data, but only the atoms that should be connected. The numeric optimization is then used to find the optimal (or at least suitable) conformation.
This is useful for most users who wish to define their own linkage types, but who will likely not wish to painstakingly define the detailed geometry of angles and dihedrals of the atom neighborhood.
The distinction between patches and recipies is purely nominal, as both are represented by the `Linkage` class. However, there are functional
wrappers available to create either a patch or recipe, respectively, which require different arguments (to make sure they are not forgotten and to make the code more readable).
.. code-block:: python
from biobuild import recipe
# Create a custom recipe
my_link = recipe(
atom1 = "C1",
atom2 = "O4",
delete_in_target = ["O1", "HO1"],
delete_in_source = ["HO4"],
id = "my_link"
)
Pre-defined patches
-------------------
biobuild comes with a number of pre-defined patches from the CHARMM force field. These can be accessed through the `resources` module:
.. code-block:: python
from biobuild import resources
# Get a list of all pre-defined patches
patches = resources.available_patches()
# Check for a specific patch
resources.has_patch("some_patch")
# Get a specific patch
my_patch = resources.get_patch("some_patch")
A custom linkage can be added to the list of pre-defined patches by using the `add_patch` function:
.. code-block:: python
# add the above defined my_link to the list of pre-defined patches
resources.add_patch(my_link)
.. note::
Despite the use of "patch" in the function nomenclature, there is no difference between a patch and a recipe in terms of
how they are used. Patches and Recipies are represented by the same data class and thus behave identically.
Hence, there are also functional wrappers with the "linkage" available
that can be used instead (if a user feels more comfortable with this) - they perform the same function.
.. code-block:: python
resources.add_linkage(my_link)
# performs the same as
resources.add_patch(my_link)
# check for a specific linkage
resources.has_linkage("my_link")
# performs the same as
resources.has_patch("my_link")
# etc.
Pre-defined patches can be accessed directly by their `id` and need not be obtained first through the `resources` module. They can be directly passed
to the ``Molecule``'s ``attach`` method or any other function that requires a linkage:
.. code-block:: python
import biobuild as bb
mol1 = bb.read_pdb("my_molecule.pdb")
mol2 = bb.read_pdb("my_other_molecule.pdb")
# Attach mol2 to mol1 using the pre-defined patch "some_patch"
mol1.attach(mol2, "some_patch")
# works the same as doing
some_patch = bb.get_patch("some_patch")
mol1.attach(mol2, some_patch)
"""
import biobuild.utils as utils
import biobuild.structural.neighbors as neighbors
__all__ = ["Linkage", "recipe", "patch", "linkage"]
[docs]
def patch(
atom1,
atom2,
delete_in_target,
delete_in_source,
internal_coordinates: dict,
id: str = None,
description: str = None,
) -> "Linkage":
"""
Make a new `Linkage` instance that describes a "patch" between two molecules.
A patch is a linkage that can be applied purely geometrically and does not require numeric optimization.
As such, it requires the internal coordinates of the atoms in the immediate vicinity of the newly formed bond.
Parameters
----------
atom1 : str or tuple of str
The atom in the first (target) molecule to connect.
atom2 : str or tuple of str
The atom in the second (source) molecule to connect.
delete_in_target : str or tuple of str
The atom(s) in the first molecule to delete.
delete_in_source : str or tuple of str
The atom(s) in the second molecule to delete.
internal_coordinates : dict, optional
The internal coordinates of the atoms in the immediate vicinity of the newly formed bond.
If provided, the link can be applied purely geometrically and will not require numeric optimization.
If provided, this must be a dictionary where keys are tuples of four atoms ids and values tuples containing (in order):
- the bond length between the first and second atom (first and third in case of an improper)
- the bond length between the third and fourth atom
- the bond angle between the first, second and third atom
- the bond angle between the second, third and fourth atom
- the dihedral angle between the first, second, third and fourth atom
- True if the internal coordinate is improper, False otherwise
id : str, optional
The id of the linkage.
description : str, optional
A description of the linkage.
Returns
-------
Linkage
The new linkage.
"""
return linkage(
atom1,
atom2,
delete_in_target=delete_in_target,
delete_in_source=delete_in_source,
internal_coordinates=internal_coordinates,
id=id,
description=description,
)
[docs]
def recipe(
atom1,
atom2,
delete_in_target=None,
delete_in_source=None,
id: str = None,
description: str = None,
) -> "Linkage":
"""
Make a new `Linkage` instance that describes a "recipe" to connect two molecules.
A recipe is a linkage that can be applied numerically and requires numeric optimization as it does not
have the internal coordinates of the atoms in the immediate vicinity of the newly formed bond.
Parameters
----------
atom1 : str or tuple of str
The atom in the first (target) molecule to connect.
atom2 : str or tuple of str
The atom in the second (source) molecule to connect.
delete_in_target : str or tuple of str
The atom(s) in the first molecule to delete.
If not provided, any Hydrogen atom bound to atom1 will be deleted.
delete_in_source : str or tuple of str
The atom(s) in the second molecule to delete.
If not provided, any Hydrogen atom bound to atom2 will be deleted.
id : str, optional
The id of the linkage.
description : str, optional
A description of the linkage.
Returns
-------
Linkage
The new linkage.
"""
return linkage(
atom1,
atom2,
delete_in_target=delete_in_target,
delete_in_source=delete_in_source,
id=id,
description=description,
)
[docs]
def linkage(
atom1,
atom2,
delete_in_target=None,
delete_in_source=None,
internal_coordinates: dict = None,
id: str = None,
description: str = None,
) -> "Linkage":
"""
Make a new `Linkage` instance to connect two molecules together.
Parameters
----------
atom1 : str or tuple of str
The atom in the first (target) molecule to connect.
atom2 : str or tuple of str
The atom in the second (source) molecule to connect.
delete_in_target : str or tuple of str, optional
The atom(s) in the first molecule to delete.
If not provided, any Hydrogen atom bound to atom1 will be deleted.
delete_in_source : str or tuple of str, optional
The atom(s) in the second molecule to delete.
If not provided, any Hydrogen atom bound to atom2 will be deleted.
internal_coordinates : dict, optional
The internal coordinates of the atoms in the immediate vicinity of the newly formed bond.
If provided, the link can be applied purely geometrically and will not require numeric optimization.
If provided, this must be a dictionary where keys are tuples of four atoms ids and values tuples containing (in order):
- the bond length between the first and second atom (first and third in case of an improper)
- the bond length between the third and fourth atom
- the bond angle between the first, second and third atom
- the bond angle between the second, third and fourth atom
- the dihedral angle between the first, second, third and fourth atom
- True if the internal coordinate is improper, False otherwise
id : str, optional
The ID of the linkage.
description : str, optional
A description of the linkage.
Returns
-------
Linkage
The new linkage instance.
"""
# make a new linkage
new_linkage = Linkage(id=id, description=description)
# add the bond
new_linkage.add_bond(utils.abstract.AbstractBond(atom1, atom2))
# add the atoms to delete
if delete_in_target is not None:
for i in delete_in_target:
new_linkage.add_delete(i, "target")
if delete_in_source is not None:
for i in delete_in_source:
new_linkage.add_delete(i, "source")
# add the internal coordinates
if internal_coordinates is not None:
if not isinstance(internal_coordinates, dict):
raise TypeError(
"The internal coordinates must be provided as a dictionary."
)
for ic in _dict_to_ics(internal_coordinates):
new_linkage.add_internal_coordinates(ic)
# return the linkage
return new_linkage
[docs]
class Linkage(utils.abstract.AbstractEntity_with_IC):
"""
Using the `Linkage` class, a template reaction instruction is stored for attaching molecules to one another.
Parameters
----------
id : str, optional
The ID of the linkage.
description : str, optional
An additional description of the linkage.
Attributes
----------
id : str
The ID of the linkage.
bond : tuple of str
The bond to form between the two molecules.
internal_coordinates : list of InternalCoordinate
The internal coordinates of the atoms in the immediate vicinity of the newly formed bond.
deletes : tuple of list of str
The atom IDs to delete
in a tuple of lists where the first list
contains the atom IDs to delete from the
first structure (target) and the second one from the second structure (source)
atoms : list of str
The atom IDs of the atoms in the linkage.
"""
def __init__(self, id=None, description: str = None) -> None:
super().__init__(id)
self._delete_ids = []
self.description = description
@property
def atom1(self) -> str:
"""
The atom ID of the first atom in the bond.
"""
return self.bond[0]
@atom1.setter
def atom1(self, value: str) -> None:
if hasattr(value, "id"):
value = value.id
self.bond = (value, self.bond[1])
@property
def atom2(self) -> str:
"""
The atom ID of the second atom in the bond.
"""
return self.bond[1]
@atom2.setter
def atom2(self, value: str) -> None:
if hasattr(value, "id"):
value = value.id
self.bond = (self.bond[0], value)
@property
def bond(self) -> tuple:
"""
The bond to form between the two molecules.
"""
if len(self.bonds) == 0:
return None
return self.bonds[0]
@bond.setter
def bond(self, value: tuple) -> None:
self.bonds = [value]
@property
def _ref_atoms(self) -> tuple:
"""
Reference atoms with 1,2 prefix for the patcher
"""
if not self.bond:
return None, None
a = f"1{self.bond[0]}" if not self.bond[0].startswith("1") else self.bond[0]
b = f"2{self.bond[1]}" if not self.bond[1].startswith("2") else self.bond[1]
return a, b
@property
def _stitch_ref_atoms(self) -> tuple:
"""
Reference atoms without 1,2 prefix for the stitcher
"""
if not self.bond:
return None, None
a = f"{self.bond[0][1:]}" if self.bond[0].startswith("1") else self.bond[0]
b = f"{self.bond[1][1:]}" if self.bond[1].startswith("2") else self.bond[1]
return a, b
[docs]
@classmethod
def from_json(cls, filename: str):
"""
Make a new `Linkage` instance from a JSON file.
Parameters
----------
filename : str
The JSON filename.
"""
_dict = utils.json.read(filename)
return cls._from_dict(_dict)
@classmethod
def _from_dict(cls, _dict):
"""
Make a new `Linkage` instance from a JSON dictionary.
Parameters
----------
_dict : dict
The JSON dictionary.
"""
new = cls(id=_dict["id"], description=_dict["description"])
new.add_bond(
utils.abstract.AbstractBond(
_dict["bond"]["target"], _dict["bond"]["source"]
),
)
for i in _dict["to_delete"]["target"]:
new.add_delete(i, "target")
for i in _dict["to_delete"]["source"]:
new.add_delete(i, "source")
for i in _dict["ics"]:
new.add_internal_coordinates(
utils.ic.InternalCoordinates._from_dict(i),
)
if len(new.internal_coordinates) != 0:
has_two_residues = False
for i in new.internal_coordinates:
if any([j.startswith("2") for j in i.atoms]):
has_two_residues = True
if not has_two_residues:
raise ValueError(
f"The linkage '{new.id}' contains only internal coordinates for one residue. It must contain internal coordinates spanning both residues!"
)
return new
@property
def deletes(self):
"""
Returns the atom IDs to delete
in a tuple of lists where the first list
contains the atom IDs to delete from the
first structure (target) and the second one from the second structure (source)
"""
deletes = (
[i[1:] for i in self._delete_ids if i[0] == "1"],
[i[1:] for i in self._delete_ids if i[0] == "2"],
)
return deletes
[docs]
def add_delete(self, id, _from: str = None):
"""
Add an atom ID to delete
Parameters
----------
id : str
The atom ID to delete.
_from : str, optional
The structure from which to delete the atom.
Can be either "source" or "target". If not provided,
the structure is inferred from the atom ID, in which case either `1` (target) or `2` (source) must be the first character of the ID.
"""
if _from is None:
if not id[0] in ["1", "2"]:
raise ValueError(
"The atom ID must start with either 1 or 2 to indicate from which structure it is to be deleted."
)
else:
if _from == "source":
if isinstance(id, str):
id = "2" + id
else:
id = ("2", *id)
elif _from == "target":
if isinstance(id, str):
id = "1" + id
else:
id = ("1", *id)
else:
raise ValueError(
"The _from argument must be either 'source' or 'target'."
)
self._delete_ids.append(id)
add_id_to_delete = add_delete
[docs]
def add_internal_coordinates(self, ic):
if isinstance(ic, dict):
ic = utils.ic.InternalCoordinates._from_dict(ic)
elif isinstance(ic, neighbors.Quartet):
ic = utils.ic.InternalCoordinates.from_quartet(ic)
elif isinstance(ic, (tuple, list)) and len(ic) == 11:
ic = utils.ic.InternalCoordinates(*ic)
elif not isinstance(ic, utils.ic.InternalCoordinates):
raise TypeError(
"The internal coordinate must be an instance of the InternalCoordinates class."
)
# now we need to vet that the internal coordinates span both molecules (by spanning different residues)
if isinstance(ic.atom1, str):
_residues = list(set(a[0] for a in ic.atoms))
use_bare_strings = True
else:
_residues = list(set(a.get_parent() for a in ic.atoms))
use_bare_strings = False
if use_bare_strings:
_residues.sort()
else:
_residues.sort(key=lambda x: x.id[1])
_residues_dict = {_residues[0]: "1"}
if len(_residues) == 2:
_residues_dict[_residues[1]] = "2"
if not use_bare_strings:
prefix = (
lambda x: _residues_dict[x.get_parent()] + x.id
if not x.id[0] in ("1", "2")
else x.id
)
ic.atom1 = prefix(ic.atom1)
ic.atom2 = prefix(ic.atom2)
ic.atom3 = prefix(ic.atom3)
ic.atom4 = prefix(ic.atom4)
return super().add_internal_coordinates(ic)
[docs]
def to_json(self, filename: str):
"""
Write the `Linkage` instance to a JSON file.
Parameters
----------
filename : str
The JSON filename.
"""
utils.json.write_linkage(self, filename)
def __getitem__(self, index):
if len(self.bonds) == 0:
raise IndexError("The linkage does not contain a bond.")
return self.bonds[0][index]
def __iter__(self):
if len(self.bonds) == 0:
raise StopIteration
return iter(self.bonds[0])
def __hash__(self) -> int:
return hash(self.id) + hash(tuple(self.bonds))
def _dict_to_ics(_dict):
"""
Convert a dictionary of internal coordinates to a list of `InternalCoordinate` instances.
"""
ics = []
for key, value in _dict.items():
if len(key) == 4 and len(value) == 6:
improper = value[-1]
ic = utils.ic.InternalCoordinates(
*key,
bond_length_12=value[0] if not improper else None,
bond_length_13=value[0] if improper else None,
bond_length_34=value[1],
bond_angle_123=value[2],
bond_angle_234=value[3],
dihedral=value[4],
improper=improper,
)
elif len(key) != 4:
raise ValueError(
"The internal coordinate must be provided as a tuple of four atom IDs."
)
elif len(value) != 6:
raise ValueError(
"The internal coordinate must be provided as a tuple of six values."
)
ics.append(ic)
return ics
if __name__ == "__main__":
link = linkage(
"C1",
"O4",
["H1"],
["HO4"],
internal_coordinates={
("1C1", "1C2", "2O4", "2C4"): [1.1, 1.2, 1.3, 1.4, 1.5, False]
},
)
link.to_json("link.json")
link2 = Linkage.from_json("link.json")
pass