Custom Modifiers#
In this tutorial we will cover:
how we can define our own modifier functions like
carboxylatehow we can define other modifier functions
Building molecules is a straightforward task in BuildAMol. A “modifier” function is essentially any function that will take a molecule as argument and return a modified version thereof. There is technically nothing special about this process, so you can write any function that performs some action on the molecule and you have a working modifier. Why this tutorial then? You may have been using the available modifiers such as hydroxylate or carboxylate that can add functional groups to one
or more positions in a molecule. This tutorial explains how they work so that you can more efficiently make your own modifiers.
Let’s dive in!
[12]:
import plotly
plotly.offline.init_notebook_mode()
[13]:
import buildamol as bam
Functional Group Modifiers#
All the default modifiers share the same architecture actually. They take one molecule as argument alongside with information on where to modify, then they obtain a reference molecule for the respective functional group they want to add, then they call the _modify function which provides a generic implementation for attaching one molecule (the functional group) to multiple locations of a target molecule.
To illustrate this, let’s remake our own carboxylate function!
[14]:
# first we need to get a suitable reference molecule for the carboxyl group.
# Formic Acid should be the best choice. Let's see if we have that in the database
bam.load_small_molecules()
bam.get_compound("formic acid")
[14]:
[Molecule(CBX), Molecule(FMT)]
So we have even two! CBX is officially called carboxy group in the database while FMT is formic acid, but structurally they are identical. Let’s use CBX for our function. Now that we know which reference molecule to use for our carboxyl group we can do additional preprocessing steps. For instance, it is common to call the carbonyl-carbon C and the hydroxyl-Oxygen OXT. If we want our carboxyl group to adhere to these conventions we will have to rename the atoms
accordingly.
[15]:
# do some atom renaming
# (call the `show` method to see
# the molecule before renaming if
# you want to check the atom names beforehand)
carboxyl = bam.Molecule.from_compound("CBX")
carboxyl.rename_atom("O1", "O")\
.rename_atom("O2", "OXT")\
.rename_atom("HO2", "HXT")
carboxyl.show()
/Users/noahhk/anaconda3/envs/glyco2/lib/python3.11/site-packages/plotly/express/_core.py:1985: FutureWarning:
When grouping with a length-1 list-like, you will need to pass a length-1 tuple to get_group in a future version of pandas. Pass `(name,)` instead of `name` to silence this warning.
Great! Now that we have our “final” carboxyl group molecule we can think about making our actual modifier function. To do that we simply define a function with the following signature:
def add_carboxyl(mol, at_atom, delete, inplace):
# get and preprocess the carboxyl group molecule
carboxyl = ...
# then call _modify with the following arguments
return _modify(
mol=mol,
modifier=carboxyl,
at_atom=at_atom,
delete=delete,
modifier_at_atom=<the atom in carboxyl to use for attaching>,
modifier_deletes=<the atom(s) to delete from carboxyl>,
inplace=inplace
)
Note:
_modifywill create linkages automatically and the default rule of removing a hydrogen neighbor if no deleting atoms are specified still applies. So we don’t necessarily have to specifiy deleter atoms!
Or fully implemented:
[16]:
# import the _modify function
from buildamol.core.Molecule import _modify
def add_carboxyl(mol, at_atom, delete=None, inplace=True):
# prepare our carboxyl group
carboxyl = bam.Molecule.from_compound("CBX")
carboxyl.rename_atom("O1", "O")\
.rename_atom("O2", "OXT")\
.rename_atom("HO2", "HXT")
# we want to attach the carboxyl group at the "C" atom (carbonyl carbon)
# and remove the "H" atom (hydrogen from the "C" atom)
return _modify(
mol, carboxyl,
at_atom=at_atom, delete=delete, inplace=inplace,
modifier_at_atom="C", modifier_deletes=["H"]
)
And that’s it! Let’s test our function:
[17]:
# let's carboxylate a benzene molecule
benzene = bam.get_compound("benzene")
for carbon in benzene.get_atoms("C", by="element"):
# since we are removing a hydrogen atom,
# we don't have to specify the delete parameter
benzene = add_carboxyl(benzene, carbon)
benzene.py3dmol().show()
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
But since we are using _modify the automatic casting also works so we don’t have to bother with writing our own loop to iterate over multiple atoms. We can simply pass a list of target atoms to attach to like so:
[18]:
benzene = bam.get_compound("benzene")
# we pass a list of atom identifiers (ids in this case)
benzene = add_carboxyl(benzene, ["C1", "C3", "C5"])
benzene.py3dmol().show()
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
Other Modifiers#
Of course “modifiers” don’t necessarily have to add functional groups onto a molecule, any function that will alter something in a molecule’s structure is a modifier, after all!
Let’s make a second example that will invert all double bonds in a molecule from cis to trans or vice versa!
[19]:
# let's start by making a testing molecule
# using the polycarbon extension
from buildamol.extensions.polymers import linear_alkene
alkene = linear_alkene(10)
# and make two double bonds into cis double bonds
alkene.cis("C3", "C4").cis("C7", "C8")
alkene.py3dmol().show()
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
Cool, so you may have noticed that I called alkene.cis(...) just now. We can modify the stereochemistry of double bonds using the methods Molecule.cis and Molecule.trans and we can check if a specific bond is in cis or trans configuration using Molecule.is_cis and Molecule.is_trans, respectively.
Hopefully, by this point you already have an idea of how our “bond inversion” modifier function could look like. Here’s the architecture:
get all double bonds
assert for each if they are in trans or cis
apply the inverse operation cis->trans / trans->cis on each double bond
There is no pre-made function like _modify that we can (or need to) use here. It’s just a standard function like so:
[20]:
def invert_stereo(mol):
# get all double bonds
double_bonds = (i for i in mol.get_bonds() if i.order == 2)
# now split them up in cis and trans
cis_bonds = []
trans_bonds = []
for bond in double_bonds:
if mol.is_cis(bond):
cis_bonds.append(bond)
elif mol.is_trans(bond):
trans_bonds.append(bond)
# now we can invert the stereochemistry
for bond in cis_bonds:
mol.trans(bond)
for bond in trans_bonds:
mol.cis(bond)
return mol
Great! Let’s see if it works:
[21]:
alkene_inverted = invert_stereo(alkene.copy())
alkene_inverted.py3dmol().show()
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
Cool, there we have the middle bond in cis and the two others in trans!
That’s it for this tutorial! We saw that we can quite easily make functions to add specific functional groups in a versatile way. So, if your projects involve often doing the same kinds of steps, maybe try making some modifiers in a separate python module and thereby develop your own little BuildAMol extension!
Thanks for checking out this tutorial and good luck with your projects using BuildAMol!