Custom Modifiers#

In this tutorial we will cover:

  • how we can define our own modifier functions like carboxylate

  • how we can define other modifier functions

Building molecules is a straightforward task in BuildAMol. A “modifier” function is essentially any function that will take a molecule as argument and return a modified version thereof. There is technically nothing special about this process, so you can write any function that performs some action on the molecule and you have a working modifier. Why this tutorial then? You may have been using the available modifiers such as hydroxylate or carboxylate that can add functional groups to one or more positions in a molecule. This tutorial explains how they work so that you can more efficiently make your own modifiers.

Let’s dive in!

[12]:
import plotly
plotly.offline.init_notebook_mode()
[13]:
import buildamol as bam

Functional Group Modifiers#

All the default modifiers share the same architecture actually. They take one molecule as argument alongside with information on where to modify, then they obtain a reference molecule for the respective functional group they want to add, then they call the _modify function which provides a generic implementation for attaching one molecule (the functional group) to multiple locations of a target molecule.

To illustrate this, let’s remake our own carboxylate function!

[14]:
# first we need to get a suitable reference molecule for the carboxyl group.
# Formic Acid should be the best choice. Let's see if we have that in the database
bam.load_small_molecules()
bam.get_compound("formic acid")
[14]:
[Molecule(CBX), Molecule(FMT)]

So we have even two! CBX is officially called carboxy group in the database while FMT is formic acid, but structurally they are identical. Let’s use CBX for our function. Now that we know which reference molecule to use for our carboxyl group we can do additional preprocessing steps. For instance, it is common to call the carbonyl-carbon C and the hydroxyl-Oxygen OXT. If we want our carboxyl group to adhere to these conventions we will have to rename the atoms accordingly.

[15]:
# do some atom renaming
# (call the `show` method to see
# the molecule before renaming if
# you want to check the atom names beforehand)
carboxyl = bam.Molecule.from_compound("CBX")
carboxyl.rename_atom("O1", "O")\
        .rename_atom("O2", "OXT")\
        .rename_atom("HO2", "HXT")

carboxyl.show()
/Users/noahhk/anaconda3/envs/glyco2/lib/python3.11/site-packages/plotly/express/_core.py:1985: FutureWarning:

When grouping with a length-1 list-like, you will need to pass a length-1 tuple to get_group in a future version of pandas. Pass `(name,)` instead of `name` to silence this warning.

Great! Now that we have our “final” carboxyl group molecule we can think about making our actual modifier function. To do that we simply define a function with the following signature:

def add_carboxyl(mol, at_atom, delete, inplace):
    # get and preprocess the carboxyl group molecule
    carboxyl = ...
    # then call _modify with the following arguments
    return _modify(
        mol=mol,
        modifier=carboxyl,
        at_atom=at_atom,
        delete=delete,
        modifier_at_atom=<the atom in carboxyl to use for attaching>,
        modifier_deletes=<the atom(s) to delete from carboxyl>,
        inplace=inplace
        )

Note: _modify will create linkages automatically and the default rule of removing a hydrogen neighbor if no deleting atoms are specified still applies. So we don’t necessarily have to specifiy deleter atoms!

Or fully implemented:

[16]:
# import the _modify function
from buildamol.core.Molecule import _modify

def add_carboxyl(mol, at_atom, delete=None, inplace=True):
    # prepare our carboxyl group
    carboxyl = bam.Molecule.from_compound("CBX")
    carboxyl.rename_atom("O1", "O")\
            .rename_atom("O2", "OXT")\
            .rename_atom("HO2", "HXT")
    # we want to attach the carboxyl group at the "C" atom (carbonyl carbon)
    # and remove the "H" atom (hydrogen from the "C" atom)
    return _modify(
        mol, carboxyl,
        at_atom=at_atom, delete=delete, inplace=inplace,
        modifier_at_atom="C", modifier_deletes=["H"]
    )

And that’s it! Let’s test our function:

[17]:
# let's carboxylate a benzene molecule
benzene = bam.get_compound("benzene")

for carbon in benzene.get_atoms("C", by="element"):
    # since we are removing a hydrogen atom,
    # we don't have to specify the delete parameter
    benzene = add_carboxyl(benzene, carbon)

benzene.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

But since we are using _modify the automatic casting also works so we don’t have to bother with writing our own loop to iterate over multiple atoms. We can simply pass a list of target atoms to attach to like so:

[18]:
benzene = bam.get_compound("benzene")
# we pass a list of atom identifiers (ids in this case)
benzene = add_carboxyl(benzene, ["C1", "C3", "C5"])

benzene.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

Other Modifiers#

Of course “modifiers” don’t necessarily have to add functional groups onto a molecule, any function that will alter something in a molecule’s structure is a modifier, after all!

Let’s make a second example that will invert all double bonds in a molecule from cis to trans or vice versa!

[19]:
# let's start by making a testing molecule
# using the polycarbon extension
from buildamol.extensions.polymers import linear_alkene

alkene = linear_alkene(10)

# and make two double bonds into cis double bonds
alkene.cis("C3", "C4").cis("C7", "C8")

alkene.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

Cool, so you may have noticed that I called alkene.cis(...) just now. We can modify the stereochemistry of double bonds using the methods Molecule.cis and Molecule.trans and we can check if a specific bond is in cis or trans configuration using Molecule.is_cis and Molecule.is_trans, respectively.

Hopefully, by this point you already have an idea of how our “bond inversion” modifier function could look like. Here’s the architecture:

  • get all double bonds

  • assert for each if they are in trans or cis

  • apply the inverse operation cis->trans / trans->cis on each double bond

There is no pre-made function like _modify that we can (or need to) use here. It’s just a standard function like so:

[20]:
def invert_stereo(mol):
    # get all double bonds
    double_bonds = (i for i in mol.get_bonds() if i.order == 2)
    # now split them up in cis and trans
    cis_bonds = []
    trans_bonds = []
    for bond in double_bonds:
        if mol.is_cis(bond):
            cis_bonds.append(bond)
        elif mol.is_trans(bond):
            trans_bonds.append(bond)

    # now we can invert the stereochemistry
    for bond in cis_bonds:
        mol.trans(bond)
    for bond in trans_bonds:
        mol.cis(bond)

    return mol

Great! Let’s see if it works:

[21]:
alkene_inverted = invert_stereo(alkene.copy())

alkene_inverted.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

Cool, there we have the middle bond in cis and the two others in trans!

That’s it for this tutorial! We saw that we can quite easily make functions to add specific functional groups in a versatile way. So, if your projects involve often doing the same kinds of steps, maybe try making some modifiers in a separate python module and thereby develop your own little BuildAMol extension!

Thanks for checking out this tutorial and good luck with your projects using BuildAMol!