Biomolecules and BuildAMol#

In this tutorial we will cover:

  • how we can use the packages in the bio extension to make

    • peptides

    • lipids

    • glycans

    • oligonucleotides

The Bio Extension#

BuildAMol comes with a bio extension that contains functions to quickly model small biomolecules. Currently available are functions to model peptides, glycans, different types of lipids, as well as small stretches of DNA or RNA.

We can use these by importing the respective packages from the extension. Here we will talk more about them.

Making a Peptide#

We can use the peptide function from the bio.proteins package to obtain a model from a single-letter code amino acid sequence. The implementation is very low-level. While it will work with large sequences of amino acids, do not think that you can use it to model protein structures! It can make models for small peptides but cannot model secondary or tertiary structures!

[12]:
import buildamol as bam

# import the proteins extension
from buildamol.extensions.bio import proteins

# make a peptide
peptide_seq = "MAARGRRAWLSVLLGLVLGF"
peptide = proteins.peptide(peptide_seq)

# now optimize using rdkit's forcefield
peptide.optimize(algorithm="rdkit")

# show the peptide
peptide.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

And there we have a peptide molecule that we can work with. Of course, it does not have any notable structure, BuildAMol does not do any kind of folding after all!

Making a Glycan#

In glycobiology the IUPAC nomenclature has been widely used to represent glycans in textual form. This is because glycans tend to produce very long and complex SMILES strings, which is also why they are often difficult to compute from SMILES. Different flavors of the IUPAC nomenclature exist. BuildAMol supports the condensed version and can read text inputs in that format.

To create a glycan model from an IUPAC string we can import the glycans extension like so:

[13]:
# import the glycans extension
from buildamol.extensions.bio import glycans

# make a glycan
iupac = "Gal(b1-3)[Fuc(a1-4)]Man(b1-4)GalNAc(b1-4)GlcNAc"
glycan = glycans.glycan(iupac)

# now optimize using rdkit's forcefield
glycan.optimize(algorithm="rdkit")

# show the glycan
glycan.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

And there we have a small glycan model that we can use further…

Making Lipids#

Lipids are a little more diverse than peptides or glycans. For once, the length and saturation of fatty acids can be very flexible. Furthermore, different types of lipids include different backbones or head groups. In the lipids extension BuildAMol offers functions to create:

  • fatty acids

  • mono-, di-, and triacylglycerols (all in the triacylglycerol function)

  • phospholipids

  • sphingolipids

Let’s explore a bit how we can work with these functions:

[1]:
# import the lipids extension
from buildamol.extensions.bio import lipids

Making Fatty Acids#

Using the fatty_acid function we can control the length as well as saturation of the fatty acids we produce. The saturation can be controlled by either specifying exactly where we want double bonds and whether they are supposed to be in cis configuration, or by simply providing inputs for the number of double bonds as well as a probability of a bond to be in cis configuration. Like so we can quickly both generate specific fatty acids or a population of random ones. Here’s how:

[4]:
# make a fatty acid with 18 carbons and 2 double bonds, with a 50% chance for cis configuration
fa1 = lipids.fatty_acid(18, 2, cis=0.5)
fa1.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

And there we have one fatty acid! Let’s make some more:

[5]:
# make a fatty acid with 16 carbons and one double bond at the 9th position in trans configuration
# to specify individual double bonds, we use a tuple of positions rather than an integer
fa2 = lipids.fatty_acid(16, (9,), cis=False)
fa2.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[8]:
# make a fatty acid with 20 carbons and 2 double bonds at the 5th and 8th positions in cis and trans configuration
fa3 = lipids.fatty_acid(20, (5, 8), cis=(True, False))
fa3.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

Feel free to play around a little more if you like. Now let’s make some larger lipids using these fatty acids!

Making Acylglycerols#

Using the function triacylglycerol we can make mono-, di-, and triacylglycerols by passing one, two, or three, fatty acid molecules as arguments. Positions where no fatty acid should be we can specify by passing None.

[9]:
# make a triacylglycerol
tag = lipids.triacylglycerol(fa1, fa2, fa3)
tag.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[10]:
#  make a diacylglycerol with the middle position empty
dag = lipids.triacylglycerol(fa1, None, fa3)
dag.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

Making Phospho- and Sphingolipids#

Phospho- and Sphingolipids haves two fatty acid chains and one headgroup. The headgroups are more diverse in structure which is why the phospholipid and sphingolipid function require in addition to the molecules themselves also a Linkage that defines how the headgroup should be connected. The Linkage only needs to specify the source atom and deleters, however, the target settings will be automatically applied.

[14]:
# make a phosphatidylserine
ser = proteins.amino_acids.serine

# define the linkage with which to attach the serine to the glycerol
# (we do not need to specify the atom1 because the position of where the headgroup is attached
# is already known in the glycerol, the only uncertainty is which headgroup atoms to use)
link = bam.linkage(atom1=None, atom2="CB", delete_in_source=["OG", "HG"])

# make the phosphatidylserine
ps = lipids.phospholipid(fa1, fa2, headgroup=ser, headgroup_link=link)
ps.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[15]:
# make a sphingoglycolipid
# let's actually just attach the glycan we made before as a headgroup

# make sure we attach the glycan via it's first (i.e. root) residue
glycan.set_attach_residue(1)

# define the linkage with which to attach the glycan to the sphingosine
link = bam.linkage(None, "C1", delete_in_source=["O1", "HO1"])

# make the sphingoglycolipid
sgl = lipids.sphingolipid(fa1, headgroup=glycan, headgroup_link=link)
sgl.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

Making Nucleic Acids#

Oligonucleotides are again a simpler case that is very similar to the peptide extension. We can create small oligonucleotides from a sequence using the dna or rna functions like so:

[16]:
from buildamol.extensions.bio import nucleic_acids

# make an RNA strand
seq =  "ACCUCAAGAGACUAC"
rna = nucleic_acids.rna(seq)

# now optimize using rdkit's forcefield
rna.optimize(algorithm="rdkit")
rna.py3dmol().show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

So that just about wraps up the bio extension! We saw how we can build simple peptides, glycans, various lipids, and nucleic acids using easy toplevel functions. Of course, the lipids especially might require some additional post-processing to align the fatty acid chains properly (if one wanted to construct things like membranes for instance).

Thanks for checking out this tutorial and good luck in your next project using BuildAMol!