Using Molecular Geometries#
In this tutorial we will cover:
how we can use BuildAMol’s molecular geometries to build fragment molecules from scratch
The core business of BuildAMol is fragment-based assembly. This requires that we have suitable fragment molecules available. In 99.9% of all cases it should be no trouble whatsoever to either use the built-in resources, use SMILES to impute templates, or query PubChem for molecules. However, there may be cases where this approach fails. For instance, think of phosphorus pentafluoride (PO5), a small inorganic molecule.
[1]:
import plotly
plotly.offline.init_notebook_mode()
Let us first check if we have the molecule available in the built-in resources:
[2]:
import buildamol as bam
bam.load_small_molecules()
bam.has_compound("PF5", search_by="formula")
[2]:
False
Well, if the compound is not available, let us query PubChem for it…
[3]:
bam.query_pubchem("PF5")
No matches… Let’s try again but with the name
[4]:
bam.query_pubchem("phosphorus pentafluoride")
Nope… Well, the SMILES should definitely work, though:
[5]:
mol = bam.read_smiles("P(F)(F)(F)(F)F")
mol.show()
/Users/noahhk/anaconda3/envs/glyco2/lib/python3.11/site-packages/plotly/express/_core.py:1985: FutureWarning:
When grouping with a length-1 list-like, you will need to pass a length-1 tuple to get_group in a future version of pandas. Pass `(name,)` instead of `name` to silence this warning.
Mhmm… Somehow that does not look like the PF5 we were expecting. For starters, there appear to be only four F atoms, and secondly PF5 is supposed to have a trigonal bipyramidal geometry and not whatever this is…
So it seems like we hit a particularly tricky case with this tiny molecule. Can we simply not build it at all? Well, we wouldn’t be having this tutorial if that were the case! In fact, to account for exactly such rare use cases where a particular molecule is not available through conventional means, BuildAMol comes with molecular geometries that can be used to define valid coordinates for manual atom placement.
Let’s see how this works as we construct PF5!
Building PF5 from scratch - the long way#
The geometries can be found in the structural.geometry submodule. They come with methods that accept one or more atoms to use for referencing the available coordinates and constructing the remaining ones. For our use case we need to use a trigonal_bipyramidal geometry. But first, let’s just make some atoms to build a PF5 molecule.
[6]:
# make some atoms
P = bam.Atom.new("P")
Fs = [bam.Atom.new("F") for _ in range(5)]
# let's assemble a molecule for our PF5
pf5 = bam.Molecule.new("PF5", "PF5")
pf5.add_atoms(P, *Fs)
# also add bonds
for F in Fs:
pf5.add_bond(P, F)
pf5.show()
/Users/noahhk/anaconda3/envs/glyco2/lib/python3.11/site-packages/plotly/express/_core.py:1985: FutureWarning:
When grouping with a length-1 list-like, you will need to pass a length-1 tuple to get_group in a future version of pandas. Pass `(name,)` instead of `name` to silence this warning.
Now, we have a PF5 molecule with the right connectivity but all of the coordinates are (0,0,0). Let’s change that by applying the trigonal bipyramidal geometry to the atoms. Here’s how:
[7]:
# use trigonal bipyramidal geometry to arrange the atoms
# we just pass the central P atom as only reference point
# and let the other coordinates be generated automatically
# they will be automatically applied onto the atoms that we pass
# as the second list
bam.structural.geometry.trigonal_bipyramidal.make_and_apply([P], [P, *Fs], length=1.5)
# let's visualize again
pf5.show()
/Users/noahhk/anaconda3/envs/glyco2/lib/python3.11/site-packages/plotly/express/_core.py:1985: FutureWarning:
When grouping with a length-1 list-like, you will need to pass a length-1 tuple to get_group in a future version of pandas. Pass `(name,)` instead of `name` to silence this warning.
And there we have it, a correctly arranged PF5 molecule! So, we were able to compute valid atom coordinates using the trigonal bipyramidal geometry object and a list of some atoms. The same can be done for any other available geometry. Available are the five main VESPR geometries:
linear
trigonal planar
tetrahedral
trigonal bipyramidal
octahedral
Building PF5 from scratch - the short way#
We have seen how we can use geometries to make coordinates for existing atoms. This was not too much code to write, but still required us to first define all atoms, make a molecule, and then generate coordinates for them. We can, fortunately, speed this up using the classmethod Molecule.from_geometry, which will perform exactly the above in one go. Let’s see how…
[8]:
# reset all coordinates to 0,0,0
for F in Fs:
F.coord = bam.structural.origin
# now use the trigonal bipyramidal geometry to arrange the atoms
pf5_again = bam.Molecule.from_geometry(
geometry=bam.structural.geometry.trigonal_bipyramidal,
atoms=[P, *Fs],
)
pf5_again.show()
/Users/noahhk/anaconda3/envs/glyco2/lib/python3.11/site-packages/plotly/express/_core.py:1985: FutureWarning:
When grouping with a length-1 list-like, you will need to pass a length-1 tuple to get_group in a future version of pandas. Pass `(name,)` instead of `name` to silence this warning.
And with this we again have a PF5 molecule. In fact, this method will automatically fill any vacant atom slots that the geometry provides with hydrogen atoms.
So, in cases, where we only care about part of a molecule, we can specify only some, and have the remainder be filled with hydrogens, which we can then use to connect to other molecules or simply remove if we don’t need them.
[9]:
# now we make a molecule with only two F atoms
# (actually, these will be in the plane, since we already inferred
# their coordinates before). The remaining 3 atom slots
# will be filled with hydrogen atoms.
pf5_partial = bam.Molecule.from_geometry(
geometry=bam.structural.geometry.trigonal_bipyramidal,
atoms=[P, Fs[0], Fs[1]],
)
pf5_partial.show()
/Users/noahhk/anaconda3/envs/glyco2/lib/python3.11/site-packages/plotly/express/_core.py:1985: FutureWarning:
When grouping with a length-1 list-like, you will need to pass a length-1 tuple to get_group in a future version of pandas. Pass `(name,)` instead of `name` to silence this warning.
And there we have it. With this we have reached the end of this tutorial. We saw how we can use geometries to manually create small molecules with relatively little effort.
Thank’s for checking out this tutorial and good luck with your research using BuildAMol!