Conformational sampling#

In this tutorial we will cover:

  • how we may use BuildAMol to obtain multiple conformations for a molecule

We sometimes want to test out different conformations of a molecule. To achieve this we can use molecular dynamics (MD) simulations. BuildAMol does not perform MD, however. We can nontheless use BuildAMol’s optimization algorithms to obtain multiple conformations for a molecule. The algorithms:

  • swarm_optimize,

  • anneal_optimize,

  • genetic_optimize

offer the functionality of returning multiple solutions using the argument n_best. We can make use of this to obtain multiple conformations for one molecule during optimization by simply passing this argument to the optimization function or the optimize wrapper.

Let’s start by loading some molecule from a file

[1]:
import buildamol as bam
bam.visual.set_backend("py3dmol")

mol = bam.read_pdb("files/x_wing.pdb")
mol.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

Preparing the environment#

If you have already checked out the other tutorials on conformational optimization then you know that we first have to export our molecule into a graph object and set up an optimization environment. If all of that is news to you, go check out the other tutorials to learn more. For now, let’s just make an environment and subsample some bonds to optimize:

[ ]:
# make a graph (standard optimization procedure)
graph = mol.get_atom_graph()

# get some edges to rotate
# (here we take a random subset of edges)
# all edges are directed to be pointing "outward" from the central node
edges = graph.find_rotatable_edges(graph.central_node)
edges = graph.sample_edges(edges, n=2, m=5) # two clusters with 5 sampled edges each

# make an environment to optimize
env = bam.optimizers.DistanceRotatron(graph, edges, pushback=1.5)

Sampling with n_best#

Now we can optimize the molecule in order to obtain multiple conformer samples by passing n_best as an argument to the optimize wrapper function like so:

[5]:
# optimize and get the conformer samples
# we pass n_best=10 to get the 10 best solutions back
conformers = bam.optimizers.optimize(mol, env, algorithm="swarm", n_best=10, n_particles=100)
print(conformers)
[Molecule(x_wing), Molecule(x_wing), Molecule(x_wing), Molecule(x_wing), Molecule(x_wing)]

Now we have a list of ten conformers for the X-Wing molecule. We can look at them in one 3D plot using the code below

[3]:
# make a 3D molecule representation of the original
view = mol.draw(color="gray")

# now draw all conformers
for c in conformers:
    view.add(c.draw(color="cyan"))

view.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

Each of the generated conformers is a Molecule in its own right so we can do with these now whatever. Of course, since they are all products of the same optimization they look very much alike. That is not quite what “sampling” is supposed to be, though.

Sampling with independent optimizations#

To diversify the conformers that are sampled, we could either extend the optimization scope (e.g. more particles for particle-swarm optimization) or we simply use multiple independent optimizations like so:

[4]:
# make some conformers from individual optimizations
conformers = [bam.optimizers.optimize(mol.copy(), env, algorithm="swarm", n_particles=5) for i in range(5)]

view = mol.draw(color="gray")
for c in conformers:
    view.add(c.draw(color="orange"))
view.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

That looks much more like what we are hoping to see. However, notice how we reduced the scale of the optimization markedly so as not to wait too long. When doing separate optimizations we can leverage the power of parallel computation, however, to get a significant speedup! For convenience BuildAMol has a parallel_optimize function that we can use to run multiple optimizations, well, in parallel! Here’s how to use it:

[8]:
# run twenty optimizations in parallel
# by passing a list of environments
# since we want to do sampling we set unify_final=False
conformers = bam.optimizers.parallel_optimize(mol, [env] * 20, algorithm="swarm", n_particles=5, unify_final=False)

view = mol.draw(color="gray")
for c in conformers:
    view.add(c.draw(color="limegreen"))
view.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

And there we have some more conformers. Of course if we want to do it a little more rigorous, we should repeat the procedure multiple times while sampling different edges to get more diversity. Hopefully you feel like you could already do this on your own at this point, but let’s demonstrate it anyways:

[9]:
conformers = []
for i in range(5):
    # make a new graph each time
    edges = graph.sample_edges(graph.find_rotatable_edges(graph.central_node), n=2, m=5)
    env = bam.optimizers.DistanceRotatron(graph, edges, pushback=1.5)

    # optimize and get the conformer samples
    incoming = bam.optimizers.parallel_optimize(mol, [env] * 10, algorithm="swarm", n_particles=5, unify_final=False)
    conformers.extend(incoming)

view = mol.draw(color="gray")
for c in conformers:
    view.add(c.draw(color="purple"))
view.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

And there we have a total of 50 conformations overlaid! Out of interest, let’s also check how many of our conformers have clashes in them:

[12]:
v = mol.draw(color="gray")
for c in conformers:
    if c.count_clashes() == 0:
        color = "limegreen"
    else:
        color = "red"

    v.add(c.draw(color=color))

v.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

What if we want to use these conformers in a deep-learning dataset for instance? Well, most of the available deep-learning tools work with RDKit so let’s simply generate a “dataset” in the form of a pickled list of dictionaries that contain RDKit molecules and some other data. We can then use this to maybe retrain an existing tool to better accomodate our needs, or do whatever else our research may entail…

[13]:
import pickle

dataset = []

for idx, c in enumerate(conformers):
    if c.count_clashes() > 0:
        continue
    new = {
        "conformer_id" : c.id + str(idx),
        # any additional features that would be of interest
        # ...
        "mol" : c.to_rdkit()
    }
    dataset.append(new)

with open("./files/x_wing_conformers.pkl", "wb") as f:
    pickle.dump(dataset, f)

With this we have reached the end of this tutorial. Hopefully you found it useful for your search for conformers!