The optimizers package#

These are the optimizers of buildamol. Whenever you want to improve a molecule’s conformation or generate new conformers for a molecule, you will want to use the optimizers.

The Rotatron Environments#

Buildamol implements a torsional optimization system. That is, instead of “wiggling” atoms around until a genetically favorable structure is obtained, Buildamol rotates around bonds within a structure to find the most favorable conformation. This is done by using the Rotatron Environments. The Rotatron Environments are OpenAI Gym environments that store an evaluation function to simulate rotating a molecule around a given set of bonds by a given set of angles. There is a base “Rotatron” environment and three subclasses thereof that can be used for optimization heads-on. These are:

The DistanceRotatron environment evaulates conformations based on the pairwise distances between nodes in the optimized graph.

It uses two forces, a global “unfolding” force to maximize spacial separation between nodes, and a local “pushback” force to maximize distances between the closest nodes.

The evaluation is computed as:

\[e_i = \sum_{j \neq i} d_{ij}^{unfold} + pushback \cdot \sum_{k=1}^N \text{sorted}(d)_{ik}\]

There are multiple variations of this basic formulation available (see the functions below).

class buildamol.optimizers.distance_rotatron.DistanceRotatron(graph: BaseGraph, rotatable_edges: list = None, radius: float = 20, pushback: float = 3, unfold: float = 2, clash_distance: float = 1.2, crop_nodes_further_than: float = -1, n_smallest: int = 10, concatenation_function: callable = None, bounds: tuple = (-3.141592653589793, 3.141592653589793), n_processes: int = 1, **kwargs)[source]#

Bases: Rotatron

A distance-based Rotatron environment.

Parameters:
  • graph (AtomGraph or ResidueGraph) – The graph to optimize

  • rotatable_edges (list) – A list of edges that can be rotated during optimization. If None, all non-locked edges are used.

  • radius (float) – The radius around rotatable edges to include in the distance calculation. Set to -1 to disable.

  • pushback (float) – Short distances between atoms are given higher weight in the evaluation using this factor.

  • unfold (float) – The exponent to use when computing the mean distance to others for each node. Higher values give higher values to global unfolding of the graph.

  • clash_distance (float) – The distance at which atoms are considered to be clashing.

  • crop_nodes_further_than (float) – Nodes that are further away than this factor times the radius from any rotatable edge at the beginning of the optimization are removed from the graph and not considered during optimization. This speeds up computation. Set to -1 to disable.

  • n_smallest (int) – The number of smallest distances to use when computing the evaluation for each node.

  • concatenation_function (callable) – A custom function to use when computing the evaluation for each node. This function should take the state array as first argument and may take any additional arguments. These additional arguments must be passed as keyword arguments to the environment during setup. The function must return a float.

  • bounds (tuple) – The bounds for the minimal and maximal rotation angles.

  • n_processes (int) – The number of processes to use for parallel computation during edge mask generation.

concatenation_function(x)[source]#
is_done(state)[source]#

Check whether the environment is done

Parameters:

state (np.ndarray) – The state of the environment

Returns:

Whether the environment is done

Return type:

bool

buildamol.optimizers.distance_rotatron.concatenation_function_linear(x, unfold, pushback, n_smallest, clash_distance)[source]#

A concatentation function that computes the evaluation as:

Mean distance * unfold + (mean of n smallest distances) * pushback

buildamol.optimizers.distance_rotatron.concatenation_function_no_pushback(x, unfold, pushback, n_smallest, clash_distance)[source]#

A concatentation function that computes the evaluation as:

Mean distance ** unfold

buildamol.optimizers.distance_rotatron.concatenation_function_no_unfold(x, unfold, pushback, n_smallest, clash_distance)[source]#

A concatentation function that computes the evaluation as:

(Mean of n smallest distances) ** pushback

buildamol.optimizers.distance_rotatron.concatenation_function_with_penalty(x, unfold, pushback, n_smallest, clash_distance)[source]#

A concatentation function that computes the evaluation as:

[(Mean distance ** unfold + (mean of n smallest distances) ** pushback)] / clash penalty

buildamol.optimizers.distance_rotatron.simple_concatenation_function(x, unfold, pushback, n_smallest, clash_distance)[source]#

A simple concatentation function that computes the evaluation as:

Mean distance ** unfold + (mean of n smallest distances) ** pushback

The OverlapRotatron is a environment that approximates molecular graphs using multi-variat Gaussian distributions. The overlap between the distributions is used as the evaluation function for the environment. Hence, this environment tries to minimize the overlap between distributions in order to find favorable conformations.

As measure for the overlap between two distributions, the Jensen-Shannon divergence is used by default. Custom overlap functions can be passed to the environment.

buildamol.optimizers.overlap_rotatron.MVN(points, spread: float = 1.0)[source]#

Compute a multi-variate normal distribution for a given set of points.

Parameters:

points (np.ndarray) – The points to compute the mean and covariance matrix for.

Returns:

mvn – The multi-variate normal distribution for the points.

Return type:

scipy.stats.multivariate_normal

class buildamol.optimizers.overlap_rotatron.OverlapRotatron(graph: BaseGraph, rotatable_edges: list = None, artificial_spread: float = 2.0, clash_distance: float = 1.2, crop_nodes_further_than: float = -1, distance_function: callable = None, ignore_further_than: float = -1, n_processes: int = 1, bounds: tuple = (-3.141592653589793, 3.141592653589793), **kwargs)[source]#

Bases: Rotatron

A distribution overlap-based Rotatron environment.

Parameters:
  • graph (AtomGraph or ResidueGraph) – The graph to optimize

  • rotatable_edges (list) – A list of edges that can be rotated during optimization. If None, all non-locked edges are used.

  • artificial_spread (float) – The spread to use for the multi-variate normal distributions. This is used to artificially increase the spread of the distributions. This is useful for cases where the distributions may be too tight and far apart which makes it difficult for the overlap to be computed.

  • clash_distance (float) – The distance at which two atoms are considered to be clashing.

  • crop_nodes_further_than (float) – If greater than 0, crop nodes that are further than this distance from the rotatable edges so that they are not considered in the overlap calculation.

  • distance_function (callable) – A specific distance function to use for calculating the overlap. This function should take two arrays of shape (1, 3) (centers) and two arrays of shape (3, 3) (covariances) and return a scalar.

  • ignore_further_than (float) – If greater than 0, centroids that are further than this distance from each other are evaluated as 0 overlap automatically.

  • n_processes (int) – The number of parallel processes to use when computing edge masks.

  • bounds (tuple) – The bounds for the minimal and maximal rotation angles.

eval(state)[source]#

Calculate the evaluation score for a given state

Parameters:

state (np.ndarray) – The state of the environment

Returns:

The evaluation for the state

Return type:

float

buildamol.optimizers.overlap_rotatron.jensen_shannon_overlap(mvn1, mvn2)[source]#

Compute the overlap between two gaussians using the Jensen-Shannon divergence.

Parameters:
  • mvn1 (scipy.stats.multivariate_normal) – The two gaussians to compute the overlap for.

  • mvn2 (scipy.stats.multivariate_normal) – The two gaussians to compute the overlap for.

Returns:

overlap – The overlap between the two gaussians.

Return type:

float

The ForceFieldRotatron is a rotatron that uses RDKit’s MMFF94 force field to evaluate a given state. Consequently, this environment can only function if RDKIt is installed.

Note

Because this environment uses an actual energy function to evaluate states, this environment performs very poorly with ResidueGraph inputs! ResidueGraphs are abstractions without a valid chemical structure. Consequently, even though this environment can be used with ResidueGraphs, it is not recommended.

class buildamol.optimizers.forcefield_rotatron.ForceFieldRotatron(graph: BaseGraph, rotatable_edges: list = None, clash_distance: float = 0.9, crop_nodes_further_than: float = -1, mmff_variant: str = 'mmff94', n_processes: int = 1, bounds: tuple = (-3.141592653589793, 3.141592653589793), **kwargs)[source]#

Bases: Rotatron

A force field based rotatron. This rotatron uses RDKit’s MMFF94 force field to evaluate the energy of a given state.

Parameters:
  • graph (AtomGraph) – The graph to optimize

  • rotatable_edges (list) – A list of edges that can be rotated during optimization. If None, all non-locked edges are used.

  • clash_distance (float) – The distance at which two atoms are considered to be clashing.

  • crop_nodes_further_than (float) – If greater than 0, crop nodes that are further than this distance from the rotatable edges so that they are not considered in the overlap calculation.

  • mmff_variant (str) – The MMFF variant to use. Can be one of “mmff94”, “mmff94s”, “uff”, “mmff94splus”

  • n_processes (int) – The number of processes to use for parallelization when computing edge masks

  • bounds (tuple) – The bounds for the minimal and maximal rotation angles.

  • kwargs – Additional keyword arguments to pass to the Rotatron

energy()[source]#

Calculate the energy of a given state

Parameters:

state (np.ndarray) – The state of the environment

Returns:

The energy for the state

Return type:

float

eval(state)[source]#

Calculate the evaluation score for a given state

Parameters:

state (np.ndarray) – The state of the environment

Returns:

The evaluation for the state

Return type:

float

The Circulatron class to circularize a molecule

class buildamol.optimizers.circulatron.Circulatron(graph, target_nodes: tuple, base_rotatron: <module 'buildamol.optimizers.base_rotatron' from '/home/docs/checkouts/readthedocs.org/user_builds/biobuild/envs/dev/lib/python3.11/site-packages/buildamol/optimizers/base_rotatron.py'> = <class 'buildamol.optimizers.distance_rotatron.DistanceRotatron'>, hinge_node=None, rotatable_edges: list = None, **kwargs)[source]#

Bases: Rotatron

A special rotatron to circularize a molecule. This rotatron works by minimizing the distance between two target nodes in the graph in order to superimpose them. This rotatron does not itself optimize the conformation of the resulting structure but instead uses one of the other rotatrons to do so.

Note

In order for this environment to work, it is important that the graph is NOT already circularized!

Parameters:
  • graph (AtomGraph or ResidueGraph) – The graph to circulize

  • target_nodes (tuple) – Two nodes in the graph that should be superimposed.

  • base_rotatron (Rotatron) – The rotatron class to use for basic conformer evaluation. By default, this is the DistanceRotatron.

  • rotatable_edges (list) – A list of edges that are rotatable in the graph. If not given, all rotatable edges will be considered.

  • **kwargs – Additional keyword arguments to pass to the base rotatron

done(state)[source]#

Check if the state is done

Parameters:

state (dict) – The state to check

Returns:

Whether the state is done

Return type:

bool

eval(state)[source]#

Evaluate the state using the base rotatron

Parameters:

state (dict) – The state to evaluate

Returns:

The energy of the state

Return type:

float

The ConstraintRotatron allows for the optimization of a molecule’s conformation while also accepting an additional constraint function that will also contribute to the evaluation.

class buildamol.optimizers.constraint_rotatron.ConstraintRotatron(rotatron: <module 'buildamol.optimizers.base_rotatron' from '/home/docs/checkouts/readthedocs.org/user_builds/biobuild/envs/dev/lib/python3.11/site-packages/buildamol/optimizers/base_rotatron.py'>, constraint: callable, finisher: callable = None, **kwargs)[source]#

Bases: Rotatron

The ConstraintRotatron is a Meta-Rotatron environment that uses one of the other Rotatron environments to optimize the conformation of a molecule while also accepting an additional freely definable constraint function that will also contribute to the evaluation.

Parameters:
  • rotatron (Rotatron) – The rotatron object to use for basic conformer evaluation. This needs to be already set up and ready to use.

  • constraint (callable) – A function that evaluates additional constraints and returnes a scalar value that will be added to the evaluation. This function will receive both the base-rotatron object, as well as the current state to evaluate as arguments, and can receive any additional arguments that are passes as keyword arguments during initialization of the ConstraintRotatron.

  • finisher (callable) – The function that evaluates if the constraints are met and returns a boolean. This function will receive the base-rotatron object as first argument and the current state as second state. Also, all kwargs that are passed during initialization will be passed to this function (just like with the constraint function).

  • **kwargs – Additional keyword arguments to pass to the constraint function.

done(state)[source]#

Check if the state is done.

Parameters:

state (np.ndarray) – The state to check.

Returns:

Whether the state is done.

Return type:

bool

eval(state)[source]#

Evaluate the state of the rotatron.

Parameters:

state (np.ndarray) – The state of the rotatron.

Returns:

The evaluation of the state.

Return type:

float

reset()[source]#

Reset the rotatron to its initial state.

step(action)[source]#

Perform a step in the rotatron.

Parameters:

action (np.ndarray) – The action to perform.

Returns:

  • np.ndarray – The new state of the rotatron.

  • float – The evaluation of the new state.

  • bool – Whether the rotatron is done.

  • dict – Additional information.

This is the basic Rotatron environment. It provides the basic functionality for preprocessing a graph into numpy arrays, masking rotatable edges, and evaluating a possible solution. All other Rotatron environments inherit from this class.

class buildamol.optimizers.base_rotatron.Rotatron(graph: BaseGraph, rotatable_edges: list = None, n_processes: int = 1, setup: bool = True, numba: bool = False, **kwargs)[source]#

Bases: Env

The base class for rotational optimization environments.

Parameters:
  • graph (AtomGraph or ResidueGraph) – The graph to optimize

  • rotatable_edges (list) – A list of edges that can be rotated during optimization. If None, all non-locked edges are used.

  • n_processes (int) – The number of processes to use to speed up the computation of edge masks and lengths

  • setup (bool) – Whether to set up the edge masks and lengths during initialization

  • numba (bool) – Whether to use numba to speed up the rotation function.

blank()[source]#

A blank action

copy()[source]#

Make a deep copy of the environment

eval(state)[source]#

Calculate the evaluation score for a given state

Parameters:

state (np.ndarray) – The state of the environment

Returns:

The evaluation for the state

Return type:

float

is_done(state)[source]#

Check whether the environment is done

Parameters:

state (np.ndarray) – The state of the environment

Returns:

Whether the environment is done

Return type:

bool

reset(*args, **kwargs)[source]#

Reset the environment

step(action)[source]#

Take a step in the environment

Parameters:

action (np.ndarray) – The action to take

Returns:

  • np.ndarray – The new state of the environment

  • float – The evaluation for the new state

  • bool – Whether the environment is done

  • dict – Additional information

The Translatron#

The Translatron is a special environment that is used to to optimize the spatial position of a molecule by optimizing its translation and global translation around the x, y and z axis. This is useful when you want to optimize the position of a molecule in a 3D space.

The Translatron

This is the Translatron environment that can be used to place a molecule according to some constraints.

class buildamol.optimizers.translatron.Translatron(graph, constraint_func: callable, finish_func: callable = None, bounds: tuple = (-10, 10))[source]#

Bases: Env

The Translatron environment can be used to place a molecule in space according to some constraints. It will produce a vector of 6 values, the first 3 are the translation in x, y, and z, and the last 3 are the rotations around the x, y, and z axes.

Parameters:
  • graph (AtomGraph or ResidueGraph) – The graph of the molecule to be placed.

  • constraint_func (callable) – A function that takes the environment and the new coordinates and returns a reward value.

  • finish_func (callable) – A function that takes the environment and the new coordinates and returns a boolean value indicating whether the optimization is finished.

  • bounds (tuple) – The bounds for the minimal and maximal translation and rotation values. If a tuple of length two this is interpreted as the low and high bounds for translation only. Otherwise provide a tuple of length 6 for the bounds of translation and rotation. In this case values can be either singular (int/float) in which case they are interpreted as symmetric extrama (min=-value, max=+value) or as tuples with (min=value[0], max=value[1]). Mixed inputs are allowed.

blank()[source]#
eval(state)[source]#
reset(*args, **kwargs)[source]#

Resets the environment to an initial state and returns the initial observation.

This method can reset the environment’s random number generator(s) if seed is an integer or if the environment has not yet initialized a random number generator. If the environment already has a random number generator and reset() is called with seed=None, the RNG should not be reset. Moreover, reset() should (in the typical use case) be called with an integer seed right after initialization and then never again.

Parameters:
  • seed (optional int) – The seed that is used to initialize the environment’s PRNG. If the environment does not already have a PRNG and seed=None (the default option) is passed, a seed will be chosen from some source of entropy (e.g. timestamp or /dev/urandom). However, if the environment already has a PRNG and seed=None is passed, the PRNG will not be reset. If you pass an integer, the PRNG will be reset even if it already exists. Usually, you want to pass an integer right after the environment has been initialized and then never again. Please refer to the minimal example above to see this paradigm in action.

  • options (optional dict) – Additional information to specify how the environment is reset (optional, depending on the specific environment)

Returns:

Observation of the initial state. This will be an element of observation_space

(typically a numpy array) and is analogous to the observation returned by step().

info (dictionary): This dictionary contains auxiliary information complementing observation. It should be analogous to

the info returned by step().

Return type:

observation (object)

step(action)[source]#

Run one timestep of the environment’s dynamics.

When end of episode is reached, you are responsible for calling reset() to reset this environment’s state. Accepts an action and returns either a tuple (observation, reward, terminated, truncated, info).

Parameters:

action (ActType) – an action provided by the agent

Returns:

this will be an element of the environment’s observation_space.

This may, for instance, be a numpy array containing the positions and velocities of certain objects.

reward (float): The amount of reward returned as a result of taking the action. terminated (bool): whether a terminal state (as defined under the MDP of the task) is reached.

In this case further step() calls could return undefined results.

truncated (bool): whether a truncation condition outside the scope of the MDP is satisfied.

Typically a timelimit, but could also be used to indicate agent physically going out of bounds. Can be used to end the episode prematurely before a terminal state is reached.

info (dictionary): info contains auxiliary diagnostic information (helpful for debugging, learning, and logging).

This might, for instance, contain: metrics that describe the agent’s performance state, variables that are hidden from observations, or individual reward terms that are combined to produce the total reward. It also can contain information that distinguishes truncation and termination, however this is deprecated in favour of returning two booleans, and will be removed in a future version.

(deprecated) done (bool): A boolean value for if the episode has ended, in which case further step() calls will return undefined results.

A done signal may be emitted for different reasons: Maybe the task underlying the environment was solved successfully, a certain timelimit was exceeded, or the physics simulation has entered an invalid state.

Return type:

observation (object)

Optimization algorithms#

The Rotatron environments are used to specify the problems to solve. The optimization algorithms are used to solve them. Buildamol implements a number of classical optimization algorithms that are tailored to work with the Rotatron environments. These are:

Particle Swarm Optimization

The Particle Swarm Optimization algorithm is a classical optimization algorithm that is based on the behavior of a swarm of particles. Each particle has a position and a velocity. The position is the current solution to the problem, and the velocity is the direction in which the particle is moving. The particles are attracted to the best solution found so far, and repelled by the worst solution found so far. This way, the particles will move towards the best solution found so far, and will not get stuck in local minima.

The algorithm performs well with both small and large inputs, both with AtomGraphs and ResidueGraphs. It is also often the fastest to compute, so it is the default algorithm .

buildamol.optimizers.algorithms.swarm_optimize(env, n_particles: int = None, max_steps: int = 30, stop_if_done: bool = True, threshold: float = 1e-06, w: float = 0.9, c1: float = 0.5, c2: float = 0.3, cooldown_rate: float = 0.99, n_best: int = 1, numba: bool = False)[source]#

Optimize a rotatron environment through a simple particle swarm optimization.

Parameters:
  • env (buildamol.optimizers.environments.Rotatron) – The environment to optimize

  • n_particles (int, optional) – The number of particles to use. Set this to None in order to compute the number of particles based on the number of rotatable edges in the environment.

  • max_steps (int, optional) – The maximum number of steps to take.

  • stop_if_done (bool, optional) – Stop the optimization if the environment signals it is done or the solutions have converged.

  • threshold (float, optional) – A threshold to use for convergence of the best solution found. The algorithm will stop if the variation of the best solution evaluation history is less than this threshold.

  • w (float, optional) – The inertia parameter for the particle swarm optimization.

  • c1 (float, optional) – The cognitive parameter for the particle swarm optimization.

  • c2 (float, optional) – The social parameter for the particle swarm optimization.

  • cooldown_rate (float, optional) – The rate at which the inertia parameter is reduced. The inertia parameter is reduced by this factor every generation. E.g. 0.95 will reduce the inertia parameter by 5% every generation.

  • n_best (int, optional) – The number of best solutions to return at the end of the optimization.

  • numba (bool, optional) – Use numba for the optimization. This may speed up the optimization if you are going to optimize many molecules.

Returns:

The solution and evaluation for the solution

Return type:

solution, evaluation

Genetic Algorithm

The Genetic Algorithm is one of the most iconic optimization algorithms. It is based on the behavior of a population of individuals. Each individual has a “genome”, which is the current solution to the problem. Each generation (optimization round) individuals are mutated (randomly change their solution), and the best individuals reproduce and make it to the next round. This way, the population will move gradually towards good solutions.

The algorithm performs well on any scale but gets exceedingly slower the larger the molecules become. Also, it works slightly better with AtomGraphs than with ResidueGraphs.

buildamol.optimizers.algorithms.genetic_optimize(env, max_generations: int = 500.0, stop_if_done: bool = True, threshold: float = 1e-06, variation: float = 0.2, population_size: int = 50, parents: int | float = 0.25, children: int | float = 0.3, mutants: int | float = 0.3, newcomers: int | float = 0.15, variation_cooldown: float = 1, n_best: int = 1, numba: bool = False)[source]#

A simple genetic algorithm for optimizing a Rotatron environment.

Parameters:
  • env (buildamol.optimizers.environments.Rotatron) – The environment to optimize

  • max_generations (int, optional) – The maximum number of steps to take.

  • stop_if_done (bool, optional) – Stop the optimization if the environment signals it is done or the solutions have converged.

  • threshold (float, optional) – A thershold to use for convergence of the best solution found. The algorithm will stop if the variation of the best solution evaluation history is less than this threshold.

  • variation (float, optional) – The variation to use for the initial action.

  • population_size (int, optional) – The size of the population.

  • parents (int or float, optional) – The number or fraction of parents (elites) to select. The parents are selected from the best solutions. Parents produce offspring and pass to the next generation.

  • children (int or float, optional) – The number or fraction of children to generate from the parents. Children are generated by averaging the parents and adding some noise.

  • mutants (int or float, optional) – The number or fraction of mutants to generate. Mutants are generated by adding noise to parents, therby generating abarrent clones.

  • newcomers (int or float, optional) – Newcomers are entirely new solution candidates.

  • variation_cooldown (float, optional) – The rate at which the variation is reduced. The variation is reduced by this factor every generation. E.g. 0.95 will reduce the variation by 5% every generation.

  • n_best (int, optional) – The number of best solutions to return at the end of the optimization.

  • numba (bool, optional) – Use numba for the optimization. This may speed up the optimization if you are going to optimize many molecules.

Returns:

The angles(s) and evaluation(s) of the best solution(s) found.

Return type:

solution, evaluation

Simulated Annealing

Simulated Annealing is another optimization algorithm that has similarities to both genetic and particle swarm optimization. It explores solutions by randomly changing the current one, and accepts or rejects the new solution based on the change in energy. The algorithm is based on the annealing process in metallurgy, where a metal is heated and then slowly cooled down. This way, the metal will settle in a more stable state.

The algorithm performs well with better with smaller inputs but is suitable for larger ones using both AtomGraphs and ResidueGraphs.

buildamol.optimizers.algorithms.anneal_optimize(env, n_particles: int = None, max_steps: int = 100, stop_if_done: bool = True, threshold: float = 1e-06, variance: float = 0.3, cooldown_rate: float = 0.98, n_best: int = 1, numba: bool = False)[source]#

Optimize a rotatron environment through a simple simulated annealing.

Parameters:
  • env (buildamol.optimizers.environments.Rotatron) – The environment to optimize

  • n_particles (int, optional) – The number of particles to use. Set to None in order to compute the number of particles based on the number of rotatable edges in the environment, where one particle is used per two rotatable edges.

  • max_steps (int, optional) – The maximum number of steps to take.

  • stop_if_done (bool, optional) – Stop the optimization if the environment signals it is done or the solutions have converged.

  • threshold (float, optional) – A threshold to use for convergence of the best solution found. The algorithm will stop if the variation of the best solution evaluation history is less than this threshold.

  • variance (float, optional) – The variation to use for updating particle positions.

  • n_best (int, optional) – The number of best solutions to return at the end of the optimization.

  • numba (bool, optional) – Use numba for the optimization. This may speed up the optimization if you are going to optimize many molecules.

Returns:

The solution and evaluation for the solution

Return type:

solution, evaluation

Gradient-based algorithms

We implement a direct link to scipy.optimize.minimize which provides a number of gradient-based optimization algorithms. These algorithms are usually very fast and perform well on small inputs. However, as evaluation landscapes of larger molecules tend to get “rugged” gradient-based methods tend to struggle with larger inputs.

Any algorithm implemented by scipy.optimize.minimize can be used. The default is the L-BFGS-B algorithm. For a complete list of available algorithms checkout the scipy documentation.

buildamol.optimizers.algorithms.scipy_optimize(env, steps: int = 100000.0, method: str = 'L-BFGS-B', **kws)[source]#

Optimize a Rotatron environment through a simple scipy optimization

Parameters:
  • env (buildamol.optimizers.environments.Rotatron) – The environment to optimize

  • steps (int, optional) – The number of steps to take.

  • method (str, optional) – The optimizer to use, by default “L-BFGS-B”. This can be any optimizer from scipy.optimize.minimize

  • kws (dict, optional) – Keyword arguments to pass as options to the optimizer

Returns:

The angles(s) and evaluation(s) of the best solution(s) found.

Return type:

solution, evaluation

Optimization utilities#

Buildamol also implements a number of utilities that can be used to make the optimization a little easier for the user by automizing certain steps.

This module contains utility functions for the optimizers.

buildamol.optimizers.utils.apply_rotatron_solution(sol: ndarray, env: Rotatron.Rotatron, mol: Molecule.Molecule) Molecule.Molecule[source]#

Apply the solution of a Rotatron environment to a Molecule object.

Parameters:
  • sol (np.ndarray) – The solution of rotational angles in radians to apply

  • env (Rotatron) – The environment used to find the solution

  • mol (Molecule) – The molecule to apply the solution to

Returns:

The object with the solution applied

Return type:

obj

buildamol.optimizers.utils.apply_translatron_solution(sol: ndarray, env: Translatron.Translatron, mol: Molecule.Molecule)[source]#

Apply the solution of a Translatron environment to a Molecule object.

Parameters:
  • sol (np.ndarray) – The solution of translational vectors to apply

  • env (Translatron) – The environment used to find the solution

  • mol (Molecule) – The molecule to apply the solution to

buildamol.optimizers.utils.auto_algorithm(mol, env=None)[source]#

Decide which algorithm to use for a quick-optimize based on the molecule size.

buildamol.optimizers.utils.optimize(mol: Molecule.Molecule, env: Rotatron.Rotatron | Translatron.Translatron = None, algorithm: str | callable = None, **kwargs) Molecule.Molecule[source]#

Quickly optimize a molecule using a specific algorithm.

Note

This is a convenience function that will automatically create an environment and determine edges. However, that means that the environment will be created from scratch every time this function is called. Also, the environment will likely not taylor to any specifc requirements of the situation. For better performance and control, it is recommended to create an environment manually and supply it to the function using the env argument.

Parameters:
  • mol (Molecule) – The molecule to optimize. This molecule will be modified in-place.

  • env (Rotatron or Translatron, optional) – The environment to use. This needs to be a Rotatron instance or Translatron instance that is fully set up and ready to use.

  • algorithm (str or callable, optional) – The algorithm to use. If not provided, an algorithm is automatically determined, depending on the molecule size. If provided, this can be: - “genetic”: A genetic algorithm - “swarm”: A particle swarm optimization algorithm - “anneal”: A simulated annealing algorithm - “scipy”: A gradient descent algorithm (default scipy implementation, can be changed using a ‘method’ keyword argument) - “rdkit”: A force field based optimization using RDKit (if installed) - or some other callable that takes an environment as its first argument

  • **kwargs – Additional keyword arguments to pass to the algorithm

Returns:

The optimized molecule

Return type:

Molecule

buildamol.optimizers.utils.parallel_optimize(mol: Molecule, envs: List[Rotatron], algorithm: str | callable = None, n_processes: int = None, unify_final: bool = True, **kwargs) Molecule.Molecule[source]#

Optimize a molecule using multiple sub-environments in parallel.

Parameters:
  • mol (Molecule) – The molecule to optimize. This molecule will be modified in-place.

  • envs (list) – The sub-environments to optimize

  • algorithm (str or callable, optional) – The algorithm to use. If not provided, an algorithm is automatically determined, depending on the molecule size. If provided, this can be: - “genetic”: A genetic algorithm - “swarm”: A particle swarm optimization algorithm - “anneal”: A simulated annealing algorithm - “scipy”: A gradient descent algorithm (default scipy implementation, can be changed using a ‘method’ keyword argument) - or some other callable that takes an environment as its first argument

  • n_processes (int, optional) – The number of processes to use. If not provided, the number of processes is automatically determined.

  • unify_final (bool, optional) – If True, the solutions to all sub-environments are applied onto the same final molecule. If False, a list of molecules is returned each with the solution of one sub-environment applied.

  • **kwargs – Additional keyword arguments to pass to the algorithm

Returns:

The optimized molecule

Return type:

Molecule

buildamol.optimizers.utils.split_environment(env: Rotatron, n: int = None) List[Rotatron][source]#

Split an environment into n sub-environments which are smaller and thus easier to optimize.

Parameters:
  • env (Rotatron) – The environment to split

  • n (int) – The number of sub-environments to create

Returns:

A list of sub-environments

Return type:

list