The graphs package#

This module defines classes to represent PDB structures as graph objects. These store the molecule connectivity information. Two graphs are available, the AtomGraph, as an atomic-scale representation of a molecule, and the ResidueGraph which represents each residue as a single node in the graph, thereby abstracting away the atomic details.

The graphs contain many of the analytical methods used for molecule structure analysis and manipulation that does not concern adding or removing atoms.

The graphs also serve an important function in biobuild’s conformational optimization process as they are the main data structure to which the optimization algorithms are applied.

The BaseGraph module#

The BaseGraph is at the basis of both the AtomGraph and ResidueGraph.

The BaseGraph class
class biobuild.graphs.BaseGraph.BaseGraph(id, bonds: list)[source]#

Bases: Graph

The basic class for molecular graphs

property atoms#

Returns the atoms in the molecule

property bonds#

Returns the bonds in the molecule

property central_node#

Returns the central most node of the graph. This is computed based on the mean of all node coordinates.

property chains#

Returns the chains in the molecule

direct_edges(root_node=None, edges: list = None) list[source]#

Sort the edges such that the first node in each edge is the one closer to the root node. If no root node is provided, the central node is used.

Parameters:
  • root_node – The root node to use for sorting the edges. If not provided, the central node is used.

  • edges (list, optional) – The edges to sort, by default None, in which case all edges are sorted.

Returns:

The sorted edges

Return type:

list

draw()[source]#

Prepare a 3D view of the graph but do not show it yet

Returns:

A 3D viewer

Return type:

PlotlyViewer3D

find_rotatable_edges(root_node=None, min_descendants: int = 1, min_ancestors: int = 1, max_descendants: int = None, max_ancestors: int = None)[source]#

Find all edges in the graph that are rotatable (i.e. not locked and not in a circular constellation). You can also filter and direct the edges.

Parameters:
  • root_node – A root node by which to direct the edges (closer to further).

  • min_descendants (int, optional) – The minimum number of descendants that an edge must have to be considered rotatable.

  • min_ancestors (int, optional) – The minimum number of ancestors that an edge must have to be considered rotatable.

  • max_descendants (int, optional) – The maximum number of descendants that an edge must have to be considered rotatable.

  • max_ancestors (int, optional) – The maximum number of ancestors that an edge must have to be considered rotatable.

Returns:

A list of rotatable edges

Return type:

list

get_ancestors(node_1, node_2)[source]#

Get all ancestor nodes that come before a specific edge defined in the direction from node1 to node2 (i.e. get all nodes that comebefore node1). This method is directed in contrast to the get_neighbors() method, which will get all neighboring nodes of an anchor node irrespective of direction.

Parameters:
  • node_1 – The nodes that define the edge

  • node_2 – The nodes that define the edge

Returns:

The ancestor nodes

Return type:

set

Examples

In case of this graph:

A---B---C---D---E
    \
    F---H
    |
    G

``` A—B—C—D—E

F—H | G

```

>>> graph.get_ancestors("B", "C")
{"A", "F", "G", "H"}
>>> graph.get_ancestors("F", "B")
{"H", "G"}
>>> graph.get_ancestors("A", "B")
set() # because in this direction there are no other nodes
get_descendants(node_1, node_2)[source]#

Get all descendant nodes that come after a specific edge defined in the direction from node1 to node2 (i.e. get all nodes that come after node2). This method is directed in contrast to the get_neighbors() method, which will get all neighboring nodes of an anchor node irrespective of direction.

Parameters:
  • node_1 – The nodes that define the edge

  • node_2 – The nodes that define the edge

Returns:

The descendant nodes

Return type:

set

Examples

In case of this graph:

A---B---C---D---E
    \
    F---H
    |
    G

``` A—B—C—D—E

F—H | G

```

>>> graph.get_descendants("B", "C")
{"D", "E"}
>>> graph.get_descendants("B", "F")
{"H", "G"}
>>> graph.get_descendants("B", "A")
set() # because in this direction there are no other nodes
get_locked_edges()[source]#

Get all locked edges

Returns:

The locked edges

Return type:

set

abstract get_neighbors(node, n: int = 1, mode='upto')[source]#

Get the neighbors of a node

Parameters:
  • node – The target node

  • n (int, optional) – The number of edges to separate the node from its neighbors.

  • mode (str, optional) – The mode to use for getting the neighbors, by default “upto” - “upto”: get all neighbors up to a distance of n edges - “exact”: get all neighbors exactly n edges away

Returns:

The neighbors of the node

Return type:

set

get_unlocked_edges()[source]#

Get all unlocked edges

Returns:

The unlocked edges

Return type:

set

in_same_cycle(node_1, node_2, cycles=None) bool[source]#

Check if two nodes are in the same cycle

Parameters:
  • node_1 – The nodes to check

  • node_2 – The nodes to check

is_locked(node_1, node_2)[source]#

Check if an edge is locked

Parameters:
  • node_1 – The nodes that define the edge

  • node_2 – The nodes that define the edge

Returns:

Whether the edge is locked

Return type:

bool

lock_all()[source]#

Lock all edges

lock_edge(node_1, node_2)[source]#

Lock an edge, preventing it from being rotated.

Parameters:
  • node_1 – The nodes that define the edge

  • node_2 – The nodes that define the edge

property nodes_in_cycles: set#

Returns the nodes in cycles

property residues#

Returns the residues in the molecule

rotate_around_edge(node_1, node_2, angle: float, descendants_only: bool = False, update_coords: bool = True)[source]#

Rotate descending nodes around a specific edge by a given angle.

Parameters:
  • node_1 – The nodes that define the edge around which to rotate.

  • node_2 – The nodes that define the edge around which to rotate.

  • angle (float) – The angle to rotate by, in radians.

  • descendants_only (bool, optional) – Whether to only rotate the descending nodes, by default False, in which case the entire graph will be rotated.

  • update_coords (bool, optional) – Whether to update the coordinates of the nodes after rotation, by default True.

Returns:

new_coords – The new coordinates of the nodes after rotation.

Return type:

dict

show()[source]#

Show the graph

property structure#

Returns the underlying bio.PDB.Structure object

unlock_all()[source]#

Unlock all edges

unlock_edge(node_1, node_2)[source]#

Unlock an edge, allowing it to be rotated.

Parameters:
  • node_1 – The nodes that define the edge

  • node_2 – The nodes that define the edge

The AtomGraph module#

The AtomGraph handles the atom connectivity within Molecule objects. It provides the bulk of connectivity related methods such as get_neighbors.

The AtomGraph class
class biobuild.graphs.AtomGraph(id, bonds: list)[source]#

Bases: BaseGraph

A graph representation of atoms and bonds in a contiguous molecule.

draw()[source]#

Prepare a 3D view of the graph but do not show it yet

Returns:

A 3D viewer

Return type:

PlotlyViewer3D

classmethod from_biopython(structure, apply_standard_bonds: bool = True, infer_residue_connections: bool = True, infer_bonds: bool = False, max_bond_length: float = None, restrict_residues: bool = True, _topology=None)[source]#

Create an AtomGraph from a biopython structure

Parameters:
  • structure – The biopython structure. This can be any biopython object that houses atoms.

  • infer_residue_connections (bool) – Whether to infer residue connecting bonds based on atom distances.

  • infer_bonds (bool) – Whether to infer bonds from the distance between atoms. If this is set to True, standard bonds cannot be also applied!

  • max_bond_length (float) – The maximum distance between atoms to infer a bond. If none is given, a default bond length is assumed.

  • restrict_residues (bool) – Whether to restrict to atoms of the same residue when inferring bonds. If set to False, this will also infer residue connecting bonds.

  • _topology – A specific reference topology to use when re-constructing any missing parts. By default the default CHARMM topology is used.

Returns:

The AtomGraph representation of the molecule

Return type:

AtomGraph

classmethod from_molecule(mol, locked: bool = False)[source]#

Create an AtomGraph from a molecule

Parameters:
  • mol (biobuild.molecule.Molecule) – The molecule to convert

  • locked (bool, optional) – If True, any information about locked bonds will also be transferred to the AtomGraph, by default False.

Returns:

The AtomGraph representation of the molecule

Return type:

AtomGraph

get_neighbors(atom: Atom, n: int = 1, mode='upto')[source]#

Get the neighbors of a node

Parameters:
  • atom (Atom) – The atom

  • n (int, optional) – The number of bonds to separate the atom from its neighbors.

  • mode (str, optional) – The mode to use for getting the neighbors, by default “upto” - “upto”: get all neighbors up to a distance of n bonds - “exact”: get all neighbors exactly n bonds away

Returns:

The neighbors of the atom

Return type:

set

migrate_bonds(other)[source]#

Migrate bonds from another graph

Parameters:

other (AtomGraph) – The other graph to migrate bonds from

The ResidueGraph module#

The ResidueGraph handles the residue connectivity within Molecule objects. It is an abstraction of the AtomGraph and provides the many of the same methods. ResidueGraph objects serve as primary input for structural optimization algorithms in the optimizers package of biobuild.

The ResidueGraph class
class biobuild.graphs.ResidueGraph(id, bonds: list)[source]#

Bases: BaseGraph

A graph representation of residues bonded together as an abstraction of a large contiguous molecule.

add_atomic_bonds(*edges)[source]#

Add atom-level bonds to the graph.

Parameters:

*edges – The edges to add

property atomic_bonds#

Get the atomic-level bonds in the molecule.

Returns:

The atomic-level bonds in the molecule

Return type:

dict

centers_of_mass()[source]#

Get the centers of mass of the residues in the molecule.

Returns:

The centers of mass of the residues in the molecule

Return type:

dict

draw()[source]#

Prepare a 3D view of the graph but do not show it yet

Returns:

A 3D viewer

Return type:

PlotlyViewer3D

find_rotatable_edges(root_node=None, min_descendants: int = 1, min_ancestors: int = 1, max_descendants: int = None, max_ancestors: int = None)[source]#

Find all edges in the graph that are rotatable (i.e. not locked and not in a circular constellation). You can also filter and direct the edges.

Parameters:
  • root_node – A root node by which to direct the edges (closer to further).

  • min_descendants (int, optional) – The minimum number of descendants that an edge must have to be considered rotatable.

  • min_ancestors (int, optional) – The minimum number of ancestors that an edge must have to be considered rotatable.

  • max_descendants (int, optional) – The maximum number of descendants that an edge must have to be considered rotatable.

  • max_ancestors (int, optional) – The maximum number of ancestors that an edge must have to be considered rotatable.

Returns:

A list of rotatable edges

Return type:

list

classmethod from_AtomGraph(atom_graph, infer_connections: bool = None)[source]#

Create a ResidueGraph from an AtomGraph.

Parameters:
  • atom_graph (AtomGraph) – The AtomGraph representation of the molecule

  • infer_connections (bool) – Whether to infer the bonds between residues from the atom-level bonds. If the AtomGraph already contains atom-level bonds that connect different residues, this is not necessary. If this is set to None, connections will be inferred automatically if no atom-level bonds are present in the AtomGraph.

Returns:

The ResidueGraph representation of the molecule

Return type:

ResidueGraph

classmethod from_molecule(mol, detailed: bool = False, locked: bool = True)[source]#

Create a ResidueGraph from a molecule object.

Parameters:
  • mol (Molecule) – The molecule object

  • detailed (bool) – Whether to make a “detailed” residue graph representation including the atomic-scale bonds between residues. If True, locked bonds can be directly migrated from the molecule.

  • locked (bool) – Whether to migrate locked bonds from the molecule. This is only possible if detailed is True.

Returns:

The ResidueGraph representation of the molecule

Return type:

ResidueGraph

get_atomic_bond(residue1, residue2) tuple[source]#

Get the atomic-level bond between two residues.

Parameters:
  • residue1 (Residue or str) – The first residue or it’s id

  • residue2 (Residue or str) – The second residue or it’s id

Returns:

The atomic bond between the two residues

Return type:

tuple

get_neighbors(residue: Residue, n: int = 1, mode='upto')[source]#

Get the neighbors of a residue

Parameters:
  • residue (bio.Residue.Residue) – The target residue

  • n (int, optional) – The number of connections to separate the residue from its neighbors.

  • mode (str, optional) – The mode to use for getting the neighbors, by default “upto” - “upto”: get all neighbors up to a distance of n bonds - “exact”: get all neighbors exactly n bonds away

Returns:

The neighbors of the residue

Return type:

set

get_residue(r)[source]#

Get a residue in the molecule.

Parameters:

r (str or Residue) – The residue or it’s id

Returns:

The residue

Return type:

Residue

lock_centers()[source]#

Lock any edges that connect residue centers of mass to their constituent atoms. This only applies to detailed graphs.

make_detailed(include_samples: bool = True, include_far_away: bool = False, include_heteroatoms: bool = False, include_clashes: bool = True, n_samples: int | float = 0.5, f: float = 1.0, no_hydrogens: bool = False) ResidueGraph[source]#

Use a detailed representation of the residues in the molecule by adding the specific atoms that connect the residues together. This is useful for visualization and analysis.

Note

This function is not reversible. It is applied in-place.

Parameters:
  • include_samples (bool) – If True, a number of atoms are sampled from each residue and included in the detailed representation.

  • include_far_away (bool) – If True, atoms that are not involved in residue connections are also included if their distance to the residue’s center of mass is greater than f * the 75th percentile of atom distances to the residue’s center of mass.

  • include_heteroatoms (bool) – If True, all hetero-atoms are included in the detailed representation, regardless of their distance to the residue center of mass.

  • include_clashes (bool) – If True, all atoms that are involved in a clash are included in the detailed representation.

  • n_samples (int or float) – The number or fraction of atoms to sample from each residue if include_samples is True. If a fraction in range (0,1) is given instead of an integer, the number of atoms to sample is adjusted according to the residue size.

  • f (float) – The factor by which the 75th percentile of atom distances to the residue’s center of mass is multiplied to determine the cutoff distance for outlier atoms. This is only used if include_outliers is True.

  • no_hydrogens (bool) – If True, hydrogens are not included in the detailed representation.

prune_triplets()[source]#

Prune bond triangles where two nodes from the same residue are connected to each other and the residue…

property residues#

Get the residues in the molecule.

Returns:

The residues in the molecule

Return type:

list

to_AtomGraph()[source]#

Convert the ResidueGraph to an AtomGraph.

Returns:

The AtomGraph representation of the molecule

Return type:

AtomGraph