The graphs package#
This module defines classes to represent PDB structures as graph objects. These store the molecule connectivity information. Two graphs are available, the AtomGraph, as an atomic-scale representation of a molecule, and the ResidueGraph which represents each residue as a single node in the graph, thereby abstracting away the atomic details.
The graphs contain many of the analytical methods used for molecule structure analysis and manipulation that does not concern adding or removing atoms.
The graphs also serve an important function in biobuild’s conformational optimization process as they are the main data structure to which the optimization algorithms are applied.
The BaseGraph module#
The BaseGraph is at the basis of both the AtomGraph and ResidueGraph.
The BaseGraph class
- class biobuild.graphs.BaseGraph.BaseGraph(id, bonds: list)[source]#
Bases:
GraphThe basic class for molecular graphs
- property atoms#
Returns the atoms in the molecule
- property bonds#
Returns the bonds in the molecule
- property central_node#
Returns the central most node of the graph. This is computed based on the mean of all node coordinates.
- property chains#
Returns the chains in the molecule
- direct_edges(root_node=None, edges: list = None) list[source]#
Sort the edges such that the first node in each edge is the one closer to the root node. If no root node is provided, the central node is used.
- Parameters:
root_node – The root node to use for sorting the edges. If not provided, the central node is used.
edges (list, optional) – The edges to sort, by default None, in which case all edges are sorted.
- Returns:
The sorted edges
- Return type:
list
- draw()[source]#
Prepare a 3D view of the graph but do not show it yet
- Returns:
A 3D viewer
- Return type:
- find_rotatable_edges(root_node=None, min_descendants: int = 1, min_ancestors: int = 1, max_descendants: int = None, max_ancestors: int = None)[source]#
Find all edges in the graph that are rotatable (i.e. not locked and not in a circular constellation). You can also filter and direct the edges.
- Parameters:
root_node – A root node by which to direct the edges (closer to further).
min_descendants (int, optional) – The minimum number of descendants that an edge must have to be considered rotatable.
min_ancestors (int, optional) – The minimum number of ancestors that an edge must have to be considered rotatable.
max_descendants (int, optional) – The maximum number of descendants that an edge must have to be considered rotatable.
max_ancestors (int, optional) – The maximum number of ancestors that an edge must have to be considered rotatable.
- Returns:
A list of rotatable edges
- Return type:
list
- get_ancestors(node_1, node_2)[source]#
Get all ancestor nodes that come before a specific edge defined in the direction from node1 to node2 (i.e. get all nodes that comebefore node1). This method is directed in contrast to the get_neighbors() method, which will get all neighboring nodes of an anchor node irrespective of direction.
- Parameters:
node_1 – The nodes that define the edge
node_2 – The nodes that define the edge
- Returns:
The ancestor nodes
- Return type:
set
Examples
In case of this graph:
A---B---C---D---E \ F---H | G
``` A—B—C—D—E
F—H | G
>>> graph.get_ancestors("B", "C") {"A", "F", "G", "H"} >>> graph.get_ancestors("F", "B") {"H", "G"} >>> graph.get_ancestors("A", "B") set() # because in this direction there are no other nodes
- get_descendants(node_1, node_2)[source]#
Get all descendant nodes that come after a specific edge defined in the direction from node1 to node2 (i.e. get all nodes that come after node2). This method is directed in contrast to the get_neighbors() method, which will get all neighboring nodes of an anchor node irrespective of direction.
- Parameters:
node_1 – The nodes that define the edge
node_2 – The nodes that define the edge
- Returns:
The descendant nodes
- Return type:
set
Examples
In case of this graph:
A---B---C---D---E \ F---H | G
``` A—B—C—D—E
F—H | G
>>> graph.get_descendants("B", "C") {"D", "E"} >>> graph.get_descendants("B", "F") {"H", "G"} >>> graph.get_descendants("B", "A") set() # because in this direction there are no other nodes
- abstract get_neighbors(node, n: int = 1, mode='upto')[source]#
Get the neighbors of a node
- Parameters:
node – The target node
n (int, optional) – The number of edges to separate the node from its neighbors.
mode (str, optional) – The mode to use for getting the neighbors, by default “upto” - “upto”: get all neighbors up to a distance of n edges - “exact”: get all neighbors exactly n edges away
- Returns:
The neighbors of the node
- Return type:
set
- in_same_cycle(node_1, node_2, cycles=None) bool[source]#
Check if two nodes are in the same cycle
- Parameters:
node_1 – The nodes to check
node_2 – The nodes to check
- is_locked(node_1, node_2)[source]#
Check if an edge is locked
- Parameters:
node_1 – The nodes that define the edge
node_2 – The nodes that define the edge
- Returns:
Whether the edge is locked
- Return type:
bool
- lock_edge(node_1, node_2)[source]#
Lock an edge, preventing it from being rotated.
- Parameters:
node_1 – The nodes that define the edge
node_2 – The nodes that define the edge
- property nodes_in_cycles: set#
Returns the nodes in cycles
- property residues#
Returns the residues in the molecule
- rotate_around_edge(node_1, node_2, angle: float, descendants_only: bool = False, update_coords: bool = True)[source]#
Rotate descending nodes around a specific edge by a given angle.
- Parameters:
node_1 – The nodes that define the edge around which to rotate.
node_2 – The nodes that define the edge around which to rotate.
angle (float) – The angle to rotate by, in radians.
descendants_only (bool, optional) – Whether to only rotate the descending nodes, by default False, in which case the entire graph will be rotated.
update_coords (bool, optional) – Whether to update the coordinates of the nodes after rotation, by default True.
- Returns:
new_coords – The new coordinates of the nodes after rotation.
- Return type:
dict
- property structure#
Returns the underlying bio.PDB.Structure object
The AtomGraph module#
The AtomGraph handles the atom connectivity within Molecule objects. It provides the bulk of connectivity related methods such as get_neighbors.
The AtomGraph class
- class biobuild.graphs.AtomGraph(id, bonds: list)[source]#
Bases:
BaseGraphA graph representation of atoms and bonds in a contiguous molecule.
- draw()[source]#
Prepare a 3D view of the graph but do not show it yet
- Returns:
A 3D viewer
- Return type:
- classmethod from_biopython(structure, apply_standard_bonds: bool = True, infer_residue_connections: bool = True, infer_bonds: bool = False, max_bond_length: float = None, restrict_residues: bool = True, _topology=None)[source]#
Create an AtomGraph from a biopython structure
- Parameters:
structure – The biopython structure. This can be any biopython object that houses atoms.
infer_residue_connections (bool) – Whether to infer residue connecting bonds based on atom distances.
infer_bonds (bool) – Whether to infer bonds from the distance between atoms. If this is set to True, standard bonds cannot be also applied!
max_bond_length (float) – The maximum distance between atoms to infer a bond. If none is given, a default bond length is assumed.
restrict_residues (bool) – Whether to restrict to atoms of the same residue when inferring bonds. If set to False, this will also infer residue connecting bonds.
_topology – A specific reference topology to use when re-constructing any missing parts. By default the default CHARMM topology is used.
- Returns:
The AtomGraph representation of the molecule
- Return type:
- classmethod from_molecule(mol, locked: bool = False)[source]#
Create an AtomGraph from a molecule
- Parameters:
mol (biobuild.molecule.Molecule) – The molecule to convert
locked (bool, optional) – If True, any information about locked bonds will also be transferred to the AtomGraph, by default False.
- Returns:
The AtomGraph representation of the molecule
- Return type:
- get_neighbors(atom: Atom, n: int = 1, mode='upto')[source]#
Get the neighbors of a node
- Parameters:
atom (Atom) – The atom
n (int, optional) – The number of bonds to separate the atom from its neighbors.
mode (str, optional) – The mode to use for getting the neighbors, by default “upto” - “upto”: get all neighbors up to a distance of n bonds - “exact”: get all neighbors exactly n bonds away
- Returns:
The neighbors of the atom
- Return type:
set
The ResidueGraph module#
The ResidueGraph handles the residue connectivity within Molecule objects. It is an abstraction of the AtomGraph and provides the many of the same methods. ResidueGraph objects serve as primary input for structural optimization algorithms in the optimizers package of biobuild.
The ResidueGraph class
- class biobuild.graphs.ResidueGraph(id, bonds: list)[source]#
Bases:
BaseGraphA graph representation of residues bonded together as an abstraction of a large contiguous molecule.
- add_atomic_bonds(*edges)[source]#
Add atom-level bonds to the graph.
- Parameters:
*edges – The edges to add
- property atomic_bonds#
Get the atomic-level bonds in the molecule.
- Returns:
The atomic-level bonds in the molecule
- Return type:
dict
- centers_of_mass()[source]#
Get the centers of mass of the residues in the molecule.
- Returns:
The centers of mass of the residues in the molecule
- Return type:
dict
- draw()[source]#
Prepare a 3D view of the graph but do not show it yet
- Returns:
A 3D viewer
- Return type:
- find_rotatable_edges(root_node=None, min_descendants: int = 1, min_ancestors: int = 1, max_descendants: int = None, max_ancestors: int = None)[source]#
Find all edges in the graph that are rotatable (i.e. not locked and not in a circular constellation). You can also filter and direct the edges.
- Parameters:
root_node – A root node by which to direct the edges (closer to further).
min_descendants (int, optional) – The minimum number of descendants that an edge must have to be considered rotatable.
min_ancestors (int, optional) – The minimum number of ancestors that an edge must have to be considered rotatable.
max_descendants (int, optional) – The maximum number of descendants that an edge must have to be considered rotatable.
max_ancestors (int, optional) – The maximum number of ancestors that an edge must have to be considered rotatable.
- Returns:
A list of rotatable edges
- Return type:
list
- classmethod from_AtomGraph(atom_graph, infer_connections: bool = None)[source]#
Create a ResidueGraph from an AtomGraph.
- Parameters:
atom_graph (AtomGraph) – The AtomGraph representation of the molecule
infer_connections (bool) – Whether to infer the bonds between residues from the atom-level bonds. If the AtomGraph already contains atom-level bonds that connect different residues, this is not necessary. If this is set to None, connections will be inferred automatically if no atom-level bonds are present in the AtomGraph.
- Returns:
The ResidueGraph representation of the molecule
- Return type:
- classmethod from_molecule(mol, detailed: bool = False, locked: bool = True)[source]#
Create a ResidueGraph from a molecule object.
- Parameters:
mol (Molecule) – The molecule object
detailed (bool) – Whether to make a “detailed” residue graph representation including the atomic-scale bonds between residues. If True, locked bonds can be directly migrated from the molecule.
locked (bool) – Whether to migrate locked bonds from the molecule. This is only possible if detailed is True.
- Returns:
The ResidueGraph representation of the molecule
- Return type:
- get_neighbors(residue: Residue, n: int = 1, mode='upto')[source]#
Get the neighbors of a residue
- Parameters:
residue (bio.Residue.Residue) – The target residue
n (int, optional) – The number of connections to separate the residue from its neighbors.
mode (str, optional) – The mode to use for getting the neighbors, by default “upto” - “upto”: get all neighbors up to a distance of n bonds - “exact”: get all neighbors exactly n bonds away
- Returns:
The neighbors of the residue
- Return type:
set
- lock_centers()[source]#
Lock any edges that connect residue centers of mass to their constituent atoms. This only applies to detailed graphs.
- make_detailed(include_samples: bool = True, include_far_away: bool = False, include_heteroatoms: bool = False, include_clashes: bool = True, n_samples: int | float = 0.5, f: float = 1.0, no_hydrogens: bool = False) ResidueGraph[source]#
Use a detailed representation of the residues in the molecule by adding the specific atoms that connect the residues together. This is useful for visualization and analysis.
Note
This function is not reversible. It is applied in-place.
- Parameters:
include_samples (bool) – If True, a number of atoms are sampled from each residue and included in the detailed representation.
include_far_away (bool) – If True, atoms that are not involved in residue connections are also included if their distance to the residue’s center of mass is greater than f * the 75th percentile of atom distances to the residue’s center of mass.
include_heteroatoms (bool) – If True, all hetero-atoms are included in the detailed representation, regardless of their distance to the residue center of mass.
include_clashes (bool) – If True, all atoms that are involved in a clash are included in the detailed representation.
n_samples (int or float) – The number or fraction of atoms to sample from each residue if include_samples is True. If a fraction in range (0,1) is given instead of an integer, the number of atoms to sample is adjusted according to the residue size.
f (float) – The factor by which the 75th percentile of atom distances to the residue’s center of mass is multiplied to determine the cutoff distance for outlier atoms. This is only used if include_outliers is True.
no_hydrogens (bool) – If True, hydrogens are not included in the detailed representation.
- prune_triplets()[source]#
Prune bond triangles where two nodes from the same residue are connected to each other and the residue…
- property residues#
Get the residues in the molecule.
- Returns:
The residues in the molecule
- Return type:
list