Add missing heavy atoms#
Adding missing non-hydrogen atoms to molecular system
Some molecular systems—especially those obtained from experimental techniques like X-ray crystallography—may lack certain heavy atoms in their structure due to resolution limits or disorder.
MolSysMT provides a function, molsysmt.build.add_missing_heavy_atoms()
, to reconstruct these atoms whenever sufficient topological and geometrical information is available.
Added in version 1.0.0.
How this function works#
API documentation
Follow this link for a detailed description of the input arguments, raised errors, and returned objects: molsysmt.build.add_missing_heavy_atoms()
.
This function checks for missing non-hydrogen atoms in standard residues based on the known topology of the system. Using internal templates and geometric constraints, it reconstructs the 3D coordinates of atoms that are missing but can be inferred unambiguously.
It requires that the system already includes atoms with known names and residue types, such as those typically present in PDB files.
Let’s illustrate how this function works with a controlled example.
Instead of downloading a real structure from the PDB, we’ll build a small molecular system and intentionally remove some heavy atoms to simulate a defect.
import molsysmt as msm
molecular_system = msm.build.build_peptide('AceHisThrNme')
msm.info(molecular_system)
form | n_atoms | n_groups | n_components | n_chains | n_molecules | n_entities | n_peptides | n_structures |
---|---|---|---|---|---|---|---|---|
molsysmt.MolSys | 43 | 4 | 1 | 1 | 1 | 1 | 1 | 1 |
As with most X-ray structures, our system currently lacks hydrogen atoms:
molecular_system = msm.remove(molecular_system, selection='atom_type=="H"')
msm.build.has_hydrogens(molecular_system)
False
Let’s now remove a few heavy atoms to simulate a common kind of experimental defect:
molecular_system = msm.remove(molecular_system, selection='atom_name in ["NE2", "CD2", "OG1"]')
We have our system ready for the purpose of showing the use of molsysmt.build.add_missing_heavy_atoms()
.
In a real case, we would first wonder if there are missing heavy atoms in our system:
msm.build.get_missing_heavy_atoms(molecular_system)
{1: ['NE2', 'CD2'], 2: ['OG1']}
Great — the missing atoms detected are exactly the ones we removed earlier.
It’s time to add them back to the system.
molecular_system = msm.build.add_missing_heavy_atoms(molecular_system)
We can now confirm that the missing atoms have been successfully restored:
msm.info(molecular_system, element='atom')
index | id | name | type | group index | group id | group name | group type | component index | chain index | molecule index | molecule type | entity index | entity name |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | CH3 | C | 0 | 1 | ACE | terminal capping | 0 | 0 | 0 | peptide | 0 | peptide 0 |
1 | 2 | C | C | 0 | 1 | ACE | terminal capping | 0 | 0 | 0 | peptide | 0 | peptide 0 |
2 | 3 | O | O | 0 | 1 | ACE | terminal capping | 0 | 0 | 0 | peptide | 0 | peptide 0 |
3 | 4 | N | N | 1 | 2 | HIS | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
4 | 5 | CA | C | 1 | 2 | HIS | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
5 | 6 | CB | C | 1 | 2 | HIS | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
6 | 7 | CG | C | 1 | 2 | HIS | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
7 | 8 | ND1 | N | 1 | 2 | HIS | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
8 | 9 | CE1 | C | 1 | 2 | HIS | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
9 | 10 | C | C | 1 | 2 | HIS | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
10 | 11 | O | O | 1 | 2 | HIS | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
11 | 12 | NE2 | N | 1 | 2 | HIS | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
12 | 13 | CD2 | C | 1 | 2 | HIS | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
13 | 14 | N | N | 2 | 3 | THR | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
14 | 15 | CA | C | 2 | 3 | THR | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
15 | 16 | CB | C | 2 | 3 | THR | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
16 | 17 | CG2 | C | 2 | 3 | THR | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
17 | 18 | C | C | 2 | 3 | THR | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
18 | 19 | O | O | 2 | 3 | THR | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
19 | 20 | OG1 | O | 2 | 3 | THR | amino acid | 0 | 0 | 0 | peptide | 0 | peptide 0 |
20 | 21 | N | N | 3 | 4 | NME | terminal capping | 0 | 0 | 0 | peptide | 0 | peptide 0 |
21 | 22 | C | C | 3 | 4 | NME | terminal capping | 0 | 0 | 0 | peptide | 0 | peptide 0 |
But we have no hydrogens, right? Find out how to add them in the documentation section Add missing hydrogens.
See also
User guide > Tools > Build > Get missing heavy atoms:
Identify heavy atoms that are missing based on residue templates.
User guide > Tools > Basic > Remove:
Remove atoms from a molecular system.
User guide > Tools > Basic > Contains:
Check if specific elements are present in the system.
User guide > Tools > Basic > Info:
Display a summary of elements and their properties.
User guide > Tools > Build > Build peptide:
Generate capped peptide structures from sequence.