Add missing heavy atoms#

Adding missing non-hydrogen atoms to molecular system

Some molecular systems—especially those obtained from experimental techniques like X-ray crystallography—may lack certain heavy atoms in their structure due to resolution limits or disorder.

MolSysMT provides a function, molsysmt.build.add_missing_heavy_atoms(), to reconstruct these atoms whenever sufficient topological and geometrical information is available.

Added in version 1.0.0.

How this function works#

API documentation

Follow this link for a detailed description of the input arguments, raised errors, and returned objects: molsysmt.build.add_missing_heavy_atoms().

This function checks for missing non-hydrogen atoms in standard residues based on the known topology of the system. Using internal templates and geometric constraints, it reconstructs the 3D coordinates of atoms that are missing but can be inferred unambiguously.

It requires that the system already includes atoms with known names and residue types, such as those typically present in PDB files.

Let’s illustrate how this function works with a controlled example.
Instead of downloading a real structure from the PDB, we’ll build a small molecular system and intentionally remove some heavy atoms to simulate a defect.

import molsysmt as msm
molecular_system = msm.build.build_peptide('AceHisThrNme')
msm.info(molecular_system)
form n_atoms n_groups n_components n_chains n_molecules n_entities n_peptides n_structures
molsysmt.MolSys 43 4 1 1 1 1 1 1

As with most X-ray structures, our system currently lacks hydrogen atoms:

molecular_system = msm.remove(molecular_system, selection='atom_type=="H"')
msm.build.has_hydrogens(molecular_system)
False

Let’s now remove a few heavy atoms to simulate a common kind of experimental defect:

molecular_system = msm.remove(molecular_system, selection='atom_name in ["NE2", "CD2", "OG1"]')

We have our system ready for the purpose of showing the use of molsysmt.build.add_missing_heavy_atoms().

In a real case, we would first wonder if there are missing heavy atoms in our system:

msm.build.get_missing_heavy_atoms(molecular_system)
{1: ['NE2', 'CD2'], 2: ['OG1']}

Great — the missing atoms detected are exactly the ones we removed earlier.
It’s time to add them back to the system.

molecular_system = msm.build.add_missing_heavy_atoms(molecular_system)

We can now confirm that the missing atoms have been successfully restored:

msm.info(molecular_system, element='atom')
index id name type group index group id group name group type component index chain index molecule index molecule type entity index entity name
0 1 CH3 C 0 1 ACE terminal capping 0 0 0 peptide 0 peptide 0
1 2 C C 0 1 ACE terminal capping 0 0 0 peptide 0 peptide 0
2 3 O O 0 1 ACE terminal capping 0 0 0 peptide 0 peptide 0
3 4 N N 1 2 HIS amino acid 0 0 0 peptide 0 peptide 0
4 5 CA C 1 2 HIS amino acid 0 0 0 peptide 0 peptide 0
5 6 CB C 1 2 HIS amino acid 0 0 0 peptide 0 peptide 0
6 7 CG C 1 2 HIS amino acid 0 0 0 peptide 0 peptide 0
7 8 ND1 N 1 2 HIS amino acid 0 0 0 peptide 0 peptide 0
8 9 CE1 C 1 2 HIS amino acid 0 0 0 peptide 0 peptide 0
9 10 C C 1 2 HIS amino acid 0 0 0 peptide 0 peptide 0
10 11 O O 1 2 HIS amino acid 0 0 0 peptide 0 peptide 0
11 12 NE2 N 1 2 HIS amino acid 0 0 0 peptide 0 peptide 0
12 13 CD2 C 1 2 HIS amino acid 0 0 0 peptide 0 peptide 0
13 14 N N 2 3 THR amino acid 0 0 0 peptide 0 peptide 0
14 15 CA C 2 3 THR amino acid 0 0 0 peptide 0 peptide 0
15 16 CB C 2 3 THR amino acid 0 0 0 peptide 0 peptide 0
16 17 CG2 C 2 3 THR amino acid 0 0 0 peptide 0 peptide 0
17 18 C C 2 3 THR amino acid 0 0 0 peptide 0 peptide 0
18 19 O O 2 3 THR amino acid 0 0 0 peptide 0 peptide 0
19 20 OG1 O 2 3 THR amino acid 0 0 0 peptide 0 peptide 0
20 21 N N 3 4 NME terminal capping 0 0 0 peptide 0 peptide 0
21 22 C C 3 4 NME terminal capping 0 0 0 peptide 0 peptide 0

But we have no hydrogens, right? Find out how to add them in the documentation section Add missing hydrogens.

See also

User guide > Tools > Build > Get missing heavy atoms:
Identify heavy atoms that are missing based on residue templates.

User guide > Tools > Basic > Remove:
Remove atoms from a molecular system.

User guide > Tools > Basic > Contains:
Check if specific elements are present in the system.

User guide > Tools > Basic > Info:
Display a summary of elements and their properties.

User guide > Tools > Build > Build peptide:
Generate capped peptide structures from sequence.