Solve atoms with alternate locations#
choose coordinates for atoms with alternate locations.
Structures solved experimentally and deposited in the Protein Data Bank can have atoms with multiple locations. MolSysMT includes a function in the ‘build’ module to solve this ambiguity: molsysmt.build.solve_atoms_with_alternate_locations()
To illustrate how this function work, let’s check if the molecular system with PDB id 1BRS has atoms with alternate locations:
import molsysmt as msm
msm.get('1BRS', alternate_location=True)
[{2686: {'location_id': array(['A', 'B'], dtype=object),
'occupancy': array([0.5, 0.5]),
'b_factor': <Quantity([0.2466 0.2467], 'nanometer ** 2')>,
'atom_id': [2687, 2688],
'coordinates': <Quantity([[3.2742 2.2579 0.1536]
[3.2757 2.2571 0.1533]], 'nanometer')>},
2687: {'location_id': array(['A', 'B'], dtype=object),
'occupancy': array([0.5, 0.5]),
'b_factor': <Quantity([0.2594 0.2596], 'nanometer ** 2')>,
'atom_id': [2689, 2690],
'coordinates': <Quantity([[3.1412 2.241 0.1076]
[3.3396 2.192 0.2619]], 'nanometer')>}}]
MolSysMT returns the info about alternate locations as a list of dictionaries where the keys are the indices of the atoms with more than an atom_id, occupancy, b_factor and coordinates (stored in a dictionary in the corresponding values).
We know then that 1BRS has a structure with two atoms with alternate location. Let’s load the system as a ‘molsysmt.MolSys’ object to see how to deal with this situation:
molecular_system = msm.convert('pdb_id:1BRS', to_form='molsysmt.MolSys')
The resulting object still keeps the information about the alternate locations:
msm.get(molecular_system, element='atom', selection='atom_index==2686', alternate_location=True)
[{2686: {'location_id': array(['A', 'B'], dtype=object),
'occupancy': array([0.5, 0.5]),
'b_factor': <Quantity([0.2466 0.2467], 'nanometer ** 2')>,
'atom_id': [2687, 2688],
'coordinates': <Quantity([[3.2742 2.2579 0.1536]
[3.2757 2.2571 0.1533]], 'nanometer')>}}]
When a molecular system has atoms with alternate locations, by default, MolSysMT always takes those with the highest occupancy. In case there is not highest occupancy value, the location_id “A” is choosen.
msm.get(molecular_system, element='atom', selection=2686, coordinates=True)
Magnitude | [[[3.2741999999999996 2.2579 0.1536]]] |
---|---|
Units | nanometer |
How can we choose a different location for a specific atom? molsysmt.build.solve_atoms_with_alternate_locations()
can help to do it. Let’s for instance change all atoms to alternate location “B” to show how this function works:
msm.build.solve_atoms_with_alternate_location(molecular_system, location_id='B')
msm.get(molecular_system, element='atom', selection=2686, atom_id=True, coordinates=True)
[[2688], <Quantity([[[3.2757 2.2571 0.1533]]], 'nanometer')>]
The function molsysmt.build.solve_atoms_with_alternate_locations()
accepts the input argument ‘selection’ in case different location ids need to be provided for different atoms:
msm.build.solve_atoms_with_alternate_location(molecular_system, selection=[2686,2687], location_id=['A','B'])
msm.get(molecular_system, element='atom', selection=[2686,2687], atom_id=True, coordinates=True)
[[2687, 2690],
<Quantity([[[3.2742 2.2579 0.1536]
[3.3396 2.192 0.2619]]], 'nanometer')>]
The input argument ‘location_id’ accepts an extra value: ‘occupancy’. With ‘occupancy’ each atom takes the location with highest occupancy, or the location id equal to ‘A’ in case all occupancy values are equal.
msm.build.solve_atoms_with_alternate_location(molecular_system, selection=[2686,2687], location_id='occupancy')
msm.get(molecular_system, element='atom', selection=[2686,2687], atom_id=True, coordinates=True)
[[2687, 2689],
<Quantity([[[3.2742 2.2579 0.1536]
[3.1412 2.241 0.1076]]], 'nanometer')>]
Warning
Some molecular system’s forms can not keep alternate location data once these atoms are solved. With this forms the function molsysmt.build.solve_atoms_with_alternate_locations()
can only be applied once. This is the case of forms such as ‘string:pdb_text’, ‘string:pdb_id’ or ‘file:pdb’.