Topology#
A molecular system in MolSysMT has two parts:
Topology: what the system is — atoms, groups, chains, molecules, entities…
Structures: where those atoms are in space — coordinates, optional box, optional time…
This page focuses on topology. We will keep examples simple and descriptive, so you can look at your own system and immediately recognize these ideas.
import molsysviewer as viewer
view = viewer.demo["pentalanine"]
view.show()
As explained in Loading and inspection, the molecular system is stored in view as view.molsys. Inside it, you will find the topology object at view.molsys.topology:
type(view.molsys.topology)
molsysmt.native.topology.Topology
Topology levels#
MolSysMT (and therefore MolSysViewer) can query the system at different hierarchical levels, called elements in MolSysMT.
Here is a quick reference table:
Element |
Meaning |
|---|---|
|
Atoms of the molecular system. |
|
First supra-atomic chemical level: amino acids, waters, ions, lipids, etc. |
|
A covalently connected set of atoms (a covalent component). |
|
A single molecular instance (one water molecule, one benzene molecule, etc.). |
|
An author-defined grouping that can vary across sources; it may represent anything from a polymer chain to a higher-level supra-molecular partition. |
|
Molecular nature/type. For example, you can have two entities (water and benzene) and six molecules (four waters and two benzenes). |
|
The whole molecular system. |
|
Covalent connectivity between atoms (may be absent or inferred depending on the input). |
You do not need to memorize these definitions. The key idea is simple: selections and queries can be applied at any of these levels, depending on what you are trying to do.
In MolSysMT’s topology object, each element level has its own pandas DataFrame. For example, inspect the groups DataFrame in view.molsys.topology:
view.molsys.topology.groups
| group_id | group_name | group_type | molecule_index | |
|---|---|---|---|---|
| 0 | 1 | ACE | terminal capping | 0 |
| 1 | 2 | ALA | amino acid | 0 |
| 2 | 3 | ALA | amino acid | 0 |
| 3 | 4 | ALA | amino acid | 0 |
| 4 | 5 | ALA | amino acid | 0 |
| 5 | 6 | ALA | amino acid | 0 |
| 6 | 7 | NME | terminal capping | 0 |
A quick look with info#
view.info(...) is a small convenience wrapper around MolSysMT’s info. It shows a summary table in Jupyter.
Try changing the element to see which levels your system can describe.
view.info(element="system")
| form | n_atoms | n_groups | n_components | n_chains | n_molecules | n_entities | n_peptides | n_structures |
|---|---|---|---|---|---|---|---|---|
| molsysmt.MolSys | 62 | 7 | 1 | 1 | 1 | 1 | 1 | 200 |
view.info(element="molecule")
| index | name | type | n atoms | n groups | n components | chain index | entity index | entity name |
|---|---|---|---|---|---|---|---|---|
| 0 | peptide 0 | peptide | 62 | 7 | 1 | 0 | 0 | peptide 0 |
Extracting a few attributes with get#
view.get(...) is the companion of info(...): it returns attribute values you can use in your Python code.
For example, you can retrieve atom- and group-level attributes such as atom_name, atom_id, or group_type.
atom_name, atom_id, group_type = view.get(
element="atom",
selection=[0, 1, 2, 3, 4],
atom_name=True,
atom_id=True,
group_type=True,
)
atom_name, atom_id, group_type
(['H1', 'CH3', 'H2', 'H3', 'C'],
['0', '1', '2', '3', '4'],
['terminal capping',
'terminal capping',
'terminal capping',
'terminal capping',
'terminal capping'])
Discovering what topological attributes are available#
Sometimes you do not know which attribute names exist for your current system.
MolSysMT provides a discovery function for that: molsysmt.basic.get_attributes().
We do not wrap it in MolSysViewer because it is mostly a power-user helper, but it is a great tool to keep in mind when you are exploring a new dataset or a new form.
import molsysmt as msm
attrs = msm.get_attributes(view.molsys, attribute_type='topological', output_type='list')
attrs
['atom_index',
'atom_name',
'atom_id',
'atom_type',
'group_index',
'group_name',
'group_id',
'group_type',
'component_index',
'component_name',
'component_id',
'component_type',
'chain_index',
'chain_name',
'chain_id',
'chain_type',
'molecule_index',
'molecule_name',
'molecule_id',
'molecule_type',
'entity_index',
'entity_name',
'entity_id',
'entity_type',
'bond_index',
'bond_type',
'bond_order',
'bonded_atoms',
'bonded_atom_pairs',
'inner_bonded_atoms',
'inner_bonded_atom_pairs',
'inner_bond_index',
'n_atoms',
'n_groups',
'n_components',
'n_chains',
'n_molecules',
'n_entities',
'n_bonds',
'n_inner_bonds',
'n_amino_acids',
'n_nucleotides',
'n_ions',
'n_waters',
'n_small_molecules',
'n_peptides',
'n_proteins',
'n_dnas',
'n_rnas',
'n_lipids',
'n_polysaccharides',
'n_saccharides']
A note about editing topology#
In most workflows, you do not change topology from inside the viewer: you load a system, inspect it, select parts of it, and visualize it.
If you ever need to mutate the underlying system (rename something, remove atoms…), do it via the explicit MolSysViewer methods that are meant to keep the visualization consistent. Directly editing view.molsys is possible, but it is an advanced move and it is easy to desynchronize what you see.