Topology#

A molecular system in MolSysMT has two parts:

Topology: what the system is — atoms, groups, chains, molecules, entities…
Structures: where those atoms are in space — coordinates, optional box, optional time…

This page focuses on topology. We will keep examples simple and descriptive, so you can look at your own system and immediately recognize these ideas.

import molsysviewer as viewer

view = viewer.demo["pentalanine"]
view.show()

As explained in Loading and inspection, the molecular system is stored in view as view.molsys. Inside it, you will find the topology object at view.molsys.topology:

type(view.molsys.topology)

molsysmt.native.topology.Topology

Topology levels#

MolSysMT (and therefore MolSysViewer) can query the system at different hierarchical levels, called elements in MolSysMT.

Here is a quick reference table:

Element	Meaning
`atom`	Atoms of the molecular system.
`group`	First supra-atomic chemical level: amino acids, waters, ions, lipids, etc.
`component`	A covalently connected set of atoms (a covalent component).
`molecule`	A single molecular instance (one water molecule, one benzene molecule, etc.).
`chain`	An author-defined grouping that can vary across sources; it may represent anything from a polymer chain to a higher-level supra-molecular partition.
`entity`	Molecular nature/type. For example, you can have two entities (water and benzene) and six molecules (four waters and two benzenes).
`system`	The whole molecular system.
`bond`	Covalent connectivity between atoms (may be absent or inferred depending on the input).

You do not need to memorize these definitions. The key idea is simple: selections and queries can be applied at any of these levels, depending on what you are trying to do.

In MolSysMT’s topology object, each element level has its own pandas DataFrame. For example, inspect the groups DataFrame in view.molsys.topology:

view.molsys.topology.groups

	group_id	group_name	group_type
0	1	ACE	terminal capping
1	2	ALA	amino acid
2	3	ALA	amino acid
3	4	ALA	amino acid
4	5	ALA	amino acid
5	6	ALA	amino acid
6	7	NME	terminal capping

A practical way to navigate the hierarchy#

A very common pattern is:

start with an atom-level selection (atoms are concrete and easy to pick),
then retrieve a few attributes that tell you where those atoms belong (group name, chain index, entity name…).

This is how you connect “I see these atoms” to “what biological unit is this?”

ca_atoms = view.select('atom_name=="CA"')
ca_atoms[:10]

[8, 18, 28, 38, 48]

view.get(
    element="atom",
    selection=ca_atoms[:5],
    atom_name=True,
    group_id=True,
    group_name=True,
    chain_index=True
)

[['CA', 'CA', 'CA', 'CA', 'CA'],
 ['2', '3', '4', '5', '6'],
 ['ALA', 'ALA', 'ALA', 'ALA', 'ALA'],
 [0, 0, 0, 0, 0]]

A quick look with `info`#

view.info(...) is a small convenience wrapper around MolSysMT’s info. It shows a summary table in Jupyter.

Try changing the element to see which levels your system can describe.

view.info(element="system")

form	n_atoms	n_groups	n_components	n_chains	n_molecules	n_entities	n_peptides	n_structures
molsysmt.MolSys	62	7	1	1	1	1	1	200

view.info(element="molecule")

index	name	type	n atoms	n groups	n components	chain index	entity index	entity name
0	peptide 0	peptide	62	7	1	0	0	peptide 0

Extracting a few attributes with `get`#

view.get(...) is the companion of info(...): it returns attribute values you can use in your Python code.

For example, you can retrieve atom- and group-level attributes such as atom_name, atom_id, or group_type.

atom_name, atom_id, group_type = view.get(
    element="atom",
    selection=[0, 1, 2, 3, 4],
    atom_name=True,
    atom_id=True,
    group_type=True,
)

atom_name, atom_id, group_type

(['H1', 'CH3', 'H2', 'H3', 'C'],
 ['0', '1', '2', '3', '4'],
 ['terminal capping',
  'terminal capping',
  'terminal capping',
  'terminal capping',
  'terminal capping'])

Discovering what topological attributes are available#

Sometimes you do not know which attribute names exist for your current system.

MolSysMT provides a discovery function for that: molsysmt.basic.get_attributes().

We do not wrap it in MolSysViewer because it is mostly a power-user helper, but it is a great tool to keep in mind when you are exploring a new dataset or a new form.

import molsysmt as msm

attrs = msm.get_attributes(view.molsys, attribute_type='topological', output_type='list')
attrs

['atom_index',
 'atom_name',
 'atom_id',
 'atom_type',
 'group_index',
 'group_name',
 'group_id',
 'group_type',
 'component_index',
 'component_name',
 'component_id',
 'component_type',
 'chain_index',
 'chain_name',
 'chain_id',
 'chain_type',
 'molecule_index',
 'molecule_name',
 'molecule_id',
 'molecule_type',
 'entity_index',
 'entity_name',
 'entity_id',
 'entity_type',
 'bond_index',
 'bond_type',
 'bond_order',
 'bonded_atoms',
 'bonded_atom_pairs',
 'inner_bonded_atoms',
 'inner_bonded_atom_pairs',
 'inner_bond_index',
 'n_atoms',
 'n_groups',
 'n_components',
 'n_chains',
 'n_molecules',
 'n_entities',
 'n_bonds',
 'n_inner_bonds',
 'n_amino_acids',
 'n_nucleotides',
 'n_ions',
 'n_waters',
 'n_small_molecules',
 'n_peptides',
 'n_proteins',
 'n_dnas',
 'n_rnas',
 'n_lipids',
 'n_polysaccharides',
 'n_saccharides']

A note about editing topology#

In most workflows, you do not change topology from inside the viewer: you load a system, inspect it, select parts of it, and visualize it.

If you ever need to mutate the underlying system (rename something, remove atoms…), do it via the explicit MolSysViewer methods that are meant to keep the visualization consistent. Directly editing view.molsys is possible, but it is an advanced move and it is easy to desynchronize what you see.