Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to map from WBOFragmenter input molecule to parent molecule #129

Open
j-wags opened this issue Nov 6, 2021 · 1 comment
Open
Labels
enhancement New feature or request

Comments

@j-wags
Copy link
Member

j-wags commented Nov 6, 2021

Summary

Many users probably want to follow specific atoms/bonds through the fragmentation process. However, as a result of performing canonicalization of input molecules, there's no way to map from an atom in the input molecule of a Fragmenter job to an atom in the result's parent_molecule or any of the output fragments.

Reproducing example

from openff.toolkit.topology import Molecule
from openff.fragmenter.fragment import WBOFragmenter
mol1 = Molecule.from_smiles('ClCCCF')
frag_engine = WBOFragmenter()
result = frag_engine.fragment(mol1)

def draw_mol_and_label_atom_index(mol):
    rdmol = mol.to_rdkit()
    for atom in rdmol.GetAtoms():
        atom.SetAtomMapNum(atom.GetIdx())
    return rdmol

mol1 has the following atom map, and the drawing below shows the atom indices (NOT map indices)

print(mol1.properties)
draw_mol_and_label_atom_index(mol1)

{'atom_map': {0: 1,
1: 2,
2: 3,
3: 4,
4: 5,
5: 6,
6: 7,
7: 8,
8: 9,
9: 10,
10: 11}}

image

The result's parent_molecule has the following atom map, and the drawing below shows the atom indices (NOT map indices)

result.parent_molecule.properties
draw_mol_and_label_atom_index(result.parent_molecule)

{'atom_map': {0: 1,
1: 3,
2: 5,
3: 4,
4: 2,
5: 8,
6: 9,
7: 10,
8: 11,
9: 6,
10: 7}}

image

We see that the Cl switched from being atom index 0 in the input, to atom index 4 in the parent. There's no direct way to map from 0 to 4 given the information in the result object.

Solutions

I think the root cause of this issue is that the OpenFF toolkit performs canonicalization, but it doesn't have a way to return the atom mapping that got applied. I'll open an issue on the Toolkit repo to have that mapping get returned, and once that's available we can provide it to the user here.

@j-wags j-wags added the enhancement New feature or request label Nov 6, 2021
@j-wags j-wags changed the title No way to map from WBOFragmenter input molecule to parent molecule Provide a way to map from WBOFragmenter input molecule to parent molecule Nov 6, 2021
@SimonBoothroyd
Copy link
Collaborator

SimonBoothroyd commented Nov 6, 2021

I think the root cause of this issue is that the OpenFF toolkit performs canonicalization, but it doesn't have a way to return the atom mapping that got applied. I'll open an issue on the Toolkit repo to have that mapping get returned, and once that's available we can provide it to the user here.

This was the main reason. And as a slight tangent, there is additional code scattered throughout the codebase that basically reapplies the atom map after things like conformer generation or WBO generation because the TK will discard this info.

Currently, as @jthorton mentioned #126, we recommend retrieving the parent molecule from the results object.

There's no direct way to map from 0 to 4 given the information in the result object.

That's not strictly true. @ldamore , If you really do want to re-map the old map indices to the new without waiting for a new TK release, just do an isomorphism check of the old with the new parent and set return atom maps = True and use that to port the indices over. However note that the indices in the fragment correspond to the map indices in the parent of the results object

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants