Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autogenerate modification auto-mappings #604

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

Conversation

pckroon
Copy link
Member

@pckroon pckroon commented Jun 24, 2024

No description provided.

@pckroon pckroon requested a review from fgrunewald June 24, 2024 14:00
@fgrunewald
Copy link
Member

@pckroon this seems to cause a flood of warnings:

 WARNING - unmapped-atom - Can't find modification mappings for the modifications ['N-ter']. The following modification mappings are known: {'C-ter': <vermouth.map_parser.Mapping object at 0x164a85bb0>, 'COOH-ter': <vermouth.map_parser.Mapping object at 0x164a85d00>, 'N-ter': <vermouth.map_parser.Mapping object at 0x164a85e50>, 'NH2-ter': <vermouth.map_parser.Mapping object at 0x164a85fa0>, 'CCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14130>, 'NCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14280>, 'GLU-HE1': <vermouth.map_parser.Mapping object at 0x164b143d0>, 'GLU-HE2': <vermouth.map_parser.Mapping object at 0x164b14520>, 'ASP-HD2': <vermouth.map_parser.Mapping object at 0x164b14670>, 'ASP-HD1': <vermouth.map_parser.Mapping object at 0x164b147c0>, 'LYS-HZ3': <vermouth.map_parser.Mapping object at 0x164b14910>, 'LYS-LSN': <vermouth.map_parser.Mapping object at 0x164b14a60>, 'HIS-HP': <vermouth.map_parser.Mapping object at 0x164b14bb0>, 'HIS-HD': <vermouth.map_parser.Mapping object at 0x164b14d00>, 'HIS-HE': <vermouth.map_parser.Mapping object at 0x164b14e50>, 'TYRPHOS': <vermouth.map_parser.Mapping object at 0x164b14fa0>}
 WARNING - unmapped-atom - Can't find modification mappings for the modifications ['N-ter']. The following modification mappings are known: {'C-ter': <vermouth.map_parser.Mapping object at 0x164a85bb0>, 'COOH-ter': <vermouth.map_parser.Mapping object at 0x164a85d00>, 'N-ter': <vermouth.map_parser.Mapping object at 0x164a85e50>, 'NH2-ter': <vermouth.map_parser.Mapping object at 0x164a85fa0>, 'CCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14130>, 'NCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14280>, 'GLU-HE1': <vermouth.map_parser.Mapping object at 0x164b143d0>, 'GLU-HE2': <vermouth.map_parser.Mapping object at 0x164b14520>, 'ASP-HD2': <vermouth.map_parser.Mapping object at 0x164b14670>, 'ASP-HD1': <vermouth.map_parser.Mapping object at 0x164b147c0>, 'LYS-HZ3': <vermouth.map_parser.Mapping object at 0x164b14910>, 'LYS-LSN': <vermouth.map_parser.Mapping object at 0x164b14a60>, 'HIS-HP': <vermouth.map_parser.Mapping object at 0x164b14bb0>, 'HIS-HD': <vermouth.map_parser.Mapping object at 0x164b14d00>, 'HIS-HE': <vermouth.map_parser.Mapping object at 0x164b14e50>, 'TYRPHOS': <vermouth.map_parser.Mapping object at 0x164b14fa0>}
 WARNING - unmapped-atom - Can't find modification mappings for the modifications ['C-ter']. The following modification mappings are known: {'C-ter': <vermouth.map_parser.Mapping object at 0x164a85bb0>, 'COOH-ter': <vermouth.map_parser.Mapping object at 0x164a85d00>, 'N-ter': <vermouth.map_parser.Mapping object at 0x164a85e50>, 'NH2-ter': <vermouth.map_parser.Mapping object at 0x164a85fa0>, 'CCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14130>, 'NCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14280>, 'GLU-HE1': <vermouth.map_parser.Mapping object at 0x164b143d0>, 'GLU-HE2': <vermouth.map_parser.Mapping object at 0x164b14520>, 'ASP-HD2': <vermouth.map_parser.Mapping object at 0x164b14670>, 'ASP-HD1': <vermouth.map_parser.Mapping object at 0x164b147c0>, 'LYS-HZ3': <vermouth.map_parser.Mapping object at 0x164b14910>, 'LYS-LSN': <vermouth.map_parser.Mapping object at 0x164b14a60>, 'HIS-HP': <vermouth.map_parser.Mapping object at 0x164b14bb0>, 'HIS-HD': <vermouth.map_parser.Mapping object at 0x164b14d00>, 'HIS-HE': <vermouth.map_parser.Mapping object at 0x164b14e50>, 'TYRPHOS': <vermouth.map_parser.Mapping object at 0x164b14fa0>}
 WARNING - unmapped-atom - Can't find modification mappings for the modifications ['C-ter']. The following modification mappings are known: {'C-ter': <vermouth.map_parser.Mapping object at 0x164a85bb0>, 'COOH-ter': <vermouth.map_parser.Mapping object at 0x164a85d00>, 'N-ter': <vermouth.map_parser.Mapping object at 0x164a85e50>, 'NH2-ter': <vermouth.map_parser.Mapping object at 0x164a85fa0>, 'CCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14130>, 'NCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14280>, 'GLU-HE1': <vermouth.map_parser.Mapping object at 0x164b143d0>, 'GLU-HE2': <vermouth.map_parser.Mapping object at 0x164b14520>, 'ASP-HD2': <vermouth.map_parser.Mapping object at 0x164b14670>, 'ASP-HD1': <vermouth.map_parser.Mapping object at 0x164b147c0>, 'LYS-HZ3': <vermouth.map_parser.Mapping object at 0x164b14910>, 'LYS-LSN': <vermouth.map_parser.Mapping object at 0x164b14a60>, 'HIS-HP': <vermouth.map_parser.Mapping object at 0x164b14bb0>, 'HIS-HD': <vermouth.map_parser.Mapping object at 0x164b14d00>, 'HIS-HE': <vermouth.map_parser.Mapping object at 0x164b14e50>, 'TYRPHOS': <vermouth.map_parser.Mapping object at 0x164b14fa0>}
 WARNING - unmapped-atom - Can't find modification mappings for the modifications ['C-ter']. The following modification mappings are known: {'C-ter': <vermouth.map_parser.Mapping object at 0x164a85bb0>, 'COOH-ter': <vermouth.map_parser.Mapping object at 0x164a85d00>, 'N-ter': <vermouth.map_parser.Mapping object at 0x164a85e50>, 'NH2-ter': <vermouth.map_parser.Mapping object at 0x164a85fa0>, 'CCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14130>, 'NCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14280>, 'GLU-HE1': <vermouth.map_parser.Mapping object at 0x164b143d0>, 'GLU-HE2': <vermouth.map_parser.Mapping object at 0x164b14520>, 'ASP-HD2': <vermouth.map_parser.Mapping object at 0x164b14670>, 'ASP-HD1': <vermouth.map_parser.Mapping object at 0x164b147c0>, 'LYS-HZ3': <vermouth.map_parser.Mapping object at 0x164b14910>, 'LYS-LSN': <vermouth.map_parser.Mapping object at 0x164b14a60>, 'HIS-HP': <vermouth.map_parser.Mapping object at 0x164b14bb0>, 'HIS-HD': <vermouth.map_parser.Mapping object at 0x164b14d00>, 'HIS-HE': <vermouth.map_parser.Mapping object at 0x164b14e50>, 'TYRPHOS': <vermouth.map_parser.Mapping object at 0x164b14fa0>}
 WARNING - unmapped-atom - These atoms are not covered by a mapping. Either your mappings don't describe all atoms (bad idea), or, there's no mapping available for all residues. ['171A-ASN21:OXT', '412B-ALA30:OXT'

@pckroon
Copy link
Member Author

pckroon commented Jun 26, 2024

Alright, that is a problem, since the modification mappings are known. I'll dig a bit

@pckroon
Copy link
Member Author

pckroon commented Jun 26, 2024

It currently produces the following errors:

   ERROR - general - The following atoms do not have a atype: [
{'PTM_atom': True, 'element': 'H', 'replace': {'atomname': 'HN2'}, 'order': 0, 'atomname': 'HN2', 'modifications': [<Modification "N-ter" at 0x7fc99e8a1910>, <Modification "('N-ter',)" at 0x7fc99880fe10>], 'graph': <vermouth.molecule.Molecule object at 0x7fc997a91890>, 'mapping_weights': {607: 1}, 'resname': 'MET', 'chain': 'A', 'resid': 1, '_old_resid': 1, 'position': array([nan, nan, nan])}, 
{'PTM_atom': True, 'element': 'H', 'replace': {'atomname': 'HN3'}, 'order': 0, 'atomname': 'HN3', 'modifications': [<Modification "N-ter" at 0x7fc99e8a1910>, <Modification "('N-ter',)" at 0x7fc99880fe10>], 'graph': <vermouth.molecule.Molecule object at 0x7fc997a91b10>, 'mapping_weights': {610: 1}, 'resname': 'MET', 'chain': 'A', 'resid': 1, '_old_resid': 1, 'position': array([nan, nan, nan])}, 
{'PTM_atom': True, 'element': 'O', 'order': 0, 'atomname': 'OXT', 'modifications': [<Modification "C-ter" at 0x7fc99e898850>, <Modification "('C-ter',)" at 0x7fc99880e710>], 'graph': <vermouth.molecule.Molecule object at 0x7fc997a91650>, 'mapping_weights': {601: 1}, 'resname': 'GLY', 'chain': 'A', 'resid': 76, '_old_resid': 76, 'position': array([4.0862, 3.9575, 3.6251])}]

   ERROR - general - The following atoms do not have a charge_group: [
{'PTM_atom': True, 'element': 'H', 'replace': {'atomname': 'HN2'}, 'order': 0, 'atomname': 'HN2', 'modifications': [<Modification "N-ter" at 0x7fc99e8a1910>, <Modification "('N-ter',)" at 0x7fc99880fe10>], 'graph': <vermouth.molecule.Molecule object at 0x7fc997a91890>, 'mapping_weights': {607: 1}, 'resname': 'MET', 'chain': 'A', 'resid': 1, '_old_resid': 1, 'position': array([nan, nan, nan])}, 
{'PTM_atom': True, 'element': 'H', 'replace': {'atomname': 'HN3'}, 'order': 0, 'atomname': 'HN3', 'modifications': [<Modification "N-ter" at 0x7fc99e8a1910>, <Modification "('N-ter',)" at 0x7fc99880fe10>], 'graph': <vermouth.molecule.Molecule object at 0x7fc997a91b10>, 'mapping_weights': {610: 1}, 'resname': 'MET', 'chain': 'A', 'resid': 1, '_old_resid': 1, 'position': array([nan, nan, nan])}, 
{'PTM_atom': True, 'element': 'O', 'order': 0, 'atomname': 'OXT', 'modifications': [<Modification "C-ter" at 0x7fc99e898850>, <Modification "('C-ter',)" at 0x7fc99880e710>], 'graph': <vermouth.molecule.Molecule object at 0x7fc997a91650>, 'mapping_weights': {601: 1}, 'resname': 'GLY', 'chain': 'A', 'resid': 76, '_old_resid': 76, 'position': array([4.0862, 3.9575, 3.6251])}]

The missing atype is a data issue on the modifications; and the missing charge_group is probably an issue with the mapping processor (which makes me very sad).
Also note that some of these atoms have nan-positions

@pckroon
Copy link
Member Author

pckroon commented Jun 28, 2024

Alright, this should fix most of the mess, at least regarding charge groups. The AA modifications are still missing some critical data, such as charge and atom types.
I don't have the time to fix the patch test coverage

mod_to_out[mod_idx] = out_idx # pylint: disable=undefined-loop-variable
graph_out.nodes[out_idx].update(modification.nodes[mod_idx].get('replace', {})) # pylint: disable=undefined-loop-variable
graph_out.nodes[out_idx]['modifications'] = graph_out.nodes[out_idx].get('modifications', [])
if modification not in graph_out.nodes[out_idx]['modifications']:
graph_out.nodes[out_idx]['modifications'].append(modification)

# FIXME Jank here to ensure the charge_group attributes look reasonable
charge_group_start = max(graph_out.nodes[mod_to_out[idx]].get('charge_group', 1) for idx in anchors) if anchors else 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is 1 a good default value? Shouldn't all residues at this stage have a charge group?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should? Yes. But I've been bitten by that word too often. Note that the number picked here does get incremented by the number of nodes before this in the modification. Admittedly it's not great, but it's ok.

Copy link
Member

@fgrunewald fgrunewald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't spot anything obvious but had some questions. However, I must admit I hardly comprehend the mapping module so no guarantees. But I assume you tested this? Perhaps worth to add an integration test instead of patching the code coverage with unit tests


# Time to collect the attributes to set. Start by grabbing all the
# attributes we already have.
node_attrs = deepcopy(graph_out.nodes[out_idx])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deepcopy because of the graph attribute?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. And paranoia.

Comment on lines +774 to +790
for node in graph_out:
for attr in attribute_must:
if attr not in graph_out.nodes[node]:
# We can skip `node`, since if it had the attribute required we
# wouldn't be in this mess to begin with.
# Set a depth_limit to keep the runtime within bounds.
close_nodes = (v for (u, v) in nx.bfs_edges(graph_out, source=node, depth_limit=3))
for template_node in close_nodes:
if attr in graph_out.nodes[template_node]:
graph_out.nodes[node][attr] = deepcopy(graph_out.nodes[template_node][attr])
LOGGER.warning("Atom {} did not have a {}, so we took "
"it from atom {} instead.",
format_atom_string(graph_out.nodes[node]),
attr,
format_atom_string(graph_out.nodes[template_node]),
type="inconsistent-data")
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get how this can be the case. Is that for hydrogen modifications?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens whenever a output particle is made up of only nodes that were reconstructed by repair graph. So unless there's entire sidechains missing you generally only see this with atomistic models, which is why this remained hidden for so long.

@pckroon
Copy link
Member Author

pckroon commented Jul 5, 2024

But I assume you tested this? Perhaps worth to add an integration test instead of patching the code coverage with unit tests

I tested this roughly (make an atomistic 1ubq, see if the resulting itp looks mostly reasonable by staring at it).
If/when you have modifications with appropriate parameters I'd be happy to run it again, check it more thoroughly, and add that as integration test.

@fgrunewald
Copy link
Member

fgrunewald commented Jul 5, 2024

@pckroon we're almost there just one ingredient missing: we need to patch the rtp paser to generate all missing dihedrals and pairs

do you want to take a stab at it? I'm not saying a complete rewrite just patch

@pckroon
Copy link
Member Author

pckroon commented Jul 5, 2024

Doesn't https://github.com/marrink-lab/vermouth-martinize/blob/master/vermouth/gmx/rtp.py#L257 already generate all the dihedrals? (Should, anyway).
About the pairs, are those the ones meant by "TODO: generate 1-4 interactions between pairs of hydrogen atoms"?

@fgrunewald
Copy link
Member

for lysozyme I'm missing like 4000 dihedrals, all angles, and a bunch of bonds. I think Jon just makes the dihedrals for links? Also for charmm cmaps are missing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants