New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Fix/links dict #443

Open

fgrunewald wants to merge 15 commits into master from fix/links_dict

Member

fgrunewald commented Jul 6, 2022

This PR implements a temporary interactions dict instead of using the replace_interaction function in DoLinks. The benefit is that do-links does not iterate over the complete interactions list every time an interaction is added. In addition it now automatically takes care of symmetric interactions. I also refactored the code a little bit to make it a little more readable and added doc-strings.

The cysteine problem is fixed in this PR as well. In the end the most simple fix is to allow the oder match to be true whenever a '*' wildcard is involved irrespective of the reisds. That might seem counterintuitive, however, the graph matching basically makes sure that we never find a node connected to itself, which is the only thing this check checks. I've tested in on the relent PDB as well and it produces the correct number of disulphide bridges. I see how this could be considered dangerous. In that case I think it would be best to use the mol_idx for making sure the nodes are different rather than the chain ID, which is really only used for proteins.

fgrunewald added 7 commits

July 6, 2022 11:27


          add draft for ff-output

04c3beb


          refactor do-links; fist pass all tests

afc6c9f


          reset class variables after running processor

5ba32cf


          remove doubles

df91e95


          implement cysteine fix

364a264


          also delete symmetric interactions

ff21bdc


          remove ffoutput

a502ffa

fgrunewald requested a review from pckroon

July 6, 2022 13:49

Member Author

fgrunewald commented Jul 6, 2022

Based on SARS-COV2 Spike protein it gives about 40% speedup

pckroon requested changes

View reviewed changes

Member

pckroon left a comment

That's an impressive speed increase!
I do have few questions/comments. How do you feel about moving this to Molecule instead? And I think there's a few (small) issues with orders of operations. Finally, I don't see why the "scheduled changes" need to be instance variables, rather than local variables.

vermouth/processors/do_links.py Outdated Show resolved Hide resolved

vermouth/processors/do_links.py Show resolved Hide resolved

vermouth/processors/do_links.py Outdated Show resolved Hide resolved

vermouth/processors/do_links.py Outdated Show resolved Hide resolved

vermouth/processors/do_links.py Outdated Show resolved Hide resolved

vermouth/processors/do_links.py

Comment on lines +373 to +374

		for edge in self.current_link.edges:
		molecule.add_edge(match[edge[0]], match[edge[1]])

Member

pckroon Jul 6, 2022

This does nothing, since links must be isomorphic to molecule anyway?

Member Author

fgrunewald Jul 6, 2022

I guess so but wasn't 100% sure.

vermouth/processors/do_links.py

		for link in links:
		self.current_link = link

Member

pckroon Jul 6, 2022

Why not pass link as argument to the functions/methods?

Member Author

fgrunewald Jul 6, 2022

mehh I thought it was a clever choice today but I see the point

vermouth/processors/do_links.py

+                              self.add_edges(molecule, match)
+                          # we remove the nodes scheduled for removal
+                          molecule.remove_nodes_from(self.nodes_to_remove)

Member

pckroon Jul 6, 2022

Add interactions first, then remove nodes. That will be slower, but it could be link 1 adds an interaction on a node that link 2 removes.

vermouth/processors/do_links.py Outdated Show resolved Hide resolved

vermouth/processors/do_links.py

Comment on lines +404 to +407

+                      self.applied_links = defaultdict(dict)
+                      self.current_match = None
+                      self.current_link = None
+                      self.nodes_to_remove = []

Member

pckroon Jul 6, 2022

Not a fan. Why not create local dicts, and update those with the results of the methods?

Member Author

fgrunewald Jul 6, 2022

because you've to hand down 4 extra variables each time ... There must be a more pythonic way to clear them though

Member

pckroon Jul 7, 2022

removed_nodes = self.replace_attributes(molecule, link, match)
nodes_to_remove.extend(removed_nodes)

And update for dicts, etc.

fgrunewald and others added 8 commits

July 11, 2022 12:15


          Update vermouth/processors/do_links.py

823a491

Co-authored-by: Peter C Kroon <[email protected]>


          Update vermouth/processors/do_links.py

6f41fd6

Co-authored-by: Peter C Kroon <[email protected]>


          Update vermouth/processors/do_links.py

d8e4bf7

Co-authored-by: Peter C Kroon <[email protected]>


          Update vermouth/processors/do_links.py

6a70b87

Co-authored-by: Peter C Kroon <[email protected]>


          Update vermouth/processors/do_links.py

b6b242a

Co-authored-by: Peter C Kroon <[email protected]>


          Update vermouth/processors/do_links.py

c25a8de

Co-authored-by: Peter C Kroon <[email protected]>


          Update vermouth/processors/do_links.py

145b26e

Co-authored-by: Peter C Kroon <[email protected]>


          Update vermouth/processors/do_links.py

411102c

Co-authored-by: Peter C Kroon <[email protected]>

Member Author

fgrunewald commented Jul 11, 2022

@pckroon How about making interactions a custom dict that essentially emulates defaultdict(list) in interface except when __getitem__ etc. methods get an additional keyword? In general making interactions dict custom would also be a way to implement the custom type based atom-key selection. For example, for angles (A, B, C) would return the same value as (C, B, A) and something like (C, B, A) in interactions_dict would evaluate to true as well as using (A, B, C). However, for other types like VS or exclusions a different evaluator would be used. It's kinda complicated but could make the API quite simple to use and robust. Of course it kinda limits interactions again to GROMACS style, but we could have a class creator method that potentially sets types with special key considerations??

Member

pckroon commented Jul 13, 2022

would you be open to doing this in this PR?

Sure, but don't underestimate how much work it's going to be.

I think the cleanest way is not to mess with dict methods, but rather to make an actual Interaction(tuple) (and SymmetricInteraction(Interaction)) class, to be subclassed as needed. Then give those classes the appropriate __eq__ and __hash__ methods. Different interactions can have more intelligent methods, think of dihedrals with versions, or vsites. For the hash, I think the best way would be e.g.

def __hash__(self):
    return hash((self.atoms, reversed(self.atoms))

This means that in that dict I can ask for Bond(1, 2) or Bond(2, 1) and get back the same item.
Example in code:

current_bond = mol.interactions['bonds'][Bond(1, 2)]

Thinking about this I don't really like this yet. Currently you do mol.interactions['bonds'][(1, 2)], and then get back an Interaction object. As I propose you'd use an Interaction object as key, which begs the question what the value should be. (InteractionParameters?)
Implementation-wise I do think this is cleaner and easier to maintain then subclassing (default)dict with an intelligent __getitem__. Subclassing dict results in a nicer API though. How would this dict decide which comparison to use?

What's your take?

As for maintaining compatibility with the currently existing libraries I suggest turning Molecule.interactions into a property, which simply yields from the appropriate things.
For future compat with interaction types we don't know about (yet), you could either make thin subclasses of Interaction, or just use Interaction itself.

Member Author

fgrunewald commented Jul 14, 2022 •

edited

Loading

@pckroon I think the problem with the organisation of interactions is the following: We don't actually know, how they will be most useful outside of the martinize2 context, which is in fact very limited. We only add or remove interactions, but don't actually do anything else with them. In lieu of having to finish the paper, I propose we stick with the local dict and implement a special comparison function that handles the edge cases for now. Not so pretty but pretty safe ;)

We have some other project related to polyply development and other vermouth based applications that actually do operations on the interactions, select and filter them. In #329 I've raised this issue before to some extend even though I wasn't sure what would be the best thing to do. I think we can move the discussion in this PR to a issue and take the time to thoroughly think this through.

Good ideas in my opinion are:

making interactions a property of molecule
having them collected in a way that is hashable by the atom-keys/indices
making the InteractionsTuple an actual class with potential subclasses

Member

pckroon commented Jul 14, 2022

I propose we stick with the local dict and implement a special comparison function that handles the edge cases for now.

Agreed.

If you could collect required/desired features from a polyply point of view that would be great. In the meantime I'll reopen #329 with some more ideas I had on the bike

pckroon mentioned this pull request

iterator method for interactions dict #329

Open

fgrunewald force-pushed the master branch from 99a936f to 1fba29e Compare

April 25, 2023 13:56

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet