Skip to content

Considerations and features, code quality and performance

Mateusz Bieniek edited this page Nov 30, 2020 · 3 revisions

Superimposition performance:

  • PRIORITY: Rarity rank - instead of doing every node to every node, consider using rare atoms only to carry out the superimposition. For example, if there is a small number of N and O atoms in both that are the same, use only these as starting point.
  • PRIORITY: Rarity rank 2 - use also atoms that are linear (bottleneck/linkers). This way we avoid the issue with PI1.
  • Problematic idea PI1: if you find a component which extended to Lx-Rx match, then Lx-Rx in theory can be removed. However, if the initial Ly-Ry match happens to be on the wrong symmetric match, then we will not be able to detect that.
  • Coarse Grained superimposition - use network features such as components, bottlenecks, and others to make a "CG network" on which the superimposition could be applied first. This way you avoid the large complexity.
  • Code the superimposition functions in C

Code quality:

  • switch to MDAnalysis for creating universes and bonds and etc and other stuff, switch to MDAnalysis for all kinds of atom traversal - ie try to use atom bonds etc instead of doing all these details myself. We should first mirror the MDAnalysis way and use that as the core of our code. This way we'll avoid a lot of issues and simplify the code drastically.
  • Use MDAnalysis to generate hybrid Universe and write this to a .mol2 file
Clone this wiki locally