Replace PyRosetta with OpenMM in Fragmenstein #1078

phraenquex · 2023-07-04T11:13:58Z

No description provided.

matteoferla · 2023-07-04T11:16:10Z

phraenquex · 2023-10-10T11:24:26Z

Get Steph's use-case and see if there's a work-around.

Bug OpenMM community about the (1) funny numbers and (2) very slow running.

To finalise the ticket:
A. RMSD needs to be sensible. (Energy not important.)
B. Slow running (probably) okay. @alanbchristie check if Knitwork is well implented to exploit NextFlow. #944)

matteoferla · 2023-10-24T10:51:57Z

Get Steph's use-case and see if there's a work-around.

She said there was no issue in the Fragmenstein side

OpenMM

I could not get the calculated values to behave nor cut down the speed, it all works fine, but the values on testing are terrible.
However, I have "simply" fixed the RDKit minimization to be aware of the protein, but as a immobile entity.
(Technical detail: By include a frozen extracted protein neighbourhood, which resolves the clashes problem in all cases bar when the template ought to move (i.e. this is not flexible))

The values are a bit hairy, but the sign is somewhat consistent.
(Technical detail: on account it is internal energy U, and not a hybrid(=fudged) Gibbs potential. Plus in vacuum is not great, but could be argued to be a pedantic issue)

Even with PyRosetta the interactions were the main driver and the results get filtered by bad energy and sorted by interactions primarily at least in my hands.

Bad:

fixed template
in vacuo

Good:

takes a fraction of a second
it works

phraenquex · 2023-11-14T13:01:23Z

@matteoferla when is your RDkit fix going into production?

(@Waztom add to agenda for Thursday.)

matteoferla · 2023-11-14T13:26:22Z

Unless the Squonk job has a frozen version of fragmenstein (doubtful), it should have been in production since last month (i.e. version 0.13.20)

@tdudgeon —Is this true? And if not, could the version of Fragmenstein be changed to the latest 0.13.34. I have access only to https://gitlab.com/informaticsmatters/squonk-fragmenstein

phraenquex · 2023-11-15T12:13:56Z

@tdudgeon there's question for you from @matteoferla, in the question above.

If this is indeed deployed, then the ticket can move to "In production."

tdudgeon · 2023-11-16T10:37:32Z

Currently Fragmenstein is used for 2 sets of jobs:

original fragmenstein jobs (https://github.com/InformaticsMatters/squonk2-fragmenstein)
Steph/Rubens fragment merges (https://github.com/stephwills/fragment_network_merges)

Both use the same base container that currently uses Fragmenstein 0.12.0 (https://github.com/InformaticsMatters/squonk2-fragmenstein/blob/main/requirements.txt#L2)

Updating this is simple and will just need the container images to be rebuilt.

But are there other impacts? If this means that the updated jobs would now use OpenMM for minimisation (rather that Pyrosetta (originally) or nothing (more recently)) do we need some extra options to allow this minimisation to be enabled and/or configured?

matteoferla · 2023-11-16T10:50:36Z

Thank you for the link: that is super helpful!

No, they will not use OpenMM for minimisation —rigid backbone only.

Custom options For the PyRosetta, the academic username and password are universal, so we could conceivably have an input box wherein the user provides the password which if its hash is correct runs the job. For some demo I did it that way.

Future I should say that there is another ticket, wherein I need to Squonkificate the normal Fragmenstein pipeline (bunk double or triple mergers, analog-by-catalog search, placement, clustering and ranking), but I need to change the analog-by-catalog search by copying Steph's FragmentKnitting via FragmentNetwork as oppose to use sw.docking.org. Steph is busy so I have been trying to reverse engineer it. I will talk to Rubén about how the Squonkification of Steph's code.

tdudgeon · 2023-11-16T10:55:29Z

So Fragmenstein version should be increased to 0.13.34, without any other changes being needed?

phraenquex · 2023-11-16T12:07:01Z

@matteoferla your answers don't look relevant to this ticket; if you need them actioned, please create new ticket. I suspect they pertain to Brownish release.

As for "custom options": no need for a ticket, the answer is categorical: there will be no PyRosetta in Fragalysis, or any support for custom licenses, certainly not in the forseeable future - we don't have the legal backup to make this happen.

matteoferla · 2023-11-16T12:38:59Z

@phraenquex Apologies. I will try to be more clear in future
@tdudgeon Yes, that is correct and sorry for simply pressing an emoji response in lieu of a written reply

phraenquex · 2023-11-23T11:43:42Z

@tdudgeon @matteoferla can I move this into Production swimlane now?

If not, what are the remaining actions?

tdudgeon · 2023-11-23T18:17:07Z

I still need to update the fragmenstein version and deploy.
This should not affect any functionality.

phraenquex · 2023-11-24T08:40:21Z

I should have asked: when is that happening? Is it a lot of work - or can it happen now, without slowing down green release?

@matteoferla you're on standby to test it, presumably?

matteoferla · 2023-11-24T12:25:43Z

@phraenquex my help stops at saying to change this line to Fragmenstein == 0.13.36, or Fragmenstein >= 0.13.36: the creation of an image for Squonk is a sys admin task beyond my skillset or permissions

tdudgeon · 2023-11-24T12:26:49Z

Yes, I just need to make the change and rebuild the images. Just not had time to do so yet.

phraenquex · 2023-11-24T13:11:02Z

@tdudgeon that hasn't answered my question.

@matteoferla no ticket is complete until someone knowledgeable has confirmed that it "works". You're the only one that could credibly do that, at the moment. Unless you show @mwinokan how to do it.

matteoferla · 2023-11-24T13:25:37Z

@phraenquex: yes, as soon as @tdudgeon signals that the image rebuild is complete I will test it.
From the image rebuild side, I am guessing this fully completely depends if Tim's infrastructure in Diamond is working following the shutdown earlier this month.

tdudgeon · 2023-11-24T16:41:53Z

The fragmenstein images have now been rebuilt, and should be testable in Squonk.

Steph's fragment knitting images need to be rebuilt by @rsanchezgarc as I can't push to the XChem DockerHub repo.
Ruben, update the base container for your one to informaticsmatters/squonk2-fragmenstein-base/1.0.4 (see https://hub.docker.com/layers/informaticsmatters/squonk2-fragmenstein-base/1.0.4/images/sha256-01f71f20308a4fa2f3c3a3562557368f7ddc70b2e177334456bb0173410cc0ab?context=repo), or you can use the stable tag if you don't mind the base image potentially shifting over time. Once done that job can be tested too.

@phraenquex No, it shouldn't be moved to production until the above has been done. But really this isn't much to do with Fragalysis staging/production/etc as it's purely Squonk related, and Squonk will use the new images as soon as they become available.

phraenquex · 2023-11-27T08:35:37Z

Thanks @tdudgeon. @rsanchezgarc could you give an indication of when this can happen? (If you can't do it, we'll ask @mwinokan or @Waztom to take it on.)

Correction: They are absolutely staging/production related: the categories refer to what's available to users, not to some arcane technical details about repos and stuff. They always have.

rsanchezgarc · 2023-11-27T10:43:47Z

@tdudgeon @matteoferla . Shouldn't we also change the line of code in which we set up Victor or Wictor or whatever the new openmm class is called?

matteoferla · 2023-11-27T10:56:50Z

@rsanchezgarc —OpenVictor was a no go. OpenMM energy minimisation is very slow —it happens fully in cartesian space without internal space tricks and akin to an MD run at zero kelvin with a shaking sampler to add motion as far as I can tell. Template choice becomes critical in Wictor (without PyRosetta) as the neighbourhood is frozen. So RDKit is the only choice here.

rsanchezgarc · 2023-11-27T11:04:32Z

@matteoferla .Then, how do you use it? Just as our old v = Victor() ?

rsanchezgarc · 2023-11-29T10:59:33Z

@tdudgeon. I already had FROM informaticsmatters/squonk2-fragmenstein-base:stable in the Dockerfile
@matteoferla said that we need to use Wictor()
So I just built the new image and pushed it. Is that all @tdudgeon?

tdudgeon · 2023-11-29T11:05:40Z

If the job hasn't changed (only the fragmenstein implementation) then all that's needed is a new container image.

rsanchezgarc · 2023-11-29T11:13:46Z

@tdudgeon so it should be ready now

phraenquex · 2024-05-23T10:22:04Z

@tdudgeon thinks it was already done, needs to check - please do for green release.

mwinokan · 2024-06-12T07:55:44Z

@tdudgeon can you confirm these changes were made? We will need to revisit the deployed algorithms in general in the orange release

tdudgeon · 2024-06-12T08:31:41Z

The jobs are currently using Fragmenstein version 0.13.36 which I believe is one that no longer uses PyRosetta, but @matteoferla should confirm this.
Also, the code seems to be exclusively using Wictor not Victor.

mwinokan · 2024-06-12T08:34:31Z

Thank you for confirming @tdudgeon

matteoferla · 2024-06-12T09:54:40Z

@tdudgeon — I can confirm that is correct.
The current version is 1.0.6, but there were only minor bug fixes, so 0.13 should be okay, functionally.

As you said, Fragmenstein can be run

default (PyRosetta) — Victor class
RDKit only with PyRosetta —Wictor class. The one to use here.
openMM — OpenVictor class. Requires GPU and is way too slow for 1k+ mergers

The code implemented in Fragalysis only does a merger, which is likely not in catalogue space.
The pipeline normally consists of Fragmenstein mergers, analogue search in catalogues via NextMove Software's SmallWorld server hosted by John Irwin and analogues placed by Fragmenstein and results analysed. The middle bit needed changing to be open/independent. The other method I have used was OpenEye GraphSim, which is not open. I never did test an open analogue search —I had my eyes on faiss library as it can be run in CPU or GPU.

mwinokan · 2024-06-12T09:59:39Z

Thanks @matteoferla. We will revisit the deployed algorithms in a later release (see #1454), and I will get in touch regarding the analogue search (although we quite like the 'pure' merges for CAR anyway)

matteoferla · 2024-06-12T10:10:08Z

@mwinokan, great, using the unaltered mergers for the in-house synthesis pipeline makes total sense: jumping into catalogue space is infuriating (eg. heteroarenes to benzenes and loss of substituents) and that is even with the fact that SW runs not on FPs but on edit distance —the Astex Fragment network is an open copycat but with fewer of Roger and John's mathemagical tricks. 🤷

phraenquex · 2024-06-12T13:16:38Z

@mwinokan I've updated #1454 to include that headline from the Joint Meeting, to filter RDkite poses with PoseBusters.

That may be necessary for this ticket too - especially if it's not hard to do.

mwinokan · 2024-06-12T13:19:28Z

@phraenquex I think it's best to address it in #1454

phraenquex · 2024-06-12T13:28:59Z

Sure - in which case, we're probably better off moving this ticket to #won'tfix - life being too short, and there being loads of loose ends with the overall problem anyway.

phraenquex assigned matteoferla Jul 4, 2023

phraenquex added squonk ALC3 labels Jul 4, 2023

phraenquex mentioned this issue Jul 4, 2023

Deploy Steph's Fragment Knitting tactically (bypass #944) #1057

Open

phraenquex added the MS 2023-06-15 label Jul 21, 2023

phraenquex assigned tdudgeon and matteoferla and unassigned matteoferla Nov 24, 2023

phraenquex assigned mwinokan and unassigned mwinokan May 23, 2024

mwinokan mentioned this issue May 23, 2024

Verify status of deployed algorithms and include (posebusters) filtering by default #1444

Open

phraenquex added the 2024-03-13 green Data dissemination label May 23, 2024

mwinokan added this to Fragalysis May 29, 2024

mwinokan moved this to Dev Done - Do review (DEV) in Fragalysis May 29, 2024

mwinokan moved this from Dev Done - Do review (DEV) to In production (Done) in Fragalysis Jun 12, 2024

Replace PyRosetta with OpenMM in Fragmenstein #1078

Replace PyRosetta with OpenMM in Fragmenstein #1078

Comments

phraenquex commented Jul 4, 2023

matteoferla commented Jul 4, 2023

phraenquex commented Oct 10, 2023 • edited Loading

matteoferla commented Oct 24, 2023

phraenquex commented Nov 14, 2023

matteoferla commented Nov 14, 2023 • edited Loading

phraenquex commented Nov 15, 2023

tdudgeon commented Nov 16, 2023

matteoferla commented Nov 16, 2023

tdudgeon commented Nov 16, 2023

phraenquex commented Nov 16, 2023

matteoferla commented Nov 16, 2023

phraenquex commented Nov 23, 2023

tdudgeon commented Nov 23, 2023

phraenquex commented Nov 24, 2023

matteoferla commented Nov 24, 2023 • edited Loading

tdudgeon commented Nov 24, 2023

phraenquex commented Nov 24, 2023

matteoferla commented Nov 24, 2023

tdudgeon commented Nov 24, 2023

phraenquex commented Nov 27, 2023

rsanchezgarc commented Nov 27, 2023

matteoferla commented Nov 27, 2023

rsanchezgarc commented Nov 27, 2023

rsanchezgarc commented Nov 29, 2023

tdudgeon commented Nov 29, 2023

rsanchezgarc commented Nov 29, 2023

phraenquex commented May 23, 2024

mwinokan commented Jun 12, 2024

tdudgeon commented Jun 12, 2024

mwinokan commented Jun 12, 2024

matteoferla commented Jun 12, 2024

mwinokan commented Jun 12, 2024

matteoferla commented Jun 12, 2024

phraenquex commented Jun 12, 2024

mwinokan commented Jun 12, 2024

phraenquex commented Jun 12, 2024

phraenquex commented Oct 10, 2023 •

edited

Loading

matteoferla commented Nov 14, 2023 •

edited

Loading

matteoferla commented Nov 24, 2023 •

edited

Loading