Script for recreating evaluation scores on Guacamol benchmark #43

HiokHian · 2022-12-31T07:21:40Z

Hi,

Could I check if the script for recreating the evaluation scores as reported in the paper on the Guacamol benchmark is available in this code base? Thanks.

kmaziarz · 2023-01-03T14:48:30Z

Hi @HiokHian, we didn't release the scripts for running the Guacamol benchmarks, these stayed in our internal copy of the repo. However, I'm happy to provide any details needed to reproduce the results. Are you more interested in the distribution matching benchmarks (FCD, etc) or the optimization ones?

HiokHian · 2023-01-15T07:27:08Z

I see. Would it be possible to provide details on reproducing the results for both the distribution matching and optimization ones? Thank you for your help on this!

kmaziarz · 2023-01-20T20:39:08Z

Reproducing the distribution matching results should be fairly easy: we load the model as normal through load_model_from_directory and call assess_distribution_learning from guacamol.assess_distribution_learning. The only caveat here is that the model passed into assess_distribution_learning needs to implement their DistributionMatchingGenerator interface; for that we make a very simple wrapper around.

For the optimization results we call assess_specific_goal_directed_generation, and similarly wrap things into interfaces expected by Guacamol. However, we need to pair MoLeR with an optimization method; as described in the paper, we use MSO. We didn't release our own fork of MSO, but it's nearly identical to the open-source one, we mostly just modified the latent space clipping which helped a bit (see Appendix C.3 in the paper). To initialize MSO swarms, we use the top scoring molecules (under the particular scoring function we are optimizing for) from the Guacamol dataset.

HiokHian · 2023-01-21T00:04:21Z

I see. Thanks again for the detailed pointers, they are very helpful. I'll close this thread for now and post new clarifications should they arise.

MorganCThomas · 2024-01-16T15:23:02Z

Hi, is it possible to share the objective for the goal directed generation. For example, how exactly the score was calculated for the different objectives.

Specifically the last one (factor xa) and what "maintaining the properties" refers to?

shriyamr · 2024-01-22T20:51:20Z

Can you provide the details of how you coupled MSO to MoLeR when specializing the GoalDirectedGenerator class from GuacaMol?

kmaziarz · 2024-01-23T13:22:15Z

Hi, is it possible to share the objective for the goal directed generation. For example, how exactly the score was calculated for the different objectives. Specifically the last one (factor xa) and what "maintaining the properties" refers to?

I assume you're mostly asking about our scaffold-based tasks, as the other ones are defined in Guacamol. Here are the building blocks for defining our tasks:

from guacamol.common_scoring_functions import (
    TanimotoScoringFunction,
    RdkitScoringFunction,
    SMARTSScoringFunction,
)
from guacamol.goal_directed_benchmark import GoalDirectedBenchmark
from guacamol.goal_directed_score_contributions import uniform_specification
from guacamol.score_modifier import ClippedScoreModifier
from guacamol.scoring_function import GeometricMeanScoringFunction


def hard_scaffold_similarity(
    smiles: str, name: str, scaffold: str, fp_type: str = "ECFP4", threshold: float = 0.7
) -> GoalDirectedBenchmark:
    """Build a benchmark that asks to maximize similarity under a hard scaffold constraint.

    Args:
        smiles: Target to maximize similarity to.
        name: Name of the benchmark.
        scaffold: Scaffold that a sample must contain to get a non-zero score.
        fp_type: Which fingerprint similarity metric to use.
        threshold: Similarity threshold above which the sample is given a perfect score.
    """

    modifier = ClippedScoreModifier(upper_x=threshold)
    tanimoto_fn = TanimotoScoringFunction(target=smiles, fp_type=fp_type, score_modifier=modifier)

    smarts_fn = SMARTSScoringFunction(scaffold, inverse=False, is_smiles=True)

    scoring_function = GeometricMeanScoringFunction([tanimoto_fn, smarts_fn])
    specification = uniform_specification(1, 10, 100)

    return GoalDirectedBenchmark(
        name=name,
        objective=scoring_function,
        contribution_specification=specification,
        scaffold_smiles=scaffold,
        drop_molecules_without_scaffold=True,
    )


def scaffold_similarity_properties(
    scaffold: str,
    first_molecule: str,
    other_molecule: str,
    fp_type: str = "ECFP4",
    threshold: float = 0.7,
) -> ScoringFunction:
    """Build a scorer asking to stay close to two molecules under a hard scaffold constraint.

    Args:
        scaffold: Scaffold that a sample must contain to get a non-zero score.
        first_molecule: Target to maximize fingerprint similarity to.
        other_molecule: Target to maximize property similarity to.
        fp_type: Which fingerprint similarity metric to use.
        threshold: Similarity threshold above which the similarity part of the scoring function will
            return 1.0 (perfect score).
    """
    smarts_scoring_function = SMARTSScoringFunction(target=scaffold, is_smiles=True)
    other_mol = Chem.MolFromSmiles(other_molecule)
    target_logp = logP(other_mol)
    target_tpsa = tpsa(other_mol)
    target_bertz = bertz(other_mol)

    modifier = ClippedScoreModifier(upper_x=threshold)
    tanimoto_fn = TanimotoScoringFunction(
        target=first_molecule, fp_type=fp_type, score_modifier=modifier
    )

    lp = RdkitScoringFunction(
        descriptor=logP, score_modifier=MinGaussianModifier(mu=target_logp, sigma=0.3)
    )
    tp = RdkitScoringFunction(
        descriptor=tpsa, score_modifier=MaxGaussianModifier(mu=target_tpsa, sigma=5)
    )
    bz = RdkitScoringFunction(
        descriptor=bertz, score_modifier=MinGaussianModifier(mu=target_bertz, sigma=30)
    )

    return GeometricMeanScoringFunction([smarts_scoring_function, tanimoto_fn, lp, tp, bz])

In the above code we reuse various utilities from Guacamol. However, note that we modified the GoalDirectedBenchmark class by adding scaffold_smiles and drop_molecules_without_scaffold arguments, so that (1) the scaffold can be passed through to the underlying model and (2) we can potentially filter out molecules that don't contain the scaffold as invalid for this task.

Here's how the tasks are defined building on top of the above:

def factor_xa() -> GoalDirectedBenchmark:
    # xarelto with apixaban properties
    scaffold = "CCN(C=O)C1=CC=C(C=C1)N1CCCCC1=O"
    apixaban = "O=C5N(c4ccc(N3C(=O)c1c(c(nn1c2ccc(OC)cc2)C(=O)N)CC3)cc4)CCCC5"
    xarelto = "O=C1COCCN1c2ccc(cc2)N3CC(OC3=O)CNC(=O)c4ccc(s4)Cl"
    specification = uniform_specification(1, 10, 100)
    return GoalDirectedBenchmark(
        name="factor_Xa_like_scaffold",
        objective=scaffold_similarity_properties(
            scaffold=scaffold,
            first_molecule=xarelto,
            other_molecule=apixaban,
            fp_type="AP",
            threshold=0.7,
        ),
        contribution_specification=specification,
        scaffold_smiles=scaffold,
        drop_molecules_without_scaffold=True,
    )


def goal_directed_scaffold_suite_v1() -> List[GoalDirectedBenchmark]:
    return [
        hard_scaffold_similarity(
            smiles="CCCC1=NN(C2=C1N=C(NC2=O)C3=C(C=CC(=C3)S(=O)(=O)N4CCN(CC4)C)OCC)C",
            name="pde5_scaffold",
            scaffold="Cc1ncn2[nH]c(-c3ccc(F)c(S(=O)(=O)N4CCNCC4)c3)nc(=O)c12",
            fp_type="AP",
            threshold=0.75,
        ),
        hard_scaffold_similarity(
            smiles="CC1Oc2c(C)nccc2-c2cc(NC(=O)Cc3nn(C)cc13)ccc2C#N",
            name="lorlati_like_scaffold",
            scaffold="COc1cnccc1-c1cccc(NC(=O)Cc2ccn(C)n2)c1",
            fp_type="PHCO",
            threshold=0.70,
        ),
        hard_scaffold_similarity(
            smiles="CC(C)C#Cc1ccc(-c2ccc(Cl)c3c(NS(C)(=O)=O)nn(CC(F)(F)F)c23)c(C(Cc2cc(F)cc(F)c2)NC(=O)Cn2nc(C(F)(F)F)c3c2C(F)(F)C2CC32)n1",
            name="GCCA1_like_scaffold",
            scaffold="CCc1cnc(-c2ccccc2)c(C(Cc2ccccc2)NC(=O)Cn2nc(CF)c3c2C(C)(F)C2CC32)n1",
            fp_type="PHCO",
            threshold=0.70,
        ),
        factor_xa(),
    ]

kmaziarz · 2024-01-23T13:39:02Z

Can you provide the details of how you coupled MSO to MoLeR when specializing the GoalDirectedGenerator class from GuacaMol?

Roughly speaking, we adapted parts of the BasePSOptimizer class from MSO, replacing their model with MoLeR, and changing the model calls where necessary (seq_to_emb -> encode, emb_to_seq -> decode). We also pass in extra arguments such as scaffold SMILES, so that for scaffold-based tasks this can be passed into decode. Sorry this is a big vague; as I mentioned the optimization code never got released and so it remained entangled with internal/production things; it is thus difficult to e.g. summarize the "diff" between that code and the open-source MSO. Do you have more specific questions I could help with?

MorganCThomas · 2024-01-23T18:00:21Z

I assume you're mostly asking about our scaffold-based tasks, as the other ones are defined in Guacamol. Here are the building blocks for defining our tasks:

In the above code we reuse various utilities from Guacamol. However, note that we modified the GoalDirectedBenchmark class by adding scaffold_smiles and drop_molecules_without_scaffold arguments, so that (1) the scaffold can be passed through to the underlying model and (2) we can potentially filter out molecules that don't contain the scaffold as invalid for this task.

Here's how the tasks are defined building on top of the above:

Thanks @kmaziarz this is exactly what I was looking for!!!

shriyamr · 2024-02-01T04:13:32Z

Can you provide the details of how you coupled MSO to MoLeR when specializing the GoalDirectedGenerator class from GuacaMol?

Roughly speaking, we adapted parts of the BasePSOptimizer class from MSO, replacing their model with MoLeR, and changing the model calls where necessary (seq_to_emb -> encode, emb_to_seq -> decode). We also pass in extra arguments such as scaffold SMILES, so that for scaffold-based tasks this can be passed into decode. Sorry this is a big vague; as I mentioned the optimization code never got released and so it remained entangled with internal/production things; it is thus difficult to e.g. summarize the "diff" between that code and the open-source MSO. Do you have more specific questions I could help with?

Thank you so much, this helped a lot! Another quick question: Can you elaborate on how you wrapped the scoring functions from the Guacamol Goal directed benchmark suites into the Scoring Function interface from MSO?

shriyamr · 2024-02-06T02:37:39Z

Also, I have another follow up question about using MSO: when loading the VAE model through the context manager inside some of the methods of the class inheriting BasePSOptimizer, I am unable to use the wrapper class functions encode and decode as written. The jobs in the job queue seem to be timing out, and the wrapper class functions aren't returning the expected results. Any reason as why this may be happening? Thanks

kmaziarz · 2024-02-15T13:44:14Z

When loading the VAE model through the context manager inside some of the methods of the class inheriting BasePSOptimizer, I am unable to use the wrapper class functions encode and decode as written.

Are you sure the context isn't being exited? For example, if you place the context inside of __init__, Python will exit the context once it's done running __init__. This is just a guess though; alternatively, you could paste a code snippet to reproduce your issue and I could take a look.

shriyamr · 2024-02-17T17:19:58Z

def _next_step_and_evaluate(self, swarm, model_dir):
        """
        Method that wraps the update of the particles position (next step) and the evaluation of
        the fitness at these new positions.
        :param swarm: The swarm that is updated.
        :return: The swarm that is updated.
        """
        swarm.next_step()
        with load_model_from_directory(model_dir) as inference_model:
            new = tf.cast(swarm.x, tf.float32)
            # import pdb; pdb.set_trace()
            smiles = inference_model.decode(new)
            swarm.smiles = smiles
            # import pdb; pdb.set_trace()
            emb = inference_model.encode(swarm.smiles)
            swarm.x = emb[0]
            swarm = self.update_fitness(swarm)
            return swarm

This is how I rewrote the _next_step_and_evaluate function of the class that inherits from BasePSOptimizer. The problem that I am running into is that the decode request doesn't get processed by the inference server (the request type shows up as None instead of MoLeRRequestType.DECODE. However, when I use the model.encode inside of the from_query method, the smiles string gets encoded and the swarms get initialized. Am I loading the context manager incorrectly?

kmaziarz · 2024-02-19T12:57:25Z

Am I loading the context manager incorrectly?

I think what you're doing is in principle correct, although it will reload the model every time _next_step_and_evaluate is called (which I assume will happen on every optimization step). Instead, it might be better to just load the model once outside of the optimizer class and pass it as argument, doing something like

with load_model_from_directory(model_dir) as inference_model:
    optimizer = Optimizer(model=inference_model)
    ...

The problem that I am running into is that the decode request doesn't get processed by the inference server (the request type shows up as None instead of MoLeRRequestType.DECODE. However, when I use the model.encode inside of the from_query method, the smiles string gets encoded and the swarms get initialized.

Are the issues then happening in the method you posted or in some other one? Maybe you could try the approach I suggested above that loads the model once and paste in more of your code if the problems persist.

matija-marijan · 2024-05-21T09:26:53Z

Can you provide the details of how you coupled MSO to MoLeR when specializing the GoalDirectedGenerator class from GuacaMol?

Roughly speaking, we adapted parts of the BasePSOptimizer class from MSO, replacing their model with MoLeR, and changing the model calls where necessary (seq_to_emb -> encode, emb_to_seq -> decode). We also pass in extra arguments such as scaffold SMILES, so that for scaffold-based tasks this can be passed into decode. Sorry this is a big vague; as I mentioned the optimization code never got released and so it remained entangled with internal/production things; it is thus difficult to e.g. summarize the "diff" between that code and the open-source MSO. Do you have more specific questions I could help with?

Hello @kmarziarz,
When trying to implement MoLeR into MSO, it seems as though MoLeR's decode is prone to generating a fair amount of invalid SMILES which in turn crashes the MoLeRInferenceServer when trying to encode the swarm.smiles. Do you have any idea how this issue should be addressed? Thanks.

kmaziarz · 2024-05-21T23:26:24Z

MoLeR's decode is prone to generating a fair amount of invalid SMILES which in turn crashes the MoLeRInferenceServer when trying to encode the swarm.smiles. Do you have any idea how this issue should be addressed? Thanks.

In what sense are they invalid? The molecules that come out of MoLeR should always be valid in the can-be-parsed-by-rdkit sense by design.

But generally speaking, if MSO optimization starts encountering weird molecules, then that could be a sign that the latent codes are out-of-distribution, for example due to the lack of clipping.

matija-marijan · 2024-05-22T13:07:20Z

In what sense are they invalid? The molecules that come out of MoLeR should always be valid in the can-be-parsed-by-rdkit sense by design.

I have encountered SMILES that cannot be converted to MOL (they are returned as None).

But generally speaking, if MSO optimization starts encountering weird molecules, then that could be a sign that the latent codes are out-of-distribution, for example due to the lack of clipping.

That makes more sense than MoLeR just generating invalid SMILES out of nowhere... I will look into the code for hyperball clipping, and find out if it is working properly. Thanks!

kmaziarz · 2024-05-24T13:37:50Z

I have encountered SMILES that cannot be converted to MOL (they are returned as None).

Without clipping I would expect the results may start getting progressively crazy; that said, I'm not sure why invalid molecules would be ever produced. If you were able to get me a latent code that produces the None I could take a look.

matija-marijan · 2024-05-29T11:58:19Z

It seems as though clipping is not the issue when dealing with those molecules, it is just a consequence of another problem - the problem seems to be the chirality of the molecules. Either the molecules are inherently problematic, or MoLeR doesn't seem to be able to deal with chiral molecules as scaffolds.

Also, MoLeR doesn't seem to preserve the chirality in the scaffolds, regardless of whether the molecule is problematic or not. Do you have any insight regarding chiral molecules with MoLeR?

Thanks!

kmaziarz · 2024-05-29T12:05:51Z

Do you have any insight regarding chiral molecules with MoLeR?

Not really, although I imagine this may not be very well supported in the codebase. Perhaps it would be possible to ignore chirality when inputting into MoLeR and then fix it up post-hoc in the model outputs?

matija-marijan · 2024-05-29T12:18:17Z

Not really, although I imagine this may not be very well supported in the codebase. Perhaps it would be possible to ignore chirality when inputting into MoLeR and then fix it up post-hoc in the model outputs?

Yes of course it should be possible to just ignore the chirality, but I was just wondering if there was any info about it, and if it could be a problem, but it seems as though it's best to avoid it. Thank you anyway!

kmaziarz self-assigned this Jan 3, 2023

kmaziarz added the question Request for help or information label Jan 3, 2023

HiokHian closed this as completed Jan 21, 2023

kmaziarz reopened this Jan 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Script for recreating evaluation scores on Guacamol benchmark #43

Script for recreating evaluation scores on Guacamol benchmark #43

HiokHian commented Dec 31, 2022 •

edited

Loading

kmaziarz commented Jan 3, 2023

HiokHian commented Jan 15, 2023

kmaziarz commented Jan 20, 2023

HiokHian commented Jan 21, 2023

MorganCThomas commented Jan 16, 2024

shriyamr commented Jan 22, 2024

kmaziarz commented Jan 23, 2024

kmaziarz commented Jan 23, 2024

MorganCThomas commented Jan 23, 2024

shriyamr commented Feb 1, 2024

shriyamr commented Feb 6, 2024

kmaziarz commented Feb 15, 2024

shriyamr commented Feb 17, 2024

kmaziarz commented Feb 19, 2024

matija-marijan commented May 21, 2024

kmaziarz commented May 21, 2024

matija-marijan commented May 22, 2024 •

edited

Loading

kmaziarz commented May 24, 2024

matija-marijan commented May 29, 2024 •

edited

Loading

kmaziarz commented May 29, 2024

matija-marijan commented May 29, 2024

Script for recreating evaluation scores on Guacamol benchmark #43

Script for recreating evaluation scores on Guacamol benchmark #43

Comments

HiokHian commented Dec 31, 2022 • edited Loading

kmaziarz commented Jan 3, 2023

HiokHian commented Jan 15, 2023

kmaziarz commented Jan 20, 2023

HiokHian commented Jan 21, 2023

MorganCThomas commented Jan 16, 2024

shriyamr commented Jan 22, 2024

kmaziarz commented Jan 23, 2024

kmaziarz commented Jan 23, 2024

MorganCThomas commented Jan 23, 2024

shriyamr commented Feb 1, 2024

shriyamr commented Feb 6, 2024

kmaziarz commented Feb 15, 2024

shriyamr commented Feb 17, 2024

kmaziarz commented Feb 19, 2024

matija-marijan commented May 21, 2024

kmaziarz commented May 21, 2024

matija-marijan commented May 22, 2024 • edited Loading

kmaziarz commented May 24, 2024

matija-marijan commented May 29, 2024 • edited Loading

kmaziarz commented May 29, 2024

matija-marijan commented May 29, 2024

HiokHian commented Dec 31, 2022 •

edited

Loading

matija-marijan commented May 22, 2024 •

edited

Loading

matija-marijan commented May 29, 2024 •

edited

Loading