Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script for recreating evaluation scores on Guacamol benchmark #43

Open
HiokHian opened this issue Dec 31, 2022 · 21 comments
Open

Script for recreating evaluation scores on Guacamol benchmark #43

HiokHian opened this issue Dec 31, 2022 · 21 comments
Assignees
Labels
question Request for help or information

Comments

@HiokHian
Copy link

HiokHian commented Dec 31, 2022

Hi,

Could I check if the script for recreating the evaluation scores as reported in the paper on the Guacamol benchmark is available in this code base? Thanks.

@kmaziarz kmaziarz self-assigned this Jan 3, 2023
@kmaziarz
Copy link
Collaborator

kmaziarz commented Jan 3, 2023

Hi @HiokHian, we didn't release the scripts for running the Guacamol benchmarks, these stayed in our internal copy of the repo. However, I'm happy to provide any details needed to reproduce the results. Are you more interested in the distribution matching benchmarks (FCD, etc) or the optimization ones?

@kmaziarz kmaziarz added the question Request for help or information label Jan 3, 2023
@HiokHian
Copy link
Author

I see. Would it be possible to provide details on reproducing the results for both the distribution matching and optimization ones? Thank you for your help on this!

@kmaziarz
Copy link
Collaborator

Reproducing the distribution matching results should be fairly easy: we load the model as normal through load_model_from_directory and call assess_distribution_learning from guacamol.assess_distribution_learning. The only caveat here is that the model passed into assess_distribution_learning needs to implement their DistributionMatchingGenerator interface; for that we make a very simple wrapper around.

For the optimization results we call assess_specific_goal_directed_generation, and similarly wrap things into interfaces expected by Guacamol. However, we need to pair MoLeR with an optimization method; as described in the paper, we use MSO. We didn't release our own fork of MSO, but it's nearly identical to the open-source one, we mostly just modified the latent space clipping which helped a bit (see Appendix C.3 in the paper). To initialize MSO swarms, we use the top scoring molecules (under the particular scoring function we are optimizing for) from the Guacamol dataset.

@HiokHian
Copy link
Author

I see. Thanks again for the detailed pointers, they are very helpful. I'll close this thread for now and post new clarifications should they arise.

@MorganCThomas
Copy link

Hi, is it possible to share the objective for the goal directed generation. For example, how exactly the score was calculated for the different objectives.

Specifically the last one (factor xa) and what "maintaining the properties" refers to?

@shriyamr
Copy link

Can you provide the details of how you coupled MSO to MoLeR when specializing the GoalDirectedGenerator class from GuacaMol?

@kmaziarz
Copy link
Collaborator

Hi, is it possible to share the objective for the goal directed generation. For example, how exactly the score was calculated for the different objectives. Specifically the last one (factor xa) and what "maintaining the properties" refers to?

I assume you're mostly asking about our scaffold-based tasks, as the other ones are defined in Guacamol. Here are the building blocks for defining our tasks:

from guacamol.common_scoring_functions import (
    TanimotoScoringFunction,
    RdkitScoringFunction,
    SMARTSScoringFunction,
)
from guacamol.goal_directed_benchmark import GoalDirectedBenchmark
from guacamol.goal_directed_score_contributions import uniform_specification
from guacamol.score_modifier import ClippedScoreModifier
from guacamol.scoring_function import GeometricMeanScoringFunction


def hard_scaffold_similarity(
    smiles: str, name: str, scaffold: str, fp_type: str = "ECFP4", threshold: float = 0.7
) -> GoalDirectedBenchmark:
    """Build a benchmark that asks to maximize similarity under a hard scaffold constraint.

    Args:
        smiles: Target to maximize similarity to.
        name: Name of the benchmark.
        scaffold: Scaffold that a sample must contain to get a non-zero score.
        fp_type: Which fingerprint similarity metric to use.
        threshold: Similarity threshold above which the sample is given a perfect score.
    """

    modifier = ClippedScoreModifier(upper_x=threshold)
    tanimoto_fn = TanimotoScoringFunction(target=smiles, fp_type=fp_type, score_modifier=modifier)

    smarts_fn = SMARTSScoringFunction(scaffold, inverse=False, is_smiles=True)

    scoring_function = GeometricMeanScoringFunction([tanimoto_fn, smarts_fn])
    specification = uniform_specification(1, 10, 100)

    return GoalDirectedBenchmark(
        name=name,
        objective=scoring_function,
        contribution_specification=specification,
        scaffold_smiles=scaffold,
        drop_molecules_without_scaffold=True,
    )


def scaffold_similarity_properties(
    scaffold: str,
    first_molecule: str,
    other_molecule: str,
    fp_type: str = "ECFP4",
    threshold: float = 0.7,
) -> ScoringFunction:
    """Build a scorer asking to stay close to two molecules under a hard scaffold constraint.

    Args:
        scaffold: Scaffold that a sample must contain to get a non-zero score.
        first_molecule: Target to maximize fingerprint similarity to.
        other_molecule: Target to maximize property similarity to.
        fp_type: Which fingerprint similarity metric to use.
        threshold: Similarity threshold above which the similarity part of the scoring function will
            return 1.0 (perfect score).
    """
    smarts_scoring_function = SMARTSScoringFunction(target=scaffold, is_smiles=True)
    other_mol = Chem.MolFromSmiles(other_molecule)
    target_logp = logP(other_mol)
    target_tpsa = tpsa(other_mol)
    target_bertz = bertz(other_mol)

    modifier = ClippedScoreModifier(upper_x=threshold)
    tanimoto_fn = TanimotoScoringFunction(
        target=first_molecule, fp_type=fp_type, score_modifier=modifier
    )

    lp = RdkitScoringFunction(
        descriptor=logP, score_modifier=MinGaussianModifier(mu=target_logp, sigma=0.3)
    )
    tp = RdkitScoringFunction(
        descriptor=tpsa, score_modifier=MaxGaussianModifier(mu=target_tpsa, sigma=5)
    )
    bz = RdkitScoringFunction(
        descriptor=bertz, score_modifier=MinGaussianModifier(mu=target_bertz, sigma=30)
    )

    return GeometricMeanScoringFunction([smarts_scoring_function, tanimoto_fn, lp, tp, bz])

In the above code we reuse various utilities from Guacamol. However, note that we modified the GoalDirectedBenchmark class by adding scaffold_smiles and drop_molecules_without_scaffold arguments, so that (1) the scaffold can be passed through to the underlying model and (2) we can potentially filter out molecules that don't contain the scaffold as invalid for this task.

Here's how the tasks are defined building on top of the above:

def factor_xa() -> GoalDirectedBenchmark:
    # xarelto with apixaban properties
    scaffold = "CCN(C=O)C1=CC=C(C=C1)N1CCCCC1=O"
    apixaban = "O=C5N(c4ccc(N3C(=O)c1c(c(nn1c2ccc(OC)cc2)C(=O)N)CC3)cc4)CCCC5"
    xarelto = "O=C1COCCN1c2ccc(cc2)N3CC(OC3=O)CNC(=O)c4ccc(s4)Cl"
    specification = uniform_specification(1, 10, 100)
    return GoalDirectedBenchmark(
        name="factor_Xa_like_scaffold",
        objective=scaffold_similarity_properties(
            scaffold=scaffold,
            first_molecule=xarelto,
            other_molecule=apixaban,
            fp_type="AP",
            threshold=0.7,
        ),
        contribution_specification=specification,
        scaffold_smiles=scaffold,
        drop_molecules_without_scaffold=True,
    )


def goal_directed_scaffold_suite_v1() -> List[GoalDirectedBenchmark]:
    return [
        hard_scaffold_similarity(
            smiles="CCCC1=NN(C2=C1N=C(NC2=O)C3=C(C=CC(=C3)S(=O)(=O)N4CCN(CC4)C)OCC)C",
            name="pde5_scaffold",
            scaffold="Cc1ncn2[nH]c(-c3ccc(F)c(S(=O)(=O)N4CCNCC4)c3)nc(=O)c12",
            fp_type="AP",
            threshold=0.75,
        ),
        hard_scaffold_similarity(
            smiles="CC1Oc2c(C)nccc2-c2cc(NC(=O)Cc3nn(C)cc13)ccc2C#N",
            name="lorlati_like_scaffold",
            scaffold="COc1cnccc1-c1cccc(NC(=O)Cc2ccn(C)n2)c1",
            fp_type="PHCO",
            threshold=0.70,
        ),
        hard_scaffold_similarity(
            smiles="CC(C)C#Cc1ccc(-c2ccc(Cl)c3c(NS(C)(=O)=O)nn(CC(F)(F)F)c23)c(C(Cc2cc(F)cc(F)c2)NC(=O)Cn2nc(C(F)(F)F)c3c2C(F)(F)C2CC32)n1",
            name="GCCA1_like_scaffold",
            scaffold="CCc1cnc(-c2ccccc2)c(C(Cc2ccccc2)NC(=O)Cn2nc(CF)c3c2C(C)(F)C2CC32)n1",
            fp_type="PHCO",
            threshold=0.70,
        ),
        factor_xa(),
    ]

@kmaziarz kmaziarz reopened this Jan 23, 2024
@kmaziarz
Copy link
Collaborator

Can you provide the details of how you coupled MSO to MoLeR when specializing the GoalDirectedGenerator class from GuacaMol?

Roughly speaking, we adapted parts of the BasePSOptimizer class from MSO, replacing their model with MoLeR, and changing the model calls where necessary (seq_to_emb -> encode, emb_to_seq -> decode). We also pass in extra arguments such as scaffold SMILES, so that for scaffold-based tasks this can be passed into decode. Sorry this is a big vague; as I mentioned the optimization code never got released and so it remained entangled with internal/production things; it is thus difficult to e.g. summarize the "diff" between that code and the open-source MSO. Do you have more specific questions I could help with?

@MorganCThomas
Copy link

I assume you're mostly asking about our scaffold-based tasks, as the other ones are defined in Guacamol. Here are the building blocks for defining our tasks:

In the above code we reuse various utilities from Guacamol. However, note that we modified the GoalDirectedBenchmark class by adding scaffold_smiles and drop_molecules_without_scaffold arguments, so that (1) the scaffold can be passed through to the underlying model and (2) we can potentially filter out molecules that don't contain the scaffold as invalid for this task.

Here's how the tasks are defined building on top of the above:

Thanks @kmaziarz this is exactly what I was looking for!!!

@shriyamr
Copy link

shriyamr commented Feb 1, 2024

Can you provide the details of how you coupled MSO to MoLeR when specializing the GoalDirectedGenerator class from GuacaMol?

Roughly speaking, we adapted parts of the BasePSOptimizer class from MSO, replacing their model with MoLeR, and changing the model calls where necessary (seq_to_emb -> encode, emb_to_seq -> decode). We also pass in extra arguments such as scaffold SMILES, so that for scaffold-based tasks this can be passed into decode. Sorry this is a big vague; as I mentioned the optimization code never got released and so it remained entangled with internal/production things; it is thus difficult to e.g. summarize the "diff" between that code and the open-source MSO. Do you have more specific questions I could help with?

Thank you so much, this helped a lot! Another quick question: Can you elaborate on how you wrapped the scoring functions from the Guacamol Goal directed benchmark suites into the Scoring Function interface from MSO?

@shriyamr
Copy link

shriyamr commented Feb 6, 2024

Also, I have another follow up question about using MSO: when loading the VAE model through the context manager inside some of the methods of the class inheriting BasePSOptimizer, I am unable to use the wrapper class functions encode and decode as written. The jobs in the job queue seem to be timing out, and the wrapper class functions aren't returning the expected results. Any reason as why this may be happening? Thanks

@kmaziarz
Copy link
Collaborator

When loading the VAE model through the context manager inside some of the methods of the class inheriting BasePSOptimizer, I am unable to use the wrapper class functions encode and decode as written.

Are you sure the context isn't being exited? For example, if you place the context inside of __init__, Python will exit the context once it's done running __init__. This is just a guess though; alternatively, you could paste a code snippet to reproduce your issue and I could take a look.

@shriyamr
Copy link

def _next_step_and_evaluate(self, swarm, model_dir):
        """
        Method that wraps the update of the particles position (next step) and the evaluation of
        the fitness at these new positions.
        :param swarm: The swarm that is updated.
        :return: The swarm that is updated.
        """
        swarm.next_step()
        with load_model_from_directory(model_dir) as inference_model:
            new = tf.cast(swarm.x, tf.float32)
            # import pdb; pdb.set_trace()
            smiles = inference_model.decode(new)
            swarm.smiles = smiles
            # import pdb; pdb.set_trace()
            emb = inference_model.encode(swarm.smiles)
            swarm.x = emb[0]
            swarm = self.update_fitness(swarm)
            return swarm

This is how I rewrote the _next_step_and_evaluate function of the class that inherits from BasePSOptimizer. The problem that I am running into is that the decode request doesn't get processed by the inference server (the request type shows up as None instead of MoLeRRequestType.DECODE. However, when I use the model.encode inside of the from_query method, the smiles string gets encoded and the swarms get initialized. Am I loading the context manager incorrectly?

@kmaziarz
Copy link
Collaborator

Am I loading the context manager incorrectly?

I think what you're doing is in principle correct, although it will reload the model every time _next_step_and_evaluate is called (which I assume will happen on every optimization step). Instead, it might be better to just load the model once outside of the optimizer class and pass it as argument, doing something like

with load_model_from_directory(model_dir) as inference_model:
    optimizer = Optimizer(model=inference_model)
    ...

The problem that I am running into is that the decode request doesn't get processed by the inference server (the request type shows up as None instead of MoLeRRequestType.DECODE. However, when I use the model.encode inside of the from_query method, the smiles string gets encoded and the swarms get initialized.

Are the issues then happening in the method you posted or in some other one? Maybe you could try the approach I suggested above that loads the model once and paste in more of your code if the problems persist.

@matija-marijan
Copy link

Can you provide the details of how you coupled MSO to MoLeR when specializing the GoalDirectedGenerator class from GuacaMol?

Roughly speaking, we adapted parts of the BasePSOptimizer class from MSO, replacing their model with MoLeR, and changing the model calls where necessary (seq_to_emb -> encode, emb_to_seq -> decode). We also pass in extra arguments such as scaffold SMILES, so that for scaffold-based tasks this can be passed into decode. Sorry this is a big vague; as I mentioned the optimization code never got released and so it remained entangled with internal/production things; it is thus difficult to e.g. summarize the "diff" between that code and the open-source MSO. Do you have more specific questions I could help with?

Hello @kmarziarz,
When trying to implement MoLeR into MSO, it seems as though MoLeR's decode is prone to generating a fair amount of invalid SMILES which in turn crashes the MoLeRInferenceServer when trying to encode the swarm.smiles. Do you have any idea how this issue should be addressed? Thanks.

@kmaziarz
Copy link
Collaborator

MoLeR's decode is prone to generating a fair amount of invalid SMILES which in turn crashes the MoLeRInferenceServer when trying to encode the swarm.smiles. Do you have any idea how this issue should be addressed? Thanks.

In what sense are they invalid? The molecules that come out of MoLeR should always be valid in the can-be-parsed-by-rdkit sense by design.

But generally speaking, if MSO optimization starts encountering weird molecules, then that could be a sign that the latent codes are out-of-distribution, for example due to the lack of clipping.

@matija-marijan
Copy link

matija-marijan commented May 22, 2024

In what sense are they invalid? The molecules that come out of MoLeR should always be valid in the can-be-parsed-by-rdkit sense by design.

I have encountered SMILES that cannot be converted to MOL (they are returned as None).

But generally speaking, if MSO optimization starts encountering weird molecules, then that could be a sign that the latent codes are out-of-distribution, for example due to the lack of clipping.

That makes more sense than MoLeR just generating invalid SMILES out of nowhere... I will look into the code for hyperball clipping, and find out if it is working properly. Thanks!

@kmaziarz
Copy link
Collaborator

I have encountered SMILES that cannot be converted to MOL (they are returned as None).

Without clipping I would expect the results may start getting progressively crazy; that said, I'm not sure why invalid molecules would be ever produced. If you were able to get me a latent code that produces the None I could take a look.

@matija-marijan
Copy link

matija-marijan commented May 29, 2024

It seems as though clipping is not the issue when dealing with those molecules, it is just a consequence of another problem - the problem seems to be the chirality of the molecules. Either the molecules are inherently problematic, or MoLeR doesn't seem to be able to deal with chiral molecules as scaffolds.

Also, MoLeR doesn't seem to preserve the chirality in the scaffolds, regardless of whether the molecule is problematic or not. Do you have any insight regarding chiral molecules with MoLeR?

Thanks!

@kmaziarz
Copy link
Collaborator

Do you have any insight regarding chiral molecules with MoLeR?

Not really, although I imagine this may not be very well supported in the codebase. Perhaps it would be possible to ignore chirality when inputting into MoLeR and then fix it up post-hoc in the model outputs?

@matija-marijan
Copy link

Not really, although I imagine this may not be very well supported in the codebase. Perhaps it would be possible to ignore chirality when inputting into MoLeR and then fix it up post-hoc in the model outputs?

Yes of course it should be possible to just ignore the chirality, but I was just wondering if there was any info about it, and if it could be a problem, but it seems as though it's best to avoid it. Thank you anyway!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Request for help or information
Projects
None yet
Development

No branches or pull requests

5 participants