Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis of Zeroshot Prediction Results #10

Open
BKRBH opened this issue May 13, 2024 · 1 comment
Open

Analysis of Zeroshot Prediction Results #10

BKRBH opened this issue May 13, 2024 · 1 comment

Comments

@BKRBH
Copy link

BKRBH commented May 13, 2024

Why do the wild-type predictions obtained by setting different mutation sites differ when using the same zeroshot model for prediction? When the "positions" parameter is set to "V39", the wild type in the result table, which corresponds to "V", has a value of -2.5876. When the "positions" parameter is set to "V39 D40", the value corresponding to "VD" in the result table is -5.0146. Why are the results of the wild-type output different between the two settings? If so, which one should be used as the wild-type prediction value for comparability

@brucejwittmann
Copy link
Collaborator

@BKRBH sorry this is late. I don't check this repo often anymore. Please tag me and I'll be quicker.

It's because the probabilities are not normalized. When you set "V39", the model returns the log probability for "V" at position 39; it does not consider any other positions. When you set "V39 D40", the model sums the log probabilities for "V" at position 39 and "D" at position 40.

The reason the other positions are ignored is because MLDE was made for evaluating combinatorial landscapes/libraries. To get the internal consistency that you are looking for would mean summing over the log probabilities of all positions, which is a waste of compute if you know from the beginning that they're all constants. As a tradeoff, though, this means that you can only compare scores relative within a combinatorial space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants