Analysis of Zeroshot Prediction Results #10

BKRBH · 2024-05-13T07:02:18Z

Why do the wild-type predictions obtained by setting different mutation sites differ when using the same zeroshot model for prediction? When the "positions" parameter is set to "V39", the wild type in the result table, which corresponds to "V", has a value of -2.5876. When the "positions" parameter is set to "V39 D40", the value corresponding to "VD" in the result table is -5.0146. Why are the results of the wild-type output different between the two settings? If so, which one should be used as the wild-type prediction value for comparability

brucejwittmann · 2024-10-04T22:10:12Z

@BKRBH sorry this is late. I don't check this repo often anymore. Please tag me and I'll be quicker.

It's because the probabilities are not normalized. When you set "V39", the model returns the log probability for "V" at position 39; it does not consider any other positions. When you set "V39 D40", the model sums the log probabilities for "V" at position 39 and "D" at position 40.

The reason the other positions are ignored is because MLDE was made for evaluating combinatorial landscapes/libraries. To get the internal consistency that you are looking for would mean summing over the log probabilities of all positions, which is a waste of compute if you know from the beginning that they're all constants. As a tradeoff, though, this means that you can only compare scores relative within a combinatorial space.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analysis of Zeroshot Prediction Results #10

Analysis of Zeroshot Prediction Results #10

BKRBH commented May 13, 2024

brucejwittmann commented Oct 4, 2024

Analysis of Zeroshot Prediction Results #10

Analysis of Zeroshot Prediction Results #10

Comments

BKRBH commented May 13, 2024

brucejwittmann commented Oct 4, 2024