Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up metrics #59

Merged
merged 12 commits into from
Jun 2, 2024
Merged

Clean up metrics #59

merged 12 commits into from
Jun 2, 2024

Conversation

rcannood
Copy link
Member

@rcannood rcannood commented Jun 1, 2024

Describe your changes

  • Remove clipped metrics
  • Remove python implementations of metrics (resulted in unwanted NAs / NaNs)
  • Add clipped_sign_log10_pval layer to the kaggle dataset
  • Set default layer to clipped_sign_log10_pval

Checklist before requesting a review

  • I have performed a self-review of my code

  • Check the correct box. Does this PR contain:

    • Breaking changes
    • New functionality (new method, new metric, ...)
    • Major changes
    • Minor changes
    • Bug fixes
  • Proposed changes are described in the CHANGELOG.md

  • CI Tests succeed and look good!

@rcannood rcannood requested a review from szalata June 1, 2024 08:24
@szalata
Copy link
Collaborator

szalata commented Jun 1, 2024

I would rather make all of the metrics (or have a component that clips the input) clip, to avoid the potential issue of penalizing methods that may still predict values beyond the threshold.

@rcannood
Copy link
Member Author

rcannood commented Jun 2, 2024

Just to make a note of this:

We decided to clip the input values of the sign_log10_pval by creating a clipped_sign_log10_pval. Clipping the predictions to the same range as that layer would improve some method's rmse and mae score, but in practice the effect is found to be negligible:

   dataset_id        method_id                        rmse   mae pearson spearman cosine   diff_rmse   diff_mae diff_pearson diff_spearman diff_cosine
   <chr>             <chr>                           <dbl> <dbl>   <dbl>    <dbl>  <dbl>       <dbl>      <dbl>        <dbl>         <dbl>       <dbl>
 1 neurips-2023-data ground_truth                    0     0      1        1      1       0           0            0               0         0        
 2 neurips-2023-data jn_ap_op2                       0.894 0.649  0.327    0.306  0.329  -0.00105    -0.00111     -0.000228       -4.08e-7  -0.000225 
 3 neurips-2023-data lgc_ensemble                    0.794 0.577  0.450    0.423  0.454  -0.00690    -0.00699     -0.000845       -3.87e-5  -0.000850 
 4 neurips-2023-data mean_across_celltypes           0.892 0.644  0.297    0.281  0.302   0           0            0               0         0        
 5 neurips-2023-data mean_across_compounds           0.943 0.698  0.259    0.243  0.263   0           0            0               0         0        
 6 neurips-2023-data mean_outcome                    0.899 0.636  0.220    0.212  0.226   0           0            0               0         0        
 7 neurips-2023-data nn_retraining_with_pseudolabels 0.756 0.547  0.490    0.462  0.493   0           0            0               0         0        
 8 neurips-2023-data pyboost                         0.795 0.560  0.462    0.440  0.465  -0.00000491 -0.0000147    0.0000132      -1.18e-9   0.0000133
 9 neurips-2023-data sample                          1.37  0.965  0.0492   0.0517 0.0515  0           0            0               0         0        
10 neurips-2023-data scape                           0.774 0.571  0.473    0.442  0.476  -0.000392   -0.000637    -0.0000266      -3.78e-7  -0.0000270
11 neurips-2023-data transformer_ensemble            0.897 0.628  0.220    0.216  0.226   0           0            0               0         0        
12 neurips-2023-data zeros                           0.918 0.635  0        0      0       0           0            0               0         0        

If there was less of an imbalance between abs(sign_log10_pval) < 1 and >= 1, the effects would be more pronounced.

@rcannood rcannood merged commit 753f37c into main Jun 2, 2024
19 checks passed
@rcannood rcannood deleted the remove_clipped branch June 2, 2024 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants