ensemble_ignorance_score needs consistent number of samples #10

kvelleby · 2023-09-11T09:32:47Z

Currently, ensemble_ignorance_score does not check how many samples it gets in. However, the evaluation is only comparable for same amounts of samples. So the function should resample (up or down) if it gets more or less samples than expected.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.resample.html might be a reasonable approach for doing this.

kvelleby · 2023-10-12T06:58:03Z

In the current implementation of evaluate_submissions.py, we resample using scipy.signal.resample before calculating the Ignorance Score (but not the other scores). The resample function is using a Fourier method. This can yield results below zero, and currently, we just truncate (re-)samples at 0. This means that we are not getting the exact distribution as we would get from just having more samples from the model.

The current implementation uses the same method for up- and downsampling (most common case is upsampling). A better approach for downsampling is probably just to take a smaller random sample of the provided samples.

Alternatives to this resampling method could be to fit parametric distributions to the sample data given, e.g., using Fitter (https://fitter.readthedocs.io/en/latest/) or scipy (https://stackoverflow.com/questions/6620471/fitting-empirical-distribution-to-theoretical-ones-with-scipy-python?lq=1).

I am a bit afraid that this approach would be time-consuming and also error-prone. Ideally, we should get 1000 samples from the prediction model itself. The scipy.signal.resample with zero-truncation approach might be suboptimal, but it is fast and consistent.

kvelleby self-assigned this Sep 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ensemble_ignorance_score needs consistent number of samples #10

ensemble_ignorance_score needs consistent number of samples #10

kvelleby commented Sep 11, 2023

kvelleby commented Oct 12, 2023

ensemble_ignorance_score needs consistent number of samples #10

ensemble_ignorance_score needs consistent number of samples #10

Comments

kvelleby commented Sep 11, 2023

kvelleby commented Oct 12, 2023