Added brier score for p-metrics in forecasting #209

jagjeet-singh · 2021-04-19T04:08:46Z

This PR adds brier score based metric to forecasting. The two new metrics are:
brier-minFDE = (1.0-p)^2 + minFDE and brier-minADE = (1.0-p)^2 + minADE

Motivation for using brier score for assessing uncertainty:

Brier score is a commonly used metric in probability calibration ("Brier, G. W. Verification of forecasts expressed in terms of probability. Monthly weather review, 1950.")
The contribution is in [0.0, 1.0], thereby making sure probabilities are not penalized too high.
No tunable parameters.
Given that both minFDE and f(p) are in a similar range, the methods will be motivated to do well in both.

Top methods from eval ai leaderboard:

johnwlambert · 2021-04-19T17:14:46Z

Thanks for the PR, @jagjeet-singh. Would you mind adding a few sentences here explaining the motivation for the change, and links to any relative literature? Maybe those could go into the docstrings, as well?

Which scenarios do we have tested so far?

johnwlambert · 2021-04-21T19:39:51Z

integration_tests/test_eval_forecasting.py

+    }
+    assert_metrics_almost_equal(expected_metrics, metrics)
+
+    # Case 5: Top-2, 2 forecast, uniform probabilities


probably would be cleaner to make this 5 separate tests, with a pytest fixture?

I was on the fence about it but ended up refactoring the tests anyway. It's more organized now.

Thanks for the push.

johnwlambert · 2021-04-21T19:40:39Z

integration_tests/test_eval_forecasting.py

+        "minFDE": 1.0,
+        "DAC": 1.0,
+        "MR": 0.0,
+        "p-minADE": 1.16,


these numbers seem a bit hard to work out by hand -- can we come up with a simple example, e.g. straight lines or something, where we can work it out by hand?

I guess the final metric values need not be easy to work out by hand as long as the test example is self-explanatory. The test example covers a few cases which straight-line examples will miss.

Actually, this particular case can even be worked out by hand. The forecast and ground truth are 1.0m apart in each frame. This is not any more complicated than straight lines.
The other case where the forecast is a turning trajectory might not be that straightforward.

johnwlambert · 2021-04-21T19:41:26Z

argoverse/evaluation/eval_forecasting.py

-                )
-                + curr_min_fde
-            )
+            prob_min_ade.append((1 - pruned_probabilities[min_idx]) ** 2 + curr_min_ade)


what was the rationale for using -log before?

maybe a reference to papers that use Brier would be useful here

Added a reference to the original brier score in the PR description. To my knowledge, no one has used brier score with discplacent error. So there's not reference for that.
Brier score is a commonly used metric for probability calibration.

johnwlambert reviewed Apr 21, 2021

View reviewed changes

jagjeet-singh force-pushed the forecasting_prob_metric branch from 6f9f275 to 08e1454 Compare April 22, 2021 23:21

Added brier score for p-metrics in forecasting

11943bf

jagjeet-singh force-pushed the forecasting_prob_metric branch from 08e1454 to 11943bf Compare April 22, 2021 23:46

jagjeet-singh merged commit 98393cb into master Apr 23, 2021

benjaminrwilson deleted the forecasting_prob_metric branch July 23, 2021 22:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added brier score for p-metrics in forecasting #209

Added brier score for p-metrics in forecasting #209

jagjeet-singh commented Apr 19, 2021 •

edited

Loading

johnwlambert commented Apr 19, 2021

johnwlambert Apr 21, 2021

jagjeet-singh Apr 22, 2021

jagjeet-singh Apr 22, 2021

johnwlambert Apr 21, 2021

jagjeet-singh Apr 22, 2021

jagjeet-singh Apr 23, 2021

johnwlambert Apr 21, 2021

jagjeet-singh Apr 22, 2021

Added brier score for p-metrics in forecasting #209

Added brier score for p-metrics in forecasting #209

Conversation

jagjeet-singh commented Apr 19, 2021 • edited Loading

johnwlambert commented Apr 19, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jagjeet-singh commented Apr 19, 2021 •

edited

Loading