Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added brier score for p-metrics in forecasting #209

Merged
merged 1 commit into from
Apr 23, 2021

Conversation

jagjeet-singh
Copy link
Contributor

@jagjeet-singh jagjeet-singh commented Apr 19, 2021

This PR adds brier score based metric to forecasting. The two new metrics are:
brier-minFDE = (1.0-p)^2 + minFDE and brier-minADE = (1.0-p)^2 + minADE

Motivation for using brier score for assessing uncertainty:

  • Brier score is a commonly used metric in probability calibration ("Brier, G. W. Verification of forecasts expressed in terms of probability. Monthly weather review, 1950.")
  • The contribution is in [0.0, 1.0], thereby making sure probabilities are not penalized too high.
  • No tunable parameters.
  • Given that both minFDE and f(p) are in a similar range, the methods will be motivated to do well in both.

image

Top methods from eval ai leaderboard:
image

@johnwlambert
Copy link
Contributor

Thanks for the PR, @jagjeet-singh. Would you mind adding a few sentences here explaining the motivation for the change, and links to any relative literature? Maybe those could go into the docstrings, as well?

Which scenarios do we have tested so far?

}
assert_metrics_almost_equal(expected_metrics, metrics)

# Case 5: Top-2, 2 forecast, uniform probabilities
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably would be cleaner to make this 5 separate tests, with a pytest fixture?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was on the fence about it but ended up refactoring the tests anyway. It's more organized now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the push.

"minFDE": 1.0,
"DAC": 1.0,
"MR": 0.0,
"p-minADE": 1.16,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these numbers seem a bit hard to work out by hand -- can we come up with a simple example, e.g. straight lines or something, where we can work it out by hand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the final metric values need not be easy to work out by hand as long as the test example is self-explanatory. The test example covers a few cases which straight-line examples will miss.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this particular case can even be worked out by hand. The forecast and ground truth are 1.0m apart in each frame. This is not any more complicated than straight lines.
The other case where the forecast is a turning trajectory might not be that straightforward.

)
+ curr_min_fde
)
prob_min_ade.append((1 - pruned_probabilities[min_idx]) ** 2 + curr_min_ade)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what was the rationale for using -log before?

maybe a reference to papers that use Brier would be useful here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a reference to the original brier score in the PR description. To my knowledge, no one has used brier score with discplacent error. So there's not reference for that.
Brier score is a commonly used metric for probability calibration.

@jagjeet-singh jagjeet-singh merged commit 98393cb into master Apr 23, 2021
@benjaminrwilson benjaminrwilson deleted the forecasting_prob_metric branch July 23, 2021 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants