Releases: epiforecasts/scoringutils
scoringutils 2.0.0
This update represents a major rewrite of the package and introduces breaking changes. If you want to keep using the older version, you can download it using remotes::install_github("epiforecasts/scoringutils@v1.2")
.
The update aims to make the package more modular and customisable and overall cleaner and easier to work with. In particular, we aimed to make the suggested workflows for evaluating forecasts more explicit and easier to follow (see visualisation below). To do that, we clarified input formats and made them consistent across all functions. We refactored many functions to S3-methods and introduced forecast
objects with separate classes for different types of forecasts. A new set of as_forecast_<type>()
functions was introduced to validate the data and convert inputs into a forecast
object (a data.table
with a forecast
class and an additional class corresponding to the forecast type (see below)). Another major update is the possibility for users to pass in their own scoring functions into score()
. We updated and improved all function documentation and added new vignettes to guide users through the package. Internally, we refactored the code, improved input checks, updated notifications (which now use the cli
package) and increased test coverage.
The most comprehensive documentation for the new package after the rewrite is the revised version
of our original scoringutils
paper.
See the NEWS file for a detailed overview of the changes.
What's Changed
- Replace use of ".Rda" by use of ".rds" where appropriate by @nikosbosse in #318
- Replace ..density.. by after_stat(density) by @nikosbosse in #319
- Create an S3 method for plot_avail_forecasts by @nikosbosse in #322
- Improvements to avail_forecasts() --> available_forecasts() by @nikosbosse in #321
- Add deprecation notes by @nikosbosse in #323
- Add documentation and make rounding for
correlation()
explicit by @nikosbosse in #320 - Rework scoring functions and their interface by @nikosbosse in #341
- Fix linting by @nikosbosse in #384
- Rework score() by @nikosbosse in #344
- Rework
summarise_scores()
andpairwise_comparison()
by @nikosbosse in #368 - Intermediate clean up by @nikosbosse in #375
- Update branch to include coverage functions by @nikosbosse in #394
- Update branch with older changes to manuscript by @nikosbosse in #440
- Rework quantile to interval format by @nikosbosse in #377
- First step towards reworking
score.scoringutils_quantile()
- adding coverage as a metric by @nikosbosse in #395 - Update
bias_quantile()
to work with vectors / matrices instead of data.table by @nikosbosse in #396 - Update functions to compute the absolute median by @nikosbosse in #419
- Rework score.scoringutils quantile()2 by @nikosbosse in #421
- Move tests around by @nikosbosse in #422
- Expand tests2 by @nikosbosse in #423
- Rework add coverage to work with raw forecasts by @nikosbosse in #426
- Simplify
score()
by @nikosbosse in #430 - Add separate functions for wis components by @nikosbosse in #397
- Fix set forecast unit by @nikosbosse in #437
- Add coverage deviation as a metric by @nikosbosse in #417
- Rework add coverage() by @nikosbosse in #390
- Rework quantile scores by @nikosbosse in #388
- Fix failing CI issues by deleting code remnants that shouldn't be there anymore by @nikosbosse in #478
- Fix failing CI issues by deleting code remnants that shouldn't be there anymore by @nikosbosse in #479
- Issue #480: Fix gh action by @nikosbosse in #483
- Fix small issues resulting from merging several updates into dev by @nikosbosse in #468
- Issue #405: expose
get_forecast_type()
to users by @nikosbosse in #466 - Issues #402: Expose
get_forecast_unit()
to users by @nikosbosse in #464 - Update input formats for binary and point forecasts by @nikosbosse in #460
- Fix pkgdown by @nikosbosse in #482
- Issue #500: Reduce messages in
bias_quantile()
by @nikosbosse in #501 - Issue #485: Fix linting by @nikosbosse in #509
- Issue #443 Drop interval functions by @nikosbosse in #525
- Issue #519: Fix rendering the Readme by @nikosbosse in #526
- Issue #275: Fix handling of scalar inputs in
logs_binary()
by @nikosbosse in #524 - Issue #446: Remove function
delete_columns()
by @nikosbosse in #529 - Issue #403: Rename
available_forecasts()
toget_forecast_counts()
by @nikosbosse in #511 - Issue #494: New workflow for creating and validating forecast objects by @nikosbosse in #531
- Issue #520: Rename forecast classes from
scoringutils_*
toforecast_*
by @nikosbosse in #533 - Issue #452: Add documentation for apply metrics by @nikosbosse in #470
- Issue 519: Fix failing render Readme action by @nikosbosse in #534
- Issue #404: Use
na.omit()
to removeNA
values before scoring by @nikosbosse in #465 - Issue #474: Make default scoring rules functions rather than stored data sets by @nikosbosse in #536
- Issue #448: Add back in examples for a few plots by @nikosbosse in #463
- Reduce the number of attributes used by
validate_forecast()
by @nikosbosse in #541 - Issue #436 Rename instances of "coverage" to either "interval coverage" or "quantile coverage" by @nikosbosse in #540
- Fix error message for
interval_coverage_quantile()
by @nikosbosse in #549 - Issue #552 Merge main into develop by @nikosbosse in #556
- Issue 553: interval_coverage_ testing improvements by @seabbs in #554
- Issue 551: Remove scoringutils metrics as using Metrics versions by @seabbs in #548
- Issue #535: Create pkgdown docs for stable and dev versions by @sbfnk in #550
- Issue 555: Rename interval_coverage_ family and remove sample version by @seabbs in #558
- Issue 560: Fix render_readme by @seabbs in #561
- Issue #564: add dependabot to keep GitHub Actions up to date by @sbfnk in #565
- Issue #566: don't clean upon pkgdown deployment. by @sbfnk in #567
- Issue #547: update package description in README by @sbfnk in #563
- Update render readme action in develop by @sbfnk in #576
- Issue #566: don't clean upon pkgdown deployment (this time on develop) by @sbfnk in #569
- Issue 557: Fix small numerical issue in sample_to_quantile() by @jhellewell14 in #570
- Develop into main by @sbfnk in #579
- Add current docs link to README by @seabbs in #582
- Issue #559: Ensure output of
add_coverage()
is an object of class forecast_quantile by @nikosbosse in #586 - Issues 520 and 484: expose function to get names of scores used in
score()
by @nikosbosse in #588
*...
scoringutils 1.2.2
This minor release includes two bug fixes, as well as some small changes to the package infrastructure. It reflects the current version on CRAN, 1.2.2
.
Package updates
- Added a startup message to inform users of an upcoming major update and asking for feedback (see #333)
- Added some fixes and updates to the package infrastructure on github
- Updates the minimum required R version to 3.6
Bug fixes
- Fixed a bug in
set_forecast_unit()
that made the function fail if adata.frame
instead of adata.table
was provided (thank you, @elray1) - Fixed a bug in the metrics table in the "Scoring forecasts directly vignette" where some rows were duplicated (thank you, @elray1)
What's Changed
- render README action by @nikosbosse in #314
- Website fixes by @nikosbosse in #315
- Improve gh pages and rendering of Vignettes by @nikosbosse in #316
- Issue #480: Update to R version 3.6 to fix failing gh action by @nikosbosse in #488
- Issue #486: Add startup message for a new CRAN release by @nikosbosse in #490
- Issue 428 fix metrics table by @nikosbosse in #489
- Issue #427: Fix error in
set_forecast_unit()
that occurs when input is not a data.table by @nikosbosse in #487 - Issue #486 Fix cran checks by @nikosbosse in #505
- Create a PR template by @nikosbosse in #491
- Issue 486: Fix NOTEs for CRAN submission by @nikosbosse in #514
Full Changelog: v1.2.0...v1.2.2
scoringutils 1.2.0
This major release contains a range of new features and bug fixes that have been introduced in minor releases since 1.1.0
. The most important changes are:
- Documentation updated to reflect changes since version 1.1.0, including new transform and workflow functions.
- New
set_forecast_unit()
function allows manual setting of forecast unit. summarise_scores()
gains newacross
argument for summarizing across variables.- New
transform_forecasts()
andlog_shift()
functions allow forecast transformations. See the documentation fortransform_forecasts()
for more details and an example use case. - Input checks and test coverage improved for bias functions.
- Bug fix in
get_prediction_type()
for integer matrix input. - Links to scoringutils paper and citation updates.
- Warning added in
interval_score()
for small interval ranges. - Linting updates and improvements.
Thanks to @nikosbosse, @seabbs, and @sbfnk for code and review contributions. Thanks to @Bisaloo for the suggestion to use a linting GitHub Action that only triggers on changes, and @adrian-lison for the suggestion to add a warning to interval_score()
if the interval range is between 0 and 1.
Package updates
- The documentation was updated to reflect the recent changes since
scoringutils 1.1.0
. In particular, usage of the functionsset_forecast_unit()
,check_forecasts()
andtransform_forecasts()
are now documented in the Vignettes. The introduction of these functions enhances the overall workflow and help to make the code more readable. All functions are designed to be used together with the pipe operator. For example, one can now use something like the following:
example_quantile |>
set_forecast_unit(c("model", "location", "forecast_date", "horizon", "target_type")) |>
check_forecasts() |>
score()
Documentation for the transform_forecasts()
has also been extended. This functions allows the user to easily add transformations of forecasts, as suggested in the paper "Scoring epidemiological forecasts on transformed scales". In an epidemiological context, for example, it may make sense to apply the natural logarithm first before scoring forecasts, in order to obtain scores that reflect how well models are able to predict exponential growth rates, rather than absolute values. Users can now do something like the following to score a transformed version of the data in addition to the original one:
data <- example_quantile[true_value > 0, ]
data |>
transform_forecasts(fun = log_shift, offset = 1) |>
score() |>
summarise_scores(by = c("model", "scale"))
Here we use the log_shift()
function to apply a logarithmic transformation to the forecasts. This function was introduced in scoringutils 1.1.2
as a helper function that acts just like log()
, but has an additional argument offset
that can add a number to every prediction and observed value before applying the log transformation.
Feature updates
- Made
check_forecasts()
andscore()
pipeable (see issue #290). This means that
users can now directly use the output ofcheck_forecasts()
as input for
score()
. Asscore()
otherwise runscheck_forecasts()
internally anyway
this simply makes the step explicit and helps writing clearer code.
What's Changed
- tweak code samples to avoid issues with CRAN by @nikosbosse in #266
- Make boolean vector explicitly numeric when computing interval scores by @nikosbosse in #274
- Add a transform_forecasts() function by @nikosbosse in #278
- create proposal for a add_transformation function by @nikosbosse in #271
- update contents of metrics table by @nikosbosse in #280
- Add warning interval score by @nikosbosse in #281
- Edit pass on transform_forecasts and NEWS by @seabbs in #283
- Linting overhaul by @seabbs in #284
- Update protected cols by @nikosbosse in #292
- update branch from master by @nikosbosse in #291
- Add set forecast unit by @nikosbosse in #293
- fix badges by @sbfnk in #298
- Issue 272: Add an across argument to summarise_scores() by @seabbs in #302
- Corrections to metrics vignette by @sbfnk in #307
- fix quantile/sample mixup in vignette by @sbfnk in #308
- Issue 301: Address coverage gaps in bias.R by @seabbs in #305
- scoringutils 1.2.0 release by @nikosbosse in #299
Full Changelog: v1.1.0...v1.2.0
scoringutils version 1.1.0
A minor update to the package with some bug fixes and minor changes.
Feature updates
Package updates
- Removed the on attach message which warned of breaking changes in
1.0.0
. - Renamed the
metric
argument ofsummarise_scores()
torelative_skill_metric
. This argument is now deprecated and will be removed in a future version of the package. Please use the new argument instead. - Updated the documentation for
score()
and related functions to make the soft requirement for amodel
column in the input data more explicit. - Updated the documentation for
score()
,pairwise_comparison()
andsummaris_scores()
to make it clearer what the unit of a single forecast is that is required for computations - Simplified the function
plot_pairwise_comparison()
which now only supports plotting mean score ratios or p-values and removed the hybrid option to print both at the same time.
Bug fixes
- Missing baseline forecasts in
pairwise_comparison()
now trigger an explicit and informative error message. - The requirements table in the getting started vignette is now correct.
- Added support for an optional
sample
column when using a quantile forecast format. Previously this resulted in an error.
List of Pull Requests
- Fix 223 issue with add_coverage by @seabbs in #234
- Test explicitly on minimum supported R version by @Bisaloo in #236
- use deparse instead of deparse1 by @nikosbosse in #237
- Specify which column name clashes with metrics by @Bisaloo in #239
- Fix separate results by @nikosbosse in #243
- fix pit histogram plot in #240 by @nikosbosse in #247
- Fix requirements table in getting-started vignette by @damonbayer in #249
- Reformatting to comply with JSS requirements by @nikosbosse in #251
- Fix ggplot2 update by @nikosbosse in #252
- update test by @nikosbosse in #254
- don't error if baseline model is not present by @sbfnk in #250
- 1.1.0 by @seabbs in #258
- add infra for develop branch by @seabbs in #259
- Issue 248 by @seabbs in #256
- Documentation: Update README by @seabbs in #262
- Docmentation: update score and related documentation to mention the model column by @seabbs in #260
- Feature: add enhanced treatment of sample as a protected column by @seabbs in #261
- simplify plot_pairwise_comparison() by @nikosbosse in #263
- Make forecast unit clearer by @nikosbosse in #264
New Contributors
- @damonbayer made their first contribution in #249
Full Changelog: v1.0.0...v1.1.0
1.0.0
Major update to the package and most package functions with lots of breaking changes.
Feature updates
- new and updated Readme and vignette
- the proposed scoring workflow was reworked. Functions were changed so they
can easily be piped and have simplified arguments and outputs.
new functions and function changes
- the function
eval_forecasts()
was replaced by a functionscore()
with a
much reduced set of function arguments. - Functionality to summarise scores and to add relative skill scores was moved
to a functionsummarise_scores()
- new function
check_forecasts()
to analyse input data before scoring - new function
correlation()
to compute correlations between different metrics - new function
add_coverage()
to add coverage for specific central prediction
intervals - new function
avail_forecasts()
allows to visualise the number of available
forecasts - new function
find_duplicates()
to find duplicate forecasts which cause an
error - all plotting functions were renamed to begin with
plot_
. Arguments were
simplified - the function
pit()
now works based on data.frames. The oldpit
function
was renamed topit_sample()
. PIT p-values were removed entirely. - the function
plot_pit()
now works directly with input as produced bypit()
- many data-handling functions were removed and input types for
score()
were
restricted to sample-based, quantile-based or binary forecasts. - the function
brier_score()
now returns all brier scores, rather than taking
the mean before returning an output. crps
,dss
andlogs
were renamed tocrps_sample()
,dss_sample()
, and
logs_sample()
Bug fixes
- Testing was expanded
- minor bugs were fixed, for example a bug in the sample_to_quantile function
(#223)
package data updated
- package data is now based on forecasts submitted to the European Forecast Hub
(https://covid19forecasthub.eu/). - all example data files were renamed to begin with
example_
- a new data set,
summary_metrics
was included that contains a summary of the
metrics implemented inscoringutils
Other breaking changes
- The 'sharpness' component of the weighted interval score was renamed to
dispersion. This was done to make it more clear what the component represents
and to maintain consistency with what is used in other places.
0.1.8
- Typos by @Bisaloo in #109
- fix plot predictions when no truth is available by @nikosbosse in #108
- Fix bug in metrics selection by @nikosbosse in #112
- Create pkgdown reference index by @Bisaloo in #113
- Add GitHub action to automatically rebase PR when a comment with '/rebase' is posted by @Bisaloo in #114
- Add pkgdown website to DESCRIPTION by @Bisaloo in #118
- Fix heading levels in NEWS.md by @Bisaloo in #116
- add pkgdown.yaml to build ignore by @seabbs in #123
- Use fct() to enable auto-linking on pkgdown by @Bisaloo in #119
- Remove NEWS.md and Readme.md from .Rbuildignore by @Bisaloo in #117
- Remove unnecessary vapply() by @Bisaloo in #120
- fix typo by @sbfnk in #131
- fix
plot_predictions
if no median forecast available by @sbfnk in #137 - avoid mixture of NA and NaN by @sbfnk in #139
- visible data.table return in a few functions by @sbfnk in #130
- Update check-full.yaml by @nikosbosse in #143
- update news file format by @nikosbosse in #144
- update_list -> internal by @seabbs in #145
- Change data.table return statements by @nikosbosse in #142
- update branch from master by @nikosbosse in #146
- update branch from master by @nikosbosse in #147
- Convert roxygen2 comments to markdown by @Bisaloo in #115
- Add Hugo as an author by @nikosbosse in #135
- Bug fixes to plot_predictions() by @Bisaloo in #153
- Address some of the points in Hugo's review by @nikosbosse in #149
- Add check function by @nikosbosse in #154
- Add available_metrics to pkgdown index by @Bisaloo in #158
- add check_forecasts to pkgdown yaml by @nikosbosse in #161
- add print method to pkgdown yaml by @nikosbosse in #162
- Add docker + vscode support by @seabbs in #177
- Remove rebase-comment action by @Bisaloo in #199
Full Changelog: v0.1.7...v0.1.8
0.1.7
v0.1.7 make criteria to classify a forecast as binary stricter