Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics_ -> metrics.scoringutils.quantile etc #832

Open
seabbs opened this issue May 24, 2024 · 28 comments
Open

metrics_ -> metrics.scoringutils.quantile etc #832

seabbs opened this issue May 24, 2024 · 28 comments
Labels
Breaking change This issue discusses or suggests a potentially breaking change enhancement New feature or request

Comments

@seabbs
Copy link
Contributor

seabbs commented May 24, 2024

As the title it seems like metrics_ functions should be exported to S3 methods? They can dispatch on the input to score when used in the default mode. This would streamline adding a custom thing to score etc.

@seabbs seabbs added the enhancement New feature or request label May 24, 2024
@nikosbosse
Copy link
Contributor

But it would also make it a bit harder to customise your own lists of functions, wouldn't it? Because you'd always have to pass in a forecast object.

This would streamline adding a custom thing to score etc.
Could you elaborate more please?

@seabbs
Copy link
Contributor Author

seabbs commented May 28, 2024

why? You can just call the underlying list (i.e metrics.class()) directly?

@nikosbosse
Copy link
Contributor

So this would boil down to people calling something like metrics(example_quantile) (or metrics(their_data)) instead of metrics_quantile(), right?
And would make the metrics() function more naturally customisable.

@Bisaloo what are your thoughts on this?

@nikosbosse nikosbosse added the Breaking change This issue discusses or suggests a potentially breaking change label Jul 14, 2024
@seabbs
Copy link
Contributor Author

seabbs commented Jul 18, 2024

yees exactly + better match the overall package design. The push back to doing this is that currently all thee metrics themselves have _sample etc names so the current approach is consistent with that

@Bisaloo
Copy link
Member

Bisaloo commented Jul 24, 2024

Yes, it sounds like a good idea to me. It would reduce the (visible) NAMESPACE because all methods can be seen through their shared generic.

As previously mentioned, any reasonable solution that reduces scoringutils' NAMESPACE is probably a win in terms of usability in my opinion.

@nikosbosse
Copy link
Contributor

Oh.. this is annoying.

the problem with a setup like this:

score <- function(forecast, metrics = metrics(forecast), ...) {
  UseMethod("score")
}

is Error: promise already under evaluation: recursive default argument reference or earlier problems? - we can't have the variable be called metrics and also have the default argument be metrics(forecast).

Also the select_metrics() function becomes a bit unwieldy if you don't have a forecast object at hand:

#' select_metrics(
#'   metrics = metrics(as_forecast_binary(example_binary)),
#'   exclude = "log_score"
#' )

I don't think this is a hard blocker (we can replace metrics = metrics(forecast) with something like metrics = NULL, but I'm not sure it makes things actually nicer/easier for the user.

@nikosbosse
Copy link
Contributor

Options:

A) staying with metrics_point(), metrics_binary() etc.
B) Creating an S3 method, but renaming metrics() to e.g. metrics_default() such that we can use score(forecast, metrics = metrics_default(forecast)) as default
C) Using score(forecast, metrics = NULL) as the default and substituting NULL with metrics(forecast) inside the function code.

I'm voting for either A or B. We moved away from NULL as an argument default wherever possible in the code.
(maybe slight preference for A in light of how we might make it easier in the future to compose lists of functions and it's a bit unwieldy to call the function on an object all the time.)

What do you think @seabbs @sbfnk @Bisaloo @jamesmbaazam ?

@seabbs
Copy link
Contributor Author

seabbs commented Jul 30, 2024

I have a slight preference for B and regardless of it being s3 slightly prefer the name as it is more accurate.

unwieldy to call the function on an object all the time

I agree this is a bit unwieldy but I think the upside of having fewer function names for people to find + automation is worth it.

I also don't think accessing via default_metrics.forecast_point() is terrible either but that might just be me.

@nikosbosse
Copy link
Contributor

Other opinions on this? :)

@Bisaloo
Copy link
Member

Bisaloo commented Jul 31, 2024

I cannot answer anything useful because I'm still struggling to understand #832 (comment).

My gut feeling (perhaps completely wrong since I don't fully understand what's going on) is that we can do something about this error without changing anything else in the proposed arguments or structure.

Is it possible to push the broken code to a branch so we can have a look and experiment?

@nikosbosse
Copy link
Contributor

The general issue is that R doesn't allow you to pass something to an argument that has the same name of that argument. e.g. the following leads to an error:

f <- function(x = 1:10, mean = mean(x)) {
  print(mean)
}
f()

So we can't have a function

score <- function(forecast, metrics = metrics(forecast), ...) {
}

Previously,

score <- function(forecast, metrics = metrics_binary(), ...) {
}

worked because the names were different.

@nikosbosse
Copy link
Contributor

(not sure this is what actually what confused you - happy to create another branch if you like)

@Bisaloo
Copy link
Member

Bisaloo commented Jul 31, 2024

Oooh, that makes sense, thanks for explaining!

I'd go with B with a small change: convert the function name to a verb, such as get_metrics().

@nikosbosse
Copy link
Contributor

We currently already have a function get_metrics() (which I don't like very much). The function reads the metrics attribute from a scores object, i.e. it checks the actual metrics that a scores object has based on the attribute.

I don't like the function because I don't like having to rely on the attribute of the scores object (but that's probably a discussion for a different issue).
ok it seems like B is emerging as the preferred option.

@seabbs
Copy link
Contributor Author

seabbs commented Jul 31, 2024

is our current get_metrics internal? If so we could just change the name. I agree I like using a verb here and it fits with the rest of the get_ names we have fairly well I think?

@nikosbosse
Copy link
Contributor

It's exported and it's used in functions like this:

get_correlations <- function(scores,
                             metrics = get_metrics(scores),
                             ...) {
}

This is the function definition of get_metrics():

get_metrics <- function(scores, error = FALSE) {

We could rename get_metrics() to something else. get_used_metrics()?
Renaming default_metrics() to get_default_metrics() seems a bit too long for my taste. I don't hate default_metrics() though.

@seabbs
Copy link
Contributor Author

seabbs commented Jul 31, 2024

get_scored_metrics maybe is a better fit anyway?

@seabbs
Copy link
Contributor Author

seabbs commented Jul 31, 2024

Just noting that another benefit of this change is we can reduce the number of custom methods that there are for score as for a few of them the metrics list is the only current difference (this would require relying on a common forecast class).

see here: https://github.com/epiforecasts/scoringutils/blob/main/R/score.R

This could help with #852 as it could reduce the amount of boiler plate users have to right

@seabbs
Copy link
Contributor Author

seabbs commented Jul 31, 2024

Perhaps this wouldn't really be likely to happen though. In the nominal case this wouldn't have saved any code for example (https://github.com/epiforecasts/scoringutils/pull/837/files)

@nikosbosse
Copy link
Contributor

nikosbosse commented Jul 31, 2024

Current contenders are:

B1:

  • get_metrics() to replace current functions metrics_binary(), metrics_quantile() etc.
  • get_scored_metrics() to replace current function get_metric() (which returns the metrics attribute of a scores object

B2:

  • default_metrics() to replace current functions metrics_binary(), metrics_quantile() etc.
  • get_metrics() stays the same

B3:

  • default_metrics() to replace current functions metrics_binary(), metrics_quantile() etc.
  • get_scored_metrics() to replace current function get_metric() (which returns the metrics attribute of a scores object

@seabbs
Copy link
Contributor Author

seabbs commented Jul 31, 2024

my preference is B1 but I recognise it is more change. I like it as a proposal as it makes it clear where get_scored_metrics is getting its metrics from which I am not sure was true in the past?

@nikosbosse
Copy link
Contributor

nikosbosse commented Jul 31, 2024

Luckily, we now also have B3 :D
I'm slightly leaning B2/B3. Do we have other opinions? One has to love these fast-paced high-stakes scoringutils dev decisions...

@seabbs
Copy link
Contributor Author

seabbs commented Jul 31, 2024

the future of the world hangs on a knife edge..

I still prefer B1.... get_metrics seems consistent with everything else. What is the push back in favour of default_metrics?

@nikosbosse
Copy link
Contributor

I think my argument is the same that you made about the previous get_metrics: default_metrics tells you what to expect, whereas with get_metrics() it's less clear. But 🤷
@sbfnk I think we need professor energy here :)

@seabbs
Copy link
Contributor Author

seabbs commented Jul 31, 2024

so what about get_default_metrics? Its still in line with the naming scheme and also captures the idea of a default?

The bit I don't like about that is that if you select as an arg then it is no longer return the defaults so the name is not quite right (so I would still fall back on preferring get_metrics).

@nikosbosse
Copy link
Contributor

Having done some waiting coupled with no new revelations, I propose we go with get_metrics() to replace metrics_quantile() etc.

For getting the metrics that were actually used for scoring we could either use get_scored_metrics() or we could simply go with get_metrics() as well and just implement a slightly different behaviour for scored objects: instead of all possible metrics, you just get the ones that were actually used.

What do you think?

Noting that this PR https://github.com/hubverse-org/hubEvals/pull/46/files will break with the changes here (cc @elray1), so we need to coordinate this a bit.

@nikosbosse
Copy link
Contributor

nikosbosse commented Sep 9, 2024

There is one additional change I propose. we previously discussed for our example data to be already validated objects rather than raw data.tables. My previous argument was "it's nice to demonstrate the as_forecast_[] workflow to users and we therefore should not have pre-validated example data.

With the change we are discussing here I think the benefits of being able to look at default metrics by calling get_metrics(example_quantile) over get_metrics(as_forecast_quantile(example_quantile)) are significant and we should likely switch to pre-validated example data.

@seabbs
Copy link
Contributor Author

seabbs commented Sep 9, 2024

I think I agree on all points. I was a little worried by the auto-magical idea of having a score method for get_metrics but I think its the right call and keeps the user exposed interface smaller

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Breaking change This issue discusses or suggests a potentially breaking change enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants