Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make boolean vector explicitly numeric when computing interval scores #274

Merged
merged 2 commits into from
Mar 2, 2023

Conversation

nikosbosse
Copy link
Contributor

Addresses #273
Ultimately the issue seems to be outside scoringutils, however adding as.numeric to the code doesn't add much complexity and just makes explicit what is happening implicitly anyway.
Discussion here: epinowcast/epinowcast#198

@codecov
Copy link

codecov bot commented Mar 2, 2023

Codecov Report

Merging #274 (17f23a7) into master (ae82ca2) will increase coverage by 0.01%.
The diff coverage is 100.00%.

❗ Current head 17f23a7 differs from pull request most recent head 59cb370. Consider uploading reports for the commit 59cb370 to get more accurate results

@@            Coverage Diff             @@
##           master     #274      +/-   ##
==========================================
+ Coverage   90.80%   90.82%   +0.01%     
==========================================
  Files          21       21              
  Lines        1284     1286       +2     
==========================================
+ Hits         1166     1168       +2     
  Misses        118      118              
Impacted Files Coverage Δ
R/interval_score.R 94.73% <100.00%> (+0.29%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@nikosbosse nikosbosse requested a review from seabbs March 2, 2023 11:40
@seabbs
Copy link
Contributor

seabbs commented Mar 2, 2023

Will circle back but just to reiterate my point from the original issue - I think having some input checking is very in scope for here.

Also did you have any luck determining how local this was?

Copy link
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code wise this looks fine.

Do we want any code changes for inputting checking (i.e higher up in the code somewhere) or changes to documentation?

This change also needs to be flagged in the news and the dev version needs to be incrmented.

@nikosbosse
Copy link
Contributor Author

nikosbosse commented Mar 2, 2023

The original "issue" is with something in epinowcast. Haven't looked at what's up there.

image

I'm also not sure how to do more input testing for the specific case: I could check for whether the input is numeric, but the epinowcast example would pass that test, yet the code as is would still fail.
(the error contains about the second vector not being numeric, hence the proposed change).

image

Agree that more input testing is always good and I can add more tests, but that wouldn't have prevented this specific error.

@seabbs
Copy link
Contributor

seabbs commented Mar 2, 2023

The original "issue" is with something in epinowcast. Haven't looked at what's up there.

That is only the case if it doesn't replicate to other use cases right? We aren't doing anything particularly strange here so I see no reason why it wouldn't.

As I said there is no input checking here so at the moment if this happens again it either 1. just works. or 2. fails in a strangely cryptic way. Ideally, it would either 1. just work or 2. fail meaningfully hence the need for input checking?

Is this in fact a bug with pillar?

I could check for whether the input is numeric,

Even if this passes in this case it still seems like something we should have for such a key input.

Wait in the code are you doing number * logical? Is that a good pattern?

@seabbs
Copy link
Contributor

seabbs commented Mar 2, 2023

ah so you code is

as..numeric(summarised_nowcast$q95 > value)

so not quite written as in your example and not a bad pattern I would say

@nikosbosse
Copy link
Contributor Author

The original code was numeric * logical and the conversion from logical to (0, 1) happened implicitly. This usually always works, but for the pillar vectors (wherever these are coming from...) didn't work.
The fixed code now does numeric * as.numeric(logical)

@seabbs
Copy link
Contributor

seabbs commented Mar 2, 2023

Ah okay. Yes then I think numeric * logical isn't a great pattern so not the worst thing to update here.

Can you replicate this with a simplepillar reprex? If yes perhaps it could/should be an issue for them?

@nikosbosse
Copy link
Contributor Author

Just looked into this - simply checking for is.numeric doesn't even work, because sometimes inputs can be NA.
options:

  • I could update every NA to NA_real_ before
  • check whether it is either numeric or NA.
  • use assert_numeric() from the checkmate package (https://mllg.github.io/checkmate/). Introducing another dependency, but also looks really nice.

Maybe checkmate is the way to go

@nikosbosse
Copy link
Contributor Author

nikosbosse commented Mar 2, 2023

Can you replicate this with a simplepillar reprex? If yes perhaps it could/should be an issue for them?

reprex would be

pillar::num(c(1:3)) * c(TRUE, TRUE, FALSE)

Do you think this is a bug or desired behaviour on their end?

@seabbs
Copy link
Contributor

seabbs commented Mar 2, 2023

Have no idea. I would have thought they want to it behave in the same way as numeric?

@nikosbosse
Copy link
Contributor Author

created an issue. r-lib/pillar#630. We are good bings.

@seabbs
Copy link
Contributor

seabbs commented Mar 2, 2023

In terms of input checking checkmate might be the way to go though I know @Bisaloo had some reservations?

@nikosbosse
Copy link
Contributor Author

I also thought about writing a custom check function, but this seems like reinventing the wheel a bit. Also there are all kinds of complicated things that happen when you test for NA values as a vector of NA values is deemed to be a logical in R.

Here is what I came up with as a first quick draft (could be improved a lot), but again not sure how deep we want to go

check_type <- function(input, allowed_types) {
  check <- list()
  
  if ("numeric" %in% allowed_types) {
    check[["numeric"]] <- sapply(input, is.numeric)
  }
  
  if ("logical" %in% allowed_types) {
    # note that NA is logical
    check[["logical"]] <- sapply(input, is.logical)
  }
  
  check <- (check$numeric | check$logical)
  
  if (all(check)) {
    return(TRUE)
  } else {
    stop(
      paste0(
        "input is not of expected type"
      )
    )
  }
}

@nikosbosse
Copy link
Contributor Author

Maybe we should merge this for now and create another issue for improving input checks

@seabbs
Copy link
Contributor

seabbs commented Mar 2, 2023

Maybe we should merge this for now and create another issue for improving input checks

Agree.

I often write my own checks as all the tools aren't ideal but as you say its a lot of work.

@nikosbosse nikosbosse merged commit 66487aa into master Mar 2, 2023
@seabbs seabbs deleted the adjust-interval-score branch March 2, 2023 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants