Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recode_shadow() special missings are not accounted for by summary functions #340

Open
mkcaulfield opened this issue Feb 2, 2024 · 1 comment
Milestone

Comments

@mkcaulfield
Copy link

Hi! Thanks so much for the package; it's such an important tool that the R ecosystem really needed!

I've been using recode_shadow() to handle some special missings, and while that works to change the shadow columns / update the factor levels, when I try to use functions to summarize characteristics of missingness in the dataframe, eg, add_any_miss() or miss_var_table(), it doesn't recognize these recoded special missings AS missing; it marks the first row of the dataframe as complete despite the -99 value for wind being a special missing. It might be nice if there were an option to choose whether NA aggregations distinguish between "true" / plain NA and special NAs, but if not, I think this omission could easily mislead someone about the completeness of their data.


library(naniar)
df <- tibble::tribble(
  ~wind, ~temp,
  -99,    45,
  68,    NA,
  72,    25
)

df
#> # A tibble: 3 × 2
#>    wind  temp
#>   <dbl> <dbl>
#> 1   -99    45
#> 2    68    NA
#> 3    72    25
df_recode <- df |> bind_shadow() |>
  recode_shadow(wind = .where(wind == -99 ~ "broken_machine"))

df_recode |> add_any_miss()
#> # A tibble: 3 × 5
#>    wind  temp wind_NA           temp_NA any_miss_all
#>   <dbl> <dbl> <fct>             <fct>   <chr>       
#> 1   -99    45 NA_broken_machine !NA     complete    
#> 2    68    NA !NA               NA      missing     
#> 3    72    25 !NA               !NA     complete
df_recode |> miss_var_table()
#> # A tibble: 2 × 3
#>   n_miss_in_var n_vars pct_vars
#>           <int>  <int>    <dbl>
#> 1             0      3       75
#> 2             1      1       25

Created on 2024-02-02 with reprex v2.1.0```

@njtierney
Copy link
Owner

Hello!

Thank you very much for the kind words :)

I'm glad to hear that you are using the special missings feature, and this is a great point that there should be some way to support/account for them in the missingness summaries.

When I'm next able to get some time to do a sprint on naniar and visdat I will revisit this and touch base, hopefully that will be sooner (0-3 months) rather than later!

@njtierney njtierney added this to the V1.1.0 milestone Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants