Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix na_level bug in summarize_vars #923

Merged
merged 4 commits into from
May 17, 2023
Merged

Conversation

edelarua
Copy link
Contributor

@edelarua edelarua commented May 16, 2023

Closes #911

I believe we should discuss refactoring this function entirely to process NA values in a more sensical way.

@edelarua edelarua changed the title Fix na_level bug Fix na_level bug in summarize_vars May 16, 2023
@edelarua edelarua marked this pull request as ready for review May 16, 2023 21:23
@edelarua edelarua added the sme label May 16, 2023
@github-actions
Copy link
Contributor

github-actions bot commented May 16, 2023

badge

Code Coverage Summary

Filename                                   Stmts    Miss  Cover    Missing
---------------------------------------  -------  ------  -------  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
R/abnormal_by_baseline.R                      63       0  100.00%
R/abnormal_by_marked.R                        52       5  90.38%   124-128
R/abnormal_by_worst_grade_worsen.R           113       3  97.35%   233-235
R/abnormal_by_worst_grade.R                   37       0  100.00%
R/abnormal.R                                  40       0  100.00%
R/analyze_vars_in_cols.R                      37       1  97.30%   113
R/combination_function.R                       9       0  100.00%
R/compare_variables.R                        138       3  97.83%   134, 240, 258
R/control_incidence_rate.R                    10       0  100.00%
R/control_logistic.R                           7       0  100.00%
R/control_step.R                              23       1  95.65%   58
R/control_survival.R                          15       0  100.00%
R/count_cumulative.R                          47       1  97.87%   63
R/count_missed_doses.R                        31       0  100.00%
R/count_occurrences_by_grade.R                84       6  92.86%   156-158, 161, 176-177
R/count_occurrences.R                         61       1  98.36%   92
R/count_patients_events_in_cols.R             67       1  98.51%   74
R/count_patients_with_event.R                 33       0  100.00%
R/count_patients_with_flags.R                 39       0  100.00%
R/count_values.R                              24       0  100.00%
R/cox_regression_inter.R                     142       0  100.00%
R/cox_regression.R                           161       0  100.00%
R/coxph.R                                    169       9  94.67%   19-20, 227-231, 275, 290, 298, 304-305
R/d_pkparam.R                                406       0  100.00%
R/decorate_grob.R                            167      38  77.25%   232-263, 274, 371, 393-430
R/desctools_binom_diff.R                     663      66  90.05%   68, 103-104, 144-145, 148, 227, 253-262, 301, 303, 323, 327, 331, 335, 391, 394, 397, 400, 461, 469, 481-482, 488-491, 499, 502, 511, 514, 562-563, 565-566, 568-569, 571-572, 642, 654-667, 672, 719, 732, 736
R/df_explicit_na.R                            30       0  100.00%
R/estimate_multinomial_rsp.R                  47       1  97.87%   60
R/estimate_proportion.R                      198      11  94.44%   75-82, 86, 91, 460, 565
R/fit_rsp_step.R                              36       0  100.00%
R/fit_survival_step.R                         36       0  100.00%
R/formatting_functions.R                     115       3  97.39%   107, 145, 155
R/g_forest.R                                 437      23  94.74%   197, 248-249, 316, 333-334, 339-340, 353, 369, 416, 447, 523, 532, 613-617, 627, 697, 700, 824
R/g_lineplot.R                               199      29  85.43%   160, 173, 201, 227-230, 307-314, 332-333, 339-349, 441, 449
R/g_step.R                                    68       1  98.53%   109
R/g_waterfall.R                               47       0  100.00%
R/h_adsl_adlb_merge_using_worst_flag.R        74       0  100.00%
R/h_biomarkers_subgroups.R                    38       0  100.00%
R/h_cox_regression.R                         110       0  100.00%
R/h_logistic_regression.R                    468       3  99.36%   206-207, 276
R/h_map_for_count_abnormal.R                  54       0  100.00%
R/h_pkparam_sort.R                            15       0  100.00%
R/h_response_biomarkers_subgroups.R           74       0  100.00%
R/h_response_subgroups.R                     171      12  92.98%   257-270
R/h_stack_by_baskets.R                        65       1  98.46%   91
R/h_step.R                                   180       0  100.00%
R/h_survival_biomarkers_subgroups.R           78       0  100.00%
R/h_survival_duration_subgroups.R            200      12  94.00%   259-271
R/incidence_rate.R                            93       7  92.47%   68-75
R/individual_patient_plot.R                  133       0  100.00%
R/kaplan_meier_plot.R                        569      61  89.28%   248-283, 292-296, 494, 666-668, 676-678, 703, 710-711, 881, 1065, 1312-1323
R/logistic_regression.R                      101       0  100.00%
R/missing_data.R                              21       3  85.71%   32, 66, 76
R/odds_ratio.R                               106       0  100.00%
R/prop_diff_test.R                            88       0  100.00%
R/prop_diff.R                                260      16  93.85%   72-75, 107, 267-274, 413, 473, 578
R/prune_occurrences.R                         57      10  82.46%   140-144, 190-194
R/response_biomarkers_subgroups.R             59       0  100.00%
R/response_subgroups.R                       165       4  97.58%   279, 321-323
R/rtables_access.R                            38       4  89.47%   161-164
R/score_occurrences.R                         20       1  95.00%   124
R/split_cols_by_groups.R                      49       0  100.00%
R/stat.R                                      47       3  93.62%   73-74, 129
R/summarize_ancova.R                          95       1  98.95%   189
R/summarize_change.R                          27       0  100.00%
R/summarize_colvars.R                          6       0  100.00%
R/summarize_coxreg.R                         140       0  100.00%
R/summarize_glm_count.R                      164       4  97.56%   188, 193, 254, 322
R/summarize_num_patients.R                    68       5  92.65%   103-105, 209-210
R/summarize_patients_exposure_in_cols.R       79       0  100.00%
R/summarize_variables.R                      216       2  99.07%   268, 486
R/survival_biomarkers_subgroups.R             59       0  100.00%
R/survival_coxph_pairwise.R                   73       9  87.67%   64-72
R/survival_duration_subgroups.R              172       0  100.00%
R/survival_time.R                             47       0  100.00%
R/survival_timepoint.R                       114       7  93.86%   150-156
R/utils_checkmate.R                           68       0  100.00%
R/utils_factor.R                              87       1  98.85%   91
R/utils_grid.R                               111       5  95.50%   152, 262-268
R/utils_rtables.R                             86       7  91.86%   24, 31-35, 345-346
R/utils.R                                    137      10  92.70%   105, 107, 111, 131, 134, 137, 141, 150-151, 334
TOTAL                                       8933     391  95.62%

Diff against main

Filename                   Stmts    Miss  Cover
-----------------------  -------  ------  -------
R/summarize_variables.R       -1       0  -0.00%
TOTAL                         -1       0  -0.00%

Results for commit: 1525cf0

Minimum allowed coverage is 80%

♻️ This comment has been updated with latest results

@github-actions
Copy link
Contributor

github-actions bot commented May 16, 2023

Unit Tests Summary

       1 files    78 suites   1m 22s ⏱️
   730 tests 730 ✔️     0 💤 0
1 548 runs  974 ✔️ 574 💤 0

Results for commit 64e3089.

♻️ This comment has been updated with latest results.

@Melkiades
Copy link
Contributor

Closes #911

I believe we should discuss refactoring this function entirely to process NA values in a more sensical way.

I think it is possible I added this. How do you think we should proceed?

@edelarua
Copy link
Contributor Author

Closes #911
I believe we should discuss refactoring this function entirely to process NA values in a more sensical way.

I think it is possible I added this. How do you think we should proceed?

I think we should remove the check for NA values - there's no reason why we shouldn't be able to process these in this/similar functions. Also, it wouldn't make a lot of sense for na_level to exist if we allowed NAs, since they would just show up in the table as NA and na.rm would still work - if the user wants to explicitly define/remove the NA level in pre-processing, they are still free to do so.

@Melkiades
Copy link
Contributor

Closes #911
I believe we should discuss refactoring this function entirely to process NA values in a more sensical way.

I think it is possible I added this. How do you think we should proceed?

I think we should remove the check for NA values - there's no reason why we shouldn't be able to process these in this/similar functions. Also, it wouldn't make a lot of sense for na_level to exist if we allowed NAs, since they would just show up in the table as NA and na.rm would still work - if the user wants to explicitly define/remove the NA level in pre-processing, they are still free to do so.

It makes sense. But what if we want just to replace output NAs with custom strings? I think that should remain, i.e. when we need to have "NE" instead. If there is another way than using na_str, I apologize for the silly question!

@edelarua
Copy link
Contributor Author

Closes #911
I believe we should discuss refactoring this function entirely to process NA values in a more sensical way.

I think it is possible I added this. How do you think we should proceed?

I think we should remove the check for NA values - there's no reason why we shouldn't be able to process these in this/similar functions. Also, it wouldn't make a lot of sense for na_level to exist if we allowed NAs, since they would just show up in the table as NA and na.rm would still work - if the user wants to explicitly define/remove the NA level in pre-processing, they are still free to do so.

It makes sense. But what if we want just to replace output NAs with custom strings? I think that should remain, i.e. when we need to have "NE" instead. If there is another way than using na_str, I apologize for the silly question!

That does make sense and I think we should add that as well, but that's not the current function of the na_level argument in this function - currently, since NA is not allowed in the table, users can specify na_level, for example na_level = "<Missing>", and this value will be treated as NA (i.e. removed by na.rm).

Copy link
Contributor

@Melkiades Melkiades left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm! We can discuss later for the na_str propagation ;)

@edelarua edelarua merged commit 13c2445 into main May 17, 2023
@edelarua edelarua deleted the 911_summarize_vars_bug@main branch May 17, 2023 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: summarize_vars do not use na_level
2 participants