Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.4.0 #203

Merged
merged 103 commits into from
Nov 16, 2018
Merged

v0.4.0 #203

merged 103 commits into from
Nov 16, 2018

Conversation

ismayc
Copy link
Collaborator

@ismayc ismayc commented Sep 24, 2018

A refactoring of visualize() and lots of other "Don't Repeat Yourself" cleaning up here. It would be great to get this sent to master by October 1 so we can send an update out to CRAN at the beginning of Q4.

Let's get list-columns implemented in v0.5.0 by the end of October.

ismayc and others added 30 commits August 2, 2018 17:18
Used {styler} (1.0.2):
`styler::style_pkg(scope = "none", strict = FALSE)`.
Details:
- Remove trailing spaces in comments (both raw and roxygen).
- Update spacing in function calls and definitions.
- Remove spaces after `!!` and `!!!` (also a tidyverse style guide).

Used {styler} (1.0.2.9000 from commit 1dd6b04, as it has some bugs fixed):
`styler::style_pkg(scope = "spaces", strict = FALSE, include_roxygen_examples = FALSE)`.
After that manually remove spaces around `/` in simple fractions in test files.
Fix `calculate()` to not depend on order of `p` (fixes #122).
This also implements fixed shade transparency for all types of plots. Earlier it was fixed for "theoretical" and varied for "simulation" and "both".
Also this renames "Chi-square" into "Chi-Square" in warnings about right-tailed tests.
Merge branch 'develop' into dry-code

# Conflicts:
#	R/visualize.R
@ismayc
Copy link
Collaborator Author

ismayc commented Oct 18, 2018

@echasnovski Do you want to look into #203 (comment)? This might also be helpful.

@echasnovski
Copy link
Collaborator

Do you suggest to change logic of computing two-sided p-value in get_p_value() (more precisely in two_sided_p_value())? To something like this:

two_sided_p_value <- function(x, obs_stat){
  right_pval <- mean(x[["stat"]] >= obs_stat)
  
  # No need to round down to 1 due to nature of computation
  tibble::tibble(p_value = 2 * min(right_pval, 1 - right_pval))
}

Or to this:

two_sided_p_value <- function(x, obs_stat) {
  left_pval <- mean(x[["stat"]] <= obs_stat)
  right_pval <- mean(x[["stat"]] >= obs_stat)
  raw_res <- 2 * min(left_pval, right_pval)

  tibble::tibble(p_value = min(raw_res, 1))
}

@echasnovski
Copy link
Collaborator

This might introduce inconsistency in treating two-sided p-value in get_p_value() and shade_p_value(). During plotting the current functionality is to draw both tales based on these computations. I believe this might confuse users.

@ismayc
Copy link
Collaborator Author

ismayc commented Oct 18, 2018

Ultimately, we just want to have get_p_value() and shade_p_value() use the same methodology. And the method used in the Chihara and Hesterberg book seems like the best route.

@ismayc
Copy link
Collaborator Author

ismayc commented Oct 18, 2018

I think I like the first implementation better, but it's worth checking some examples to see which works better in practice.

@echasnovski
Copy link
Collaborator

I wonder, if the following reasoning makes sense:

  • As get_p_value() is based on generated sample from sampling distribution, we only can use the same methodology with "simulation" or "both" visualization methods in shade_p_value(). The "theoretical" case of mirroring observed statistics can't be in sync with get_p_value().
  • The core idea of shade_p_value() is to visually represent regions which together represent p-value. With "Chihara and Hesterberg" approach this wouldn't make any sense because all x axis will be shaded.

So there are two options of implementing two-sided p-value in get_p_value():

  • With "Chihara and Hesterberg" approach (which, as algorithm, I personally like better). And deal with inconsistency with shade_p_value().
  • Directly with shade_p_value() approach. Meaning, something like this:
two_sided_p_value <- function(x, obs_stat) {
  stat <- x[["stat"]]
  
  second_border <- mirror_obs_stat(stat, obs_stat)
  left_border <- min(obs_stat, second_border)
  right_border <- max(obs_stat, second_border)
  
  is_in_tail <- (stat <= left_border) | (stat >= right_border)
  
  tibble::tibble(p_value = mean(is_in_tail))
}

@ismayc
Copy link
Collaborator Author

ismayc commented Oct 18, 2018

This seems tricky enough without any theoretical underpinning as to which method would be preferred that an extra logical argument might be helpful if a user would prefer one of the two options over the other. The default could be the "Chihara and Hesterberg" approach. What do you think?

@andrewpbray
Copy link
Collaborator

This looks ready to go. It probably makes sense to first submit to CRAN, which I'm happy to do, then pull into master only after it has been accepted.

One thing: @ismayc which version is this? NEWS says 0.3.1.9000 but this PR is called 0.4.0.

@ismayc
Copy link
Collaborator Author

ismayc commented Nov 8, 2018

I usually use *.9000 for the developmental version. Feel free to change to 0.4.0 when we have settled on what goes in the release, which appears to be this.

@andrewpbray
Copy link
Collaborator

I plan to submit develop to CRAN tonight at 10 pm pacific. @ismayc @echasnovski If you want to change anything before the update, please chime in.

@andrewpbray andrewpbray merged commit db509aa into master Nov 16, 2018
@github-actions
Copy link

github-actions bot commented Mar 9, 2021

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants