Change presentation order of the grouping variable #792

martynagalazka · 2022-09-26T13:17:15Z

I am plotting grouped ggwithinstats plot and the graphs display in alphabetical or ascending order (if numbers).

In cases in which one cannot change the name of the grouping variable but for the sake of clarity it would be good to present it in a certain order, is there a way to control whether grouping variable subgroups are presented on the left or the right`?

Thank you.

IndrajeetPatil · 2022-09-26T13:42:09Z

Hmm, seems like none of the functions respect the original order of the grouping.var column. This definitely shouldn't happen.

library(ggstatsplot)

(df <- dplyr::tibble(
  grp = c(rep("c", 5), rep("a", 5), rep("b", 5)),
  val1 = runif(15),
  val2 = runif(15)
))
#> # A tibble: 15 × 3
#>    grp     val1   val2
#>    <chr>  <dbl>  <dbl>
#>  1 c     0.732  0.754 
#>  2 c     0.825  0.0488
#>  3 c     0.214  0.113 
#>  4 c     0.288  0.609 
#>  5 c     0.574  0.498 
#>  6 a     0.209  0.276 
#>  7 a     0.971  0.0530
#>  8 a     0.227  0.328 
#>  9 a     0.748  0.550 
#> 10 a     0.866  0.734 
#> 11 b     0.472  0.171 
#> 12 b     0.330  0.881 
#> 13 b     0.262  0.604 
#> 14 b     0.956  0.734 
#> 15 b     0.0407 0.0470
  
grouped_ggscatterstats(df, val1, val2, grouping.var = grp)

^{Created on 2022-09-26 with reprex v2.0.2}

etiennebacher · 2022-11-21T17:31:30Z

The problem probably comes from .grouped_list(), and more particularly from split() which automatically reorders the groups. One solution (taken on SO) is to manually specified the levels.

test <- data.frame(
  id = c("b", "c", "a"),
  val = 1:3
)

# reordered
split(test, ~ id)
#> $a
#>   id val
#> 3  a   3
#> 
#> $b
#>   id val
#> 1  b   1
#> 
#> $c
#>   id val
#> 2  c   2

# not reordered
test$id <- factor(test$id, levels=unique(test$id))
split(test, ~ id)
#> $b
#>   id val
#> 1  b   1
#> 
#> $c
#>   id val
#> 2  c   2
#> 
#> $a
#>   id val
#> 3  a   3

Therefore, in .grouped_list(), you can apply this to all grouping variables:

.grouped_list <- function(data, grouping.var = NULL) {
  data <- as_tibble(data)

  if (quo_is_null(enquo(grouping.var))) {
    return(data)
  }

  data %<>%
    mutate(
      across(
        {{ grouping.var }},
        ~ factor(.x, levels = unique(.x))
      )
    )

  data %>% split(f = new_formula(NULL, enquo(grouping.var)), drop = FALSE)
}

And now the order is correct:

library(ggstatsplot)
#> You can cite this package as:
#>      Patil, I. (2021). Visualizations with statistical details: The 'ggstatsplot' approach.
#>      Journal of Open Source Software, 6(61), 3167, doi:10.21105/joss.03167

(df <- dplyr::tibble(
  grp = c(rep("c", 5), rep("a", 5), rep("b", 5)),
  val1 = runif(15),
  val2 = runif(15)
))
#> # A tibble: 15 × 3
#>    grp     val1   val2
#>    <chr>  <dbl>  <dbl>
#>  1 c     0.144  0.700 
#>  2 c     0.210  0.461 
#>  3 c     0.0319 0.875 
#>  4 c     0.609  0.978 
#>  5 c     0.213  0.317 
#>  6 a     0.0133 0.502 
#>  7 a     0.778  0.0390
#>  8 a     0.553  0.979 
#>  9 a     0.0742 0.0475
#> 10 a     0.616  0.339 
#> 11 b     0.869  0.322 
#> 12 b     0.501  0.0567
#> 13 b     0.632  0.347 
#> 14 b     0.758  0.953 
#> 15 b     0.391  0.687

grouped_ggscatterstats(df, val1, val2, grouping.var = grp)
#> Registered S3 method overwritten by 'ggside':
#>   method from   
#>   +.gg   ggplot2
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

This should probably work with other grouped plots but I didn't check. @IndrajeetPatil I'm not familiar with this package so I'll let you apply the solution if it is correct.

IndrajeetPatil added the bug 🐜 Something isn't working label Sep 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change presentation order of the grouping variable #792

Change presentation order of the grouping variable #792

martynagalazka commented Sep 26, 2022

IndrajeetPatil commented Sep 26, 2022

etiennebacher commented Nov 21, 2022

Change presentation order of the grouping variable #792

Change presentation order of the grouping variable #792

Comments

martynagalazka commented Sep 26, 2022

IndrajeetPatil commented Sep 26, 2022

etiennebacher commented Nov 21, 2022