Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update stretch() default behaviour to convert x and y to factors rather than character vectors #98

Closed
mattwarkentin opened this issue Apr 11, 2020 · 3 comments
Labels
feature a feature request or enhancement

Comments

@mattwarkentin
Copy link
Contributor

Hi,

When you shave() and then stretch() a cor_df class object, the result is a three column tibble where x and y are character vectors containing the row and column names. However, this means when you plot with rplot(), or manually with ggplot(), that the resulting plot is not a reflection of the correlation matrix shown with print.cor_df(). This is especially noticeable when you shave() a correlation matrix because the resulting plot is not a upper/lower triangular plot.

This is because ggplot()s default behaviour is to order character vectors along the axes in alphabetic order. My suggestion is to have the default behaviour for stretch() be to convert x and y to factor() variables based on the column order of the input data.frame to correlate().

Here is a demonstration of the issue:

library(tidyverse)
library(corrr)

x <- 
  mtcars %>% 
  correlate()
#> 
#> Correlation method: 'pearson'
#> Missing treated using: 'pairwise.complete.obs'

#  Note the column and row order, same as mtcars
print(x)
#> # A tibble: 11 x 12
#>    rowname    mpg    cyl   disp     hp    drat     wt    qsec     vs      am
#>    <chr>    <dbl>  <dbl>  <dbl>  <dbl>   <dbl>  <dbl>   <dbl>  <dbl>   <dbl>
#>  1 mpg     NA     -0.852 -0.848 -0.776  0.681  -0.868  0.419   0.664  0.600 
#>  2 cyl     -0.852 NA      0.902  0.832 -0.700   0.782 -0.591  -0.811 -0.523 
#>  3 disp    -0.848  0.902 NA      0.791 -0.710   0.888 -0.434  -0.710 -0.591 
#>  4 hp      -0.776  0.832  0.791 NA     -0.449   0.659 -0.708  -0.723 -0.243 
#>  5 drat     0.681 -0.700 -0.710 -0.449 NA      -0.712  0.0912  0.440  0.713 
#>  6 wt      -0.868  0.782  0.888  0.659 -0.712  NA     -0.175  -0.555 -0.692 
#>  7 qsec     0.419 -0.591 -0.434 -0.708  0.0912 -0.175 NA       0.745 -0.230 
#>  8 vs       0.664 -0.811 -0.710 -0.723  0.440  -0.555  0.745  NA      0.168 
#>  9 am       0.600 -0.523 -0.591 -0.243  0.713  -0.692 -0.230   0.168 NA     
#> 10 gear     0.480 -0.493 -0.556 -0.126  0.700  -0.583 -0.213   0.206  0.794 
#> 11 carb    -0.551  0.527  0.395  0.750 -0.0908  0.428 -0.656  -0.570  0.0575
#> # … with 2 more variables: gear <dbl>, carb <dbl>

# Note the axis orders, alphabetic
rplot(x)
#> Don't know how to automatically pick scale for object of type noquote. Defaulting to continuous.

# Note the axis orders, alphabetic
x %>%
  stretch() %>% 
  ggplot(aes(x, y, fill = r)) +
  geom_tile()

# Unexpected behaviour when shaving
x %>%
  shave() %>% 
  stretch() %>% 
  ggplot(aes(x, y, fill = r)) +
  geom_tile()

# We can recover our triangular matrix
x %>%
  shave() %>% 
  stretch() %>% 
  # replace with across() with dplyr 1.0.0
  mutate_at(vars(x, y), factor, levels = x$rowname) %>% 
  ggplot(aes(x, y, fill = r)) +
  geom_tile()

x %>%
  shave(upper = FALSE) %>% 
  stretch() %>% 
  # replace with across() with dplyr 1.0.0
  mutate_at(vars(x, y), factor, levels = x$rowname) %>% 
  ggplot(aes(x, y, fill = r)) +
  geom_tile()

Created on 2020-04-11 by the reprex package (v0.3.0)

My suggestion would be to change the default behaviour of stretch() to convert to factor()s and perhaps offer an argument in stretch() to revert back to the old behaviour where x and y are character vectors. Argument could be keep_order = TRUE, or something along those lines.

I look forward to hearing your thoughts and would be happy to submit a PR if you agree that this change should be implemented.

@mattwarkentin
Copy link
Contributor Author

See #99

@juliasilge juliasilge added the feature a feature request or enhancement label Jun 5, 2020
@juliasilge
Copy link
Member

Closed in #99

@github-actions
Copy link

github-actions bot commented Mar 6, 2021

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants