Get correlations for a combination of variables

get_all_corr makes it easy to calculate correlations across every variable in a data frame or select set of variables. It also works with grouped data frames so you can check correlations among the levels of several grouping variables.

Usage

get_all_corr(data, cols, wt = NULL, remove_redundant = TRUE)

Arguments

data: A data frame or tibble object
cols: <tidy-select> The variables you want to get the correlations for.
wt: A variable to use as the weights for weighted correlations
remove_redundant: Should rows where the two variables are the same be kept or removed? If TRUE, the default, they are removed.

Examples

# load dplyr and adlgraphs
library(dplyr)
library(adlgraphs)

# To get correlations with three variables you can do it three ways
# 1. Create a new data frame with only the columns you want
new_data <- test_data %>% dplyr::select(top:dominate)
get_all_corr(new_data)
#> # A tibble: 6 × 8
#>   x               y          correlation     n conf.low conf.high  p.value stars
#>   <chr+lbl>       <chr+lbl>        <dbl> <dbl>    <dbl>     <dbl>    <dbl> <chr>
#> 1 inferior [Some… top [An i…       0.498   250    0.389    0.606  4.82e-17 ***  
#> 2 dominate [No o… top [An i…      -0.147   250   -0.271   -0.0234 2.00e- 2 *    
#> 3 top [An ideal … inf… [Som…       0.498   250    0.389    0.606  4.82e-17 ***  
#> 4 dominate [No o… inf… [Som…      -0.138   250   -0.262   -0.0146 2.86e- 2 *    
#> 5 top [An ideal … dom… [No …      -0.147   250   -0.271   -0.0234 2.00e- 2 *    
#> 6 inferior [Some… dom… [No …      -0.138   250   -0.262   -0.0146 2.86e- 2 *    

# 2. Using dplyr::select() and pipes
test_data %>% 
  dplyr::select(c(top:dominate)) %>% 
  get_all_corr()
#> # A tibble: 6 × 8
#>   x               y          correlation     n conf.low conf.high  p.value stars
#>   <chr+lbl>       <chr+lbl>        <dbl> <dbl>    <dbl>     <dbl>    <dbl> <chr>
#> 1 inferior [Some… top [An i…       0.498   250    0.389    0.606  4.82e-17 ***  
#> 2 dominate [No o… top [An i…      -0.147   250   -0.271   -0.0234 2.00e- 2 *    
#> 3 top [An ideal … inf… [Som…       0.498   250    0.389    0.606  4.82e-17 ***  
#> 4 dominate [No o… inf… [Som…      -0.138   250   -0.262   -0.0146 2.86e- 2 *    
#> 5 top [An ideal … dom… [No …      -0.147   250   -0.271   -0.0234 2.00e- 2 *    
#> 6 inferior [Some… dom… [No …      -0.138   250   -0.262   -0.0146 2.86e- 2 *    

# 3. Use the `cols` argument
get_all_corr(test_data, cols = c(top:dominate))
#> # A tibble: 6 × 8
#>   x               y          correlation     n conf.low conf.high  p.value stars
#>   <chr+lbl>       <chr+lbl>        <dbl> <dbl>    <dbl>     <dbl>    <dbl> <chr>
#> 1 inferior [Some… top [An i…       0.498   250    0.389    0.606  4.82e-17 ***  
#> 2 dominate [No o… top [An i…      -0.147   250   -0.271   -0.0234 2.00e- 2 *    
#> 3 top [An ideal … inf… [Som…       0.498   250    0.389    0.606  4.82e-17 ***  
#> 4 dominate [No o… inf… [Som…      -0.138   250   -0.262   -0.0146 2.86e- 2 *    
#> 5 top [An ideal … dom… [No …      -0.147   250   -0.271   -0.0234 2.00e- 2 *    
#> 6 inferior [Some… dom… [No …      -0.138   250   -0.262   -0.0146 2.86e- 2 *    
# or 
test_data %>% get_all_corr(c(top:dominate))
#> # A tibble: 6 × 8
#>   x               y          correlation     n conf.low conf.high  p.value stars
#>   <chr+lbl>       <chr+lbl>        <dbl> <dbl>    <dbl>     <dbl>    <dbl> <chr>
#> 1 inferior [Some… top [An i…       0.498   250    0.389    0.606  4.82e-17 ***  
#> 2 dominate [No o… top [An i…      -0.147   250   -0.271   -0.0234 2.00e- 2 *    
#> 3 top [An ideal … inf… [Som…       0.498   250    0.389    0.606  4.82e-17 ***  
#> 4 dominate [No o… inf… [Som…      -0.138   250   -0.262   -0.0146 2.86e- 2 *    
#> 5 top [An ideal … dom… [No …      -0.147   250   -0.271   -0.0234 2.00e- 2 *    
#> 6 inferior [Some… dom… [No …      -0.138   250   -0.262   -0.0146 2.86e- 2 *    

# To get weighted correlations just specify the `wt` argument
test_data %>% get_all_corr(c(top:dominate), wt = wts)
#> # A tibble: 6 × 8
#>   x               y          correlation     n conf.low conf.high  p.value stars
#>   <chr+lbl>       <chr+lbl>        <dbl> <dbl>    <dbl>     <dbl>    <dbl> <chr>
#> 1 inferior [Some… top [An i…       0.498   250    0.389    0.606  4.82e-17 ***  
#> 2 dominate [No o… top [An i…      -0.147   250   -0.271   -0.0234 2.00e- 2 *    
#> 3 top [An ideal … inf… [Som…       0.498   250    0.389    0.606  4.82e-17 ***  
#> 4 dominate [No o… inf… [Som…      -0.138   250   -0.262   -0.0146 2.86e- 2 *    
#> 5 top [An ideal … dom… [No …      -0.147   250   -0.271   -0.0234 2.00e- 2 *    
#> 6 inferior [Some… dom… [No …      -0.138   250   -0.262   -0.0146 2.86e- 2 *    

# You can also calculate grouped correlations. For example, if
# you were interested in comparing the weighted correlations 
# among people with a college degree vs those without one, you 
# would do it like this:
test_data %>% 
  dplyr::group_by(edu_f2) %>% 
  get_all_corr(c(top:dominate), wt = wts)
#> # A tibble: 12 × 9
#>    edu_f2    x          y          correlation     n conf.low conf.high  p.value
#>    <chr>     <chr+lbl>  <chr+lbl>        <dbl> <dbl>    <dbl>     <dbl>    <dbl>
#>  1 No Colle… inf… [Som… top [An i…       0.509   142    0.366    0.653  9.61e-11
#>  2 No Colle… dom… [No … top [An i…      -0.104   142   -0.270    0.0621 2.18e- 1
#>  3 No Colle… top [An i… inf… [Som…       0.509   142    0.366    0.653  9.61e-11
#>  4 No Colle… dom… [No … inf… [Som…      -0.120   142   -0.286    0.0456 1.54e- 1
#>  5 No Colle… top [An i… dom… [No …      -0.104   142   -0.270    0.0621 2.18e- 1
#>  6 No Colle… inf… [Som… dom… [No …      -0.120   142   -0.286    0.0456 1.54e- 1
#>  7 At Least… inf… [Som… top [An i…       0.483   108    0.315    0.652  1.19e- 7
#>  8 At Least… dom… [No … top [An i…      -0.218   108   -0.406   -0.0297 2.37e- 2
#>  9 At Least… top [An i… inf… [Som…       0.483   108    0.315    0.652  1.19e- 7
#> 10 At Least… dom… [No … inf… [Som…      -0.164   108   -0.354    0.0259 8.99e- 2
#> 11 At Least… top [An i… dom… [No …      -0.218   108   -0.406   -0.0297 2.37e- 2
#> 12 At Least… inf… [Som… dom… [No …      -0.164   108   -0.354    0.0259 8.99e- 2
#> # ℹ 1 more variable: stars <chr>