Skip to contents

get_all_corr makes it easy to calculate correlations across every variable in a data frame or select set of variables. It also works with grouped data frames so you can check correlations among the levels of several grouping variables.

Usage

get_all_corr(data, cols, wt = NULL, remove_redundant = TRUE)

Arguments

data

A data frame or tibble object

cols

<tidy-select> The variables you want to get the correlations for.

wt

A variable to use as the weights for weighted correlations

remove_redundant

Should rows where the two variables are the same be kept or removed? If TRUE, the default, they are removed.

Value

A data.frame with the correlations between every combination of columns in data.

Examples

# load dplyr and adlgraphs
library(dplyr)
library(adlgraphs)

# To get correlations with three variables you can do it three ways
# 1. Create a new data frame with only the columns you want
new_data <- test_data %>% dplyr::select(top:dominate)
get_all_corr(new_data)
#> Error in purrr::map(seq_len(nrow(combinations)), internal_corr):  In index: 1.
#> Caused by error in `[[<-`:
#> ! Assigned data `*vtmp*` must be compatible with existing data.
#>  Existing data has 250 rows.
#>  Assigned data has 0 rows.
#>  Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 0 to size 250.

# 2. Using dplyr::select() and pipes
test_data %>% 
  dplyr::select(c(top:dominate)) %>% 
  get_all_corr()
#> Error in purrr::map(seq_len(nrow(combinations)), internal_corr):  In index: 1.
#> Caused by error in `[[<-`:
#> ! Assigned data `*vtmp*` must be compatible with existing data.
#>  Existing data has 250 rows.
#>  Assigned data has 0 rows.
#>  Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 0 to size 250.

# 3. Use the `cols` argument
get_all_corr(test_data, cols = c(top:dominate))
#> Error in purrr::map(seq_len(nrow(combinations)), internal_corr):  In index: 1.
#> Caused by error in `[[<-`:
#> ! Assigned data `*vtmp*` must be compatible with existing data.
#>  Existing data has 250 rows.
#>  Assigned data has 0 rows.
#>  Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 0 to size 250.
# or 
test_data %>% get_all_corr(c(top:dominate))
#> Error in purrr::map(seq_len(nrow(combinations)), internal_corr):  In index: 1.
#> Caused by error in `[[<-`:
#> ! Assigned data `*vtmp*` must be compatible with existing data.
#>  Existing data has 250 rows.
#>  Assigned data has 0 rows.
#>  Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 0 to size 250.

# To get weighted correlations just specify the `wt` argument
test_data %>% get_all_corr(c(top:dominate), wt = wts)
#> Error in purrr::map(seq_len(nrow(combinations)), internal_corr):  In index: 1.
#> Caused by error in `[[<-`:
#> ! Assigned data `*vtmp*` must be compatible with existing data.
#>  Existing data has 250 rows.
#>  Assigned data has 0 rows.
#>  Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 0 to size 250.

# You can also calculate grouped correlations. For example, if
# you were interested in comparing the weighted correlations 
# among people with a college degree vs those without one, you 
# would do it like this:
test_data %>% 
  dplyr::group_by(edu_f2) %>% 
  get_all_corr(c(top:dominate), wt = wts)
#> Error in purrr::map(group_helpers$nest_data$data, ~get_all_corr.default(data = .x,     wt = wt, remove_redundant = remove_redundant)):  In index: 1.
#> Caused by error in `purrr::map()`:
#>  In index: 1.
#> Caused by error in `[[<-`:
#> ! Assigned data `*vtmp*` must be compatible with existing data.
#>  Existing data has 142 rows.
#>  Assigned data has 0 rows.
#>  Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 0 to size 142.