get_all_corr
makes it easy to calculate correlations across
every variable in a data frame or select set of variables. It also
works with grouped data frames so you can check correlations among
the levels of several grouping variables.
Arguments
- data
A data frame or tibble object
- cols
<
tidy-select
> The variables you want to get the correlations for.- wt
A variable to use as the weights for weighted correlations
- remove_redundant
Should rows where the two variables are the same be kept or removed? If
TRUE
, the default, they are removed.
Examples
# load dplyr and adlgraphs
library(dplyr)
library(adlgraphs)
# To get correlations with three variables you can do it three ways
# 1. Create a new data frame with only the columns you want
new_data <- test_data %>% dplyr::select(top:dominate)
get_all_corr(new_data)
#> Error in purrr::map(seq_len(nrow(combinations)), internal_corr): ℹ In index: 1.
#> Caused by error in `[[<-`:
#> ! Assigned data `*vtmp*` must be compatible with existing data.
#> ✖ Existing data has 250 rows.
#> ✖ Assigned data has 0 rows.
#> ℹ Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 0 to size 250.
# 2. Using dplyr::select() and pipes
test_data %>%
dplyr::select(c(top:dominate)) %>%
get_all_corr()
#> Error in purrr::map(seq_len(nrow(combinations)), internal_corr): ℹ In index: 1.
#> Caused by error in `[[<-`:
#> ! Assigned data `*vtmp*` must be compatible with existing data.
#> ✖ Existing data has 250 rows.
#> ✖ Assigned data has 0 rows.
#> ℹ Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 0 to size 250.
# 3. Use the `cols` argument
get_all_corr(test_data, cols = c(top:dominate))
#> Error in purrr::map(seq_len(nrow(combinations)), internal_corr): ℹ In index: 1.
#> Caused by error in `[[<-`:
#> ! Assigned data `*vtmp*` must be compatible with existing data.
#> ✖ Existing data has 250 rows.
#> ✖ Assigned data has 0 rows.
#> ℹ Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 0 to size 250.
# or
test_data %>% get_all_corr(c(top:dominate))
#> Error in purrr::map(seq_len(nrow(combinations)), internal_corr): ℹ In index: 1.
#> Caused by error in `[[<-`:
#> ! Assigned data `*vtmp*` must be compatible with existing data.
#> ✖ Existing data has 250 rows.
#> ✖ Assigned data has 0 rows.
#> ℹ Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 0 to size 250.
# To get weighted correlations just specify the `wt` argument
test_data %>% get_all_corr(c(top:dominate), wt = wts)
#> Error in purrr::map(seq_len(nrow(combinations)), internal_corr): ℹ In index: 1.
#> Caused by error in `[[<-`:
#> ! Assigned data `*vtmp*` must be compatible with existing data.
#> ✖ Existing data has 250 rows.
#> ✖ Assigned data has 0 rows.
#> ℹ Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 0 to size 250.
# You can also calculate grouped correlations. For example, if
# you were interested in comparing the weighted correlations
# among people with a college degree vs those without one, you
# would do it like this:
test_data %>%
dplyr::group_by(edu_f2) %>%
get_all_corr(c(top:dominate), wt = wts)
#> Error in purrr::map(group_helpers$nest_data$data, ~get_all_corr.default(data = .x, wt = wt, remove_redundant = remove_redundant)): ℹ In index: 1.
#> Caused by error in `purrr::map()`:
#> ℹ In index: 1.
#> Caused by error in `[[<-`:
#> ! Assigned data `*vtmp*` must be compatible with existing data.
#> ✖ Existing data has 142 rows.
#> ✖ Assigned data has 0 rows.
#> ℹ Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 0 to size 142.