This function calculates weighted Pearson correlations between two variables.
It also allows you to group the data and calculate correlations along each
level of the grouping variable. If data is not grouped and no group is
specified, then it will return the same output as wtd_corr()
.
Arguments
- data
An object of type data.frame or tibble. If piping the data into the function, this is not required.
- x, y
Can be either character strings or symbols. Name of two variables in the data you want to calculate the correlation between.
- group
Can be either a character string or a symbol. The grouping variable.
- wt
Can be either character strings or symbols. Weights. Add if you have a weighting variable and want to get weighted correlations
Value
A tibble showing correlations (correlation
), number of observations
(n
), low and high confidence intervals (conf.low
, conf.high
),
the p-value (p.value), and stars indicating it's statistical significance.
If the data is grouped, then it will also include a column, or multiple,
for each group. Similarly, if the data is grouped, the tibble will have
a row for each unique combination of grouping variables.
Examples
# load the dplyr for piping and grouping
library(dplyr)
# Let's first do a simple correlation where we pipe in the data
test_data %>% get_corr(x = top, y = sdo_sum)
#> # A tibble: 1 × 8
#> x y correlation n conf.low conf.high p.value stars
#> <chr+lbl> <chr+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 top [An ideal … sdo… [Soc… -0.736 250 -0.821 -0.651 6.41e-44 ***
# Repeat but with weights
test_data %>% get_corr(x = top, y = sdo_sum, wt = wts)
#> # A tibble: 1 × 8
#> x y correlation n conf.low conf.high p.value stars
#> <chr+lbl> <chr+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 top [An ideal … sdo… [Soc… -0.721 250 -0.808 -0.634 2.25e-41 ***
# Now let's get the correlatoin among only people with a bachelor's degree
test_data %>%
filter(edu_f2 == "At Least a Bachelor's Degree") %>%
get_corr(x = top, y = sdo_sum, wt = wts)
#> # A tibble: 1 × 8
#> x y correlation n conf.low conf.high p.value stars
#> <chr+lbl> <chr+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 top [An ideal … sdo… [Soc… -0.712 108 -0.847 -0.577 5.41e-18 ***
# Now let's get it for each education level. Two ways of doing this:
# The first is to group the data ahead of time
test_data %>%
group_by(edu_f) %>%
get_corr(x = top, y = sdo_sum, wt = wts)
#> # A tibble: 4 × 9
#> # Groups: edu_f [4]
#> edu_f x y correlation n conf.low conf.high p.value stars
#> <fct> <chr+lbl> <chr+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 High… top [An … sdo… [Soc… -0.728 64 -0.902 -0.555 8.94e-12 ***
#> 2 Some… top [An … sdo… [Soc… -0.729 78 -0.885 -0.572 3.93e-14 ***
#> 3 Bach… top [An … sdo… [Soc… -0.603 68 -0.799 -0.407 5.24e- 8 ***
#> 4 Grad… top [An … sdo… [Soc… -0.814 40 -1.00 -0.623 1.68e-10 ***
# The second is to use the group argument
test_data %>% get_corr(x = top, y = sdo_sum, group = edu_f, wt = wts)
#> # A tibble: 4 × 9
#> # Groups: edu_f [4]
#> edu_f x y correlation n conf.low conf.high p.value stars
#> <fct> <chr+lbl> <chr+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 High… top [An … sdo… [Soc… -0.728 64 -0.902 -0.555 8.94e-12 ***
#> 2 Some… top [An … sdo… [Soc… -0.729 78 -0.885 -0.572 3.93e-14 ***
#> 3 Bach… top [An … sdo… [Soc… -0.603 68 -0.799 -0.407 5.24e- 8 ***
#> 4 Grad… top [An … sdo… [Soc… -0.814 40 -1.00 -0.623 1.68e-10 ***