This function calculates the difference in means using a
bivariate regression, as well the p-value indicating how
significant each difference is. The main function doing the
calculations lm()
.
NOTE: This function does not perform an actual Dunnet Test as it
does not calculate the quantile of the multivariate t-distribution
when determining the confidence intervals and p-values. If you need
to perform an actual Dunnett Test use the dunnett()
function
instead. Please be aware that that function is far slower when
there are many comparison groups due to the nature of
mvtnorm::qmvt()
and high dimensional data.
Usage
get_diffs(
data,
x,
treats,
group = NULL,
wt = NULL,
ref_level,
show_means = FALSE,
show_pct_change = FALSE,
decimals = 3,
conf.level = 0.95,
na.rm = TRUE
)
Arguments
- data
A data frame or tibble.
- x
A numeric vector that will be used to calculate the means. This can be a string or symbol.
- treats
A variable whose values are used to determine if the means are statistically significantly different from each other. Should be a factor or character vector. This can be a string or symbol.
- group
<
tidy-select
> A selection of columns to group the data by in addition totreats
. This operates very similarly to.by
from dplyr (for more info on that see ?dplyr_by). See examples to see how it operates.- wt
Weights. Add if you have a weighting variable and want to perform Dunnett's test with weighted means.
- ref_level
A string that specifies the level of the reference group through which the others will be tested.
- show_means
Logical. Default is
FALSE
which does not show the mean values for the levels. IfTRUE
, will add a column calledmean
that contains the means.- show_pct_change
Logical. Default is
FALSE
which does not show the percent change from the reference category to the other categories. IfTRUE
, will show the percent change.- decimals
Number of decimals each number should be rounded to. Default is 3.
- conf.level
A number between 0 and 1 that signifies the width of the desired confidence interval. Default is 0.95, which corresponds to a 95% confidence interval.
- na.rm
Logical. Default is
TRUE
which removes NAs prior to calculation.
Value
A tibble with one row if no group
is provided and data
is not of class "grouped_df"
. If data is of class "grouped_df"
or
group
is provided, it will return one row for each unique observation
if one group is provides and one row per unique combination of observations
if multiple groups are used.
Examples
# load dplyr for the pipe: %>%
library(dplyr)
library(adlgraphs)
# Check to see if any of the partisan groups are significantly different
# from the control group (in this case "Democrat") for conspiracy
# theory belief
get_diffs(test_data, "acts_avg", "pid_f3")
#> # A tibble: 2 × 7
#> pid_f3 diffs n conf.low conf.high p_value stars
#> * <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 Independent 0.262 57 0.015 0.509 0.037 "*"
#> 2 Republican 0.175 95 -0.038 0.388 0.107 ""
# now do the same as above but make "Independent" the control group
get_diffs(test_data, "acts_avg", "pid_f3", ref_level = "Independent")
#> # A tibble: 2 × 7
#> pid_f3 diffs n conf.low conf.high p_value stars
#> * <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 Democrat -0.262 98 -0.509 -0.015 0.037 "*"
#> 2 Republican -0.087 95 -0.335 0.161 0.491 ""
# now let's add in education (`edu_f2`) as the `group` variable. This let's us
# compare education levels within each level of `edu_f2`. Note how the arguments
# don't have to be strings
get_diffs(test_data, acts_avg, pid_f3, edu_f2)
#> # A tibble: 4 × 8
#> edu_f2 pid_f3 diffs n conf.low conf.high p_value stars
#> * <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 No College Degree Indep… 0.168 40 -0.134 0.471 0.273 ""
#> 2 No College Degree Repub… 0.156 50 -0.129 0.441 0.281 ""
#> 3 At Least a Bachelor's Deg… Indep… 0.316 17 -0.118 0.749 0.152 ""
#> 4 At Least a Bachelor's Deg… Repub… 0.2 45 -0.121 0.52 0.22 ""
# we can also group by multiple variables. Due to a small n, I'm going to use
# `edu_f2` instead of `edu_f`.
test_data %>%
dplyr::mutate(values_f2 = make_dicho(values)) %>%
get_diffs(acts_avg, treats = pid_f3, group = c(edu_f2, values_f2))
#> # A tibble: 8 × 9
#> edu_f2 values_f2 pid_f3 diffs n conf.low conf.high p_value stars
#> * <fct> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 No College Deg… True Indep… 0.163 18 -0.249 0.575 0.432 ""
#> 2 No College Deg… True Repub… 0.08 35 -0.26 0.421 0.639 ""
#> 3 No College Deg… False Indep… 0.081 22 -0.379 0.541 0.726 ""
#> 4 No College Deg… False Repub… 0.369 15 -0.137 0.876 0.149 ""
#> 5 At Least a Bac… True Indep… 0.235 11 -0.359 0.829 0.433 ""
#> 6 At Least a Bac… True Repub… -0.015 28 -0.454 0.423 0.945 ""
#> 7 At Least a Bac… False Indep… 0.49 6 -0.042 1.02 0.07 "."
#> 8 At Least a Bac… False Repub… 0.586 17 0.188 0.983 0.005 "**"
# now let's do those previous two calculations but using `dplyr::group_by()`
test_data %>%
dplyr::group_by(pid_f3) %>%
get_diffs(acts_avg, edu_f)
#> # A tibble: 9 × 8
#> pid_f3 edu_f diffs n conf.low conf.high p_value stars
#> * <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 Democrat Some College -0.111 32 -0.533 0.311 0.603 ""
#> 2 Democrat Bachelor's Degree -0.346 28 -0.779 0.087 0.116 ""
#> 3 Democrat Graduate Degree -0.379 18 -0.86 0.101 0.121 ""
#> 4 Independent Some College -0.096 14 -0.645 0.453 0.727 ""
#> 5 Independent Bachelor's Degree -0.054 12 -0.633 0.524 0.851 ""
#> 6 Independent Graduate Degree -0.471 5 -1.28 0.338 0.248 ""
#> 7 Republican Some College -0.074 32 -0.49 0.342 0.725 ""
#> 8 Republican Bachelor's Degree -0.204 28 -0.631 0.222 0.344 ""
#> 9 Republican Graduate Degree -0.443 17 -0.92 0.035 0.069 "."
# we can also group by multiple variables
test_data %>%
dplyr::mutate(values_f2 = make_dicho(values)) %>%
dplyr::group_by(pid_f3, values_f2) %>%
get_diffs(acts_avg, edu_f2)
#> # A tibble: 6 × 9
#> pid_f3 values_f2 edu_f2 diffs n conf.low conf.high p_value stars
#> * <fct> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 Democrat True At Least … -0.169 33 -0.546 0.209 0.376 ""
#> 2 Democrat False At Least … -0.546 13 -1.04 -0.056 0.03 "*"
#> 3 Independent True At Least … -0.097 11 -0.828 0.634 0.787 ""
#> 4 Independent False At Least … -0.136 6 -0.81 0.537 0.681 ""
#> 5 Republican True At Least … -0.264 28 -0.623 0.095 0.146 ""
#> 6 Republican False At Least … -0.329 17 -0.76 0.101 0.129 ""