Computes the effective sample size of a survey design using either the
Kish (1965) weight-only approximation (method = "kish") or the full
design-effect-based formula for a specified variable (method = "deff").
Usage
get_effective_n(
design,
x = NULL,
group = NULL,
method = c("kish", "deff"),
na.rm = TRUE,
decimals = NULL,
min_cell_n = 30L,
...,
.id = NULL,
.if_missing_var = NULL
)Arguments
- design
A survey design object:
survey_taylor,survey_replicate,survey_twophase,survey_nonprob, or asurvey_collection.- x
<
tidy-select> A single unquoted numeric variable name. Required whenmethod = "deff"; ignored (with a message) whenmethod = "kish". DefaultNULL.- group
<
tidy-select> Optional grouping variable(s). Combined with any grouping set bygroup_by(). DefaultNULL.- method
Character(1).
"kish"(default) or"deff". Controls the effective-N formula. Matched viamatch.arg().- na.rm
Logical. If
TRUE(default), exclude observations withNAweights or group variables from the Kish computation; passed toget_means()for the DEFF computation.- decimals
Integer or
NULL. Roundsn_effand deff columns to this many decimal places.nis always integer and is never rounded. DefaultNULL.- min_cell_n
Integer. Minimum unweighted cell count before
surveycore_warning_small_cellfires (Kish method only). Default30L.- ...
Unused. Reserved so that
.idand.if_missing_varremain named-only when asurvey_collectionis passed.- .id
Character(1) or
NULL. Column name identifying each survey in asurvey_collection. DefaultNULL(uses the collection's stored@id).- .if_missing_var
"error","skip", orNULL. Handling for surveys in a collection that lackx. DefaultNULL.
Value
A survey_effective_n tibble (also inheriting survey_result).
Columns, in order:
[.id]— survey identifier column (whendesignis a collection).[group_cols...]— group variable columns (when grouping is active).n— integer. Unweighted count of observations.n_eff— numeric. Effective sample size.deff_kish— numeric. Weight-based design effect (n / n_eff). Present whenmethod = "kish"only.deff— numeric. Full design effect (Var_design / Var_SRS). Present whenmethod = "deff"only.
Use meta(result)$method to retrieve the formula used. For DEFF,
meta(result)$x is a named list with variable metadata.
Details
The Kish method (method = "kish") computes effective N from survey
weights alone: n_eff = sum(w)^2 / sum(w^2). It captures only weight variation.
For clustered designs with equal weights, deff_kish = 1.0 even when the
true design effect is substantially greater due to clustering. Use
method = "deff" to capture the full design effect for a specific
analysis variable.
The DEFF method (method = "deff") computes effective N as
n_eff = n / DEFF, where DEFF = Var_design / Var_SRS for variable x.
It captures clustering, stratification, and weight variation jointly.
See also
Other analysis:
clean(),
get_anova(),
get_corr(),
get_covariance(),
get_diffs(),
get_freqs(),
get_means(),
get_pairwise(),
get_quantiles(),
get_ratios(),
get_t_test(),
get_totals(),
get_variance(),
meta()
Examples
d <- as_survey(nhanes_2017, ids = sdmvpsu, weights = wtint2yr,
strata = sdmvstra, nest = TRUE)
# Kish effective N (weight-only approximation)
get_effective_n(d)
#> # A tibble: 1 × 3
#> n n_eff deff_kish
#> <int> <dbl> <dbl>
#> 1 9254 3820. 2.42
# Full DEFF effective N for a specific variable
get_effective_n(d, ridageyr, method = "deff")
#> # A tibble: 1 × 3
#> n n_eff deff
#> <int> <dbl> <dbl>
#> 1 9254 2425. 3.82
