Creates a survey design object using Taylor series (linearization) for variance estimation. Supports simple random samples, stratified designs, single- and multi-stage cluster designs, and designs with finite population correction. Uses a tidy-select interface for all design variable arguments.
Usage
as_survey(
data,
ids = NULL,
probs = NULL,
weights = NULL,
strata = NULL,
fpc = NULL,
nest = FALSE
)Arguments
- data
A
data.framecontaining the survey responses. Must have at least one row and unique column names.- ids
<
tidy-select> Cluster (PSU) ID column(s). For single-stage:ids = psu. For multi-stage:ids = c(psu, ssu). Omit entirely for simple random sampling.- probs
<
tidy-select> Sampling probability column (a single column, values in (0, 1]). Converted to weights= 1/probsand stored internally. Cannot be used together withweightsunless the values are consistent (weights == 1/probs).- weights
<
tidy-select> Sampling weight column (a single column, values strictly > 0).- strata
<
tidy-select> Stratification variable column (a single column).- fpc
<
tidy-select> Finite population correction column (a single column). Accepts either total population size (integer) or sampling fraction (numeric, 0–1). Cannot containNA.- nest
Logical. If
TRUE, PSU IDs are treated as nested within strata — i.e., the same ID value in two different strata refers to two distinct PSUs. Setnest = TRUEwhen PSU IDs are not globally unique (e.g., NHANES, where PSU IDs restart from 1 in each stratum). Requiresstratato be specified. DefaultFALSE.
Tidy-select
All design variable arguments (ids, probs, weights, strata, fpc)
support tidy-select syntax:
Simple random sample
If ids, weights, and probs are all omitted, an equal-probability SRS
is assumed. A warning is issued because population totals cannot be
estimated without weights or population size.
See also
as_survey_rep() for replicate-weight designs,
as_survey_twophase() for two-phase designs,
set_var_label(), set_variable_labels() to add variable metadata
Other constructors:
as_survey_calibrated(),
as_survey_rep(),
as_survey_srs(),
as_survey_twophase(),
survey_calibrated(),
survey_data(),
survey_replicate(),
survey_srs(),
survey_taylor(),
survey_twophase()
Examples
# Full NHANES design: stratified cluster with PSU IDs nested within strata
d <- as_survey(
nhanes_2017,
ids = sdmvpsu,
weights = wtint2yr,
strata = sdmvstra,
nest = TRUE
)
# Stratified design without PSU cluster IDs
d_strat <- as_survey(nhanes_2017, weights = wtint2yr, strata = sdmvstra)
# Blood pressure analysis: filter to exam participants, use MEC weight
exam <- nhanes_2017[nhanes_2017$ridstatr == 2, ]
d_bp <- as_survey(exam, ids = sdmvpsu, weights = wtmec2yr,
strata = sdmvstra, nest = TRUE)