Creates a survey design object using replicate weights for variance estimation. Supports all common replicate methods: jackknife (JK1, JK2, JKn), balanced repeated replication (BRR, Fay), bootstrap, ACS, successive-difference, and user-defined types. Uses a tidy-select interface for weight and replicate-weight columns.
Arguments
- data
A
data.framecontaining the survey responses. Must have at least one row and unique column names.- weights
<
tidy-select> Sampling weight column (a single column, values strictly > 0). Required.- repweights
<
tidy-select> Replicate weight columns. Must select at least one column. Supports tidy-select helpers (e.g.,starts_with("repwt")). Required.- type
Character. Replicate weight method. One of
"JK1"(delete-1 jackknife),"JK2"(delete-1 jackknife, stratified),"JKn"(delete-1 jackknife with varying replication counts),"BRR"(balanced repeated replication),"Fay"(Fay's method, a modified BRR),"bootstrap","ACS"(used in American Community Survey),"successive-difference", or"other"(user-specified scale).- scale
Numeric. Scaling factor applied to the replicate variance formula. If
NULL(default), computed automatically fromtypeand the number of replicates:(R-1)/Rfor jackknife methods,1/4for BRR/Fay,1/Rfor bootstrap/ACS,2/Rfor successive-difference,1for other.- rscales
Numeric vector of replicate-specific scaling factors, or
NULL. If provided, must have the same length as the number of replicate weight columns selected byrepweights.- fpc
<
tidy-select> Finite population correction column (a single column). Used by some replicate methods to adjust the variance estimator.NULLmeans no FPC correction.- fpctype
Character. How
fpcis interpreted:"fraction"(sampling fraction, 0–1) or"correction"(multiplier for the replicate variance). Default"fraction".- mse
Logical. If
TRUE(default), use mean-squared-error estimates (subtract the full-sample estimate rather than the mean replicate estimate when computing variance). Recommended for most designs.
Tidy-select
Both weights and repweights support tidy-select syntax:
# Bare name for weights
as_survey_rep(df, weights = wt, repweights = starts_with("repwt"), type = "BRR")
# c() for explicit replicate columns
as_survey_rep(df, weights = wt, repweights = c(rep1, rep2, rep3), type = "JK1")Replicate weight matrix
The replicate weight matrix is not stored in the object. Only the
column names are stored in @variables$repweights. Variance estimation
computes the matrix on demand:
as.matrix(design@data[, design@variables$repweights]).
Memory usage
Each call to an estimation function (e.g., get_means(), get_totals())
materialises the full replicate weight matrix from the data frame. For large
designs (e.g., ACS PUMS with 500k+ rows × 80 replicates), this is roughly
nrow * n_replicates * 8 bytes per call (~363 MB for ACS Wyoming × 80).
If you are estimating many variables, this is repeated for each call.
This behaviour matches the survey package reference implementation.
See also
as_survey() for Taylor series designs,
as_survey_twophase() for two-phase designs,
set_var_label(), set_variable_labels() to add variable metadata
Other constructors:
as_survey(),
as_survey_calibrated(),
as_survey_srs(),
as_survey_twophase(),
survey_calibrated(),
survey_data(),
survey_replicate(),
survey_srs(),
survey_taylor(),
survey_twophase()
Examples
# ACS PUMS Wyoming: 80 successive-difference replicate weights
d_acs <- as_survey_rep(
acs_pums_wy,
weights = pwgtp,
repweights = pwgtp1:pwgtp80,
type = "successive-difference"
)
# Explicit replicate columns using c()
d_sub <- as_survey_rep(
acs_pums_wy,
weights = pwgtp,
repweights = c(pwgtp1, pwgtp2, pwgtp3, pwgtp4),
type = "JK1"
)