Skip to contents

Creates a survey design object using replicate weights for variance estimation. Supports all common replicate methods: jackknife (JK1, JK2, JKn), balanced repeated replication (BRR, Fay), bootstrap, ACS, successive-difference, and user-defined types. Uses a tidy-select interface for weight and replicate-weight columns.

Usage

as_survey_replicate(
  data,
  weights,
  repweights,
  type = c("JK1", "JK2", "JKn", "BRR", "Fay", "bootstrap", "ACS",
    "successive-difference", "other"),
  scale = NULL,
  rscales = NULL,
  fpc = NULL,
  fpctype = c("fraction", "correction"),
  mse = TRUE
)

Arguments

data

A data.frame containing the survey responses. Must have at least one row and unique column names.

weights

<tidy-select> Sampling weight column (a single column, values strictly > 0). Required.

repweights

<tidy-select> Replicate weight columns. Must select at least one column. Supports tidy-select helpers (e.g., starts_with("repwt")). Required.

type

Character. Replicate weight method. One of "JK1" (delete-1 jackknife), "JK2" (delete-1 jackknife, stratified), "JKn" (delete-1 jackknife with varying replication counts), "BRR" (balanced repeated replication), "Fay" (Fay's method, a modified BRR), "bootstrap", "ACS" (used in American Community Survey), "successive-difference", or "other" (user-specified scale). Case-sensitive.

scale

Numeric. Scaling factor applied to the replicate variance formula. If NULL (default), computed automatically from type and the number of replicates: (R-1)/R for jackknife methods, 1/4 for BRR/Fay, 1/R for bootstrap/ACS, 2/R for successive-difference, 1 for other.

rscales

Numeric vector of replicate-specific scaling factors, or NULL. If provided, must have the same length as the number of replicate weight columns selected by repweights.

fpc

<tidy-select> Finite population correction column (a single column). Used by some replicate methods to adjust the variance estimator. NULL means no FPC correction.

fpctype

Character. How fpc is interpreted: "fraction" (sampling fraction, 0–1) or "correction" (multiplier for the replicate variance). Default "fraction". Case-sensitive.

mse

Logical. If TRUE (default), use mean-squared-error estimates (subtract the full-sample estimate rather than the mean replicate estimate when computing variance). Recommended for most designs.

Value

A survey_replicate object.

Tidy-select

Both weights and repweights support tidy-select syntax:

# Bare name for weights
as_survey_replicate(
  df, weights = wt, repweights = starts_with("repwt"), type = "BRR"
)
# c() for explicit replicate columns
as_survey_replicate(
  df, weights = wt, repweights = c(rep1, rep2, rep3), type = "JK1"
)

Replicate weight matrix

The replicate weight matrix is not stored in the object. Only the column names are stored in @variables$repweights. Variance estimation computes the matrix on demand: as.matrix(design@data[, design@variables$repweights]).

Memory usage

Each call to an estimation function (e.g., get_means(), get_totals()) materialises the full replicate weight matrix from the data frame. For large designs (e.g., ACS PUMS with 500k+ rows × 80 replicates), this is roughly nrow * n_replicates * 8 bytes per call (~363 MB for ACS Wyoming × 80). If you are estimating many variables, this is repeated for each call. This behaviour matches the survey package reference implementation.

References

Judkins, D.R. (1990) Fay's method for variance estimation. Journal of the American Statistical Association 85(410), 895–904.

Canty, A.J. and Davison, A.C. (1999) Resampling-based variance estimation for labour force surveys. The Statistician 48(3), 379–391.

Shao, J. and Tu, D. (1995) The Jackknife and Bootstrap. Springer.

Examples

# ACS PUMS Wyoming: 80 successive-difference replicate weights
d_acs <- as_survey_replicate(
  acs_pums_wy,
  weights    = pwgtp,
  repweights = pwgtp1:pwgtp80,
  type       = "successive-difference"
)

# Explicit replicate columns using c()
d_sub <- as_survey_replicate(
  acs_pums_wy,
  weights    = pwgtp,
  repweights = c(pwgtp1, pwgtp2, pwgtp3, pwgtp4),
  type       = "JK1"
)