Skip to contents

Data frame verbs

Rows

filter(), filter_out(), and drop_na() use domain estimation — rows are marked in or out of the analysis domain without being removed, so variance estimates stay correct. Physical row removal (subset(), slice_*()) is also available but issues a warning because removing rows can bias variance estimates.

Columns

Select, reorder, rename, create, extract, and inspect columns. Design variables (weights, strata, PSU, FPC) are always retained even when not explicitly selected. rename() automatically updates the survey design specification and variable metadata to match the new name.

select()
Keep or drop columns using their names and types
relocate()
Change column order in a survey design object
rename() rename_with(<survey_base>) rename_with(<survey_result>) rename_with(<survey_collection>)
Rename columns of a survey design object
mutate()
Create, modify, and delete columns of a survey design object
pull()
Extract a column from a survey design object
glimpse()
Get a glimpse of a survey design object

Groups

group_by() stores grouping columns on the survey object for use by grouped operations like mutate(). rowwise() enables row-by-row computation. Unlike dplyr, the underlying data is not modified — groups are stored on the survey object and applied when needed.

ungroup(<survey_base>) ungroup(<survey_collection>) group_by()
Group and ungroup a survey design object
rowwise()
Compute row-wise on a survey design object

Joins

Join a survey design object with a plain data frame. left_join() adds lookup columns without changing row count. semi_join() and anti_join() are domain-aware: unmatched rows are marked out-of-domain rather than removed, preserving variance estimation validity. inner_join() defaults to domain-aware mode and supports an explicit .domain_aware = FALSE for physical row removal. right_join(), full_join(), and bind_rows() always error — they would add rows with missing design variables.

left_join()
Add columns from a data frame to a survey design
semi_join() anti_join()
Domain-aware semi- and anti-join for survey designs
inner_join()
Domain-aware inner join for survey designs
bind_cols()
Append columns to a survey design by position
right_join() full_join()
Unsupported joins for survey designs
bind_rows()
Stack surveys with bind_rows (errors unconditionally)

Predicates

Test the current grouping and rowwise state of a survey design object. These predicates are designed for use by estimation functions in Phase 1.

is_rowwise()
Test whether a survey design is in rowwise mode
is_grouped()
Test whether a survey design has active grouping

Recoding

Survey-aware versions of dplyr’s recoding and conditional functions. When called with .label, .value_labels, .factor, or .description, these functions automatically propagate label metadata into @metadata via mutate(). When called without these arguments, the output is identical to the corresponding dplyr function.

case_when()
A generalised vectorised if-else
replace_when()
Partially update a vector using conditional formulas
if_else()
Vectorised if-else
na_if()
Convert values to NA
recode_values()
Recode values using an explicit mapping
replace_values()
Partially update values using an explicit mapping

Transformation

Vector-level transformation functions for common survey variable operations. These functions operate on plain R vectors and integrate with mutate() via the surveytidy_recode attribute protocol, automatically recording transformation metadata in @metadata@transformations.

make_factor()
Convert a vector to a factor using value labels
make_dicho()
Collapse a multi-level factor to two levels
make_binary()
Convert a dichotomous variable to a numeric 0/1 indicator
make_rev()
Reverse the numeric values of a scale variable
make_flip()
Flip the semantic valence of a variable
row_means()
Compute row-wise means across selected columns
row_sums()
Compute row-wise sums across selected columns