na_if() is a survey-aware version of dplyr::na_if() that converts
values equal to y to NA. It is useful for replacing sentinel values
(e.g., 999 for "don't know") with proper missing values.
Unlike dplyr::na_if(), which accepts only a scalar y, this version
accepts a vector y and replaces all matching values in a single call.
When x carries value labels, na_if() automatically inherits those
labels. By default (.update_labels = TRUE), the label entries for the
NA'd values are removed from the output; set .update_labels = FALSE to
retain them (useful when you want to document what was set to missing).
Arguments
- x
Vector to modify.
- y
Value or vector of values to replace with
NA.yis cast to the type ofxbefore comparison. Whenyhas more than one element, each value is replaced sequentially.- .update_labels
logical(1). IfTRUE(the default) andxcarries value labels, label entries for values inyare removed from the output's value labels. Set toFALSEto retain all inherited labels even for values that were set toNA.- .description
character(1)orNULL. Plain-language description of how the variable was created. Stored in@metadata@transformations[[col]]$descriptionaftermutate().
Value
A modified version of x where values equal to y are replaced
with NA. If x carries value labels, returns a haven_labelled vector
with updated (or retained) labels; otherwise returns the same type as x.
See also
dplyr::na_if()for the base implementation.dplyr::coalesce()to replaceNAs with the first non-missing value.replace_values()for replacing specific values with a new value rather thanNA.replace_when()for condition-based in-place replacement.
Other recoding:
case_when(),
if_else(),
recode_values(),
replace_values(),
replace_when()
Examples
library(surveycore)
library(surveytidy)
# create the survey design
ns_wave1_svy <- as_survey_nonprob(ns_wave1, weights = weight)
# basic na_if — replace "Something else" (pid3 == 4) with NA
new <- ns_wave1_svy |>
mutate(pid3_clean = na_if(pid3, 4)) |>
select(pid3, pid3_clean)
new
#>
#> ── Survey Design ───────────────────────────────────────────────────────────────
#> <survey_nonprob> (non-probability) [experimental]
#> Sample size: 6422
#>
#> # A tibble: 6,422 × 2
#> pid3 pid3_clean
#> <dbl> <dbl>
#> 1 1 1
#> 2 1 1
#> 3 1 1
#> 4 3 3
#> 5 2 2
#> 6 1 1
#> 7 4 NA
#> 8 2 2
#> 9 2 2
#> 10 1 1
#> # ℹ 6,412 more rows
#>
#> ℹ Design variables preserved but hidden: weight.
#> ℹ Use `print(x, full = TRUE)` to show all variables.
# replace multiple values at once — Independent (3) and "Something else" (4)
new <- ns_wave1_svy |>
mutate(pid3_2party = na_if(pid3, c(3, 4))) |>
select(pid3, pid3_2party)
new
#>
#> ── Survey Design ───────────────────────────────────────────────────────────────
#> <survey_nonprob> (non-probability) [experimental]
#> Sample size: 6422
#>
#> # A tibble: 6,422 × 2
#> pid3 pid3_2party
#> <dbl> <dbl>
#> 1 1 1
#> 2 1 1
#> 3 1 1
#> 4 3 NA
#> 5 2 2
#> 6 1 1
#> 7 4 NA
#> 8 2 2
#> 9 2 2
#> 10 1 1
#> # ℹ 6,412 more rows
#>
#> ℹ Design variables preserved but hidden: weight.
#> ℹ Use `print(x, full = TRUE)` to show all variables.
# .update_labels = TRUE (default) drops label entries for NA'd values
new <- ns_wave1_svy |>
mutate(pid3_clean = na_if(pid3, 4, .update_labels = TRUE)) |>
select(pid3, pid3_clean)
# "Something else" (4) is removed from pid3_clean's value labels
new@metadata@value_labels$pid3_clean
#> Democrat Republican Independent
#> 1 2 3
# .update_labels = FALSE retains label entries even for NA'd values
new <- ns_wave1_svy |>
mutate(pid3_clean = na_if(pid3, 4, .update_labels = FALSE)) |>
select(pid3, pid3_clean)
# "Something else" (4) is still in pid3_clean's value labels
new@metadata@value_labels$pid3_clean
#> Democrat Republican Independent Something else
#> 1 2 3 4
# attach a plain-language description of the transformation
new <- ns_wave1_svy |>
mutate(
pid3_clean = na_if(
pid3,
4,
.description = "Set 'Something else' (pid3 == 4) to NA."
)
) |>
select(pid3, pid3_clean)
new@metadata@transformations
#> $pid3_clean
#> $pid3_clean$fn
#> [1] "na_if"
#>
#> $pid3_clean$source_cols
#> [1] "pid3"
#>
#> $pid3_clean$expr
#> [1] "na_if(pid3, 4, .description = \"Set 'Something else' (pid3 == 4) to NA.\")"
#>
#> $pid3_clean$output_type
#> [1] "vector"
#>
#> $pid3_clean$description
#> [1] "Set 'Something else' (pid3 == 4) to NA."
#>
#>
