Skip to contents

na_if() is a survey-aware version of dplyr::na_if() that converts values equal to y to NA. It is useful for replacing sentinel values (e.g., 999 for "don't know") with proper missing values.

Unlike dplyr::na_if(), which accepts only a scalar y, this version accepts a vector y and replaces all matching values in a single call.

When x carries value labels, na_if() automatically inherits those labels. By default (.update_labels = TRUE), the label entries for the NA'd values are removed from the output; set .update_labels = FALSE to retain them (useful when you want to document what was set to missing).

Usage

na_if(x, y, .update_labels = TRUE, .description = NULL)

Arguments

x

Vector to modify.

y

Value or vector of values to replace with NA. y is cast to the type of x before comparison. When y has more than one element, each value is replaced sequentially.

.update_labels

logical(1). If TRUE (the default) and x carries value labels, label entries for values in y are removed from the output's value labels. Set to FALSE to retain all inherited labels even for values that were set to NA.

.description

character(1) or NULL. Plain-language description of how the variable was created. Stored in @metadata@transformations[[col]]$description after mutate().

Value

A modified version of x where values equal to y are replaced with NA. If x carries value labels, returns a haven_labelled vector with updated (or retained) labels; otherwise returns the same type as x.

See also

Other recoding: case_when(), if_else(), recode_values(), replace_values(), replace_when()

Examples

library(surveycore)
library(surveytidy)

# create the survey design
ns_wave1_svy <- as_survey_nonprob(ns_wave1, weights = weight)

# basic na_if — replace "Something else" (pid3 == 4) with NA
new <- ns_wave1_svy |>
  mutate(pid3_clean = na_if(pid3, 4)) |>
  select(pid3, pid3_clean)

new
#> 
#> ── Survey Design ───────────────────────────────────────────────────────────────
#> <survey_nonprob> (non-probability) [experimental]
#> Sample size: 6422
#> 
#> # A tibble: 6,422 × 2
#>     pid3 pid3_clean
#>    <dbl>      <dbl>
#>  1     1          1
#>  2     1          1
#>  3     1          1
#>  4     3          3
#>  5     2          2
#>  6     1          1
#>  7     4         NA
#>  8     2          2
#>  9     2          2
#> 10     1          1
#> # ℹ 6,412 more rows
#> 
#>  Design variables preserved but hidden: weight.
#>  Use `print(x, full = TRUE)` to show all variables.

# replace multiple values at once — Independent (3) and "Something else" (4)
new <- ns_wave1_svy |>
  mutate(pid3_2party = na_if(pid3, c(3, 4))) |>
  select(pid3, pid3_2party)

new
#> 
#> ── Survey Design ───────────────────────────────────────────────────────────────
#> <survey_nonprob> (non-probability) [experimental]
#> Sample size: 6422
#> 
#> # A tibble: 6,422 × 2
#>     pid3 pid3_2party
#>    <dbl>       <dbl>
#>  1     1           1
#>  2     1           1
#>  3     1           1
#>  4     3          NA
#>  5     2           2
#>  6     1           1
#>  7     4          NA
#>  8     2           2
#>  9     2           2
#> 10     1           1
#> # ℹ 6,412 more rows
#> 
#>  Design variables preserved but hidden: weight.
#>  Use `print(x, full = TRUE)` to show all variables.

# .update_labels = TRUE (default) drops label entries for NA'd values
new <- ns_wave1_svy |>
  mutate(pid3_clean = na_if(pid3, 4, .update_labels = TRUE)) |>
  select(pid3, pid3_clean)

# "Something else" (4) is removed from pid3_clean's value labels
new@metadata@value_labels$pid3_clean
#>    Democrat  Republican Independent 
#>           1           2           3 

# .update_labels = FALSE retains label entries even for NA'd values
new <- ns_wave1_svy |>
  mutate(pid3_clean = na_if(pid3, 4, .update_labels = FALSE)) |>
  select(pid3, pid3_clean)

# "Something else" (4) is still in pid3_clean's value labels
new@metadata@value_labels$pid3_clean
#>       Democrat     Republican    Independent Something else 
#>              1              2              3              4 

# attach a plain-language description of the transformation
new <- ns_wave1_svy |>
  mutate(
    pid3_clean = na_if(
      pid3,
      4,
      .description = "Set 'Something else' (pid3 == 4) to NA."
    )
  ) |>
  select(pid3, pid3_clean)

new@metadata@transformations
#> $pid3_clean
#> $pid3_clean$fn
#> [1] "na_if"
#> 
#> $pid3_clean$source_cols
#> [1] "pid3"
#> 
#> $pid3_clean$expr
#> [1] "na_if(pid3, 4, .description = \"Set 'Something else' (pid3 == 4) to NA.\")"
#> 
#> $pid3_clean$output_type
#> [1] "vector"
#> 
#> $pid3_clean$description
#> [1] "Set 'Something else' (pid3 == 4) to NA."
#> 
#>