# A tibble: 5 × 2
a .data
<chr> <dbl>
1 A -0.700
2 B 0.892
3 C 1.50
4 D 0.731
5 E -0.440
A colleague came to ask for help today with some code that worked perfectly when run with base R, but failed with an impenetrable error when run through tidyverse.
Here is a minimal example of the set up, a tibble with a column called .data
.data is a legal name for a column according to R’s rules for naming objects, and everything appears to be fine.
What could possibly go wrong? Let’s try filtering the column .data to keep just the positive values, first with base R.
df[df$.data > 0, ]
# A tibble: 3 × 2
a .data
<chr> <dbl>
1 B 0.892
2 C 1.50
3 D 0.731
Perfect. And now with dplyr::filter
df |> filter(.data > 0)
Error in `filter()`:
ℹ In argument: `.data > 0`.
Caused by error:
! 'list' object cannot be coerced to type 'double'
List object cannot be coerced to type double? But the column .data is already a double.
It took me a while to workout what was going on, but eventually I remembered that .data
is a pronoun in tidyverse (see ?rlang:::.data
), used mainly when writing functions using tidyverse. Changing the column name to data fixed the problem.
So what does .data
do? Consider this code
What will column c contain? The values 1 to 5 from the column b or the word “fish”? Let’s have a peek.
df
# A tibble: 3 × 3
a b c
<chr> <int> <int>
1 a 1 1
2 b 2 2
3 c 3 3
It took the values from the column. If we wanted to be explicit, we could write
# take b from the column
df |> mutate(c = .data$b)
# A tibble: 3 × 3
a b c
<chr> <int> <int>
1 a 1 1
2 b 2 2
3 c 3 3
# take b from the environment with the brace-brace operator
df |> mutate(c = {{b}})
# A tibble: 3 × 3
a b c
<chr> <int> <chr>
1 a 1 fish
2 b 2 fish
3 c 3 fish
# take b from the environment with the .env pronoun
df |> mutate(c = .env$b)
# A tibble: 3 × 3
a b c
<chr> <int> <chr>
1 a 1 fish
2 b 2 fish
3 c 3 fish
It also useful to use the .data
pronoun when writing packages otherwise you get notes from R CMD check.
In short, while .data
or .env
are legal names, they break tidyverse code, so don’t call data.frame columns .data
or .env
if you ever want to use tidyverse functions.