Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal: slightly inaccurate as_join_by() implementation for character #7001

Open
krlmlr opened this issue Mar 8, 2024 · 8 comments
Open

Comments

@krlmlr
Copy link
Member

krlmlr commented Mar 8, 2024

waldo::compare(dplyr:::as_join_by("a"), dplyr::join_by(a == a))
#> `old$exprs[[1]]`: `"a" == "a"`
#> `new$exprs[[1]]`: `a == a`

Created on 2024-03-08 with reprex v2.1.0

I can work around, but I wonder if this may cause some downstream gotchas.

@DavisVaughan
Copy link
Member

We do allow this though:

unclass(dplyr::join_by("a" == "a"))
#> $exprs
#> $exprs[[1]]
#> "a" == "a"
#> 
#> 
#> $condition
#> [1] "=="
#> 
#> $filter
#> [1] "none"
#> 
#> $x
#> [1] "a"
#> 
#> $y
#> [1] "a"

But maybe that should be the "special" case, and coming from character should try to generate the "typical" case of a == a

@krlmlr
Copy link
Member Author

krlmlr commented Mar 8, 2024

What are the semantics of dplyr::join_by("a" == "a") ? Shouldn't the result be identical to dplyr::join_by(a == a) ?

@psychelzh
Copy link

psychelzh commented Mar 20, 2024

Out of curiosity, will as_join_by() be exported? Currently, I found join_by() is kind of hard to program with, for example, if I have stored the column names in a vector. For example:

suppressPackageStartupMessages(library(dplyr))
x <- data.frame(a = 1:5, b = 2:6)
y <- data.frame(a = 1:3, b = c(2, 3, 5), x = 2:4)
cols_by <- c("a", "b")
inner_join(x, y, by = cols_by)
#>   a b x
#> 1 1 2 2
#> 2 2 3 3
# oops
inner_join(x, y, by = join_by(!!cols_by))
#> Error in `join_by()`:
#> ! Each element of `...` must be a single column name or a join by
#>   expression.
#> ✖ Element 1 is not a name and not an expression.
inner_join(x, y, by = dplyr:::as_join_by(cols_by))
#>   a b x
#> 1 1 2 2
#> 2 2 3 3

Created on 2024-03-20 with reprex v2.1.0

@krlmlr
Copy link
Member Author

krlmlr commented Mar 20, 2024

Do you need join_by(!!!syms(cols_by)) ?

@psychelzh
Copy link

psychelzh commented Mar 20, 2024

Ah, thanks! !!! works as expected. And without syms works too (the docs of join_by() tell us these can be quoted or not quoted).

suppressPackageStartupMessages(library(dplyr))
x <- data.frame(a = 1:5, b = 2:6)
y <- data.frame(a = 1:3, b = c(2, 3, 5), x = 2:4)
cols_by <- c("a", "b")
inner_join(x, y, by = join_by(!!!cols_by))
#>   a b x
#> 1 1 2 2
#> 2 2 3 3

Created on 2024-03-20 with reprex v2.1.0

But I do think we do not recommend using !! or !!! a lot, do we? So, I am wondering if there will be some helpers without knowing quasiquotation or unqutation.

@DavisVaughan
Copy link
Member

DavisVaughan commented Mar 21, 2024

if I have stored the column names in a vector

The old method is not going away, so if you have this case then I'd recommend just supplying the character vector as you did before join_by()

But using !!! is also totally fine here, the !!enquo() pattern is replaced by {{ }} but using !!! is still somewhat common https://rlang.r-lib.org/reference/topic-metaprogramming.html


as_join_by() is only for creating equi joins, and is basically a nicely named internal function. I don't intend for us to export it since it has such a limited use case

@DavisVaughan
Copy link
Member

What are the semantics of dplyr::join_by("a" == "a") ? Shouldn't the result be identical to dplyr::join_by(a == a) ?

They are identical yes

@krlmlr
Copy link
Member Author

krlmlr commented Mar 22, 2024

Should we support the invariant that two joins are identical iff their join_by() arguments are identical? I'm still advocating for

identical(dplyr::join_by("a" == "a"), dplyr::join_by(a == a))

because it will simplify code that analyzes join_by objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants