`cross_join()` in place of `by = character()` #6604

DavisVaughan · 2022-12-13T21:55:30Z

Rationale for adding an explicit cross_join() function in favor of the existing by = character() method:

I think by = character() is fairly unintuitive, and I’ve seen 2 cases recently where people didn’t even know it existed
I think by = character() should really be an error. I think its current behavior of doing a cross join is actually a little scary. Like, supplying c("x", "y") matches on two columns, supplying "x" matches on one column, but character() all of the sudden matches all rows? Seems very odd to me.
Importantly, currently only inner_join(by = character()) is a true cross join. SQL cross joins must result in nrow(x) * nrow(y) number of rows, and only the inner cross join does this in the edge case where one of the inputs has 0 rows (i.e. the left join would retain all x rows). I know this is edge case behavior, but the weirdness here feels worth resolving by just adding cross_join()
I think it would be more straightforward for dbplyr and friends to convert cross_join() to SQL. Converting left_join(by = character()) to SQL is probably quite odd.
It would be nice that cross_join() wouldn’t have a by argument at all, making it very clear it doesn’t have anything to do with columns
We could document cross_join() on its own help page, raising awareness of it
join_by() wouldn’t have to support cross joins, which it currently does by allowing empty join_by() calls to be treated like cross joins. I could turn that into an error for 1.1.0 if we also added cross_join() at the same time, which would be much nicer than having to deprecate it slowly in the future.

The text was updated successfully, but these errors were encountered:

DavisVaughan added feature a feature request or enhancement tables 🧮 joins and set operations labels Dec 13, 2022

DavisVaughan added this to the 1.1.0 milestone Dec 13, 2022

DavisVaughan self-assigned this Dec 16, 2022

DavisVaughan mentioned this issue Dec 16, 2022

Implement cross_join() #6612

Merged

DavisVaughan closed this as completed in #6612 Dec 17, 2022

steven-buckley mentioned this issue Apr 3, 2023

37 fix union all nhsbsa-data-analytics/personMatchR#38

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`cross_join()` in place of `by = character()` #6604

`cross_join()` in place of `by = character()` #6604

DavisVaughan commented Dec 13, 2022

cross_join() in place of by = character() #6604

cross_join() in place of by = character() #6604

Comments

DavisVaughan commented Dec 13, 2022

`cross_join()` in place of `by = character()` #6604

`cross_join()` in place of `by = character()` #6604