Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid as_tibble() in group_data.tbl_df() #6736

Merged

Conversation

DavisVaughan
Copy link
Member

@DavisVaughan DavisVaughan commented Feb 17, 2023

And break group_data.data.frame() into multiple lines

This is used by compute_by() in the common case of an ungrouped data frame with no .by argument, so it is worth it to make it faster here:

library(dplyr)

df <- tibble(x = 1:5)

bench::mark(group_data(df))

# Main
#> # A tibble: 1 × 6
#>   expression          min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>     <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 group_data(df)    155µs    173µs     5235.     179KB     8.19

# This PR
#> # A tibble: 1 × 6
#>   expression          min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>     <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 group_data(df)   22.5µs   24.8µs    37684.     105KB     11.3

Created on 2023-02-17 with reprex v2.0.2.9000

And break `group_data.data.frame()` into multiple lines
@DavisVaughan DavisVaughan merged commit 597390a into tidyverse:main Feb 17, 2023
@DavisVaughan DavisVaughan deleted the feature/faster-group-data-tbl branch February 17, 2023 15:35
@DavisVaughan DavisVaughan mentioned this pull request Feb 27, 2023
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant