-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace tibbles with data frames to improve performance #1007
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1007 +/- ##
=======================================
Coverage 91.14% 91.14%
=======================================
Files 46 46
Lines 2664 2665 +1
=======================================
+ Hits 2428 2429 +1
Misses 236 236
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 7c691c9 is merged into main:
Further explanation regarding interpretation and methodology can be found in the documentation. |
@lorenzwalthert, @krlmlr That's quite the bump in performance when switching to data frames instead of tibbles as our data structure of choice! 😮 |
Wow yes @IndrajeetPatil and without much code change even! Well done. Now only hurdle is to make it pass on old releases... |
wow, quite impressive speed improvement per LoC change!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice speedup! Can we encapsulate the choice of data structure in helper functions? We could add e.g. new_styler_df()
and styler_df()
that use vctrs::new_data_frame()
and vctrs::data_frame()
under the hood. This means that we could later change the underlying data structure with lesser effort.
Do we still need to import tibble?
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 63a27d0 is merged into main:
Further explanation regarding interpretation and methodology can be found in the documentation. |
Good idea. Done!
Only for |
Let's keep the |
This is how benchmark results would change (along with a 95% confidence interval in relative change) if a0d6c20 is merged into main:
Further explanation regarding interpretation and methodology can be found in the documentation. |
This is how benchmark results would change (along with a 95% confidence interval in relative change) if fa98f9c is merged into main:
Further explanation regarding interpretation and methodology can be found in the documentation. |
Also, it seems removing {tibble} does not reduce recursive dependencies: library(magrittr)
deps <- desc::desc_get_deps() %>%
dplyr::filter(type == 'Imports') %>%
dplyr::pull(package)
recursive_deps_before <- purrr::map(deps, ~names(renv:::renv_package_dependencies(.x))) %>%
unlist() %>%
unique()
deps_without_tibble <- setdiff(deps, 'tibble')
recursive_deps_after <- purrr::map(deps_without_tibble, ~names(renv:::renv_package_dependencies(.x))) %>%
unlist() %>%
unique()
waldo::compare(recursive_deps_before, recursive_deps_after)
#> ✔ No differences Created on 2022-09-27 by the reprex package (v2.0.1) This is because we use {rematch2} (in one place only, can be worked around probably some how), which in turn depends on {tibble}. That {tibble} dependency was suggested to be removed in r-lib/rematch2#14, where @krlmlr was not all in for the suggested implementation. With the additional development that happened over the last 2 years and more recursive dependencies added to tibble, I think it would be even more beneficial to remove that dependency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job.
I think this is a big enough improvement to consider creating a new CRAN release? |
* Get rid of unnecessary `.name_repair` arg This is generating warnings. Follow-up on #1007 * make the wrapper even thinner
Any thoughts, @lorenzwalthert and @krlmlr? We also need to get rid of |
Yes, I agree. Do you want to m make a PR to |
I already bumped the version recently and tried to organise the news items a bit. |
@lorenzwalthert I can wait! :) |
#759 (comment)
Need to use continuous benchmarking, so not converting this to a draft.