Understand where styler spends most of the time #759

krlmlr · 2021-03-25T02:57:23Z

This is crucial for #558.

Because we use mostly recursive calls, and effectively each function can call each other function, I suspect it is difficult to see where exactly styler spends most of its time, and which parts perhaps should be optimized. Can you confirm that?

To work around, I propose processing the .Rprof files created from a profiler run. If we erase the right places from the call stack for each sample, we might just end up with a clean flat structure that immediately shows where the time is spent. What do you think?

Alternatively, we could (temporarily) replace the recursive calls with later::later() or replace recursion with a queue and a while {} loop. Editing the call stack sounds like less work.

The text was updated successfully, but these errors were encountered:

lorenzwalthert · 2021-03-25T08:46:47Z

The recursive calls are time consuming, yes. In particular, there are two important recursions:

building the nested parse table in compute_parse_data_nested()
apply the transformer rules in apply_transformers()

I can't remember correctly but I think they take up similar amount of time. Recently, there were two important speed improvements made (speed change in unseen example code benchmarked with the {here} package). #739 removed unnecessary sorting in merge() inside a recursion since we sort right afterwards anyways (-4%) and in #711, transformers that are never used are removed (-10%).

That said, I think your proposed approach sounds interesting. I can confirm that it's hard for me to understand where bottlenecks are. I already spend quite some time optimising R code, maybe not very systematically though. And I think the cost/benefit ratio is too high for me now. I think time is better spent on parallelisation at the moment. I think the latest release (1.4.0) introduced speed improvements of about 25%+ with the few changes described in the NEWS.md, but then there were edge case fixes and new features that slowed things down again in this release, leaving us with like 15% diff. Because minor changes were always so hard to measure, I built {touchstone}, so now every PR is benchmarked and we benchmark against the current release as well on merge in master and for all new code, I tried to write is as efficient as possible while keeping it maintainable and understandable. You can find the speed implications under the comment PR step in the benchmarking workflow:

I am not familiar enough with {later} to comment on whether or not we should use it. I just think moving large parts of the code base into C++ would force me to learn C++, which I want to avoid xD.

IndrajeetPatil · 2022-03-07T14:42:54Z

@lorenzwalthert If this is still up for grabs, I would be happy to work on trying to improve performance using C++ (via {cpp11} package).

I just think moving large parts of the code base into C++ would force me to learn C++, which I want to avoid xD.

LOL, I have the opposite problem where I have learned a fair bit of C++, but haven't found a "real" project to contribute to.

I should mention though that I'd be slow to implement this because I can work on FOSS projects only during free time, and not during office hours.

Lemme know if that works for you, and I can get started by creating a fork. :)

lorenzwalthert · 2022-03-11T09:09:50Z

Thanks @IndrajeetPatil, I really appreciate your offer. Using C++ code in {styler} is quite a big decision that comes with a number of implications, so I think this needs to be well thought out. As @krlmlr said, we first need to identify the bottlenecks before starting implementing C++ functionality. Also, I would not be able to review your pull requests, we'd have to ask @krlmlr to do that. Would you have time to review these PRs @krlmlr?

That being said, would you want to start witht he profiling as Kirill suggested?

IndrajeetPatil · 2022-03-12T17:02:13Z

would you want to start witht he profiling as Kirill suggested?

On it!

Also, I would not be able to review your pull requests, we'd have to ask @krlmlr to do that

Yes, I would really appreciate if he could do that! :)

krlmlr · 2022-03-17T00:05:49Z

Thanks. I agree profiling should be the first step.

There are four steps:

Parsing (done by base R, should be fast enough)
Nesting of parse data (compute_parse_data_nested())
Walking the parse data (apply_transformers() and the transformers)
Composition of the styled text

Optimizing the third part will be hardest. Do we have a good understanding which parts take how long for a typical file?

Of course caching should be off in all experiments.

lorenzwalthert · 2022-03-17T07:29:19Z

I agree @krlmlr and I added the names of the functions that are relevant in your post.

In addition, I want to highlight that styling rules that we know are not affecting the code are removed before all transformers are applied. Which tokens affect which rules is specified in transformers_drop argument in create_style_guide(). See #711 and the doc of create_style_guide().

IndrajeetPatil · 2022-03-22T06:41:23Z

Thank you both for pointers on what might be the best places to start digging into this.

Will keep you posted as I go through them.

krlmlr · 2022-08-21T07:12:44Z

I took another look at styling performance.

With the two new PRs combined, and when replacing %>% with |>, we now finally get usable stack traces when profiling. With proffer::pprof(filter.callframes = TRUE, line.profiling = TRUE), we get a helpful call graph. With profvis() we would need to post-process the stack traces to remove recursive calls, should be easy to do with {profile}.

For a small-ish file from this repo, R/nest.R, the overhead (nesting and composition of the styled text) takes the lion's share of the time. Composition looks like an easy target, optimizing the nesting will be a bit trickier.

For easier profiling, it seems useful to specifically isolate the time spent styling from the overhead, by temporarily injecting calls to Rprof(...) and Rprof(NULL) in the appropriate places.

By and large, a whole lot of time seems to be spent in $<-.tbl_df() . Could we get noticeable results by consistently using unclass(pd_flat) and new_tibble(pd_flat)?

krlmlr · 2022-08-21T08:31:08Z

Current state of affairs: https://rpubs.com/krlmlr/styler-prof-1 .

This requires a cleanup of Rprof.out and a tweak to visit_one(), will share soon.

Surprisingly, I didn't have to convert pipes (magrittr -> native) to get this.

krlmlr · 2022-08-21T09:23:34Z

With #980 and #982 we require only a cleanup of Rprof.out to get https://rpubs.com/krlmlr/styler-prof-2:

text <- readLines("~/git/R/styler/R/nest.R")

styler::cache_deactivate()
#> Deactivated cache.

Rprof(filter.callframes = TRUE, line.profiling = FALSE)
for (i in 1:10) styler::style_text(text)
#> Warning: Invalid stylerignore sequences found, potentially ignoring some of the markers set.
#> See `help("stylerignore", "styler")`.
Rprof(NULL)

prof <- profile::read_rprof("Rprof.out")
prof$samples$locations <- lapply(prof$samples$locations, vctrs::vec_unique)
profile::write_rprof(prof, "Rprof-unique.out")

profvis::profvis(prof_input = "Rprof-unique.out")

^{Created on 2022-08-21 by the reprex package (v2.0.1)}

krlmlr · 2022-08-21T09:26:20Z

The profiling with %>% replaced by |> might be a tad cleaner, I'm not sure:

https://rpubs.com/krlmlr/styler-prof-3

After merging all relevant PRs, I believe a speed improvement by a factor of 2 is achievable. Perhaps more, perhaps a bit less.

IndrajeetPatil · 2022-09-19T11:20:06Z

By and large, a whole lot of time seems to be spent in $<-.tbl_df() . Could we get noticeable results by consistently using unclass(pd_flat) and new_tibble(pd_flat)?

I think this approach is definitely worth a try:

library(tibble)

df1 <- iris
df2 <- df3 <- as_tibble(iris)

bench::mark(
  "data.frame" = df1$x <- "x",
  "tibble" = df2$x <- "x",
  "unclass tibble" = {
    df3 <- unclass(df3)
    df3$x <- "x"
    new_tibble(df3)
  },
  check = FALSE
)[1:5]
#> # A tibble: 3 × 5
#>   expression          min   median `itr/sec` mem_alloc
#>   <bch:expr>     <bch:tm> <bch:tm>     <dbl> <bch:byt>
#> 1 data.frame       2.79µs   3.28µs   284932.    1.22KB
#> 2 tibble          19.35µs  20.54µs    47633.  165.32KB
#> 3 unclass tibble   4.55µs   5.04µs   193397.        0B

^{Created on 2022-09-19 with reprex v2.0.2}

lorenzwalthert · 2022-09-20T05:58:51Z

Ok that is indeed a big difference. Can’t we fix that at the root and make tibble assignment faster?

lorenzwalthert · 2022-09-20T06:00:18Z

@krlmlr ?

IndrajeetPatil · 2022-09-20T12:32:31Z

Another change I am going to make is using if() + else() instead of ifelse() when the test condition is of length 1.

mark = 25

bench::mark(
  "if + else" = if(mark >= 40) {
    "pass" 
  } else {
    "fail"
  },
  "ifelse" = ifelse(mark >= 40, "pass", "fail"),
  "dplyr-ifelse" = dplyr::if_else(mark >= 40, "pass", "fail"),
  time_unit = "ns"
)
#> # A tibble: 3 × 6
#>   expression     min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>   <dbl>  <dbl>     <dbl> <bch:byt>    <dbl>
#> 1 if + else       0    41.0 19230132.        0B      0  
#> 2 ifelse        533.  615.   1501476.        0B      0  
#> 3 dplyr-ifelse 8405. 9102.    107454.     8.7MB     86.0

^{Created on 2022-09-20 with reprex v2.0.2}

krlmlr · 2022-09-20T12:36:26Z

I'd only go for ifelse() if benchmarks suggest that this is useful.

We want tibble subsetting to be fast, but it's a larger project. We can:

fix in styler now, while making it easy to revert
fix tibble
revert

krlmlr · 2022-09-24T07:37:42Z

A quick heads-up: I tested with tidyverse/tibble#1469, which reverts to regular data.frame subsetting.

All styler tests still pass
For styler's nest.R, style_text(scope = "none") drops from 1.03s to 812ms, and style_text() drops from 1.6s to 1.1s
Profiling shows that a lot of time is still spent in e.g. $<-.data.frame or similar

Also, there are major obstacles to making tibble subsetting as fast as data frame subsetting for the $ operator: it is a primitive and enjoys special treatment in base R, but we need to implement a method in tibble to get the "unknown or uninitialized column" warning. We can fix this by overriding $, see tidyverse/tibble#1470 for a proof of concept.

Suggested course of action:

Replace tibbles with data frames or "tbl" objects in our nested data
Profile and rework style_text(scope = "none"), which will then become the major bottleneck
Use list subsetting/subassignment instead of d.f. subset/subassign in select cases, depending on profiling results

🐌 -> 🏎️

MichaelChirico · 2022-09-24T19:06:09Z

if you're doing a lot of in-place overwrites / it's a bottleneck, maybe worth considering data.table as a backend. Happy to advise/author some work towards that.

krlmlr · 2022-09-25T03:16:36Z

A move to data.table seems like a paradigm shift -- from immutable data to one overarching mutable "parse data" structure. To me, this looks like a major refactoring, touching all styling functions, with uncertain benefits. What's the estimated effort for moving to data.table, compared to using data.frames and doing a more efficient nesting + unnesting?

MichaelChirico · 2022-09-25T03:52:49Z

looks like a major refactoring

That's likely true :)

And ah, I didn't read the above closely enough. I thought tidyverse/tibble#1469 was a working branch shifting to data.frames on styler already. I think it'll be easier to go from data.frames-as-backend to data.tables-as-backend.

sebffischer · 2023-01-19T12:06:51Z

FWIW: I am working a lot with R6 classes and it seems that these are especially problematic for the speed of styler, as there are so many nested functions / objects I assume.

lorenzwalthert added Complexity: High Priority: Medium Status: Unassigned Type: Infrastructure labels Mar 29, 2021

IndrajeetPatil mentioned this issue Jun 16, 2022

Leading blank lines in a scope are not removed when a comment present #951

Closed

This was referenced Aug 21, 2022

Optimize visiting code #979

Merged

Refactor: better stack traces for profiling #980

Merged

krlmlr mentioned this issue Aug 21, 2022

Cleaner stack trace with visitors #982

Draft

IndrajeetPatil mentioned this issue Sep 13, 2022

WIP: Check for performance improvements with C++ #997

Closed

IndrajeetPatil mentioned this issue Sep 20, 2022

Performance improvements with if() + else() instead of ifelse() #1006

Merged

IndrajeetPatil mentioned this issue Sep 26, 2022

Replace tibbles with data frames to improve performance #1007

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understand where styler spends most of the time #759

Understand where styler spends most of the time #759

krlmlr commented Mar 25, 2021 •

edited

Loading

lorenzwalthert commented Mar 25, 2021 •

edited

Loading

IndrajeetPatil commented Mar 7, 2022

lorenzwalthert commented Mar 11, 2022

IndrajeetPatil commented Mar 12, 2022

krlmlr commented Mar 17, 2022 •

edited by lorenzwalthert

Loading

lorenzwalthert commented Mar 17, 2022

IndrajeetPatil commented Mar 22, 2022

krlmlr commented Aug 21, 2022

krlmlr commented Aug 21, 2022 •

edited

Loading

krlmlr commented Aug 21, 2022

krlmlr commented Aug 21, 2022

IndrajeetPatil commented Sep 19, 2022

lorenzwalthert commented Sep 20, 2022 •

edited

Loading

lorenzwalthert commented Sep 20, 2022

IndrajeetPatil commented Sep 20, 2022

krlmlr commented Sep 20, 2022 •

edited

Loading

krlmlr commented Sep 24, 2022 •

edited

Loading

MichaelChirico commented Sep 24, 2022

krlmlr commented Sep 25, 2022

MichaelChirico commented Sep 25, 2022

sebffischer commented Jan 19, 2023

Understand where styler spends most of the time #759

Understand where styler spends most of the time #759

Comments

krlmlr commented Mar 25, 2021 • edited Loading

lorenzwalthert commented Mar 25, 2021 • edited Loading

IndrajeetPatil commented Mar 7, 2022

lorenzwalthert commented Mar 11, 2022

IndrajeetPatil commented Mar 12, 2022

krlmlr commented Mar 17, 2022 • edited by lorenzwalthert Loading

lorenzwalthert commented Mar 17, 2022

IndrajeetPatil commented Mar 22, 2022

krlmlr commented Aug 21, 2022

krlmlr commented Aug 21, 2022 • edited Loading

krlmlr commented Aug 21, 2022

krlmlr commented Aug 21, 2022

IndrajeetPatil commented Sep 19, 2022

lorenzwalthert commented Sep 20, 2022 • edited Loading

lorenzwalthert commented Sep 20, 2022

IndrajeetPatil commented Sep 20, 2022

krlmlr commented Sep 20, 2022 • edited Loading

krlmlr commented Sep 24, 2022 • edited Loading

MichaelChirico commented Sep 24, 2022

krlmlr commented Sep 25, 2022

MichaelChirico commented Sep 25, 2022

sebffischer commented Jan 19, 2023

krlmlr commented Mar 25, 2021 •

edited

Loading

lorenzwalthert commented Mar 25, 2021 •

edited

Loading

krlmlr commented Mar 17, 2022 •

edited by lorenzwalthert

Loading

krlmlr commented Aug 21, 2022 •

edited

Loading

lorenzwalthert commented Sep 20, 2022 •

edited

Loading

krlmlr commented Sep 20, 2022 •

edited

Loading

krlmlr commented Sep 24, 2022 •

edited

Loading