Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add workflow for longitudinal benchmarking #5205

Merged
merged 6 commits into from
Mar 20, 2023
Merged

Conversation

zeme-iohk
Copy link

@zeme-iohk zeme-iohk commented Mar 10, 2023

This PR introduces a new file .github/workflows/new-benchmark.yml, which describes a new workflow for longitudinal benchmarking.
It uses github-action-benchmark.

In short, whenever the benchmarks are run, the results are stored and added to the history.
The charts will be available at the plutus GitHub Pages URL: https://input-output-hk.github.io/plutus/dev/bench/
The suffix /dev/bench is configurable.
When a regression is detected (new value is larger than % of old value, configurable via the action's alert-threshold field) the workflow job fails and a comment is posted.

I have tested this on my plutus fork.
This is what it looks like with two data points: https://zeme-iohk.github.io/plutus/dev/bench/
This is what it looks like when a regression is detected: zeme-iohk@83e504b#commitcomment-103842408

This is a more complete example: https://benchmark-action.github.io/github-action-benchmark/dev/bench/

What's left to do:

  • Decide when to run the benchmarks (push to master? specific PR label?)
  • Which benchmarks to run
  • Format the output of the benchmarks according to github-action-benchmark

One limitation is that the alert-threshold value is shared across all benchmarks.

@zeme-iohk zeme-iohk added the No Changelog Required Add this to skip the Changelog Check label Mar 10, 2023
@zeme-iohk zeme-iohk requested review from michaelpj and zliu41 March 10, 2023 07:38
@zeme-iohk zeme-iohk marked this pull request as ready for review March 10, 2023 07:38
@michaelpj
Copy link
Contributor

We need to document this also. I'd like to see some notes on your investigation and what you decided? In particular, where's the data being stored? It's not obvious - can we get it out later? Is it safe?

How does this interact with the stuff that's already getting pushed to github pages?

@zeme-iohk
Copy link
Author

The choice of benchmarking tool was very straightforward, as I could find no valid alternatives to github-action-benchmark.

github-action-benchmark does exactly what we need and more:

  • Is language agnostic, we just have to format the benchmark data accordingly
  • Is a GitHub action that requires minimal setup
  • Automatically pushes to GitHub pages <- no need for 3rd party hosting solution
  • Automatically charts the benchmarking data
  • Can detect performance regressions and send notifications
  • Provides a feature to download the entire benchmarking data history

I've also looked at these:
https://www.tweag.io/blog/2018-12-12-benchgraph/
https://github.com/piotrmurach/rspec-benchmark
https://github.com/wg/wrk
And many more.
But they are not what we are looking for: those are tools for benchmarking code, not for (benchmark) data visualisation.

Again the only limitation I could find with github-action-benchmark is the alert-threshold, which is shared across all benchmarks.
This can be worked around either by:

  • Creating a separate job for each benchmark (which allows us to specify a separate alert-threshold)
  • Writing a custom script to detect regressions according to our own logic, and to manually send notifications.
    • This does not interfere with the data viz. features, which we still get "for free"

In particular, where's the data being stored? It's not obvious - can we get it out later? Is it safe?
How does this interact with the stuff that's already getting pushed to github pages?

The data is being stored at <github-pages-url>/dev/bench/data.js
The entire history can be downloaded as json by clicking the blue download data as json button on the bottom left of the page. E.g. https://benchmark-action.github.io/github-action-benchmark/dev/bench/

The frontend code is stored in the /dev/bench directory in the gh-pages branch.
That path is configurable.
It's isolated from the stuff that's already getting pushed to github pages.

@michaelpj
Copy link
Contributor

Maybe worth writing up your research somewhere? It might be useful for other people or teams in the future. Could be an ADR?

.github/workflows/new-benchmark.yml Outdated Show resolved Hide resolved
steps:
- uses: actions/[email protected]

- name: Run benchmark
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worthwhile doing this in its own job so it alone can run on the benchmarking machine, and it's isolated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also then we can run this as a matrix job if we want to do multiple benchmarks in the future.

.github/workflows/new-benchmark.yml Outdated Show resolved Hide resolved
.github/workflows/new-benchmark.yml Outdated Show resolved Hide resolved
.github/workflows/new-benchmark.yml Outdated Show resolved Hide resolved
.github/workflows/new-benchmark.yml Outdated Show resolved Hide resolved
on:
push:
branches:
- master
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want to set up some ignore paths or something. This is going to be somewhat expensive (30mins for the validation benchmark, per commit), so maybe we should try and skip some commits that don't change anything?

.github/workflows/new-benchmark.yml Outdated Show resolved Hide resolved
@zeme-iohk zeme-iohk force-pushed the zeme-iohk/new-benchmark branch from 26e93d0 to 0bd9101 Compare March 13, 2023 13:32
@@ -1,4 +1,4 @@
name: New Benchmark
name: Longitudinal Benchmarking
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the file name too?

@zeme-iohk zeme-iohk force-pushed the zeme-iohk/new-benchmark branch from 0bd9101 to ab91a7d Compare March 18, 2023 11:58
@zeme-iohk zeme-iohk requested a review from michaelpj March 18, 2023 12:08
import json
result = []
for benchmark in "$BENCHMARKS".split():
with open(f"{benchmark}-output.txt", "r") as file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You decided not to use criterion's JSON output? seems like it might have been a bit easier, but not a big deal I guess.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that but it turned out to be more complicated to use -> the JSON output is somewhat "lower level" (no unit) and it doesn't actually include the same figures that you get in the standard output. 🤷

2>&1 cabal run "$bench" | tee "$bench-output.txt"
done

read -r -d '' PYTHON_SCRIPT <<- END_SCRIPT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this in its own file? Makes it a lot easier to test...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

.github/workflows/longitudinal-benchmark.yml Outdated Show resolved Hide resolved
@zeme-iohk zeme-iohk force-pushed the zeme-iohk/new-benchmark branch from 91f4f5b to 47006c6 Compare March 20, 2023 10:23
@michaelpj michaelpj merged commit 4235904 into master Mar 20, 2023
@michaelpj michaelpj deleted the zeme-iohk/new-benchmark branch March 20, 2023 15:57
v0d1ch pushed a commit to v0d1ch/plutus that referenced this pull request Dec 6, 2024
* New benchmark workflow

* alert threshold

* upd

* Addressed some comments

* WIP

* Applied requested changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
No Changelog Required Add this to skip the Changelog Check
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants