-
Notifications
You must be signed in to change notification settings - Fork 483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add workflow for longitudinal benchmarking #5205
Conversation
We need to document this also. I'd like to see some notes on your investigation and what you decided? In particular, where's the data being stored? It's not obvious - can we get it out later? Is it safe? How does this interact with the stuff that's already getting pushed to github pages? |
The choice of benchmarking tool was very straightforward, as I could find no valid alternatives to
I've also looked at these: Again the only limitation I could find with
The data is being stored at The frontend code is stored in the |
Maybe worth writing up your research somewhere? It might be useful for other people or teams in the future. Could be an ADR? |
.github/workflows/new-benchmark.yml
Outdated
steps: | ||
- uses: actions/[email protected] | ||
|
||
- name: Run benchmark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worthwhile doing this in its own job so it alone can run on the benchmarking machine, and it's isolated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also then we can run this as a matrix job if we want to do multiple benchmarks in the future.
.github/workflows/new-benchmark.yml
Outdated
on: | ||
push: | ||
branches: | ||
- master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might want to set up some ignore paths or something. This is going to be somewhat expensive (30mins for the validation benchmark, per commit), so maybe we should try and skip some commits that don't change anything?
26e93d0
to
0bd9101
Compare
.github/workflows/new-benchmark.yml
Outdated
@@ -1,4 +1,4 @@ | |||
name: New Benchmark | |||
name: Longitudinal Benchmarking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change the file name too?
0bd9101
to
ab91a7d
Compare
import json | ||
result = [] | ||
for benchmark in "$BENCHMARKS".split(): | ||
with open(f"{benchmark}-output.txt", "r") as file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You decided not to use criterion's JSON output? seems like it might have been a bit easier, but not a big deal I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that but it turned out to be more complicated to use -> the JSON output is somewhat "lower level" (no unit) and it doesn't actually include the same figures that you get in the standard output. 🤷
2>&1 cabal run "$bench" | tee "$bench-output.txt" | ||
done | ||
|
||
read -r -d '' PYTHON_SCRIPT <<- END_SCRIPT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this in its own file? Makes it a lot easier to test...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
91f4f5b
to
47006c6
Compare
* New benchmark workflow * alert threshold * upd * Addressed some comments * WIP * Applied requested changes
This PR introduces a new file
.github/workflows/new-benchmark.yml
, which describes a new workflow for longitudinal benchmarking.It uses github-action-benchmark.
In short, whenever the benchmarks are run, the results are stored and added to the history.
The charts will be available at the plutus GitHub Pages URL:
https://input-output-hk.github.io/plutus/dev/bench/
The suffix
/dev/bench
is configurable.When a regression is detected (new value is larger than % of old value, configurable via the action's
alert-threshold
field) the workflow job fails and a comment is posted.I have tested this on my plutus fork.
This is what it looks like with two data points: https://zeme-iohk.github.io/plutus/dev/bench/
This is what it looks like when a regression is detected: zeme-iohk@83e504b#commitcomment-103842408
This is a more complete example: https://benchmark-action.github.io/github-action-benchmark/dev/bench/
What's left to do:
github-action-benchmark
One limitation is that the
alert-threshold
value is shared across all benchmarks.