Add workflow for longitudinal benchmarking #5205

zeme-iohk · 2023-03-10T07:21:16Z

This PR introduces a new file .github/workflows/new-benchmark.yml, which describes a new workflow for longitudinal benchmarking.
It uses github-action-benchmark.

In short, whenever the benchmarks are run, the results are stored and added to the history.
The charts will be available at the plutus GitHub Pages URL: https://input-output-hk.github.io/plutus/dev/bench/
The suffix /dev/bench is configurable.
When a regression is detected (new value is larger than % of old value, configurable via the action's alert-threshold field) the workflow job fails and a comment is posted.

I have tested this on my plutus fork.
This is what it looks like with two data points: https://zeme-iohk.github.io/plutus/dev/bench/
This is what it looks like when a regression is detected: zeme-iohk@83e504b#commitcomment-103842408

This is a more complete example: https://benchmark-action.github.io/github-action-benchmark/dev/bench/

What's left to do:

Decide when to run the benchmarks (push to master? specific PR label?)
Which benchmarks to run
Format the output of the benchmarks according to github-action-benchmark

One limitation is that the alert-threshold value is shared across all benchmarks.

.github/workflows/new-benchmark.yml

michaelpj · 2023-03-10T12:03:40Z

We need to document this also. I'd like to see some notes on your investigation and what you decided? In particular, where's the data being stored? It's not obvious - can we get it out later? Is it safe?

How does this interact with the stuff that's already getting pushed to github pages?

zeme-iohk · 2023-03-13T03:42:27Z

The choice of benchmarking tool was very straightforward, as I could find no valid alternatives to github-action-benchmark.

github-action-benchmark does exactly what we need and more:

Is language agnostic, we just have to format the benchmark data accordingly
Is a GitHub action that requires minimal setup
Automatically pushes to GitHub pages <- no need for 3rd party hosting solution
Automatically charts the benchmarking data
Can detect performance regressions and send notifications
Provides a feature to download the entire benchmarking data history

I've also looked at these:
https://www.tweag.io/blog/2018-12-12-benchgraph/
https://github.com/piotrmurach/rspec-benchmark
https://github.com/wg/wrk
And many more.
But they are not what we are looking for: those are tools for benchmarking code, not for (benchmark) data visualisation.

Again the only limitation I could find with github-action-benchmark is the alert-threshold, which is shared across all benchmarks.
This can be worked around either by:

Creating a separate job for each benchmark (which allows us to specify a separate alert-threshold)
Writing a custom script to detect regressions according to our own logic, and to manually send notifications.
- This does not interfere with the data viz. features, which we still get "for free"

In particular, where's the data being stored? It's not obvious - can we get it out later? Is it safe?
How does this interact with the stuff that's already getting pushed to github pages?

The data is being stored at <github-pages-url>/dev/bench/data.js
The entire history can be downloaded as json by clicking the blue download data as json button on the bottom left of the page. E.g. https://benchmark-action.github.io/github-action-benchmark/dev/bench/

The frontend code is stored in the /dev/bench directory in the gh-pages branch.
That path is configurable.
It's isolated from the stuff that's already getting pushed to github pages.

michaelpj · 2023-03-13T10:29:13Z

Maybe worth writing up your research somewhere? It might be useful for other people or teams in the future. Could be an ADR?

.github/workflows/new-benchmark.yml

michaelpj · 2023-03-13T10:43:26Z

.github/workflows/new-benchmark.yml

+    steps:
+      - uses: actions/[email protected]
+
+      - name: Run benchmark


I think it's worthwhile doing this in its own job so it alone can run on the benchmarking machine, and it's isolated.

Also then we can run this as a matrix job if we want to do multiple benchmarks in the future.

.github/workflows/new-benchmark.yml

michaelpj · 2023-03-13T10:54:47Z

.github/workflows/new-benchmark.yml

+on:
+  push:
+    branches:
+      - master


I think we might want to set up some ignore paths or something. This is going to be somewhat expensive (30mins for the validation benchmark, per commit), so maybe we should try and skip some commits that don't change anything?

.github/workflows/new-benchmark.yml

michaelpj · 2023-03-13T16:35:32Z

.github/workflows/new-benchmark.yml

@@ -1,4 +1,4 @@
-name: New Benchmark
+name: Longitudinal Benchmarking


change the file name too?

michaelpj · 2023-03-20T10:01:26Z

.github/workflows/longitudinal-benchmark.yml

+              import json 
+              result = []
+              for benchmark in "$BENCHMARKS".split():
+                  with open(f"{benchmark}-output.txt", "r") as file:


You decided not to use criterion's JSON output? seems like it might have been a bit easier, but not a big deal I guess.

I tried that but it turned out to be more complicated to use -> the JSON output is somewhat "lower level" (no unit) and it doesn't actually include the same figures that you get in the standard output. 🤷

michaelpj · 2023-03-20T10:01:41Z

.github/workflows/longitudinal-benchmark.yml

+            2>&1 cabal run "$bench" | tee "$bench-output.txt"
+          done 
+
+          read -r -d '' PYTHON_SCRIPT <<- END_SCRIPT


Can we put this in its own file? Makes it a lot easier to test...

.github/workflows/longitudinal-benchmark.yml

* New benchmark workflow * alert threshold * upd * Addressed some comments * WIP * Applied requested changes

zeme-iohk commented Mar 10, 2023

View reviewed changes

.github/workflows/new-benchmark.yml Outdated Show resolved Hide resolved

zeme-iohk commented Mar 10, 2023

View reviewed changes

.github/workflows/new-benchmark.yml Outdated Show resolved Hide resolved

zeme-iohk added the No Changelog Required Add this to skip the Changelog Check label Mar 10, 2023

zeme-iohk commented Mar 10, 2023

View reviewed changes

.github/workflows/new-benchmark.yml Outdated Show resolved Hide resolved

zeme-iohk requested review from michaelpj and zliu41 March 10, 2023 07:38

zeme-iohk marked this pull request as ready for review March 10, 2023 07:38

michaelpj reviewed Mar 13, 2023

View reviewed changes

zeme-iohk force-pushed the zeme-iohk/new-benchmark branch from 26e93d0 to 0bd9101 Compare March 13, 2023 13:32

michaelpj reviewed Mar 13, 2023

View reviewed changes

zeme-wana added 5 commits March 18, 2023 18:58

New benchmark workflow

bca2c71

alert threshold

cb955de

upd

f91ee9c

Addressed some comments

80a2c2c

WIP

ab91a7d

zeme-iohk force-pushed the zeme-iohk/new-benchmark branch from 0bd9101 to ab91a7d Compare March 18, 2023 11:58

zeme-iohk requested a review from michaelpj March 18, 2023 12:08

michaelpj approved these changes Mar 20, 2023

View reviewed changes

Applied requested changes

47006c6

zeme-iohk force-pushed the zeme-iohk/new-benchmark branch from 91f4f5b to 47006c6 Compare March 20, 2023 10:23

michaelpj merged commit 4235904 into master Mar 20, 2023

michaelpj deleted the zeme-iohk/new-benchmark branch March 20, 2023 15:57

v0d1ch pushed a commit to v0d1ch/plutus that referenced this pull request Dec 6, 2024

Add workflow for longitudinal benchmarking (IntersectMBO#5205)

8ead840

* New benchmark workflow * alert threshold * upd * Addressed some comments * WIP * Applied requested changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add workflow for longitudinal benchmarking #5205

Add workflow for longitudinal benchmarking #5205

zeme-iohk commented Mar 10, 2023 •

edited

Loading

michaelpj commented Mar 10, 2023

zeme-iohk commented Mar 13, 2023

michaelpj commented Mar 13, 2023

michaelpj Mar 13, 2023

michaelpj Mar 13, 2023

michaelpj Mar 13, 2023

michaelpj Mar 13, 2023

michaelpj Mar 20, 2023

zeme-iohk Mar 20, 2023

michaelpj Mar 20, 2023

zeme-iohk Mar 20, 2023

		@@ -1,4 +1,4 @@
		name: New Benchmark
		name: Longitudinal Benchmarking

Add workflow for longitudinal benchmarking #5205

Add workflow for longitudinal benchmarking #5205

Conversation

zeme-iohk commented Mar 10, 2023 • edited Loading

michaelpj commented Mar 10, 2023

zeme-iohk commented Mar 13, 2023

michaelpj commented Mar 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zeme-iohk commented Mar 10, 2023 •

edited

Loading