Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Integration Tests #34

Closed
wants to merge 14 commits into from
Closed

Update Integration Tests #34

wants to merge 14 commits into from

Conversation

cassidycodes
Copy link
Collaborator

The integration tests were returning false positives when running locally. This PR updates the integration tests to fail when thresholds are breached. Please see my self-review for an explanation.

working-directory: ./k6/graphql-api
run: |
bundle
bundle exec puma -t 0:1 -p 9291 &
./boot-servers.sh &
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use boot-servers.sh to clean this up and to make local testing simpler.

HIVE_ENABLED=$hive_enabled \
PORT=$port \
LOG_LEVEL=$LOG_LEVEL \
bundle exec puma -C puma.rb | log_with_prefix "$prefix" &
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start a puma server using puma.rb as the config. See notes below about how this is different from before.

Comment on lines +1 to +2
workers 2
threads 2, 2
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this was booted with one worker and one thread. Two workers and two threads lets us run with little more throughput. We are not load testing puma, here. So we may as well boot up a few workers.

Comment on lines +13 to +14
vus: 5,
iterations: 1000,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dialling down the number of VUs (virtual users) so that we don't slam Puma with throughput. But amping up the iterations so we get better data on p95 response times.

Comment on lines 28 to 29
"http_req_duration{hive:enabled}": ["p(95)<15"],
"http_req_duration{hive:disabled}": ["p(95)<15"],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fail if either of these has a p95 > 15 ms.

Comment on lines +45 to +47
check(count, {
"usage-api starts with 0 operations": (count) => count === 0,
});
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New setup with a check to make sure we reset the usage api counter before testing.

Comment on lines +68 to +74
check(res, {
"response body is GraphQL": (res) => res.body.includes("data"),
});
check(res, {
"response body is not a GraphQL error": (res) =>
!res.body.includes("errors"),
});
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two new checks to ensure we get good data back.

Comment on lines 81 to 110
check(count, {
"usage-api received 1000 operations": (count) => count === REQUEST_COUNT,
});
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New check to ensure usage API received 1000 operations.

k6/integration-test.js Outdated Show resolved Hide resolved
@cassidycodes cassidycodes force-pushed the fix-k6-tests branch 2 times, most recently from bdfc9c6 to 6893d38 Compare October 17, 2024 13:18
Repository owner deleted a comment from charlypoly Oct 17, 2024
@@ -33,7 +33,8 @@ class Schema < GraphQL::Schema
GraphQL::Hive,
enabled: ENV["HIVE_ENABLED"] === "true",
endpoint: "localhost",
debug: true,
debug: false,
buffer_size: 1,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a buffer_size of 1 because we have two puma processes running and we want every operation to get sent for the integration test.

Comment on lines 27 to 45
thresholds: {
"http_req_duration{hive:enabled}": [
{
threshold: "p(95)<25",
abortOnFail: true,
},
],
"http_req_duration{hive:disabled}": [
{
threshold: "p(95)<25",
abortOnFail: true,
},
],
checks: [
{
threshold: "rate===1",
abortOnFail: true,
},
],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do a non-zero exit code if any of these fail.

@cassidycodes cassidycodes marked this pull request as ready for review October 17, 2024 15:46
Comment on lines -61 to -70
if (
data.metrics["http_req_duration{hive:enabled}"] &&
data.metrics["http_req_duration{hive:disabled}"]
) {
const withHive =
data.metrics["http_req_duration{hive:enabled}"].values["avg"];
const withoutHive =
data.metrics["http_req_duration{hive:disabled}"].values["avg"];
overheadPercentage = 100 - (withHive * 100.0) / withoutHive;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not accurate. On repeated tests I saw hive:enabled 1-15% faster than hive:disabled. Instead of this check, I added a p95 threshold for both tests of 25ms, which seems to work well on github actions runners.

Comment on lines 28 to 39
"http_req_duration{hive:enabled}": [
{
threshold: "p(95)<25",
abortOnFail: true,
},
],
"http_req_duration{hive:disabled}": [
{
threshold: "p(95)<25",
abortOnFail: true,
},
],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are in lieu of the performance regression check that compared avg response time. 25 ms is set based on a few github actions runs to be sure these thresholds work on the runners provided.

@cassidycodes cassidycodes self-assigned this Oct 17, 2024
@cassidycodes cassidycodes requested a review from rperryng October 17, 2024 17:57
Comment on lines +97 to +106
let count = 0;
for (let i = 0; i < 10; i++) {
const res = http.get("http://localhost:8888/count");
count = JSON.parse(res.body).count;
console.log(`📊 Total operations: ${count}`);
if (count === REQUEST_COUNT) {
break;
}
await sleep(1);
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps overkill, but this is incase one of the threads hasn't finished reporting by the time teardown is run.

Repository owner deleted a comment from charlypoly Oct 18, 2024
@cassidycodes cassidycodes force-pushed the fix-k6-tests branch 2 times, most recently from e837e3d to 4bc9d6b Compare October 18, 2024 13:21
Repository owner deleted a comment from charlypoly Oct 18, 2024
Repository owner deleted a comment from charlypoly Oct 18, 2024
Repository owner deleted a comment from charlypoly Oct 18, 2024
Comment on lines +28 to +40
"http_req_duration{hive:enabled}": [
{
// Set a threshold that works consistently on GitHub Actions runners
threshold: "p(95)<50",
abortOnFail: true,
},
],
"http_req_duration{hive:disabled}": [
{
threshold: "p(95)<50",
abortOnFail: false,
},
],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a few tests I found that a P95 of 50 ms reliably passed on GitHub actions. The test will fail if hive:enabled crosses the threshold.

uses: ruby/setup-ruby@2a9a743e19810b9f3c38060637daf594dbd7b37f
with:
working-directory: ./k6/graphql-api
ruby-version: "3.3"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you mind adding a .ruby-version file instead? so that all our CI / localdev tools willl just rely on a single source of truth for this?

It looks like setting working-directory will dictate where setup-ruby checks for the .ruby-version file, but I believe we should be able to just add a symlink at ./k6/graphql-api/.ruby-version to point to the root .ruby-version file

- name: Use Node
uses: actions/setup-node@v4
with:
node-version: 22
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe same here? just with node-version-file instead?

README.md Outdated Show resolved Hide resolved
| reporting.author | Author of the latest change | Author of the latest change | |
| reporting.commit | Git SHA or any identifier | git sha or any identifier | |
| reporting.service_name | (Optional) Name of the service | | |
| reporting.service_url | (Optional) URL of the service | | |
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using something to generate this table? seems like it would be a bit of a chore to update in the future but nbd

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I converted the former Ruby Hash to this table manually. I thought this was easier to read in the rendered markdown. I can revert this, though, if you prefer the Ruby Hash version.

README.md Outdated
```sh
git clone https://github.com/your-username/your-repo-name.git
cd your-repo-name
```
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

README.md Show resolved Hide resolved
@@ -2,6 +2,6 @@

module Graphql
module Hive
VERSION = "0.4.3"
VERSION = "0.4.4"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mentioned we didn't want to create a new release for this, so should we maintain the current version number?

cassidycodes and others added 3 commits October 21, 2024 09:41
Co-authored-by: Ryan Perry-Nguyen <[email protected]>
Co-authored-by: Ryan Perry-Nguyen <[email protected]>
This reverts commit cb40842.
@cassidycodes
Copy link
Collaborator Author

Closing this PR as the k6 tests tend to be flaky and aren't the best way of testing either benchmarks in CI or integration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants