Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: make endpoint report the upstream status code #5510

Merged
merged 19 commits into from
Nov 27, 2024

Conversation

btasker
Copy link
Contributor

@btasker btasker commented Nov 21, 2024

endpoint() populates the column _sent based on whether upstream returned a 200, however it doesn't report the status code to the caller.

endpoint is used (amongst other things) in 2.x's Notification Rules - not having the status code available can make it quite difficult to troubleshoot why a notification wasn't sent.

With this change, endpoint will include an additional column: status_code

In the context of a notification rule, this will result in entries in _monitoring/notifications carrying the tag status_code, making it possible to graph out notifications by the upstream status code:

from(bucket: "_monitoring")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "notifications")
  |> filter(fn: (r) => exists r["status_code"])
  |> group(columns: ["check_id", "status_code"])
  |> aggregateWindow(every: v.windowPeriod, fn: count)

Note: the addition of this column could cause issues where the result is written into a bucket with an explicit schema.

However, explicit schemas are only available on Cloud 2 - the query log has been checked for any queries which could result in issues, with none identified.

Checklist

Dear Author 👋, the following checks should be completed (or explicitly dismissed) before merging.

  • ✏️ Write a PR description, regardless of triviality, to include the value of this PR
  • 🔗 Reference related issues
  • 🏃 Test cases are included to exercise the new code
  • 🧪 If new packages are being introduced to stdlib, link to Working Group discussion notes and ensure it lands under experimental/
  • 📖 If language features are changing, ensure docs/Spec.md has been updated

Dear Reviewer(s) 👋, you are responsible (among others) for ensuring the completeness and quality of the above before approval.

@btasker btasker requested review from a team as code owners November 21, 2024 16:25
@btasker btasker requested review from sanderson and removed request for a team November 21, 2024 16:25
@btasker btasker self-assigned this Nov 21, 2024
Copy link
Contributor

@mhilton mhilton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me, but can we just change the standard lib like this?

@sanderson
Copy link
Contributor

To @mhilton's point, we might want to add this to all of the *.endpoint() functions in the stdlib so that they're consistent.

@btasker
Copy link
Contributor Author

btasker commented Nov 21, 2024

Yeah that's a fair point - I did think it'd be worth adding to at least the Slack and PD ones, but hadn't done the same legwork on looking to see whether there are any unusual uses of them.

I don't love changing the return of something quite so core, but this is causing some pain.

The alternatives seem to be

  • Change the notification rule task generator to override endpoint in the tasks it generates - as well as making them all slightly less efficient it'd also be a bit of a pain to do
  • Add a new endpointWithStatus function which duplicates endpoint but includes status (and then update the notification rule generator etc etc).

The latter's doable, but feels a little OTT. On the other hand, it would also be safer.

I'll try have a play around tomorrow

@btasker
Copy link
Contributor Author

btasker commented Nov 26, 2024

we might want to add this to all of the *.endpoint() functions in the stdlib so that they're consistent.

Starting with this - my first run at adding a different function ran into some headaches, which'll only be bigger when moving onto some of the other functions.

Pre-scan Notes

  • pagerduty doesn't need updating, it already includes the status code as _status
  • discord , opsgenie, telegram, teams, webexteams, bigpanda, zenoss, servicenow, victorops, alerta, slack, pushbullet don't have it (the calculation of _sent there is a little different, need to understand why - I think it's just so that they catch any 2xx though)

As pagerduty already has _status, rather than changing that I'll update this PR to also use that and then add it to the others.

@btasker
Copy link
Contributor Author

btasker commented Nov 26, 2024

Looks like _status was added to PD for much the same reasons - #4796 - so, it makes sense to keep things consistent

) / 100,
),
return {r with _status: string(v: response),
_sent: string(v: 2 == response / 100),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious that make fmt didn't collapse this into a single line like it did with the others - that trailing comma maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no, it's line length - I've used a different variable name here. Correcting to make that consistent

@btasker
Copy link
Contributor Author

btasker commented Nov 26, 2024

Note: If the windows build commit above works I'll rebase it out and raise a fresh PR for that, that way we don't lose it if this change ends up having to be reverted for some reason

@btasker
Copy link
Contributor Author

btasker commented Nov 26, 2024

Alright, rebasing that back out and I'll try find time to prod at it separately.

@btasker btasker force-pushed the feat/http_endpoint_status_code branch from f6a5d74 to 1689526 Compare November 26, 2024 13:25
@mhilton
Copy link
Contributor

mhilton commented Nov 26, 2024

This can be merged even if the windows build isn't working. It is unrelated to this PR.

@btasker btasker changed the title feat(http): make endpoint report the upstream status code feat: make endpoint report the upstream status code Nov 26, 2024
stdlib/http/http.flux Outdated Show resolved Hide resolved
@btasker btasker merged commit 041192a into master Nov 27, 2024
8 of 9 checks passed
@btasker btasker deleted the feat/http_endpoint_status_code branch November 27, 2024 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants