Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pyroscope/scrape): add support for configuring CPU profile's duration scraped by pyroscope.scrape #591

Conversation

hainenber
Copy link
Contributor

PR Description

Which issue(s) this PR fixes

Fixes #195

Notes to the Reviewer

PR Checklist

  • CHANGELOG.md updated
  • Documentation added
  • Tests updated

…ation scraped by `pyroscope.scrape`

Signed-off-by: hainenber <[email protected]>
@clayton-cornell clayton-cornell requested a review from a team April 16, 2024 17:27
@clayton-cornell
Copy link
Contributor

Docs here include suggestions from the Agent PR, so all good from a doc standpoint. Over to the @grafana/grafana-alloy-maintainers team for code review

@clayton-cornell clayton-cornell added the type/docs Docs Squad label across all Grafana Labs repos label Apr 16, 2024
Co-authored-by: Paschalis Tsilias <[email protected]>
Comment on lines 399 to 402
* The default value for the `seconds` query parameter is `scrape_interval - 1`.
If you set `profiling_duration`, then `seconds` is assigned the same value as `profiling_duration`.
For example, if you set `scrape_interval` to `"15s"`, then `seconds` defaults to `14s`.
If you set `profiling_duration` to `16s`, then `seconds` is set to `16s` regardless of the `scrape_interval` value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @simonswine @korniltsev does it make sense for the profiling_duration to ever be larger than the scrape_interval here? Or similar to the Prometheus scrape_timeout it should be always less than the scrape interval?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should always be less. If you try to profile a service which is already running a profile, you'll get an error

server returned HTTP status (500) Could not enable CPU profiling: cpu profiling already in use

I think we should either add some validation, or default back to min(scrape_interval - 1, profiling_duration)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense as well, yeah cc @hainenber

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the validation in the HEAD. Looks A-OK to you guys?

Copy link
Member

@tpaschalis tpaschalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is on a good path. I'd like some opinion from the Pyroscope folks on whether we can leave the profiling_value set to any as this would mean that we could effectively be always profiling multiple times in parallel which doesn't make much sense to me.

@rfratto rfratto added the backport-to-agent:no PR should NOT be backported to the agent repo. label Apr 23, 2024
@@ -65,6 +65,7 @@ Name | Type | Description
`params` | `map(list(string))` | A set of query parameters with which the target is scraped. | | no
`scrape_interval` | `duration` | How frequently to scrape the targets of this scrape configuration. | `"15s"` | no
`scrape_timeout` | `duration` | The timeout for scraping targets of this configuration. Must be larger than `scrape_interval`. | `"18s"` | no
`profiling_duration`| `duration` | The duration for a delta profiling to be scraped. Must be larger than 1 second. | `"14s"` | no
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, one last thing. Could we name this with a little more descriptive name?

Suggested change
`profiling_duration`| `duration` | The duration for a delta profiling to be scraped. Must be larger than 1 second. | `"14s"` | no
`delta_profiling_duration`| `duration` | The duration for a delta profiling to be scraped. Must be larger than 1 second. | `"14s"` | no

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this in HEAD. PTAL

Comment on lines 230 to 232
if arg.ProfilingDuration.Seconds() >= arg.ScrapeInterval.Seconds() {
return fmt.Errorf("profiling_duration must be smaller than scrape_interval when using delta profiling")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this condition make sure that DeltaProfilingDuration is not only smaller, but also at least a second smaller?

@simonswine
Copy link
Contributor

Thanks for the contribution, this is a very useful knob to be able to turn 🙂

I am happy with this to go forward, once all the points @tpaschalis has raised are addressed.

@hainenber hainenber requested a review from jdbaldry as a code owner May 1, 2024 10:53
@hainenber
Copy link
Contributor Author

Oops, git rebase sheenanigan 🤦

hainenber added 2 commits May 1, 2024 17:57
…ation scraped by `pyroscope.scrape`

Signed-off-by: hainenber <[email protected]>
…sible `delta_profiling_duration` + stricter validation

Signed-off-by: hainenber <[email protected]>
@hainenber hainenber force-pushed the support-for-user-configuration-of-pyroscope-scrape-seconds-query-params branch from 5be0053 to f3d89f6 Compare May 1, 2024 10:57
@julienvey
Copy link

@tpaschalis @simonswine Anything we can do to help move this forward ? I think @hainenber fixed all the previous comment

Copy link
Member

@tpaschalis tpaschalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for this and apologies for the belated review!

@tpaschalis tpaschalis enabled auto-merge (squash) May 22, 2024 08:28
@tpaschalis tpaschalis merged commit 583ad76 into grafana:main May 22, 2024
15 checks passed
@hainenber hainenber deleted the support-for-user-configuration-of-pyroscope-scrape-seconds-query-params branch May 22, 2024 13:11
@salarali
Copy link

Has this change been released? I don't see it in the docs or the changelogs.

@clayton-cornell
Copy link
Contributor

Has this change been released? I don't see it in the docs or the changelogs.

It is in the next release, so for docs, you find the new content in the "next" version of the pyroscope.scrape component reference documentation: https://grafana.com/docs/alloy/next/reference/components/pyroscope.scrape/#common-configuration When Alloy v1.2 is release, this doc change will show up in "latest".

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport-to-agent:no PR should NOT be backported to the agent repo. frozen-due-to-age type/docs Docs Squad label across all Grafana Labs repos
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a configuration option to pyroscope.scrape to define for how long a CPU profile should run
7 participants