Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pyroscope/scrape): add support for configuring CPU profile's duration scraped by pyroscope.scrape #591

Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ Main (unreleased)
- In `prometheus.exporter.kafka`, the interpolation table used to compute estimated lag metrics is now pruned
on `metadata_refresh_interval` instead of `prune_interval_seconds`. (@wildum)

- Add support for configuring CPU profile's duration scraped by `pyroscope.scrape`. (@hainenber)

### Bugfixes

- Fixed issue with defaults for Beyla component not being applied correctly. (marctc)
Expand Down
8 changes: 6 additions & 2 deletions docs/sources/reference/components/pyroscope.scrape.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ Name | Type | Description
`params` | `map(list(string))` | A set of query parameters with which the target is scraped. | | no
`scrape_interval` | `duration` | How frequently to scrape the targets of this scrape configuration. | `"15s"` | no
`scrape_timeout` | `duration` | The timeout for scraping targets of this configuration. Must be larger than `scrape_interval`. | `"18s"` | no
`profiling_duration`| `duration` | The duration for a delta profiling to be scraped. Must be larger than 1 second. | `"14s"` | no
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, one last thing. Could we name this with a little more descriptive name?

Suggested change
`profiling_duration`| `duration` | The duration for a delta profiling to be scraped. Must be larger than 1 second. | `"14s"` | no
`delta_profiling_duration`| `duration` | The duration for a delta profiling to be scraped. Must be larger than 1 second. | `"14s"` | no

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this in HEAD. PTAL

`scheme` | `string` | The URL scheme with which to fetch metrics from targets. | `"http"` | no
`bearer_token_file` | `string` | File containing a bearer token to authenticate with. | | no
`bearer_token` | `secret` | Bearer token to authenticate with. | | no
Expand Down Expand Up @@ -395,8 +396,11 @@ When the `delta` argument is `false`, the [pprof][] HTTP query will be instantan
When the `delta` argument is `true`:
* The [pprof][] HTTP query will run for a certain amount of time.
* A `seconds` parameter is automatically added to the HTTP request.
* The `seconds` used will be equal to `scrape_interval - 1`.
For example, if `scrape_interval` is `"15s"`, `seconds` will be 14 seconds.
* The default value for the `seconds` query parameter is `scrape_interval - 1`.
If you set `profiling_duration`, then `seconds` is assigned the same value as `profiling_duration`.
However, the `profiling_duration` cannot be larger than `scrape_interval`.
For example, if you set `scrape_interval` to `"15s"`, then `seconds` defaults to `14s`
If you set `profiling_duration` to `16s`, then `scrape_interval` must be set to at least `17s`.
If the HTTP endpoint is `/debug/pprof/profile`, then the HTTP query will become `/debug/pprof/profile?seconds=14`

## Exported fields
Expand Down
41 changes: 27 additions & 14 deletions internal/component/pyroscope/scrape/scrape.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,17 @@ import (
)

const (
pprofMemory string = "memory"
pprofBlock string = "block"
pprofGoroutine string = "goroutine"
pprofMutex string = "mutex"
pprofProcessCPU string = "process_cpu"
pprofFgprof string = "fgprof"
pprofGoDeltaProfMemory string = "godeltaprof_memory"
pprofGoDeltaProfBlock string = "godeltaprof_block"
pprofGoDeltaProfMutex string = "godeltaprof_mutex"
pprofMemory string = "memory"
pprofBlock string = "block"
pprofGoroutine string = "goroutine"
pprofMutex string = "mutex"
pprofProcessCPU string = "process_cpu"
pprofFgprof string = "fgprof"
pprofGoDeltaProfMemory string = "godeltaprof_memory"
pprofGoDeltaProfBlock string = "godeltaprof_block"
pprofGoDeltaProfMutex string = "godeltaprof_mutex"
defaultScrapeInterval time.Duration = 15 * time.Second
defaultProfilingDuration time.Duration = defaultScrapeInterval - 1*time.Second
)

func init() {
Expand Down Expand Up @@ -60,6 +62,8 @@ type Arguments struct {
ScrapeTimeout time.Duration `alloy:"scrape_timeout,attr,optional"`
// The URL scheme with which to fetch metrics from targets.
Scheme string `alloy:"scheme,attr,optional"`
// The duration for a profile to be scrapped.
ProfilingDuration time.Duration `alloy:"profiling_duration,attr,optional"`

// todo(ctovena): add support for limits.
// // An uncompressed response body larger than this many bytes will cause the
Expand Down Expand Up @@ -192,11 +196,12 @@ var DefaultArguments = NewDefaultArguments()
// NewDefaultArguments create the default settings for a scrape job.
func NewDefaultArguments() Arguments {
return Arguments{
Scheme: "http",
HTTPClientConfig: component_config.DefaultHTTPClientConfig,
ScrapeInterval: 15 * time.Second,
ScrapeTimeout: 10 * time.Second,
ProfilingConfig: DefaultProfilingConfig,
Scheme: "http",
HTTPClientConfig: component_config.DefaultHTTPClientConfig,
ScrapeInterval: 15 * time.Second,
ScrapeTimeout: 10 * time.Second,
ProfilingConfig: DefaultProfilingConfig,
ProfilingDuration: defaultProfilingDuration,
}
}

Expand All @@ -218,6 +223,14 @@ func (arg *Arguments) Validate() error {
if target.Enabled && target.Delta && arg.ScrapeInterval.Seconds() < 2 {
return fmt.Errorf("scrape_interval must be at least 2 seconds when using delta profiling")
}
if target.Enabled && target.Delta {
if arg.ProfilingDuration.Seconds() <= 1 {
return fmt.Errorf("profiling_duration must be larger than 1 second when using delta profiling")
}
if arg.ProfilingDuration.Seconds() >= arg.ScrapeInterval.Seconds() {
return fmt.Errorf("profiling_duration must be smaller than scrape_interval when using delta profiling")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this condition make sure that DeltaProfilingDuration is not only smaller, but also at least a second smaller?

}
}

// We must explicitly Validate because HTTPClientConfig is squashed and it won't run otherwise
Expand Down
23 changes: 22 additions & 1 deletion internal/component/pyroscope/scrape/scrape_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,26 @@ func TestUnmarshalConfig(t *testing.T) {
`,
expectedErr: "scrape_interval must be at least 2 seconds when using delta profiling",
},
"invalid cpu profiling_duration": {
in: `
targets = []
forward_to = null
scrape_timeout = "1s"
scrape_interval = "10s"
profiling_duration = "1s"
`,
expectedErr: "profiling_duration must be larger than 1 second when using delta profiling",
},
"erroneous cpu profiling_duration": {
in: `
targets = []
forward_to = null
scrape_timeout = "1s"
scrape_interval = "10s"
profiling_duration = "12s"
`,
expectedErr: "profiling_duration must be smaller than scrape_interval when using delta profiling",
},
"allow short scrape_intervals without delta": {
in: `
targets = []
Expand All @@ -175,7 +195,8 @@ func TestUnmarshalConfig(t *testing.T) {
targets = []
forward_to = null
scrape_timeout = "5s"
scrape_interval = "2s"
scrape_interval = "3s"
profiling_duration = "2s"
bearer_token = "token"
bearer_token_file = "/path/to/file.token"
`,
Expand Down
6 changes: 5 additions & 1 deletion internal/component/pyroscope/scrape/target.go
Original file line number Diff line number Diff line change
Expand Up @@ -406,7 +406,11 @@ func targetsFromGroup(group *targetgroup.Group, cfg Arguments, targetTypes map[s
}

if pcfg, found := targetTypes[profType]; found && pcfg.Delta {
params.Add("seconds", strconv.Itoa(int((cfg.ScrapeInterval)/time.Second)-1))
seconds := (cfg.ScrapeInterval)/time.Second - 1
if cfg.ProfilingDuration != defaultProfilingDuration {
seconds = (cfg.ProfilingDuration) / time.Second
}
params.Add("seconds", strconv.Itoa(int(seconds)))
}
targets = append(targets, NewTarget(lbls, origLabels, params))
}
Expand Down
63 changes: 63 additions & 0 deletions internal/component/pyroscope/scrape/target_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import (
"net/url"
"sort"
"testing"
"time"

"github.com/prometheus/common/model"
"github.com/prometheus/prometheus/discovery/targetgroup"
Expand Down Expand Up @@ -236,3 +237,65 @@ func Test_targetsFromGroup(t *testing.T) {
require.Equal(t, expected, active)
require.Empty(t, dropped)
}

func Test_targetsFromGroup_withSpecifiedProfilingDuration(t *testing.T) {
args := NewDefaultArguments()
args.ProfilingConfig.Block.Enabled = false
args.ProfilingConfig.Goroutine.Enabled = false
args.ProfilingConfig.Mutex.Enabled = false
args.ProfilingDuration = 20 * time.Second

active, dropped, err := targetsFromGroup(&targetgroup.Group{
Targets: []model.LabelSet{
{model.AddressLabel: "localhost:9090"},
},
Labels: model.LabelSet{
"foo": "bar",
},
}, args, args.ProfilingConfig.AllTargets())
expected := []*Target{
// unspecified
NewTarget(
labels.FromMap(map[string]string{
model.AddressLabel: "localhost:9090",
serviceNameLabel: "unspecified",
model.MetricNameLabel: pprofMemory,
ProfilePath: "/debug/pprof/allocs",
model.SchemeLabel: "http",
"foo": "bar",
"instance": "localhost:9090",
}),
labels.FromMap(map[string]string{
model.AddressLabel: "localhost:9090",
model.MetricNameLabel: pprofMemory,
ProfilePath: "/debug/pprof/allocs",
model.SchemeLabel: "http",
"foo": "bar",
}),
url.Values{}),
NewTarget(
labels.FromMap(map[string]string{
model.AddressLabel: "localhost:9090",
serviceNameLabel: "unspecified",
model.MetricNameLabel: pprofProcessCPU,
ProfilePath: "/debug/pprof/profile",
model.SchemeLabel: "http",
"foo": "bar",
"instance": "localhost:9090",
}),
labels.FromMap(map[string]string{
model.AddressLabel: "localhost:9090",
model.MetricNameLabel: pprofProcessCPU,
ProfilePath: "/debug/pprof/profile",
model.SchemeLabel: "http",
"foo": "bar",
}),
url.Values{"seconds": []string{"20"}}),
}

require.NoError(t, err)
sort.Sort(Targets(active))
sort.Sort(Targets(expected))
require.Equal(t, expected, active)
require.Empty(t, dropped)
}
Loading