Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove in memory sink from fanout and add latency sampling methods #28

Merged
merged 2 commits into from
Oct 11, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion store/iavl/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,7 @@ func (st *Store) Set(key, value []byte) {

// Implements types.KVStore.
func (st *Store) Get(key []byte) []byte {
defer telemetry.MeasureSince(time.Now(), "store", "iavl", "get")
defer telemetry.MeasureSinceWithSampling(time.Now(), 0.01, "store", "iavl", "get")
value, err := st.tree.Get(key)
if err != nil {
panic(err)
Expand Down
4 changes: 3 additions & 1 deletion telemetry/metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ func New(cfg Config) (_ *Metrics, rerr error) {
}()

m := &Metrics{memSink: memSink}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't create the fanout sink, you should be able to use a MetricSink object and either assign promSink or memSink to it.

fanout := metrics.FanoutSink{memSink}
fanout := metrics.FanoutSink{}

if cfg.PrometheusRetentionTime > 0 {
m.prometheusEnabled = true
Expand All @@ -112,6 +112,8 @@ func New(cfg Config) (_ *Metrics, rerr error) {
}

fanout = append(fanout, promSink)
} else {
fanout = append(fanout, memSink)
}

if _, err := metrics.NewGlobal(metricsConf, fanout); err != nil {
Expand Down
21 changes: 21 additions & 0 deletions telemetry/wrapper.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package telemetry

import (
"math/rand"
"time"

"github.com/armon/go-metrics"
Expand All @@ -19,6 +20,18 @@ func NewLabel(name, value string) metrics.Label {
return metrics.Label{Name: name, Value: value}
}

// ModuleMeasureSinceWithSampling samples latency metrics given the sample rate.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update comment to state sampleRate should be between [0, 1.0)

// This is intended to be used in hot code paths.
func ModuleMeasureSinceWithSampling(module string, start time.Time, sampleRate float64, keys ...string) {
if rand.Float64() < sampleRate {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of issues with sampling based upon a probabilistic rate since:

  • it usually requires a lot of tuning over time.
  • you can't be sure if the telemetry information is skewed due to low frequency of events or bad random number generator sequence.

Sampling per N units of time makes a lot more sense. For example only allow this metric to be sampled once every second. It would also be good to scale the sample based upon how many times it has occurred correctly but that seems like it would require changing the go-metrics package to take bulk updates for samples.

If you still want to go down this path then it would make sense to use a faster random number generator like https://github.com/flyingmutant/rand and also ensure that you perform the sampling before you compute the time which would require the sampling to happen in the callers method (which is annoying without macros).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on slack, we will go as is.

We should swap to use float32 as we don't need the extra precision that float64 provides and it takes less effort to RNG 32 bits.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it actually takes more effort because Float32 internally calls Float64 😅

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is terrible. Likely why other rand libraries are so common in golang.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to move the sampling check outside of the method so that we don't have to pay the cost of defer or time.Now(). That would also make this method pointless.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah maybe we should just do it that way. let me revert

metrics.MeasureSinceWithLabels(
keys,
start.UTC(),
append([]metrics.Label{NewLabel(MetricLabelNameModule, module)}, globalLabels...),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a label that states the sampling rate?

)
}
}

// ModuleMeasureSince provides a short hand method for emitting a time measure
// metric for a module with a given set of keys. If any global labels are defined,
// they will be added to the module label.
Expand Down Expand Up @@ -70,3 +83,11 @@ func SetGaugeWithLabels(keys []string, val float32, labels []metrics.Label) {
func MeasureSince(start time.Time, keys ...string) {
metrics.MeasureSinceWithLabels(keys, start.UTC(), globalLabels)
}

// MeasureSinceWithSampling provides a wrapper functionality for emitting a a time measure
// metric with sampling.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on comment.

func MeasureSinceWithSampling(start time.Time, sampleRate float64, keys ...string) {
if rand.Float64() < sampleRate {
metrics.MeasureSinceWithLabels(keys, start.UTC(), globalLabels)
}
}
Loading