Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaxDataPoints consolidation improvements: support nudging for consistent bucketing #842

Merged
merged 2 commits into from
Sep 12, 2024

Conversation

npazosmendez
Copy link
Collaborator

@npazosmendez npazosmendez commented Sep 6, 2024

Brings in these features: grafana#146


NudgeStartTimeOnAggregation

I've described the purpose of this in the code. Here's a copy of the relevant parts:

// NudgeStartTimeOnAggregation enables nudging the start time of metrics
// when aggregated. The start time is nudged in such way that timestamps
// always fall in the same bucket. This is done by GraphiteWeb, and is
// useful to avoid jitter in graphs when refreshing the page.

[ ... ]

// nudgePointsCount returns the number of points to discard at the beginning of
// the series when aggregating. This is done if NudgeStartTimeOnAggregation is
// enabled, and has the purpose of assigning timestamps of a series to buckets
// consistently across different time ranges. To simplifiy the aggregation
// logic, we discard points at the beginning of the series so that a bucket
// starts right at the beginning. This function calculates how many points to
// discard.

TLDR: if the consolidation is done naively (as it is being done now) , the same data point can be assigned to different buckets depending on the start time of the series. This can cause jitter in graphs when refreshing constantly.

This is done by graphite web here: https://github.com/graphite-project/graphite-web/blob/dca59dc72ae28ffdae659232c10c60aa598536eb/webapp/graphite/render/views.py#L225

and by Metrictank here: https://github.com/grafana/metrictank/blob/309ba74749c4fd8c60950929ff0719af8abc0799/consolidation/consolidate.go#L26

My implementation and tests are mainly inspired on Metrictank.

UseBucketsHighestTimestampOnAggregation

Additionally, UseBucketsHighestTimestampOnAggregation let's the user configure the bucket's timestimes in such way that the upper bound is used instead of the lower bound. This was also done by Metrictank, and its purpose is to avoid results from appearing to predict the future (i.e. a spike in "real time" ts=6 could be moved to a ts=4).

Both of these features are disabled by default for now to avoid unexpected changes on upgrades.


Some videos showing this:

CarbonAPI with nudging disabled

carbonapi_no_nudge_.mp4

Notice the sudden change when the values change buckets

CarbonAPI with nudging enabled

carbonapi_nudge.mp4

Metrictank's version (which was my ground truth here)

mt_nudge.mp4

@npazosmendez npazosmendez marked this pull request as ready for review September 6, 2024 17:29
@Civil Civil added this pull request to the merge queue Sep 12, 2024
Merged via the queue into main with commit bec3bf8 Sep 12, 2024
6 checks passed
@Civil
Copy link
Member

Civil commented Oct 23, 2024

I'll flip those two flags to "true" by default in master, as overall idea is to be as compatible with graphite-web as feasible out of the box.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants