Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: Cost Attribution Proposal 1.2 #9733

Draft
wants to merge 34 commits into
base: main
Choose a base branch
from
Draft

Conversation

ying-jeanne
Copy link
Contributor

@ying-jeanne ying-jeanne commented Oct 24, 2024

What this PR does

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.


// we need the time stamp, since active series could have entered active stripe long time ago, and already evicted
// from the observed map but still in the active Stripe
func (t *Tracker) DecrementActiveSeries(lbs labels.Labels, value int64, ts time.Time) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here we might still want ts, when we decrement a active series counter that are "active" we may want to keep the timestamp updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we want to update the timestamp of something that didn't have activity?

cmd/mimir/help-all.txt.tmpl Outdated Show resolved Hide resolved
pkg/ingester/activeseries/active_series.go Outdated Show resolved Hide resolved
pkg/ingester/activeseries/active_series.go Outdated Show resolved Hide resolved
pkg/ingester/activeseries/active_series.go Outdated Show resolved Hide resolved
pkg/ingester/activeseries/active_series.go Show resolved Hide resolved
pkg/ingester/ingester.go Outdated Show resolved Hide resolved
pkg/ingester/user_tsdb.go Outdated Show resolved Hide resolved
pkg/mimir/mimir.go Outdated Show resolved Hide resolved
pkg/mimir/mimir.go Outdated Show resolved Hide resolved
pkg/util/validation/limits.go Outdated Show resolved Hide resolved
@colega
Copy link
Contributor

colega commented Oct 25, 2024

I reviewed mostly the existing code changes, skipping the costattribution package implementation for now.

Please ping me back when this is addressed.

Meanwhile, while I didn't look into Tracker implementation, since it's a hot path, I wonder if we could replace the prometheus.Gauge implementation by a custom map. Of course we'll need to benchmark this.

"go.uber.org/atomic"
)

type Tracker interface {
Copy link
Contributor

@colega colega Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need an interface here, it just makes things more complex and slow. Now we can't compare two trackers by just comparing the pointers, and we expose too many internal details on exported method just to make that comparison.

There are only two implementations of Tracker: the one that does things and the one that is disabled, we can make the disabled one be the nil pointer of TrackerImpl (please rename that back to Tracker).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the interface, and it checks for nil pointer in the receiver instead. commit 7f0b372

Comment on lines 129 to 134
func (c *ActiveSeries) ConfigDiffers(ctCfg asmodel.CustomTrackersConfig, caCfg costattribution.Tracker) bool {
if ctCfg.String() != c.CurrentConfig().String() {
return true
}
return !areTrackersEqual(caCfg, c.CurrentCostAttributionTracker())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have ConfigDiffers now, we don't need CurrentConfig or CurrentCostAttributionTracker methods, and we don't need to take mutex twice for them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ying-jeanne
Copy link
Contributor Author

ying-jeanne commented Nov 19, 2024

Based on the benchmark results, the successfully pushed case shows a significant increase in allocations, which is expected due to tracking counters per cost label value, previously there is only one counter for received sample tracking. In discarded cases, differences are minimal since counters per reason were already tracked, with a notable percentage of memory consumption. The additional allocations primarily stem from tracking cost attribution-specific data.

BenchmarkDistributor_Push/all_samples_successfully_pushed-8      10           2432680 ns/op          212628 B/op         98 allocs/op
BenchmarkDistributor_Push/all_samples_successfully_pushed_with_cost_attribution_enabled-8    10           4021089 ns/op          287027 B/op        3110 allocs/op
BenchmarkDistributor_Push/too_many_labels_limit_reached-8          10          12637491 ns/op         1218316 B/op        5069 allocs/op
BenchmarkDistributor_Push/too_many_labels_limit_reached_with_cost_attribution_enabled-8         10           8159270 ns/op         1351149 B/op        9080 allocs/op
BenchmarkDistributor_Push/all_samples_go_to_metric_relabel_configs-8         10           4029890 ns/op         1414630 B/op       7099 allocs/op
BenchmarkDistributor_Push/max_label_value_length_limit_reached-8           10           7136839 ns/op         1234532 B/op        6069 allocs/op
BenchmarkDistributor_Push/timestamp_too_new-8          10           2085542 ns/op          345832 B/op        4069 allocs/op
BenchmarkDistributor_Push/ingestion_rate_limit_reached-8           10           1061883 ns/op           21572 B/op          59 allocs/op
BenchmarkDistributor_Push/max_label_name_length_limit_reached-8        10           4094097 ns/op         1150353 B/op        5067 allocs/op

@ying-jeanne
Copy link
Contributor Author

The result of ingester push benchmark, the impact is less compare to distributor.

BenchmarkIngesterPush/ingester_push_succeeded-8                       10             49926 ns/op            9944 B/op         37 allocs/op
BenchmarkIngesterPush/ingester_push_succeeded_with_cost_attribution_enabled-8                 10             52132 ns/op           10514 B/op         51 allocs/op

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants