Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[weekly-r317] PostingsForLabelMatching: check for waiting writers #760

Conversation

colega
Copy link
Contributor

@colega colega commented Nov 20, 2024

Cherry-pick of #757

There's still some contention on MemPostings.mtx, I checked goroutines
from a profile at a spiking mutex wait time moment, I filtered by
`MemPostings` and all I see is:

- 4 goroutines waiting on `p.mtx.RLock()` in `LabelValues`
- 203 goroutines waiting on `p.mtx.RLock()` in `Get`
- 6 goroutines waiting on `p.mtx.Lock()` in `Add`
- 4 goroutines waiting on first `p.mtx.RLock` in `PostingsForLabelMatching`
- 2 goroutines in `for _, v := range vals` loop in `PostingsForLabelMatching`

So, what I undersand is going on here:
- The last two goroutines were able to enter the `PostingsForLabelMatching`
  second mutex, so they're creating ListPostings as there were no
  tomorrow (I'll send a separate PR to Prometheus to avoid creating one
  pointer for each one there)
- The six goroutines waiting on `Add` are blocking the reset 200+ read
  goroutines.

So, here we have 2 read goroutines blocking 200 goroutines (plus six
read ones, if someone cares).

What this change proposes is to check every few iterations (1024,
because I expect this loop to go quite fast) whether there's a Lock()
waiting. If there is one, we'll stop, unlock, and let it happen, because
if we're being selfish, no other reader can get in here.

Signed-off-by: Oleg Zaytsev <[email protected]>
@colega colega marked this pull request as draft November 20, 2024 16:39
@colega colega closed this Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant