Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP-52526: rate limit event updates #6126

Merged
merged 7 commits into from
Dec 12, 2024

Conversation

edwintorok
Copy link
Contributor

We generate O(N^2) events when we update O(N) fields: each field update generates an event including the entire object, even if later we are going to change other fields of the same object.

Instead of returning the individual field update events immediately (and generating a storm of events whenever an API client watcher for VM power events), we batch these event updates by introducing a minimum amount of time that successive Event.from need to have between them.
(The client is working as expected here: when it gets an event and processes it, it immediately calls Event.from to get more events)

Although this doesn't guarantee to eliminate the O(N^2) problem, in practice it reduces the overhead significantly.

There is one case where we do want almost immediately notification of updates: task completions (because then the client likely wants to send us more tasks).

This PR makes the already existing rate limiting in Xapi_event consistent and configurable, but doesn't yet introduce a batching delay for Event.from (it does for Event.next, which is deprecated). A separate PR (or config change) can then enable this for testing purposes, but also allows us to roll the change back by changing the tunable in the config file.

There is also a new microbenchmark introduced here, I'll need to update that with the latest results.

@edwintorok edwintorok force-pushed the pr/CP-52526 branch 2 times, most recently from fc7f473 to 4e9d46c Compare November 21, 2024 18:07
ocaml/xapi/xapi_event.ml Outdated Show resolved Hide resolved
@edwintorok edwintorok marked this pull request as draft November 25, 2024 17:32
@edwintorok edwintorok force-pushed the pr/CP-52526 branch 3 times, most recently from 3bf2c7b to 4c2710a Compare December 10, 2024 20:39
ocaml/xapi-aux/throttle.ml Outdated Show resolved Hide resolved
@edwintorok edwintorok force-pushed the pr/CP-52526 branch 5 times, most recently from 5565ca5 to c2bcad3 Compare December 12, 2024 10:40
edwintorok added a commit that referenced this pull request Dec 12, 2024
Target: feature/perf

Note that this conflicts with
7e9310f?diff=unified&w=1,
but is a pre-requisite of
#6126
@edwintorok edwintorok force-pushed the pr/CP-52526 branch 2 times, most recently from 8824391 to 8bf9a0d Compare December 12, 2024 13:49
…like Event.{from,next}

No functional change

Signed-off-by: Edwin Török <[email protected]>
…ursive

No functional change.

Signed-off-by: Edwin Török <[email protected]>
Event.next is deprecated, but was allowed to use a lot more CPU in a tight loop
than Event.from.

No feature flag for this one, because Event.next is deprecated.

Signed-off-by: Edwin Török <[email protected]>
…are for task specific delays

Tasks are very small, and we need to react to them more quickly,
there is usually nothing to batch.

Refactor code and prepare for using different delays for tasks.
The delays are now configurable, but their default values are exactly the same as before.

Also in the future we should probably use monotonic clocks here, but I've kep t that code unchanged for now.

No functional change (except for configurability).

Signed-off-by: Edwin Török <[email protected]>
No feature flag, because this is a deprecated API.
Clients that wants the best performance should've used Event.from.

Signed-off-by: Edwin Török <[email protected]>
This delay was right after we waited for a new event, delaying all event
responses by 50ms (including task completions).
Eliminate the first delay, so that if we find the event we're looking after the
DB update, then we can return immediately.

On spurious wakeups (e.g. not the event we subscribed for) the delay is still
useful, so keep it for recursive calls after the first one, and exponentially
increase it up to the configured maximum.

No feature flag, this is a relatively small change, and we use exponential backoffs
elsewhere in XAPI already.

Signed-off-by: Edwin Török <[email protected]>
@edwintorok edwintorok marked this pull request as ready for review December 12, 2024 13:53
Give an opportunity for more fields to be filled, e.g. when waiting for a task
to complete, give a chance for the task to actually run.

No feature flag, it only changes timing.

Signed-off-by: Edwin Török <[email protected]>
@edwintorok edwintorok merged commit 8a427b9 into xapi-project:feature/perf Dec 12, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants