-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize TransportNodesAction to selectively avoid sending list of Di… #14531
Optimize TransportNodesAction to selectively avoid sending list of Di… #14531
Conversation
❌ Gradle check result for 9e66ed2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
} | ||
|
||
void start() { | ||
final DiscoveryNode[] nodes = request.concreteNodes(); | ||
logger.info(request.getClass().getSimpleName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would lead to heavy logging. Please remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure Vikas, I intended to work on the logs for testing. Will remove them prior to raising the PR.
@@ -234,14 +266,28 @@ class AsyncAction { | |||
this.request = request; | |||
this.listener = listener; | |||
if (request.concreteNodes() == null) { | |||
resolveRequest(request, clusterService.state()); | |||
assert request.concreteNodes() != null; | |||
if (optimizedTransportActionNames.contains(request.getClass().getSimpleName())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some tests for this? Also, have you executed all existing tests and made sure that they are passing after your change. If yes, then please confirm this in the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure that node left and node added scenarios are covered properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have modified the draft PR approach. Will make sure I add all the necessary tests covering different scenarios.
9e66ed2
to
4f6b6ed
Compare
❌ Gradle check result for 4f6b6ed: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
…s, NodesInfo and ClusterStats call Signed-off-by: Pranshu Shukla <[email protected]>
4f6b6ed
to
d2ef37d
Compare
❌ Gradle check result for d2ef37d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Pranshu Shukla <[email protected]>
❌ Gradle check result for 469cd6a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Pranshu Shukla <[email protected]>
❌ Gradle check result for 03c35df: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Pranshu Shukla <[email protected]>
❌ Gradle check result for b7b8754: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Pranshu Shukla <[email protected]>
❌ Gradle check result for 31f030c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Closing this PR in virtue of clone PR - #14749 |
Description
In the current implementation, every transport action extending TransportNodesAction includes all discovery nodes in the transport request sent to each node in the cluster. This approach leads to performance bottlenecks in large clusters due to redundant data transmission. Specifically:
This PR proposes a solution to address these bottlenecks by selectively skipping the resolution of concrete nodes at the transport layer. This optimization will significantly reduce redundant data transmission and improve overall performance in large clusters.
To do this, flag was introduced on the BaseNodeRequest Class
populateDiscoveryNodesInTransportRequest
. Requests in this flag overriden toFalse
will skip adding the concrete nodes data into the request. Currently it is only implemented in NodeStats, ClusterStats and NodeInfo Call from the rest layer.The perf gains are significant (tested in a large 1k node domain)
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.