Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggressive retry on meta election failure #12228

Open
arkbriar opened this issue Sep 12, 2023 · 4 comments
Open

Aggressive retry on meta election failure #12228

arkbriar opened this issue Sep 12, 2023 · 4 comments
Assignees
Labels
type/bug Something isn't working
Milestone

Comments

@arkbriar
Copy link
Contributor

arkbriar commented Sep 12, 2023

Describe the bug

The meta aggressively retries on another election attempt when the previous one failed. Retry interval is ~50ms.

Error message/log

2023-09-12T07:28:47.536850637Z  INFO risingwave_meta::rpc::election_client: client risingwave-meta-0.risingwave-meta-headless.default.svc:5690 start election
2023-09-12T07:28:47.541331846Z ERROR risingwave_meta::rpc::server: election error happened, Election failed: grpc request error: status: Unavailable, message: "error trying to connect: dns error: failed to lookup address information: Name or service not known", details: [], metadata: MetadataMap { headers: {} }
2023-09-12T07:28:47.541353137Z  INFO risingwave_meta::rpc::election_client: client risingwave-meta-0.risingwave-meta-headless.default.svc:5690 start election
2023-09-12T07:28:47.547203804Z ERROR risingwave_meta::rpc::server: election error happened, Election failed: grpc request error: status: Unavailable, message: "error trying to connect: dns error: failed to lookup address information: Name or service not known", details: [], metadata: MetadataMap { headers: {} }
2023-09-12T07:28:47.547214846Z  INFO risingwave_meta::rpc::election_client: client risingwave-meta-0.risingwave-meta-headless.default.svc:5690 start election
2023-09-12T07:28:47.552142221Z ERROR risingwave_meta::rpc::server: election error happened, Election failed: grpc request error: status: Unavailable, message: "error trying to connect: dns error: failed to lookup address information: Name or service not known", details: [], metadata: MetadataMap { headers: {} }

To Reproduce

Install a RisingWave with helm.

helm repo add risingwavelabs https://risingwavelabs.github.io/helm-charts/
helm repo update
helm install --set wait=true risingwave risingwavelabs/risingwave

You can find the logs by

k logs risingwave-meta-0 -f

Expected behavior

There should be backoff to ensure the CPU won't be exhausted.

How did you deploy RisingWave?

Helm + Kubernetes

The version of RisingWave

dev=> select version();
                                  version
----------------------------------------------------------------------------
 PostgreSQL 9.5-RisingWave-1.2.0 (f27f085e4433eb4b3b9e8df61b3f94406f654d89)
(1 row)

The image is v1.2.0.

Additional context

No response

@arkbriar arkbriar added the type/bug Something isn't working label Sep 12, 2023
@github-actions github-actions bot added this to the release-1.3 milestone Sep 12, 2023
@fuyufjh fuyufjh modified the milestones: release-1.3, release-1.4 Oct 10, 2023
@yezizp2012 yezizp2012 assigned shanicky and unassigned yezizp2012 Oct 17, 2023
@fuyufjh fuyufjh modified the milestones: release-1.4, release-1.5 Nov 8, 2023
@fuyufjh
Copy link
Member

fuyufjh commented Nov 8, 2023

Ping, any updates?

@shanicky shanicky modified the milestones: release-1.5, release-1.6 Dec 4, 2023
@shanicky shanicky modified the milestones: release-1.6, release-1.7 Jan 10, 2024
@neverchanje
Copy link
Contributor

I also encountered this problem. Any solutions?

@fuyufjh
Copy link
Member

fuyufjh commented May 21, 2024

Does this problem recur now?

@arkbriar
Copy link
Contributor Author

I think it still exists.

let handle = tokio::spawn(async move {
while let Err(e) = election_client
.run_once(lease_interval_secs as i64, stop_rx.clone())
.await
{
tracing::error!(error = %e.as_report(), "election error happened");
}
});

@shanicky shanicky modified the milestones: release-1.10, release-1.11 Jul 10, 2024
@shanicky shanicky removed this from the release-2.2 milestone Dec 27, 2024
@shanicky shanicky added this to the release-2.3 milestone Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants