Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

talos_cluster_health takes 5s to pass #145

Closed
zargony opened this issue Dec 29, 2023 · 4 comments
Closed

talos_cluster_health takes 5s to pass #145

zargony opened this issue Dec 29, 2023 · 4 comments

Comments

@zargony
Copy link

zargony commented Dec 29, 2023

I noticed (while trying around with #143), that even if the cluster is healthy, the talos_cluster_health check takes about 5s to pass. It seems like an unusual long delay during terraform apply. Is this intended behavior?

data.talos_cluster_health.cluster: Reading...
data.talos_cluster_health.cluster: Read complete after 5s [id=cluster_health]
@frezbo
Copy link
Member

frezbo commented Dec 29, 2023

Yes, it is, it does a multitude of checks which seems like expected check times.

@smira
Copy link
Member

smira commented Dec 29, 2023

There's one check we could probably improve on Talos side - that is specifically a check for the node to finish boot sequence, it's certainly not optimal.

@zargony
Copy link
Author

zargony commented Jan 22, 2024

Thanks for clarifying. I get that there's a multitude of checks to do to ensure that the cluster is up and running. It makes a lot sense waiting for everything to settle when the cluster is created. On the other hand, with a running cluster, this causes a 5s delay every time Terraform refreshes its state (i.e. every time you run plan or apply), which is a little tedious since basically all k8s resources depend on this health check. For me, running terraform refresh went up from 1.3 seconds to 6.2 seconds. It's probably no big deal for CD workflows, but kind of annoying for local development (it felt a lot quicker before talos_cluster_health was introduced). Maybe it would be worth introducing a quick check option in some way? E.g. talosctl dashboard comes up very quickly with the system being healthy or not.

@smira
Copy link
Member

smira commented Mar 6, 2024

Fixed in latest Talos 1.6.x and Talos 1.7+

@smira smira closed this as completed Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants