Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Update] Documentation on high availability for distributed hypertables #513

Open
7 tasks
phemmer opened this issue Oct 3, 2020 · 0 comments
Open
7 tasks

Comments

@phemmer
Copy link

phemmer commented Oct 3, 2020

Add the appropriate label --> 2.0

Describe the update
Need documentation explaining how to properly achieve high availability when using distributed hypertables.
Questions/items to be addressed:

  • What is the impact of using async replication for the access node? Meaning what happens when a dirty failover is performed, and the new access node is running with out-of-date information regarding the state of the data nodes?
  • What is the overhead of using sync replication on the access node? Since all data is sent to the data nodes, and only metadata is stored on the access node, does using sync replication result in near negligible performance impact? Basically only when new chunks are created?
  • Can the access node replica be used for read queries?
  • Since the documentation for create_distributed_hypertable() mentions that replication_factor should not be used, do we have to use full replicas of the data nodes?
  • Document that there is currently no way to rebalance data when a new data node is added. A workaround is to cordon the old nodes until the new node catches up in terms of data volume, but this makes the new node a hot spot.

Items not currently supported, but to be documented when they are:

  • How is high availability achieved with create_distributed_hypertable(replication_factor => N)?
  • How do we rebalance data after adding a new data node? (remove the note from above regarding the lack of rebalance)

Location(s)

https://docs.timescale.com/beta-v2.0.0/getting-started/setup-multi-node
and/or
https://docs.timescale.com/beta-v2.0.0/tutorials/clustering

How soon is this needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant