-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for switchover (controlled failover) #235
Comments
Agreed, this would be a great feature to enable upgrading of hardware and software other than 'kubectl delete pod' or however you'd force it now |
@kagkarlsson the main question is how to do this. Now it works with synchronous replication because it's one of its guarantees. There's no guarantee that data will be lost with asynchronous replication. To achieve this you have to stop every write transaction on the master and be sure that the standby you are switching to is synced. For doing this, assuming that all the transactions will flow through the stolon proxy, you should shutdown the proxies (or tell them to not accept new connection and close the current ones). This could be implemented (and we'll happily accept and review PR). But, instead of doing all of this, that will cause also more downtime, why not just temporarily enabling synchronous replication and stop the current master? This should already be possible right now: https://github.com/sorintlab/stolon/blob/master/doc/syncrepl.md. Thoughts? |
@sgotti That's a nice solution! Maybe it should be documented somewhere as the recommended procedure until it can be automatically done with a command. |
@albertvaka Would you like to create a PR to document this? |
This sounds like an acceptable solution @sgotti . |
Yeah, the synchronous replication happens when issuing a commit. For example, if you don't have synchronous replication enabled, BEGIN a new transaction, do some work, then enable synchronous replication (wait for the master to have configured and applied its (BTW, while connections are usually pooled and so long living, transaction should be short lived). If you are interested on how postgres sync repl works take a look at its docs: and be aware of the explained limitations since the sync repl is optained just firstly commiting the transaction to the master wal and then waiting for the wal being replicated and applied by the standby. For example:
|
When they say
I suppose they are talking about a abrupt restart of the primary, not a graceful shutdown which would be done in a switchover scenario. I assume the primary wait for acknowledgements from standbys before shutting completely down.. |
I need this ASAP, so I'm working on implementing it. It's going to be roughly an equivalent of PgBouncer session pool mode:
As long as you keep connection close timeout longer than max connection lifetime on the client it should be completely seamless. Stretch goal: option to wait for replication lag to drop to 0 before resume. |
@c2h5oh glad you want to work on this. IMHO I still think that the solution here (temporarily enable sync repl) is one of the safer and faster. Is there something in that solution that won't work for you? About your idea:
Switching master will always forcibly close connection to old master.
Not sure why you want to pause and resume. Why you can't just stop accepting like done now when a new master is elected? The client will just receive a connection refused until the proxy is instructed with a new master. |
I'm essentially replicating PgBouncer functionality.
The changes (backwards compatible - I only added functions, see pull request) I've made to pollon give you an option let those connection finish. With optional timeout to put a cap on how long this is going to take and to handle clients with lazy disconnect for pooled connections (I'm looking at you golang's database/sql).
My goal is to provide a switchover that, when it works right, is completely transparent to clients and externally manifests as nothing more than slower connect time - with sufficiently low master election time it allows to continue as if nothing happened even with dumb clients lacking proper connection error recovery. One could also argue that in many situations this might be preferred option vs hammering the proxy with reconnect attempts - client is not cluster status aware so it will just blindly retry; stolon proxy on the other hand can let the connections through as soon as new master is elected.
Like with my pollon pull request I have no intention of changing the default behaviour - I'll be making it optional. |
I'm not sure it makes sense to replicate that functionality, PgBouncer has a very different purpose than the stolon proxy since it acts as a layer 7 proxy that talks the postgresql protocol and it's used as an intermediate connection pooling for non connection pooled clients, while the stolon proxy is just a layer4 proxy born with the intent to make client go to the right master (avoiding client connecting/keeping connections to an old partitioned master). When a new master is elected the sentinel will tell the proxies to close all the connections and stop accepting new ones and when the new master reports as promoted the sentinel will tell the proxies to accept connections and proxies them to the new master. Keeping a connection to an old master open for some time while we already have a new master promoted will cause very bad behaviors, for example:
When a new master is elected there's the need to immediately close connections to the old one, so there's not way to make it transparent, the connections to the old master MUST be closed. Keeping it open, as explained above, will defeat one of the main purpose of stolon: data consistency. So it cannot be transparent, the client needs to reconnect and the proxy will proxy that new connections to the new master. But, also making the stolon proxy a layer 7 proxy that talks the postgresql protocol decoupling clients from the db (adding a lot of logic behind, like closing prepared statement, saving and cleaning session data etc...) will not solve the problem. You'll always have to close connections to the old masters so nothing will really change and make it transparent. IMHO I think the current behavior is the unique possible independently from how a client behaves:
|
For manual switchover this is not mutually exclusive:
This is actually unrelated. Prepared statement cleanup is only performed when client is disconnected, but PgBouncer keeps the connection alive in the pool for future reuse. |
Not sure If I get the complete idea before looking at the implementation. Sorry if I miss this before but now it looks like automating this with the addition of a small pause (can't completely understand how this could be useful with pooled clients) and the proxy accepting incoming connection but keeping them queued until the new master is ready. Given the declarative nature of the cluster specification I'm not sure how to declare this "switchover" operation inside it. In addition it'll become quite complicated to manage all the states change of this workflow and recover from all the possible errors. I'll better see this as something on top of the sentinel logic like an external tool that will orchestrate this workflow (that could become a stolonctl command). |
Yes, that's exactly it.
Step 1: Declare cluster paused - all proxies should start queuing new connections To answer how is this useful to clients:
|
As far as I can see, stolon does not support a switchover operation, i.e. a controlled failover where there are strong guarantees that no data will be lost, even if using asynchronous replication. It could for example be used to do database minor-version updates or OS-updates.
Are there any plans for supporting this in the future?
The text was updated successfully, but these errors were encountered: