-
Notifications
You must be signed in to change notification settings - Fork 14
Graceful restart
Fortunately, you already know how to implement graceful stop with this framework.
If don't, read Tutorial first.
In order to implement graceful restart of network servers, a process, named the master process here, keeps listening sockets and passes them to actual network server processes. When a restart is requested, the master process shuts down the current server gracefully and starts a new server process. Because listening sockets are kept between restarts, clients will not experience server failure.
well.Graceful
helps implementation of such servers.
It creates listening sockets and passes them to a child process that executes the same binary of the
master process. The child can identify itself not the master by a special environment variable
CYBOZU_LISTEN_FDS
. The child then uses listening sockets passed by the master to accept connections.
If SIGHUP signal is sent to the master, it immediately sends SIGTERM to the child process. The child process closes listening sockets as soon as it gets SIGTERM thanks to the framework. The master can therefore starts a new child process soon after SIGTERM. This way, clients will not wait long while the server is restarting.
Let's see how to implement HTTP server that can be restarted gracefully.
A simple HTTP server can be implemented like this:
package main
import (
"flag"
"net"
"net/http"
"time"
"github.com/cybozu-go/log"
"github.com/cybozu-go/well"
)
// serve is called in child processes.
func serve(listeners []net.Listener) {
// well.HTTPServer implements graceful stoppable HTTP server.
s := &well.HTTPServer{
Server: &http.Server{
Handler: http.FileServer(http.Dir("/path/to/files")),
},
}
for _, ln := range listeners {
s.Serve(ln)
}
err := well.Wait()
if err != nil && !well.IsSignaled(err) {
log.ErrorExit(err)
}
}
func listen() ([]net.Listener, error) {
ln, err := net.Listen("tcp", ":8080")
if err != nil {
return nil, err
}
return []net.Listener{ln}, nil
}
func main() {
flag.Parse()
well.LogConfig{}.Apply()
g := &well.Graceful{
Listen: listen,
Serve: serve,
ExitTimeout: 30 * time.Second,
}
g.Run()
// g.Run() only returns in the main process.
err := well.Wait()
if err != nil && !well.IsSignaled(err) {
log.ErrorExit(err)
}
}
You will notice that logs from this program has "pid" field. It is the process ID of the current child process.
2016-08-27T08:22:08.342472Z localhost prog info: "well: new child" pid=17082
(snip)
2016-08-27T08:22:08.342934Z localhost prog info: "well: waiting for all goroutines to complete"
2016-08-27T08:22:08.343044Z localhost prog warning: "well: got signal" signal="terminated" pid=17082
2016-08-27T08:22:08.343148Z localhost prog info: "well: waiting for all goroutines to complete" pid=17082
The current implementation has several limitations. Easy workaround for them is to stop and start the server.
This is because the master process works as a log server for children, and the master process is not affected by graceful restarts.
You may implement your own logic to handle SIGHUP and open other log files in the master process as follows:
func main() {
// ...
g.Run()
go func() {
sighupCh := make(chan os.Signal, 2)
signal.Notify(sighupCh, syscall.SIGHUP)
for _ := range sighupCh {
logger := log.DefaultLogger()
newOutput := ...
logger.SetOutput(newOutput)
}
}()
}
This is because the master process starts listening and is not affected by graceful restarts.
The current implementation does not support Windows.
However, programs using well.Graceful
can be compiled and run on Windows
because the framework provides a dummy implementation.
well.SystemdListeners
can be used for Graceful.Listen
function. If used, the program can receive
listeners from systemd.
well.SystemdListeners
can be used for mostly graceful restart of servers. Unlike
well.Graceful
, systemd wait for
the current process to terminate before starting the next process. This is good and bad:
Good points:
- No two processes run concurrently. This is safer.
- No limitations to logging etc.
Bad points:
- If the current process takes long to terminate, clients experience service fault.
- You need to kill the process in such cases.
Programs that implement graceful restart with systemd should therefore have timeouts for quick exit.
Example:
import (
"flag"
"net/http"
"time"
"github.com/cybozu-go/log"
"github.com/cybozu-go/well"
)
func main() {
flag.Parse()
err := well.LogConfig{}.Apply()
if err != nil {
log.ErrorExit(err)
}
s := &well.HTTPServer{
Server: &http.Server{
Handler: ...
},
// For quick exit, wait timeouts in 10 seconds.
ShutdownTimeout: 10 * time.Second,
}
listeners, err := well.SystemdListeners()
if err != nil {
log.ErrorExit(err)
}
for _, ln := range listeners {
s.Serve(ln)
}
// well.Wait waits no longer than 10 seconds.
err = well.Wait()
if err != nil && !well.IsSignaled(err) {
log.ErrorExit(err)
}
}