Skip to content
This repository has been archived by the owner on Mar 31, 2023. It is now read-only.

Graceful restart

Yamamoto, Hirotaka edited this page Oct 21, 2018 · 14 revisions

Graceful restart cannot be perfect without graceful stop.

Fortunately, you already know how to implement graceful stop with this framework.
If don't, read Tutorial first.

How it works

In order to implement graceful restart of network servers, a process, named the master process here, keeps listening sockets and passes them to actual network server processes. When a restart is requested, the master process shuts down the current server gracefully and starts a new server process. Because listening sockets are kept between restarts, clients will not experience server failure.

well.Graceful helps implementation of such servers. It creates listening sockets and passes them to a child process that executes the same binary of the master process. The child can identify itself not the master by a special environment variable CYBOZU_LISTEN_FDS. The child then uses listening sockets passed by the master to accept connections.

If SIGHUP signal is sent to the master, it immediately sends SIGTERM to the child process. The child process closes listening sockets as soon as it gets SIGTERM thanks to the framework. The master can therefore starts a new child process soon after SIGTERM. This way, clients will not wait long while the server is restarting.

Example HTTP server

Let's see how to implement HTTP server that can be restarted gracefully.
A simple HTTP server can be implemented like this:

package main

import (
    "flag"
    "net"
    "net/http"
    "time"

    "github.com/cybozu-go/log"
    "github.com/cybozu-go/well"
)

// serve is called in child processes.
func serve(listeners []net.Listener) {
    // well.HTTPServer implements graceful stoppable HTTP server.
    s := &well.HTTPServer{
        Server: &http.Server{
            Handler: http.FileServer(http.Dir("/path/to/files")),
        },
    }
    for _, ln := range listeners {
        s.Serve(ln)
    }

    err := well.Wait()
    if err != nil && !well.IsSignaled(err) {
        log.ErrorExit(err)
    }
}

func listen() ([]net.Listener, error) {
    ln, err := net.Listen("tcp", ":8080")
    if err != nil {
        return nil, err
    }
    return []net.Listener{ln}, nil
}

func main() {
    flag.Parse()
    well.LogConfig{}.Apply()

    g := &well.Graceful{
        Listen: listen,
        Serve: serve,
        ExitTimeout: 30 * time.Second,
    }
    g.Run()

    // g.Run() only returns in the main process.
    err := well.Wait()
    if err != nil && !well.IsSignaled(err) {
        log.ErrorExit(err)
    }
}

You will notice that logs from this program has "pid" field. It is the process ID of the current child process.

2016-08-27T08:22:08.342472Z localhost prog info: "well: new child" pid=17082
(snip)
2016-08-27T08:22:08.342934Z localhost prog info: "well: waiting for all goroutines to complete"
2016-08-27T08:22:08.343044Z localhost prog warning: "well: got signal" signal="terminated" pid=17082
2016-08-27T08:22:08.343148Z localhost prog info: "well: waiting for all goroutines to complete" pid=17082

Limitations

The current implementation has several limitations. Easy workaround for them is to stop and start the server.

Log file name cannot be changed.

This is because the master process works as a log server for children, and the master process is not affected by graceful restarts.

You may implement your own logic to handle SIGHUP and open other log files in the master process as follows:

func main() {
    // ...
    g.Run()
    go func() {
        sighupCh := make(chan os.Signal, 2)
        signal.Notify(sighupCh, syscall.SIGHUP)
        for _ := range sighupCh {
            logger := log.DefaultLogger()
            newOutput := ...
            logger.SetOutput(newOutput)
        }
    }()
}

Listening sockets cannot be changed.

This is because the master process starts listening and is not affected by graceful restarts.

Windows

The current implementation does not support Windows.

However, programs using well.Graceful can be compiled and run on Windows because the framework provides a dummy implementation.

Systemd socket activation

well.SystemdListeners can be used for Graceful.Listen function. If used, the program can receive listeners from systemd.

Mostly graceful restart using systemd

well.SystemdListeners can be used for mostly graceful restart of servers. Unlike well.Graceful, systemd wait for the current process to terminate before starting the next process. This is good and bad:

Good points:

  • No two processes run concurrently. This is safer.
  • No limitations to logging etc.

Bad points:

  • If the current process takes long to terminate, clients experience service fault.
  • You need to kill the process in such cases.

Programs that implement graceful restart with systemd should therefore have timeouts for quick exit.

Example:

import (
    "flag"
    "net/http"
    "time"

    "github.com/cybozu-go/log"
    "github.com/cybozu-go/well"
)

func main() {
    flag.Parse()
    err := well.LogConfig{}.Apply()
    if err != nil {
        log.ErrorExit(err)
    }

    s := &well.HTTPServer{
        Server: &http.Server{
            Handler: ...
        },

        // For quick exit, wait timeouts in 10 seconds.
        ShutdownTimeout: 10 * time.Second,
    }

    listeners, err := well.SystemdListeners()
    if err != nil {
        log.ErrorExit(err)
    }
    for _, ln := range listeners {
        s.Serve(ln)
    }

    // well.Wait waits no longer than 10 seconds.
    err = well.Wait()
    if err != nil && !well.IsSignaled(err) {
        log.ErrorExit(err)
    }
}