-
Hooks have to be put into 'hooks/before' or 'hooks/after' now. Files located in 'hooks' directly are no longer supported and won't run. Checkout the hooks section below.
-
The config has recently changed, as we migrated to httparty. You now no longer set the API authentication within the base url. Checkout the example below or config/heartbeat.yml
Heartbeat is a rather simple daemon which pings a Hetzner Failover IP. If the Failover IP is down, Heartbeat will automatically try to set a new active server ip for the Failover IP through the API provided by Hetzner. For further details of Hetzner Failover IPs, please check
http://wiki.hetzner.de/index.php/Robot_Webservice
There are plenty of HA tools out there. However, i thought (and think!) they are too heavy-weight for this rather simple task. Thus, i wrote Heartbeat to automatically switch between load balancers and MySQL servers i run behind Hetzner Failover IPs in case one becomes down/unavailable.
We've used heartbeat for quite some time in production now.
Heartbeat uses plain-old ping's. Thus, it can only detect full crashes of your servers, where the server does no longer reply to a ping. However, other monitoring options will probably be added in the future.
A few words about Heartbeat's behaviour. By default, every 30 seconds, Heartbeat sends a ping to the Hetzner Failover IP. If Heartbeat does not receive an answer, it assumes that the server behind the Failover IP is down.
Since, Heartbeat uses plain-old ping's, be sure you can ping your servers before using Heartbeat!
When the Failover IP is down, Heartbeat will ask the Hetzner API for the current active server ip and looks up the ip in the list you've configured. Heartbeat then pings the next ip from the list until it can reach an ip or has to give up, because there are no remaining ips Heartbeat could try to reach. The order of the ip addresses within the config file determines which ip is tried out next. When the last ip of the list is reached, the first one is tried.
After Heartbeat switched to another active server ip by using Hetzner's API, Heartbeat will sleep for 300 seconds. This delay has been chosen to avoid switching to different ips too often. Heartbeat will as well sleep for 300 seconds if the Hetzner API call fails, because Heartbeat assumes that the server Heatbeat is running on is itself currently down or separated from the network in some way.
Heartbeat is written in ruby. Thus, you first have to install ruby, rubygems and bundler.
$ apt-get install ruby rubygems
$ gem install bundler
Afterwards, you need to install Heartbeat's dependencies:
$ cd /path/to/heartbeat
$ bundle
To configure the Hetzner API access, the Failover IP as well as your server's ip addresses, edit config/heartbeat.yml
base_url: https://robot-ws.your-server.de
basic_auth:
username: username
password: password
failover_ip: 0.0.0.0
ping_ip: 0.0.0.0
ips:
- ping: 10.0.0.1
target: 10.0.0.2
- ping: 10.0.1.1
target: 10.0.1.2
interval: 30
timeout: 10
tries: 3
The ping_ip
option is explained below in detail. The interval
option
specifies how long to sleep between the ping attempts. The timeout
specifies
the timeout to use for a ping and tries
specifies how many pings to send to
the ip which is about to be tested. Thus, by default, every 30 seconds heartbeat
sends a ping to ping_ip
. If heartbeat does not receive a response within 10
seconds, heartbeat tries again, 3 times. If the host is down for 30 seconds
(3 tries
* 10 seconds timeout
), heartbeat will consider the host to be down
and will begin to switch the failover ip to a new target.
Heartbeat provides an init script for Debian you can use to start Heartbeat at boot time. However, you have to symlink to it yourself. It is important to actually symlink to it. Otherwise, the init script can't find the location of your Heartbeat installation.
$ cd /etc/init.d
$ ln -s /path/to/heartbeat/bin/debian heartbeat
$ update-rc.d heartbeat defaults
Finally, you can start the daemon:
$ /etc/init.d/heartbeat start
Unless you run heartbeat on a hetzner machine that actually listens to your
Failover IP, you can simply use your Failover IP for the ping_ip
option.
Otherwise, assume, you have e.g., two load balancers and you want heartbeat to
run on each load balancer to monitor the state of the other one. In case one
load balancer crashes, you want heartbeat running on the other load balancer to
detect this and to switch the Failover IP to itself. However, as your load
balancers actually both listen to the Failover IP themeselves, heartbeat will
do nothing but monitor the indivual server it's running on (not the other one).
Thus, the ping_ip
option enables you to specfiy the exact ip you want to
monitor on a specific host, i.e. the individual IP of the other load balancer.
Example:
You have two load balancers 1.1.1.1
and 2.2.2.2
and a Failover IP
0.0.0.0
both load balancers are addtionally listening to.
On 1.1.1.1
, the respective parts of your heartbeat config would look like:
failover_ip: 0.0.0.0
ping_ip: 2.2.2.2
ips:
- ping: 1.1.1.1
target: 1.1.1.1
- ping: 2.2.2.2
target: 2.2.2.2
And on 2.2.2.2
your heartbeat config would partially look like:
failover_ip: 0.0.0.0
ping_ip: 1.1.1.1
ips:
- ping: 1.1.1.1
target: 1.1.1.1
- ping: 2.2.2.2
target: 2.2.2.2
But what about ping
and target
within the ips
block?
Assume you run virtual machines on your server, where each virtual machine
listens to an individual IP address. Your Failover IP, however, can only be
bound to your server's main IP address. Thus, the ping
option tells heartbeat
about the virtual machine's IP addresses and heartbeat will use these addresses
to check the availability of your virtual machines. Instead, target
tells
heartbeat which IP address to use in case heartbeat switches the Failover IP to
the associated server. If you don't use virtual machines or multiple IP
addresses on your servers, you can simply use your server's main IP addresses
for both, the ping
as well as target
option.
If you want to force a Failover IP switch, add
force_down: true
to your config, restart heartbeat, wait for the switch, remove it from your config and restart heartbeat again.
To only run a single availability check, add
only_once: true
to your config. Using this option, heartbeat will run only one check instead of running into a loop to continously run the checks. After the check, heartbeat will terminate.
If you want the heartbeat
to neither trigger an IP failover nor to
run the hooks, but just to tell you what it would do, then add
dry: true
to your config.
If you want heartbeat
to log somewhere else than to
$WHERE_THE_BINARY_IS/../log
then set this environment
variable:
HEARTBEAT_LOG=/var/log/heartbeat/log
or
HEARTBEAT_LOG=STDOUT
If you do planned maintenance and you have to shutdown or reboot your server,
there are usually two ways to avoid downtime in case the failover IP currently
points to the server you have to reboot. You can a) switch the failover IP
manually, either via force_down: true
or hetzner's robot, etc. or you can b)
wait for heartbeat to switch the failover IP for you when heartbeat detects
that the server the failover IP points to is down. However, option a) requires
manual intervention and option b) results in more downtime than neccessary.
Thus, heartbeat provides a third way via the bin/debian-shutdown init script.
This init script will automatically switch the failover IP when you shutdown the
server.
To install it, run:
$ cd /etc/init.d
$ ln -s /path/to/heartbeat/bin/debian-shutdown heartbeat-shutdown
$ update-rc.d heartbeat-shutdown defaults
like you do for bin/debian. When you shutdown your server this init script
will run heartbeat using the config/shutdown.yml config file to switch the
failover IP in case it currently points to the server you are about to
shutdown. config/shutdown.yml provides the same options as
config/heartbeat.yml, but you have to use it differently, because in this case
you want heartbeat to monitor the server it runs on by using ping_ip: [current ip]
. At the same time you want heartbeat to assume that the server it
currently runs on is down by using force_down: true
, because the server will
be down soon. Moreover, you want heartbeat to run only a single check by using
the only_once: true
option, such that heartbeat will terminate afterwards.
Please note that bin/debian-shutdown is configured to run before other daemons are about to stop and you can use hooks to stop other daemons before heartbeat switches the failover IP.
You can add your own hooks which will be run before or after the Failover IP is switched from one active server ip to another in case the first one is down. To add hooks, add your shell, ruby or other scripts to the 'hooks/before' or 'hooks/after' folder within heartbeat's root folder. Please note that your scripts must of course be executable by the heartbeat daemon. Heartbeat will execute your scripts in alphabetical order and will pass the failover ip as first argument, the old active server ip as second argument and the new active server ip as the third argument to your scripts. Please take a look at examples/hooks/email to learn more about how to write your own hooks.
Heartbeat allows you to monitor multiple failover IPs independently by providing
multiple config files. For each file
named like config/heartbeat*.yml
the daemon will start a separate thread.
For example you could create config/heartbeat0000.yml
to monitor your first
failover IP 0.0.0.0
and config/heartbeat5555.yml
to monitor your second
failover IP 5.5.5.5
.
To run the tests, simply run:
$ bundle exec rake test
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Added some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request