Skip to content

Commit

Permalink
fix(rootfs): refactor database init process
Browse files Browse the repository at this point in the history
When you complete a recovery of the database for the first time, a new log timeline
is started with an ID of 2. When you restore again, the timeline used when the last
backup occurred will be replayed. Because of this, if you restored the database and
did not perform a backup, all data from that successful recovery will be lost because
only WAL logs from the first timeline (the timeline that the database was last backed
up) will be restored.

In order to fix this, after completing a database recovery we create a fresh backup
in order to establish a new recovery baseline. That way we can now replay from
timeline 2.

Other fixes that thie refactor includes:

 - database attempting to start halfway through recovery
 - /bin/is_running identifying the database as running during recovery
 - cleaning up database boot script
  • Loading branch information
Matthew Fisher committed Jul 25, 2016
1 parent b66069b commit b7814ba
Show file tree
Hide file tree
Showing 6 changed files with 25 additions and 21 deletions.
2 changes: 2 additions & 0 deletions rootfs/bin/backup
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
export BACKUP_FREQUENCY=${BACKUP_FREQUENCY:-12h}
export BACKUPS_TO_RETAIN=${BACKUPS_TO_RETAIN:-5}

until psql -l -t >/dev/null 2>&1; do sleep 1; done

while true; do
sleep "$BACKUP_FREQUENCY"
echo "Performing a base backup..."
Expand Down
14 changes: 14 additions & 0 deletions rootfs/bin/backup-initial
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#!/usr/bin/env bash

until psql -l -t >/dev/null 2>&1; do sleep 1; done

while true; do
if [[ ! -f "$PGDATA/recovery.conf" ]] ; then
# Push a fresh backup so we have a new recovery baseline
echo "Performing an initial backup..."
envdir "$WALE_ENVDIR" wal-e backup-push "$PGDATA"
echo "Backup has been completed."
break
fi
sleep 1
done
2 changes: 1 addition & 1 deletion rootfs/bin/is_running
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@
set -e

# check if database is running
gosu postgres pg_ctl status
gosu postgres psql -l -t >/dev/null 2>&1
17 changes: 3 additions & 14 deletions rootfs/docker-entrypoint-initdb.d/003_restore_from_backup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,10 @@ EOF
chown -R postgres:postgres "$PGDATA"
chmod 0700 "$PGDATA"

# reboot the server for wal_level to be set before backing up
echo "Rebooting postgres to enable archive mode"
gosu postgres pg_ctl -D "$PGDATA" -w restart

# check if there are any backups -- if so, let's restore
# we could probably do better than just testing number of lines -- one line is just a heading, meaning no backups
if [[ $(envdir "$WALE_ENVDIR" wal-e --terse backup-list | wc -l) -gt "1" ]]; then
echo "Found backups. Restoring from backup..."
gosu postgres pg_ctl -D "$PGDATA" -w stop
rm -rf "$PGDATA"
envdir "$WALE_ENVDIR" wal-e backup-fetch "$PGDATA" LATEST
cat << EOF > "$PGDATA/postgresql.conf"
Expand All @@ -35,6 +30,9 @@ archive_mode = on
archive_command = 'envdir "${WALE_ENVDIR}" wal-e wal-push %p'
archive_timeout = 60
listen_addresses = '*'
archive_mode = on
archive_command = 'envdir "${WALE_ENVDIR}" wal-e wal-push %p'
archive_timeout = 60
EOF
cat << EOF > "$PGDATA/pg_hba.conf"
# "local" is for Unix domain socket connections only
Expand All @@ -48,17 +46,8 @@ host all all 0.0.0.0/0 md5
EOF
touch "$PGDATA/pg_ident.conf"
echo "restore_command = 'envdir /etc/wal-e.d/env wal-e wal-fetch \"%f\" \"%p\"'" >> "$PGDATA/recovery.conf"
chown -R postgres:postgres "$PGDATA"
chmod 0700 "$PGDATA"
gosu postgres pg_ctl -D "$PGDATA" \
-o "-c listen_addresses=''" \
-t 1200 \
-w start
fi

echo "Performing an initial backup..."
gosu postgres envdir "$WALE_ENVDIR" wal-e backup-push "$PGDATA"

# ensure $PGDATA has the right permissions
chown -R postgres:postgres "$PGDATA"
chmod 0700 "$PGDATA"
3 changes: 3 additions & 0 deletions rootfs/docker-entrypoint-initdb.d/004_run_backups.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
#!/usr/bin/env bash

# Create a fresh backup as a starting point
gosu postgres backup-initial &

# Run periodic backups in the background
gosu postgres backup &
8 changes: 2 additions & 6 deletions rootfs/docker-entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -80,21 +80,17 @@ if [ "$1" = 'postgres' ]; then
EOSQL
echo

gosu postgres pg_ctl -D "$PGDATA" -m fast -w stop

echo
for f in /docker-entrypoint-initdb.d/*; do
case "$f" in
*.sh) echo "$0: running $f"; . "$f" ;;
*.sql)
echo "$0: running $f";
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" < "$f"
echo
;;
*) echo "$0: ignoring $f" ;;
esac
echo
done

gosu postgres pg_ctl -D "$PGDATA" -m fast -w stop
set_listen_addresses '*'

echo
Expand Down

0 comments on commit b7814ba

Please sign in to comment.