fix(rootfs): refactor database init process #112

bacongobbler · 2016-06-06T19:16:32Z

This refactor includes:

removing PGCTLTIMEOUT, now you'll only need to modify the readinessProbe timeout
cleaning up database boot script

Note that this is not required for 2.0.

Manual testing procedure:

boot up database
create a few apps
delete the database via kubectl delete po
ensure the database recovers correctly

closes #55
closes #108

mention-bot · 2016-06-06T19:16:33Z

By analyzing the blame information on this pull request, we identified @kmala to be a potential reviewer

monaka · 2016-07-17T02:34:38Z

I rebased this patch on my repository and tests on Travis-CI.
(Strictly, my Wal-e is patched. But it will be less side effects to this PR.)

Tests were failed randomly. Sometimes they finish with success.
Below is the example of log on failure.

LOG:  restored log file "00000001000000000000000D" from archive
LOG:  redo starts at 0/D000090
LOG:  consistent recovery state reached at 0/D0000B8
FATAL:  the database system is starting up
FATAL:  the database system is starting up
FATAL:  the database system is starting up
FATAL:  the database system is starting up
-----> checking if postgres is running
make: *** [test-functional-swift] Error 2

bacongobbler · 2016-07-18T15:05:02Z

yeah this is still a WIP, it's not expected to work yet.

monaka · 2016-07-23T12:12:57Z

rootfs/bin/backup-initial

@@ -0,0 +1,14 @@
+#!/usr/bin/env bash
+
+until psql -l -t >/dev/null 2>&1; do sleep 1; done


This line should be in "if then - fi" blocks? (e.g. Line 7)

(refs #123 (comment))

how come? we want to wait here until the database is ready to accept connections. When I get back to this PR, I'd like to rewrite this to

until is_running; do sleep 1; done

So it's more clear what the intent of this line is.

monaka · 2016-07-25T20:26:59Z

Travis shows errors. I guess it may caused by timeout. Should be patched to the test driver?

bacongobbler · 2016-07-25T21:31:14Z

I think it's because I'm out-of-date with master. Trying again

bacongobbler · 2016-07-25T22:09:22Z

k, got travis to pass, now we just need e2e to pass.

monaka · 2016-07-25T22:39:56Z

rootfs/docker-entrypoint-initdb.d/003_restore_from_backup.sh

@@ -35,6 +30,9 @@ archive_mode = on
 archive_command = 'envdir "${WALE_ENVDIR}" wal-e wal-push %p'
 archive_timeout = 60
 listen_addresses = '*'
+archive_mode = on
+archive_command = 'envdir "${WALE_ENVDIR}" wal-e wal-push %p'
+archive_timeout = 60


Duplicate with L31? (L33, L34 are similar)

Good catch, must've missed it in the rebase

bacongobbler · 2016-07-26T15:42:30Z

re-labeling as in progress until I identify the travis failures

monaka · 2016-07-28T20:47:59Z

I guess the failure on Travis is caused by some timing issues on tests only. I also got random failures on my Travis builds.

bacongobbler · 2016-07-28T21:09:32Z

I don't think these are random this time, but I'll eventually look into it.

bacongobbler · 2016-08-08T15:50:13Z

FYI I managed to get #131 in so half of the work here has already been done.

When you complete a recovery of the database for the first time, a new log timeline is started with an ID of 2. When you restore again, the timeline used when the last backup occurred will be replayed. Because of this, if you restored the database and did not perform a backup, all data from that successful recovery will be lost because only WAL logs from the first timeline (the timeline that the database was last backed up) will be restored. In order to fix this, after completing a database recovery we create a fresh backup in order to establish a new recovery baseline. That way we can now replay from timeline 2. Other fixes that this refactor includes: - database attempting to start halfway through recovery - /bin/is_running identifying the database as running during recovery - cleaning up database boot script

mboersma · 2017-03-14T16:26:12Z

@bacongobbler is this PR still worth pursuing? Should we prioritize it for the next release, or maybe mark it "help wanted?"

bacongobbler · 2017-03-14T16:32:37Z

I think it's still worth it to pursue eventually to get rid of PGCTLTIMEOUT on long recovery times, otherwise there's no real end benefit. It's mostly cleanup so it's not a high priority.

mboersma · 2017-07-05T18:16:28Z

I'm going to close this PR since it's old and rusty, but @bacongobbler please re-open if this code gets resurrected.

bacongobbler added awaiting review refactor labels Jun 6, 2016

bacongobbler self-assigned this Jun 6, 2016

This was referenced Jun 6, 2016

fix(rootfs): create new recovery baseline after restore #104

Merged

Second db restore fails #114

Closed

bacongobbler force-pushed the refactor-database branch 4 times, most recently from 63fc8db to 6c41179 Compare June 8, 2016 08:26

mboersma added the needs rebase label Jun 24, 2016

bacongobbler mentioned this pull request Jun 27, 2016

rebooting loop after recovered by WAL-E. #123

Closed

monaka mentioned this pull request Jul 17, 2016

deis/postgres dying halfway through restore #56

Closed

bacongobbler added in progress and removed awaiting review labels Jul 18, 2016

monaka reviewed Jul 23, 2016
View reviewed changes

monaka mentioned this pull request Jul 24, 2016

Refactor database (some additions to @bacongobbler's PR) #128

Closed

bacongobbler force-pushed the refactor-database branch from 6c41179 to ea66780 Compare July 25, 2016 18:14

bacongobbler removed the needs rebase label Jul 25, 2016

bacongobbler force-pushed the refactor-database branch from ea66780 to b7814ba Compare July 25, 2016 21:30

bacongobbler added awaiting review and removed in progress labels Jul 25, 2016

bacongobbler added this to the v2.3 milestone Jul 25, 2016

bacongobbler force-pushed the refactor-database branch from b7814ba to 350b15e Compare July 25, 2016 22:17

monaka reviewed Jul 25, 2016
View reviewed changes

bacongobbler force-pushed the refactor-database branch 4 times, most recently from 3396707 to 72390e8 Compare July 26, 2016 04:43

bacongobbler removed this from the v2.3 milestone Jul 26, 2016

bacongobbler added in progress and removed awaiting review labels Jul 26, 2016

bacongobbler force-pushed the refactor-database branch from bd6699a to d181497 Compare October 24, 2016 16:49

bacongobbler force-pushed the refactor-database branch 2 times, most recently from 917e795 to 37a75b5 Compare November 2, 2016 17:03

bacongobbler removed the in progress label Nov 21, 2016

bacongobbler added the help wanted label Mar 14, 2017

mboersma closed this Jul 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(rootfs): refactor database init process #112

fix(rootfs): refactor database init process #112

bacongobbler commented Jun 6, 2016 •

edited

Loading

mention-bot commented Jun 6, 2016

monaka commented Jul 17, 2016

bacongobbler commented Jul 18, 2016

monaka Jul 23, 2016 •

edited

Loading

bacongobbler Jul 24, 2016

monaka commented Jul 25, 2016

bacongobbler commented Jul 25, 2016

bacongobbler commented Jul 25, 2016

monaka Jul 25, 2016

bacongobbler Jul 25, 2016

bacongobbler commented Jul 26, 2016

monaka commented Jul 28, 2016

bacongobbler commented Jul 28, 2016 •

edited

Loading

bacongobbler commented Aug 8, 2016

mboersma commented Mar 14, 2017

bacongobbler commented Mar 14, 2017

mboersma commented Jul 5, 2017

		@@ -0,0 +1,14 @@
		#!/usr/bin/env bash

		until psql -l -t >/dev/null 2>&1; do sleep 1; done

fix(rootfs): refactor database init process #112

fix(rootfs): refactor database init process #112

Conversation

bacongobbler commented Jun 6, 2016 • edited Loading

mention-bot commented Jun 6, 2016

monaka commented Jul 17, 2016

bacongobbler commented Jul 18, 2016

monaka Jul 23, 2016 • edited Loading

Choose a reason for hiding this comment

bacongobbler Jul 24, 2016

Choose a reason for hiding this comment

monaka commented Jul 25, 2016

bacongobbler commented Jul 25, 2016

bacongobbler commented Jul 25, 2016

monaka Jul 25, 2016

Choose a reason for hiding this comment

bacongobbler Jul 25, 2016

Choose a reason for hiding this comment

bacongobbler commented Jul 26, 2016

monaka commented Jul 28, 2016

bacongobbler commented Jul 28, 2016 • edited Loading

bacongobbler commented Aug 8, 2016

mboersma commented Mar 14, 2017

bacongobbler commented Mar 14, 2017

mboersma commented Jul 5, 2017

bacongobbler commented Jun 6, 2016 •

edited

Loading

monaka Jul 23, 2016 •

edited

Loading

bacongobbler commented Jul 28, 2016 •

edited

Loading