Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading from image based on Debian #16

Open
xojoc opened this issue Oct 16, 2023 · 38 comments
Open

Upgrading from image based on Debian #16

xojoc opened this issue Oct 16, 2023 · 38 comments

Comments

@xojoc
Copy link

xojoc commented Oct 16, 2023

Hi,
I'm currently using postgres:14.1-bullseye, do you think that switching to Apline could cause problems? Is this use case contemplated?

Thanks

@xojoc xojoc changed the title Upgrading with previous image using debian Upgrading from image based on Debian Oct 16, 2023
@justinclift
Copy link
Member

At the moment I wouldn't risk it.

Or, at least I'd make a backup before trying it, then check carefully if it went ok.

There are reports of collation problems after upgrading, and the common thread seem to be people switching from non-Alpine based builds to our (current) Alpine-only one.

It shouldn't technically be all that hard for us to make non-Alpine based images available too, but it's just not something we've looked at yet.

@rick-giorgi
Copy link

In my test lab I faced a similar thing with an Ubuntu-based source PG (the base postgres:15 image)

Interestingly enough, I ran the Alpine-based upgrade image and then reverted back to the stock-Ubuntu one and the collation warnings went away. All the DB-based apps I had (5 or 6 of them) are working normally, no data loss (haven't seen any deviation so far).

This is a lab setup, so I'm not saying it is prod-ready, but it may be worth testing if you're just playing with it like me.

@justinclift
Copy link
Member

Cool! That might be the right approach for people to use then. 😄

@justinclift
Copy link
Member

justinclift commented Nov 16, 2023

@wuast94 @fOmey That might be worth your trying out too. 😄

@Davo1624
Copy link

Davo1624 commented Dec 6, 2023

I am just a lowly user who has no rights to be messing with postgres but here we are...

Trying to upgrade from pg14 (Debian 14.10-1 docker image) results in the following:


Checking cluster versions ok
Checking database user is the install user ok
Checking database connection settings ok
Checking for prepared transactions ok
Checking for system-defined composite types in user tables ok
Checking for reg* data types in user tables ok
Checking for contrib/isn with bigint-passing mismatch ok
Checking for incompatible "aclitem" data type in user tables ok
Creating dump of global objects ok
Creating dump of database schemas
failure

Consult the last few lines of "/var/lib/postgresql/data/new/pg_upgrade_output.d/20231206T014727.753/log/pg_upgrade_dump_339654.log" for
the probable cause of the failure.
Failure, exiting
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

the relevant log reports:
command: "/usr/local/bin/pg_dump" --host /var/lib/postgresql/data --port 50432 --username postgres --schema-only --quote-all-identifiers --binary-upgrade --format=custom --file="/var/lib/postgresql/data/new/pg_upgrade_output.d/20231206T013912.915/dump/pg_upgrade_dump_339654.custom" 'dbname=immich' >> "/var/lib/postgresql/data/new/pg_upgrade_output.d/20231206T013912.915/log/pg_upgrade_dump_339654.log" 2>&1
pg_dump: error: query failed: ERROR: could not access file "$libdir/cube": No such file or directory
pg_dump: detail: Query was: EXECUTE getDomainConstraints('1572422')

I understand this container is meant for alpine-based upgrades but figured I would chime in

@justinclift
Copy link
Member

@Davo1624 Oops, I didn't see this comment until just now. Did you get this figured out?

@Davo1624
Copy link

@Davo1624 Oops, I didn't see this comment until just now. Did you get this figured out?

Unfortunately no I did not, wasn't sure if accommodating alipne-based images was within the scope of your project but I will take all the help I can get!

@justinclift
Copy link
Member

No worries. Looking at the error log there, it's complaining about a missing cube file.

That's probably the cube extension for PostgreSQL:

https://www.postgresql.org/docs/current/cube.html

At present the pg_autoupgrade image doesn't include any of the optional PostgreSQL extensions (I didn't think of it until prompted recently).

So we'll probably need to add support for them to get it working.

Alternatively, is there some way to temporarily remove the extension from your PostgreSQL install while you do the upgrade?

That might fix the problem too, but only if it's the only extension you're using.

@Davo1624
Copy link

Davo1624 commented Jan 13, 2024

Thank you for the very detailed explanation!!

I am running the default postgres:14 image with no extensions so I guess the cube extension is baked in. I will do some internet sleuthing to see if there is a way to get around that and I really really appreciate the info!

Edit: running \dx in psql results in the following

                 List of installed extensions
  Name   | Version |   Schema   |         Description
---------+---------+------------+------------------------------
 plpgsql | 1.0     | pg_catalog | PL/pgSQL procedural language
(1 row)

so I don't think the cube extension exists in the image

@justinclift
Copy link
Member

Interesting. I wonder why the pg_dump command is trying to use that cube file then?

Hmmm, is it feasible for you to do a pg_dump of your database (with the working database), then grep the resulting file for cube to see if anything interesting shows up?

@Davo1624
Copy link

Davo1624 commented Jan 13, 2024

Looking at the database that was erroring during the backup process it was immich, using pg_dump then grep cube for the specific db results in the following:

-- Name: cube; Type: EXTENSION; Schema: -; Owner: -
CREATE EXTENSION IF NOT EXISTS cube WITH SCHEMA public;
-- Name: EXTENSION cube; Type: COMMENT; Schema: -; Owner:
COMMENT ON EXTENSION cube IS 'data type for multidimensional cubes';
MZ.09.10701343  Distrito de Mocubela
679932  Curcubeu        26.14405        44.76261        RO      30      132226  2013-04-21
305145  Kuzucubelen     34.43765        36.83996        TR      32              2013-09-08

So that one specific database is using the cube extension.

I am a bit out of my depth here so any suggestions are welcome and thanks again for bearing with me.

@justinclift
Copy link
Member

No worries, that just means we're on the right track. I'll try and look at the extensions side of this in the next few days, as adding them to the Docker container here is probably all that's needed. Hopefully. 😅

@Davo1624
Copy link

Sounds great!

@justinclift
Copy link
Member

So much for "next few days". 😦 I've only just started looking at this particular issue, and it looks like it'll be a simple fix.

It turns out that where the build script is running make install (example here), it should instead be running make install-world.

With that change made, the common PostgreSQL extensions (including cube) are included in the image. So you'd likely be able to do the upgrade with this Docker container without hitting issues.

I'm testing a few other minor adjustments to the build process too today. Assuming nothing goes badly wrong, then new Docker images that include the extensions should be available later on today. Hopefully. 😄

@justinclift
Copy link
Member

@Davo1624 The pgautoupgrade docker images now include the common PostgreSQL extensions (as above), so if you're up for testing things out again then it should all work ok. In theory. 😄

@Davo1624
Copy link

Many thanks, will give it a whirl and report back!

@justinclift
Copy link
Member

@Davo1624 How'd you go with this, did it work? 😄

@andyundso
Copy link
Member

I also recently upgraded from a Debian-based Postgres v15 to pgautoupgrade v16, which is Alpine-based. As @rick-giorgi mentioned, I got some collation warnings (which did not go away), but I was also able to just use the "regular" Debian-based Postgres v16 image again.

As @justinclift , I don't think it should be too much effort to offer Debian-based images, just requires some tweaks to the build pipeline and automated testing.

@justinclift
Copy link
Member

@andyundso Yeah, similar thought.

Any ideas for a suitable naming scheme for the Debian based ones? Maybe just appending -debian to the end or something?

@andyundso
Copy link
Member

I would suggest -bookworm to align with Postgres (they also offer a bullseye-based image, which I think we do not need).

@justinclift
Copy link
Member

Cool, that makes good sense. In theory it shouldn't be too hard to get this working.

Without actually checking yet (!) we can probably do the same thing for the Debian one that we do for the Alpine one. ie grab the official Docker PostgreSQL (for bookworm) docker entrypoint script, then insert the pgautoupgrade scripting into it

We might even be able to use the same pgautoupgrade script piece in both (via include file maybe?) which would be pretty optimal. 😄

@jjb
Copy link
Member

jjb commented May 14, 2024

source '../foo.sh'

works well but it always relative to the script that invokes it, not relative to the file itself. Maybe not a problem here, reasonable to assume point of invocation is always the same.

There’s a common hack to calculate full path, it is not good looking.

@jjb
Copy link
Member

jjb commented May 14, 2024

We could convert the whole project to ruby and ERB 🧌

@justinclift
Copy link
Member

justinclift commented May 14, 2024

Um... no. 😄

Ruby is firmly on my list of languages to not use unless there's literally no other choice (or close to), after having been exposed to it briefly about a decade ago.

Nothing so much wrong with the language syntax or anything (back then), but when my employer at the time (Red Hat) discovered long running Ruby processes randomly segfaulted, they had one of the senior guys (Jim Meyering I think) investigate. Turned out to be due to umm... (this is from long ago memory) Ruby leaving dangling pointers or something in functions when they terminated, rather than zero-ing out those pointers. So the Ruby garbage collection process would hit those dangling pointers and segfault.

The senior guy wrote a patch to fix it by correctly clearing the pointers, and proposed it upstream. But the Ruby Community decided they'd rather keep the random segfaults than have the ~2% speed impact from the patch (!!!!!).

That was when Ruby went onto my permanent "not worth bothering with" list, and I've not seen any reason to change my opinion. Meanwhile, I tend to prefer Go for most things these days. 😁

@justinclift
Copy link
Member

Yeah, the "source" approach for including common code might be the go.

We'll probably find out if it's that easy once someone spends some implementation time experimenting with the pieces. 😄

@jjb
Copy link
Member

jjb commented May 14, 2024

very interesting anecdote! would be very interested in what that patch was but probably almost impossible to find since ruby wasn't on github at the time i think.

@justinclift
Copy link
Member

Hmmm, if the Ruby developers have a mailing list + matching archive, that might do it?

@andyundso
Copy link
Member

We might even be able to use the same pgautoupgrade script piece in both (via include file maybe?) which would be pretty optimal. 😄

maybe I do not understand the issue, but I would assume we can just copy the same docker-entrypoint.sh into the potential Debian container? or what would you need to source?

I could see erb helping with templating a Dockerfile for both Debian and Alpine, but as for now I would suggest to maintain two different Dockerfile, one for Alpine and one for Debian. They will be quite similar in terms of structure, but we still have to install other tools on Debian than Alpine.

@justinclift
Copy link
Member

justinclift commented May 14, 2024

I would assume we can just copy the same docker-entrypoint.sh into the potential Debian container

The docker entrypoint script we're using at the moment was based upon the official Docker PG Alpine one (copied from their GitHub repo).

I haven't checked if they're using the same entrypoint script in both their Debian and Alpine containers. If they are, then you're right it should be the same.

If they're not though, then we probably can't use our entrypoint one as is.

@andyundso
Copy link
Member

So I took the most recent version of the entrypoint script from here for Alpine and here for bookworm and ran them through a text compare. There is one change (left is Alpine, right is bookworm):

image

maybe we could use this gosu tool for Alpine as well. It is mentioned in the README of gosu that su-exec is a rewrite of gosu in C that is available to Alpine. But I think a couple megabytes more does not hurt our Alpine container for consistency across the Debian and Alpine images.

@andyundso
Copy link
Member

so Gosu is available in Alpine Edge, but not 3.19. But we could use the Docker build from the maintainer to copy out the binary.

not sure if this is my favorite solution. potentially we could also implement a switch in our Docker entryscript to determine if we are running under Alpine or Debian to invoke the correct tool.

@jjb
Copy link
Member

jjb commented May 15, 2024

implement a switch in our Docker entryscript to determine if we are running under Alpine or Debian to invoke the correct tool

that sounds like the best to me

@justinclift
Copy link
Member

@andyundso Sounds like a reasonable idea. Maybe try it out and we can put put some kind of EXPERIMENTAL Debian build! notice on the front page README with info about it?

And probably also ask for people to try it out (etc). There are a few open Issues in the repo with people that would probably give it a go.

We could also ask on that super-long thread about upgrading, on the official Docker PG image too. 😄

@justinclift
Copy link
Member

@xojoc @rick-giorgi @Davo1624 If anyone's up for testing a new Debian based pgautoupgrade image, then @andyundso has officially added ones for PostgreSQL 15 and 16:

  • 15-bookworm
  • 16-bookworm

Anyone up for giving it a go? 😄

@josecsotomorales
Copy link

Hi @justinclift! I tested upgrading from 14 -> 16 using 16-bookworm. Tested extensively in our dev environments and then deployed the image to over 20 managed PostgreSQL databases running on K8S. It worked perfectly, pos upgrade had to execute some manual cleanup to get the engine to execute queries faster than before:

REINDEX DATABASE database;
VACUUM FULL VERBOSE;
ANALYSE VERBOSE;

Thanks all for this fantastic project!! I am happy to contribute to future development 🚀

@justinclift
Copy link
Member

Awesome, thanks heaps @josecsotomorales! 😄

Yeah, I'd really like for us to figure out solid post-upgrade steps to be automatically run after each upgrade. Re-indexing is definitely on that list. I'm just concerned about the amount of time it can take with larger data sets.

Though I suppose we should start simple (ie just do it) and then figure out more intelligent options over time if they do turn out to be needed.

@wuast94
Copy link
Member

wuast94 commented Jul 26, 2024

maybee something we could turn on per env variable but defaulting to false?

@justinclift
Copy link
Member

justinclift commented Jul 27, 2024

Yeah, I think we'll ultimately need to have some kind of yes/no/(etc) variables to control the (automatic) post-upgrade tasks like re-indexing.

Creation of new indexes isn't needed on every upgrade. However, it is needed for some specific upgrades, including (sometimes) for upgrades to a new point release when there was a bug in the prior one's indexing code.

To get things exactly right (rather than just do "always reindex" or similar), we'd need to have a good idea of the exact PostgreSQL version (including point release version) being upgraded from.

Which is a problem as PostgreSQL only records the major version in the PG_VERSION file.


We could (for pgautoupgrade images) store the exact PostgreSQL version (including point release) in an extra data file in the database directory. Then we could use that extra information when upgrading from one pgautoupgrade image to another for deciding whether to reindex.

If the file isn't present when upgrading (eg coming from a non-pgautoupgrade image), then we could just blindly always do the reindexing.

Something along these lines would probably be the most accurate approach, at least to only reindex when it's truly needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants