Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy to AWS #34

Closed
wants to merge 24 commits into from
Closed

Deploy to AWS #34

wants to merge 24 commits into from

Conversation

amitaibu
Copy link
Owner

@amitaibu amitaibu commented Jan 6, 2024

No description provided.

@@ -12,11 +12,12 @@ JS_FILES += ${IHP}/static/vendor/turbolinksMorphdom.js

include ${IHP}/Makefile.dist

tailwind-dev:
tailwind-dev: node_modules
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mpscholten what is node_modules doing here?

Copy link
Owner Author

@amitaibu amitaibu Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I guess it should be npm

so it calls make npm

and I should have

npm:
	npm install

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It triggers a make node_modules when the node_modules directory doesn't exists

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we should define also:

node_modules:
     npm install

?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already defined like that (npm ci is similar to bpm install)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, moving discussion to digitallyinduced/ihp#1890

@AronNovak
Copy link
Collaborator

image

#!/bin/bash

# Define port
PORT=80

# Define the response
read -r -d '' RESPONSE <<EOF
HTTP/1.1 200 OK\r
Content-Length: 11\r
Content-Type: text/plain\r
\r
Hello World
EOF

# Create a TCP server on port 80
while : ; do
  # Listen for a connection and respond
  echo -e "$RESPONSE" | nc -l $PORT
done

Using a very simple Bash-based webserver, I tested the connectivity. Port 80 is available, but certificate validation does not succeed.

[47 of 47] Linking build/bin/RunUnoptimizedProdServer
chmod +x Main.hs
rm -f build/bin/RunProdServer
ln -s `basename build/bin/RunUnoptimizedProdServer` build/bin/RunProdServer
installing
building '/nix/store/x55ydprmzj4hnqqw16ss2q9jwdaqfwqv-unit-app.service.drv'...
building '/nix/store/giv3x69p48z7100zz4vavrlas8qz75n0-unit-worker.service.drv'...
building '/nix/store/3ijzzvzyc4vkdz9s4cwjxcmj2g1d39yh-system-units.drv'...
building '/nix/store/wb4l8nxcyx4akhcy2rhcabd0an1p18mr-etc.drv'...
building '/nix/store/hpid7r5aafbicvs091rncbp0lcmvlpv9-nixos-system-unnamed-23.11.20231003.ea0284a.drv'...
warning: you did not specify '--add-root'; the result might be removed by the garbage collector
updating GRUB 2 menu...
activating the configuration...
setting up /etc...
reloading user units for root...
setting up tmpfiles
warning: the following units failed: acme-tpp-qa.gizra.site.service, app.service, nginx.service, postgresql.service, worker.service

× acme-tpp-qa.gizra.site.service - Renew ACME certificate for tpp-qa.gizra.site
     Loaded: loaded (/etc/systemd/system/acme-tpp-qa.gizra.site.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Fri 2024-02-02 10:56:59 UTC; 60ms ago
TriggeredBy: ● acme-tpp-qa.gizra.site.timer
    Process: 54266 ExecStart=/nix/store/22l06paby9095dqr8wrscbxqm16xxrff-unit-script-acme-tpp-qa.gizra.site-start/bin/acme-tpp-qa.gizra.site-start (code=exited, status=10)
   Main PID: 54266 (code=exited, status=10)
         IP: 14.8K in, 6.8K out
        CPU: 114ms

Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54271]: 2024/02/02 10:56:59 Could not obtain certificates:
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54271]:         error: one or more domains had a problem:
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54271]: [tpp-qa.gizra.site] acme: error: 400 :: urn:ietf:params:acme:error:connection :: 3.249.27.81: Fetching http://tpp-qa.gizra.site/.well-known/acme-challenge/CHoIM6nWqiJafCsW_G8eJ5VdOKqCQo1ytff1S3IOVHg: Connection refused

@AronNovak
Copy link
Collaborator

image
Even via the hostname, the dummy webserver is available.

@mpscholten
Copy link
Collaborator

Can you run journalctl -u acme-tpp-qa.gizra.site.service to retrieve the logs of the failing letsencrypt service?

@AronNovak
Copy link
Collaborator

Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: Starting Renew ACME certificate for tpp-qa.gizra.site...
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54266]: Waiting to acquire lock /run/acme/1.lock
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54266]: Acquired lock /run/acme/1.lock
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54266]: + set -euo pipefail
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54268]: + mkdir -p /var/lib/acme/acme-challenge/.well-known/acme-challenge
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54268]: + chgrp nginx /var/lib/acme/acme-challenge/.well-known/acme-challenge
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54266]: + echo 872737f092ffb012ff2b
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54266]: + cmp -s domainhash.txt certificates/domainhash.txt
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54266]: + lego --accept-tos --path . -d tpp-qa.gizra.site --email [email protected] --key-type ec256 --http --h>
Feb 02 10:56:54 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54271]: 2024/02/02 10:56:54 [INFO] [tpp-qa.gizra.site] acme: Obtaining bundled SAN certificate
Feb 02 10:56:54 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54271]: 2024/02/02 10:56:54 [INFO] [tpp-qa.gizra.site] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/31075>
Feb 02 10:56:55 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54271]: 2024/02/02 10:56:55 [INFO] [tpp-qa.gizra.site] acme: Could not find solver for: tls-alpn-01
Feb 02 10:56:55 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54271]: 2024/02/02 10:56:55 [INFO] [tpp-qa.gizra.site] acme: use http-01 solver
Feb 02 10:56:55 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54271]: 2024/02/02 10:56:55 [INFO] [tpp-qa.gizra.site] acme: Trying to solve HTTP-01
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54271]: 2024/02/02 10:56:59 [INFO] Deactivating auth: https://acme-v02.api.letsencrypt.org/acme/authz-v3/310754516227
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54271]: 2024/02/02 10:56:59 Could not obtain certificates:
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54271]:         error: one or more domains had a problem:
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54271]: [tpp-qa.gizra.site] acme: error: 400 :: urn:ietf:params:acme:error:connection :: 3.249.27.81: Fetching http://tp>
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54266]: + echo Failed to fetch certificates. This may mean your DNS records are set up incorrectly. Selfsigned certs are>
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54266]: Failed to fetch certificates. This may mean your DNS records are set up incorrectly. Selfsigned certs are in pla>
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[54266]: + exit 10
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: acme-tpp-qa.gizra.site.service: Main process exited, code=exited, status=10/n/a
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: acme-tpp-qa.gizra.site.service: Failed with result 'exit-code'.
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: Failed to start Renew ACME certificate for tpp-qa.gizra.site.
Feb 02 10:56:59 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: acme-tpp-qa.gizra.site.service: Consumed 114ms CPU time, received 14.8K IP traffic, sent 6.8K IP traffic.

I'd assume for now that the Nginx server is not started by the time it tries to validate the certificate.

@AronNovak
Copy link
Collaborator

Feb 02 10:56:33 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: Starting Nginx Web Server...
Feb 02 10:56:33 ip-172-31-23-205.eu-west-1.compute.internal nginx-pre-start[54194]: nginx: [emerg] cannot load certificate "/var/lib/acme/tpp-qa.gizra.site/fullchain.pem": BIO_new_file() failed (SSL: error:800>
Feb 02 10:56:33 ip-172-31-23-205.eu-west-1.compute.internal nginx-pre-start[54194]: nginx: configuration file /etc/nginx/nginx.conf test failed
Feb 02 10:56:33 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE
Feb 02 10:56:33 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: nginx.service: Failed with result 'exit-code'.
Feb 02 10:56:33 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: Failed to start Nginx Web Server.
Feb 02 10:56:43 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: nginx.service: Scheduled restart job, restart counter is at 1.
Feb 02 10:56:43 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: Starting Nginx Web Server...
Feb 02 10:56:43 ip-172-31-23-205.eu-west-1.compute.internal nginx-pre-start[54257]: nginx: [emerg] cannot load certificate "/var/lib/acme/tpp-qa.gizra.site/fullchain.pem": BIO_new_file() failed (SSL: error:800>
Feb 02 10:56:43 ip-172-31-23-205.eu-west-1.compute.internal nginx-pre-start[54257]: nginx: configuration file /etc/nginx/nginx.conf test failed
Feb 02 10:56:43 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE
Feb 02 10:56:43 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: nginx.service: Failed with result 'exit-code'.
Feb 02 10:56:43 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: Failed to start Nginx Web Server.
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: nginx.service: Scheduled restart job, restart counter is at 2.
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: Starting Nginx Web Server...
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal nginx-pre-start[54262]: nginx: [emerg] cannot load certificate "/var/lib/acme/tpp-qa.gizra.site/fullchain.pem": BIO_new_file() failed (SSL: error:800>
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal nginx-pre-start[54262]: nginx: configuration file /etc/nginx/nginx.conf test failed
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: nginx.service: Failed with result 'exit-code'.
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: Failed to start Nginx Web Server.
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: nginx.service: Start request repeated too quickly.
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: nginx.service: Failed with result 'exit-code'.
Feb 02 10:56:53 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: Failed to start Nginx Web Server.

Perhaps that's the matter, that Nginx config refers to a certificate that does not exist yet.

@mpscholten
Copy link
Collaborator

Hm can you run systemctl restart acme-tpp-qa.gizra.site.service?

@AronNovak
Copy link
Collaborator

I just restarted the service, it has the same error:

Feb 02 13:02:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[63557]: 2024/02/02 13:02:59 Could not obtain certificates:
Feb 02 13:02:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[63557]:         error: one or more domains had a problem:
Feb 02 13:02:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[63557]: [tpp-qa.gizra.site] acme: error: 400 :: urn:ietf:params:acme:error:connection :: 3.249.27.81: Fetching http://tp>
Feb 02 13:02:59 ip-172-31-23-205.eu-west-1.compute.internal acme-tpp-qa.gizra.site-start[63551]: + echo Failed to fetch certificates. This may mean your DNS records are set up incorrectly. Selfsigned certs are>

Almost certainly it's the failing Nginx service that's preventing ACM.

@mpscholten
Copy link
Collaborator

Ok, let's try to disable SSL then for the moment:

services.nginx.virtualHosts."tpp-qa.gizra.site".enableACME = false;
services.nginx.virtualHosts."tpp-qa.gizra.site".forceSSL = false;

After that the nginx should be running

@AronNovak
Copy link
Collaborator

       error: The option `services.nginx.virtualHosts."tpp-qa.gizra.site".enableACME' has conflicting definition values:
       - In `/nix/store/3qq9i5znbx951wqpn7rs0jjw5zq3mxlj-source/flake.nix': false
       - In `/nix/store/zvmll5hprfkd73j8lhkqc1xm1j9gr5k9-source/NixSupport/nixosModules/appWithPostgres.nix': true
       Use `lib.mkForce value` or `lib.mkDefault value` to change the priority on any of these definitions.

@AronNovak
Copy link
Collaborator

@AronNovak
Copy link
Collaborator

Success.
One step further:

× postgresql.service - PostgreSQL Server
     Loaded: loaded (/etc/systemd/system/postgresql.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Fri 2024-02-02 13:15:07 UTC; 133ms ago
    Process: 65642 ExecStartPre=/nix/store/hc4rx3gzd0rg1a6rd78ymc9jhk2xax5g-unit-script-postgresql-pre-start/bin/postgresql-pre-start (code=exited, status=0/SUCCESS)
    Process: 65656 ExecStart=/nix/store/ki3srrjjzqalvh0hd9lmqavp5v9wr9jp-postgresql-14.9/bin/postgres (code=exited, status=0/SUCCESS)
    Process: 65674 ExecStartPost=/nix/store/xb8274v02ahxsr8w44z0f1rx0g7a998g-unit-script-postgresql-post-start/bin/postgresql-post-start (code=exited, status=2)
   Main PID: 65656 (code=exited, status=0/SUCCESS)
         IP: 5.2K in, 5.2K out
        CPU: 67ms

Feb 02 13:15:07 ip-172-31-23-205.eu-west-1.compute.internal postgresql-post-start[65690]: psql:/nix/store/ybqflcpnr1l4j2qq8z3slhbfbzhc3iwj-ihp-initScript:1: error: \connect: connection to server on socket "/run/postgresql/.s.PGSQL.5432" failed: FATAL:  database "app" does not exist

I need to create the SQL database at this point.

@mpscholten
Copy link
Collaborator

Hm this should happen automatically. Are you on latest IHP master? I have fixed something related to the database with digitallyinduced/ihp@ec29222 a month ago

This reverts commit ccce5d0.
@AronNovak
Copy link
Collaborator

@mpscholten I have the same error after the update:

 aaron   deploy  ~  gizra  ihp-landing-page  nix flake update
 aaron   deploy  ~  gizra  ihp-landing-page  git status
On branch deploy
nothing to commit, working tree clean

@mpscholten
Copy link
Collaborator

@AronNovak
Copy link
Collaborator

@mpscholten Had the same error afterwards.
I had the idea to edit the init script:

[root@ip-172-31-23-205:~]# vi /nix/store/vr1p8sw5z1765c99djjlyd2za7qkw746-ihp-initScript 

[root@ip-172-31-23-205:~]# ls -l -h /nix/store/vr1p8sw5z1765c99djjlyd2za7qkw746-ihp-initScript 
-r--r--r-- 2 root root 287 Jan  1  1970 /nix/store/vr1p8sw5z1765c99djjlyd2za7qkw746-ihp-initScript

[root@ip-172-31-23-205:~]# chmod +w /nix/store/vr1p8sw5z1765c99djjlyd2za7qkw746-ihp-initScript
chmod: changing permissions of '/nix/store/vr1p8sw5z1765c99djjlyd2za7qkw746-ihp-initScript': Read-only file system

But no luck... my idea was that in this file, at the 1st line, I'd create the missing database.

@AronNovak
Copy link
Collaborator

diff --git a/flake.nix b/flake.nix
index 31e40e9..01413df 100644
--- a/flake.nix
+++ b/flake.nix
@@ -67,6 +67,10 @@
                                 JWT_PUBLIC_KEY_PATH = "/root/jwtRS256.key.pub";
                             };
                         };
+                        services.postgresql = {
+                          enable = true;
+                          ensureDatabases = [ "app" ];
+                        };

This does not help either.

@AronNovak
Copy link
Collaborator

NixOS/nixpkgs#109273 (comment) - it seems for me that ensureDatabases would be processed after the init script, but as it's working elsewhere, it should be something else for sure.

@AronNovak
Copy link
Collaborator

@mpscholten How can I get rid of the in-server PostgreSQL server and use a managed one on AWS?
I guess in flake.nix, there's a way to specify SQL connectivity details and disable the local SQL service.
It would unblock this, and anyways in a production environment, I'd not mix web and db roles.

@mpscholten
Copy link
Collaborator

I think we just need to extend the init script to also create the user + db. I just pushed a change for this.

Can you switch to the new branch ( ihp.url = "github:digitallyinduced/ihp/deploy-to-nixos-fixes"; into flakes.nix, then a nix flakes update), then delete the postgres service again and then do another deploy?

@AronNovak
Copy link
Collaborator

building the system configuration...
error:
       … while calling the 'head' builtin

         at /nix/store/3qq9i5znbx951wqpn7rs0jjw5zq3mxlj-source/lib/attrsets.nix:820:11:

          819|         || pred here (elemAt values 1) (head values) then
          820|           head values
             |           ^
          821|         else

       … while evaluating the attribute 'value'

         at /nix/store/3qq9i5znbx951wqpn7rs0jjw5zq3mxlj-source/lib/modules.nix:807:9:

          806|     in warnDeprecation opt //
          807|       { value = builtins.addErrorContext "while evaluating the option `${showOption loc}':" value;
             |         ^
          808|         inherit (res.defsFinal') highestPrio;

       (stack trace truncated; use '--show-trace' to show the full trace)

       error: attribute 'databaseUser' missing

       at /nix/store/42xmfa93nc7dq0qphaxlbzhnwkhvy41x-source/NixSupport/nixosModules/appWithPostgres.nix:69:72:

           68|             CREATE USER ${cfg.databaseUser};
           69|             GRANT ALL PRIVILEGES ON DATABASE ${cfg.databaseName} TO "${pkgs.databaseUser}";
             |                                                                        ^
           70|             CREATE DATABASE ${cfg.databaseName} OWNER ${cfg.databaseUser};
Job for migrate.service failed because the control process exited with error code.
See "systemctl status migrate.service" and "journalctl -xeu migrate.service" for details.

@AronNovak
Copy link
Collaborator

^^ I am going to address this in my fork.

@AronNovak
Copy link
Collaborator

@AronNovak
Copy link
Collaborator

@mpscholten

Feb 08 08:28:01 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: Starting migrate.service...
░░ Subject: A start job for unit migrate.service has begun execution
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ A start job for unit migrate.service has begun execution.
░░ 
░░ The job identifier is 63717.
Feb 08 08:28:01 ip-172-31-23-205.eu-west-1.compute.internal migrate-start[85021]: migrate: EnhancedSqlError {sqlErrorQuery = "SELECT revision FROM schema_migrations ORDER BY revision", sqlErrorQueryParams = []>
Feb 08 08:28:01 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: migrate.service: Main process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ An ExecStart= process belonging to unit migrate.service has exited.
░░ 
░░ The process' exit code is 'exited' and its exit status is 1.
Feb 08 08:28:01 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: migrate.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ The unit migrate.service has entered the 'failed' state with result 'exit-code'.
Feb 08 08:28:01 ip-172-31-23-205.eu-west-1.compute.internal systemd[1]: Failed to start migrate.service.
░░ Subject: A start job for unit migrate.service has failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ A start job for unit migrate.service has finished with a failure.
░░ 
░░ The job identifier is 63717 and the job result is failed.

[root@ip-172-31-23-205:~]# ps aux | grep sql
postgres   84941  0.0  0.2  82260 20764 ?        Ss   08:27   0:00 /nix/store/ki3srrjjzqalvh0hd9lmqavp5v9wr9jp-postgresql-14.9/bin/postgres
root       85073  0.0  0.0   6616  2684 pts/0    S+   08:31   0:00 grep sql

[root@ip-172-31-23-205:~]# 

This is definitely much better now.
Nginx is running, http://tpp-qa.gizra.site/ gives a HTTP 502, PostgreSQL is running.

[root@ip-172-31-23-205:~]# psql
psql (14.9)
Type "help" for help.

app=> \lk
invalid command \lk
Try \? for help.
app=> \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 app       | root     | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(4 rows)

app=> \c app
You are now connected to database "app" as user "root".
app=> \dt
               List of relations
 Schema |       Name        | Type  |  Owner   
--------+-------------------+-------+----------
 public | landing_pages     | table | postgres
 public | paragraph_ctas    | table | postgres
 public | paragraph_quotes  | table | postgres
 public | schema_migrations | table | postgres
 public | users             | table | postgres
(5 rows)

app=> 

@amitaibu
Copy link
Owner Author

amitaibu commented Feb 8, 2024

@mpscholten you should have your key in this AWS env - are you able to login to it? (So we're not blocking you 😄 )

@mpscholten
Copy link
Collaborator

Thanks, just logged into the ec2 instance. journalctl -u app shows that the app fails to start because of the missing RSA keys. Did you set the JWT_PRIVATE_KEY_PATH and JWT_PUBLIC_KEY_PATH variables?

@mpscholten
Copy link
Collaborator

Just saw that these env vars are set and the keys exists in the /root directory

@mpscholten
Copy link
Collaborator

The public key is wrongly encoded. Got it working by adjusting the preStart script to this:


                        systemd.services.app.preStart = ''
                            if [ ! -f /root/jwtRS256.key ]; then
                                ${pkgs.openssh}/bin/openssl genpkey -algorithm RSA -out /root/jwtRS256.key -pkeyopt rsa_keygen_bits:4096;
                            fi
                            if [ ! -f /root/jwtRS256.key.pub ]; then
                                ${pkgs.openssl}/bin/openssl rsa -pubout -in /root/jwtRS256.key -out /root/jwtRS256.key.pub;
                            fi
                        '';

@amitaibu amitaibu closed this Jul 15, 2024
@amitaibu amitaibu deleted the deploy branch July 15, 2024 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants