Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCSP refresh - seems to break connections afterwards #388

Open
ehdis opened this issue Jul 11, 2024 · 1 comment
Open

OCSP refresh - seems to break connections afterwards #388

ehdis opened this issue Jul 11, 2024 · 1 comment

Comments

@ehdis
Copy link

ehdis commented Jul 11, 2024

OCSP refresh - seems to break connections afterwards

TLDR

  1. After a restart of the hitch service via systemctl or reboot, the cert and the OCSP info is loaded
  2. client connection works
  3. hitch refreshes the OCSP staple
  4. client connections DO NOT work anymore

Restart/Start hitch service

After a restart of the hitch service via systemctl or reboot the cert and the OCSP info is loaded

Jun 30 18:53:45 my.host.name systemd[1]: Starting Network proxy that terminates TLS/SSL connections...
Jun 30 18:53:48 my.host.name hitch[25373]: {core} hitch 1.8.0 starting
Jun 30 18:53:48 my.host.name hitch[25373]: 20240630T185348.315572 [25373] {core} hitch 1.8.0 starting
Jun 30 18:53:48 my.host.name hitch[25373]: 20240630T185348.315606 [25373] {core} Using OpenSSL version 101010bf.
Jun 30 18:53:48 my.host.name hitch[25373]: 20240630T185348.318844 [25373] {core} Loading certificate pem files (2)
Jun 30 18:53:48 my.host.name hitch[25373]: {core} Using OpenSSL version 101010bf.
Jun 30 18:53:48 my.host.name hitch[25373]: {core} Loading certificate pem files (2)
Jun 30 18:53:48 my.host.name hitch[25373]: {core} Note: no DH parameters found in /etc/pki/tls/private/localhost.pem
Jun 30 18:53:48 my.host.name hitch[25373]: 20240630T185348.322064 [25373] {core} Note: no DH parameters found in /etc/pki/tls/private/localhost.pem
Jun 30 18:53:48 my.host.name hitch[25373]: 20240630T185348.322123 [25373] {core} ECDH Initialized
Jun 30 18:53:48 my.host.name hitch[25373]: {core} ECDH Initialized
Jun 30 18:53:48 my.host.name hitch[25373]: {core} Using DH parameters from /etc/pki/tls/private/vhosts-app.pem
Jun 30 18:53:48 my.host.name hitch[25373]: 20240630T185348.326334 [25373] {core} Using DH parameters from /etc/pki/tls/private/vhosts-app.pem
Jun 30 18:53:48 my.host.name hitch[25373]: 20240630T185348.326363 [25373] {core} DH initialized with 4096 bit key
Jun 30 18:53:48 my.host.name hitch[25373]: 20240630T185348.326373 [25373] {core} ECDH Initialized
Jun 30 18:53:48 my.host.name hitch[25373]: {core} DH initialized with 4096 bit key
Jun 30 18:53:48 my.host.name hitch[25373]: {core} ECDH Initialized
Jun 30 18:53:48 my.host.name hitch[25373]: {core} Loaded cached OCSP staple for cert '/etc/pki/tls/private/vhosts-app.pem'
Jun 30 18:53:48 my.host.name hitch[25373]: 20240630T185348.327642 [25373] {core} Loaded cached OCSP staple for cert '/etc/pki/tls/private/vhosts-app.pem'
Jun 30 18:53:48 my.host.name systemd[1]: hitch.service: Can't open PID file /run/hitch/hitch.pid (yet?) after start: No such file or directory
Jun 30 18:53:48 my.host.name hitch[25374]: {core} hitch 1.8.0 initialization complete
Jun 30 18:53:48 my.host.name hitch[25375]: {core} Listening on 0.0.0.0:443
Jun 30 18:53:48 my.host.name hitch[25375]: {core} Process 0 online
Jun 30 18:53:48 my.host.name hitch[25375]: {core} Successfully attached to CPU #0
Jun 30 18:53:48 my.host.name hitch[25376]: {ocsp} Refresh of OCSP staple for /etc/pki/tls/private/vhosts-app.pem scheduled in 310633 seconds
Jun 30 18:53:48 my.host.name hitch[25376]: {ocsp} Note: No OCSP responder URI found for cert /etc/pki/tls/private/localhost.pem
Jun 30 18:53:48 my.host.name systemd[1]: Started Network proxy that terminates TLS/SSL connections.

(ignore the localhost.pem lines - its just a dummy cert)

The daemon then drops the privileges to the hitch user (mentioned because the cert files -rw------- root root vhosts-app.pem):

# ss -nlpt |grep 443
LISTEN 0      400          0.0.0.0:443        0.0.0.0:*    users:(("hitch",pid=25375,fd=6))             

Now, after the service restart the "application" is reachable and the clients do not complain about a missing OCSP info.
As above showed the OCSP info is refreshed after a radom time window (scheduled in 310633 seconds).

Jul 10 09:11:01 my.host.name  hitch[1401]: {ocsp} Retrieved new staple for cert /etc/pki/tls/private/vhosts-app.pem
Jul 10 09:11:01 my.host.name  hitch[1401]: {ocsp} Refresh of OCSP staple for /etc/pki/tls/private/vhosts-app.pem scheduled in 518401 seconds

I am noticing that the client-connection-problems correlates with the mentioned OCSP refresh time period.

Errors during downloading metadata for repository:
  - Curl error (91): SSL server certificate status verification FAILED for {{URI}}.xml [No OCSP response received]
Error: Failed to download metadata for repo 'app': 

Is the refresh process reliable? Any clue?


  • Version used: hitch-1.8.0
  • Operating System and version: RHEL8
  • Source of binary packages used (if any): EL8 build
@ehdis
Copy link
Author

ehdis commented Aug 15, 2024

It seems that a SIGHUP does not result in the same state compared to a 'stop and start the service',
albeit it says that a reload was done (but this happens also while running (refresh)).

/usr/bin/kill -HUP $(cat /run/hitch/hitch.pid)

 hitch[1434]: Received SIGHUP: Initiating configuration reload.
 hitch[1434]: {core} Config reloaded in 0.00 seconds. Starting new child processes.
 hitch[16887]: Worker 0 (gen: 2) in state EXITING is now exiting.
 hitch[1434]: {core} Child 16887 exited with status 0.
 hitch[1434]: {core} Child 16888 exited with status 0.
 hitch[16968]: {core} Listening on 0.0.0.0:443
 hitch[16968]: {core} Process 0 online
 hitch[16968]: {core} Successfully attached to CPU #0
 hitch[16969]: {ocsp} Retrieved new staple for cert /etc/pki/tls/private/vhosts-app.pem
 hitch[16969]: {ocsp} Refresh of OCSP staple for /etc/pki/tls/private/vhosts-app.pem scheduled in 584638 seconds

So, how different is the path 'start->retrieveOCSP' compared to 'refresh->retrieveOCSP' or 'HUP->retrieveOCSP'?

PS: I wonder if I'm the only one hit by this issue ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant