Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: /ready pending state as 503 and 500 as fail #3490

Merged
merged 1 commit into from
May 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ This project adheres to [Semantic Versioning](http://semver.org/).
- #3404, Clarify the `PGRST121` (could not parse RAISE 'PGRST') error message - @laurenceisla
- #3267, Fix wrong `503 Service Unavailable` on pg error `53400` - @taimoorzaeem
- #2985, Fix not adding `application_name` on all connection strings - @steve-chavez
- #3424, Admin `/live` and `/ready` now differentiates a failure as 500 status - @steve-chavez
+ 503 status is still given when postgREST is in a recovering state

### Deprecated

Expand Down
8 changes: 5 additions & 3 deletions docs/references/admin_server.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Admin Server
############

PostgREST provides an admin server that can be enabled by setting :ref:`admin-server-port` to the port number of your preference.
PostgREST provides an admin server that can be enabled by setting :ref:`admin-server-port`.

.. _health_check:

Expand All @@ -23,7 +23,7 @@ Two endpoints ``live`` and ``ready`` will then be available.
Live
----

The ``live`` endpoint verifies if PostgREST is running on its configured port. A request will return ``200 OK`` if PostgREST is alive or ``503`` otherwise.
The ``live`` endpoint verifies if PostgREST is running on its configured port. A request will return ``200 OK`` if PostgREST is alive or ``500`` otherwise.

For instance, to verify if PostgREST is running while the ``admin-server-port`` is set to ``3001``:

Expand All @@ -38,7 +38,7 @@ For instance, to verify if PostgREST is running while the ``admin-server-port``
Ready
-----

In addition, the ``ready`` endpoint checks the state of the :ref:`connection_pool` and the :ref:`schema_cache`. A request will return ``200 OK`` if both are good or ``503`` if not.
Additionally to the ``live`` check, the ``ready`` endpoint checks the state of the :ref:`connection_pool` and the :ref:`schema_cache`. A request will return ``200 OK`` if both are good or ``503`` if not.

.. code-block:: bash

Expand All @@ -48,6 +48,8 @@ In addition, the ``ready`` endpoint checks the state of the :ref:`connection_poo

HTTP/1.1 200 OK

PostgREST will try to recover from the ``503`` state with :ref:`automatic_recovery`.

Metrics
=======

Expand Down
2 changes: 1 addition & 1 deletion docs/references/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ admin-server-port
**In-Database** `n/a`
=============== =======================

Specifies the port for the :ref:`health_check` endpoints.
Specifies the port for the :ref:`admin_server`.

.. _app.settings.*:

Expand Down
13 changes: 10 additions & 3 deletions src/PostgREST/Admin.hs
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,19 @@
admin appState req respond = do
isMainAppReachable <- isRight <$> reachMainApp (AppState.getSocketREST appState)
isLoaded <- AppState.isLoaded appState
isPending <- AppState.isPending appState

case Wai.pathInfo req of
["ready"] ->
respond $ Wai.responseLBS (if isMainAppReachable && isLoaded then HTTP.status200 else HTTP.status503) [] mempty
["live"] ->
respond $ Wai.responseLBS (if isMainAppReachable then HTTP.status200 else HTTP.status503) [] mempty
respond $ Wai.responseLBS (if isMainAppReachable then HTTP.status200 else HTTP.status500) [] mempty
["ready"] ->
let
status | not isMainAppReachable = HTTP.status500
| isPending = HTTP.status503
| isLoaded = HTTP.status200
| otherwise = HTTP.status500

Check warning on line 55 in src/PostgREST/Admin.hs

View check run for this annotation

Codecov / codecov/patch

src/PostgREST/Admin.hs#L55

Added line #L55 was not covered by tests
in
respond $ Wai.responseLBS status [] mempty
["config"] -> do
config <- AppState.getConfig appState
respond $ Wai.responseLBS HTTP.status200 [] (LBS.fromStrict $ encodeUtf8 $ Config.toText config)
Expand Down
12 changes: 11 additions & 1 deletion src/PostgREST/AppState.hs
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ module PostgREST.AppState
, runListener
, getObserver
, isLoaded
, isPending
) where

import qualified Data.Aeson as JSON
Expand Down Expand Up @@ -315,6 +316,12 @@ isLoaded x = do
connEstablished <- isConnEstablished x
return $ scacheStatus == SCLoaded && connEstablished

isPending :: AppState -> IO Bool
isPending x = do
scacheStatus <- readIORef $ stateSCacheStatus x
connStatus <- readIORef $ stateConnStatus x
return $ scacheStatus == SCPending || connStatus == ConnPending

putSCacheStatus :: AppState -> SchemaCacheStatus -> IO ()
putSCacheStatus = atomicWriteIORef . stateSCacheStatus

Expand All @@ -338,12 +345,15 @@ loadSchemaCache appState@AppState{stateObserver=observer} = do
observer $ SchemaCacheFatalErrorObs e hint
return SCFatalFail
Nothing -> do
putSCacheStatus appState SCPending
putSchemaCache appState Nothing
observer $ SchemaCacheNormalErrorObs e
putSCacheStatus appState SCPending
return SCPending

Right sCache -> do
-- IMPORTANT: While the pending schema cache state starts from running the above querySchemaCache, only at this stage we block API requests due to the usage of an
-- IORef on putSchemaCache. This is why SCacheStatus is put at SCPending here to signal the Admin server (using isPending) that we're on a recovery state.
putSCacheStatus appState SCPending
putSchemaCache appState $ Just sCache
observer $ SchemaCacheQueriedObs resultTime
(t, _) <- timeItT $ observer $ SchemaCacheSummaryObs $ showSummary sCache
Expand Down
15 changes: 11 additions & 4 deletions test/io/postgrest.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,13 @@ def read_stdout(self, nlines=1):
time.sleep(0.1)
return output

def wait_until_scache_starts_loading(self, max_seconds=1):
"Wait for the admin /ready return a status of 503"

wait_until_status_code(
self.admin.baseurl + "/ready", max_seconds=max_seconds, status_code=503
)


@contextlib.contextmanager
def run(
Expand Down Expand Up @@ -120,7 +127,7 @@ def run(
process.stdin.close()

if wait_for_readiness:
wait_until_ready(adminurl + "/ready", wait_max_seconds)
wait_until_status_code(adminurl + "/ready", wait_max_seconds, 200)

if no_startup_stdout:
process.stdout.read()
Expand Down Expand Up @@ -178,14 +185,14 @@ def wait_until_exit(postgrest):
raise PostgrestTimedOut()


def wait_until_ready(url, max_seconds):
"Wait for the given HTTP endpoint to return a status of 200."
def wait_until_status_code(url, max_seconds, status_code):
"Wait for the given HTTP endpoint to return a status code"
session = requests_unixsocket.Session()

for _ in range(max_seconds * 10):
try:
response = session.get(url, timeout=1)
if response.status_code == 200:
if response.status_code == status_code:
return
except (requests.ConnectionError, requests.ReadTimeout):
pass
Expand Down
9 changes: 5 additions & 4 deletions test/io/test_big_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from postgrest import *


def test_requests__wait_for_schema_cache_reload(defaultenv):
def test_requests_wait_for_schema_cache_reload(defaultenv):
"requests that use the schema cache (e.g. resource embedding) wait for the schema cache to reload"

env = {
Expand All @@ -23,13 +23,14 @@ def test_requests__wait_for_schema_cache_reload(defaultenv):
response = postgrest.session.get("/rpc/notify_pgrst")
assert response.status_code == 204

time.sleep(1.5) # wait for schema cache to start reloading
postgrest.wait_until_scache_starts_loading()

response = postgrest.session.get("/tpopmassn?select=*,tpop(*)")
assert response.status_code == 200

server_timings = parse_server_timings_header(response.headers["Server-Timing"])
plan_dur = server_timings["plan"]
plan_dur = parse_server_timings_header(response.headers["Server-Timing"])[
"plan"
]
assert plan_dur > 10000.0


Expand Down
4 changes: 2 additions & 2 deletions test/io/test_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -739,7 +739,7 @@ def test_admin_ready_dependent_on_main_app(defaultenv):
# delete the unix socket to make the main app fail
os.remove(defaultenv["PGRST_SERVER_UNIX_SOCKET"])
response = postgrest.admin.get("/ready")
assert response.status_code == 503
assert response.status_code == 500


def test_admin_live_good(defaultenv):
Expand All @@ -757,7 +757,7 @@ def test_admin_live_dependent_on_main_app(defaultenv):
# delete the unix socket to make the main app fail
os.remove(defaultenv["PGRST_SERVER_UNIX_SOCKET"])
response = postgrest.admin.get("/live")
assert response.status_code == 503
assert response.status_code == 500


@pytest.mark.parametrize("specialhostvalue", FIXTURES["specialhostvalues"])
Expand Down
Loading