Operation dispatcher err handling #125

didierofrivia · 2024-10-29T09:56:25Z

Fixes #123

Also fixes the following e2e tests:

This PR introduces a refactor that besides fixing the issue #123, makes the code a bit more consistent regarding the flow of the Operations State controlled by the OperationHandler. It aims to handle all the exceptions raised and report back to the Filter (or any caller).

eguzki

few comments dropped

eguzki · 2024-10-29T13:03:38Z

src/filter/http_context.rs

@@ -114,15 +127,15 @@ impl Context for Filter {



I think we should check status_code. When grpc endpoint is unreachable, this is flagged on the status_code.

Does it make sense to check the status_code from on_grpc_call_response input param?

if not, the error will come from trying to parse the GRPC response, which may be empty because there is no GrpcMessageResponse being serialized

it might be useful, however I'm a bit confused about what are the status_code valid values... since there's no docs, its signature is just u32 and it's confusing their examples... would it be a GrpcStatusCode or Status.. then it's a bit confusing in this particular example how they treat it. What do you reckon? what meaningful check can we do to avoid parsing and returning early?

From the tests I have run, on successful GRPC request/response exchange, status_code is 0. Whereas when the service is unreachable, status_code is 14.

That all I know for now.

Let's merge this and you can enhance this in a following up PR's

src/operation_dispatcher.rs

src/filter/http_context.rs

src/operation_dispatcher.rs

eguzki · 2024-10-29T13:36:30Z

This is still not working for me. I run the following configuration:

{
  "services": {
    "limitadorA": {
      "type": "ratelimit",
      "endpoint": "limitador",
      "failureMode": "deny"
    },
    "limitadorB": {
      "type": "ratelimit",
      "endpoint": "unknown",
      "failureMode": "deny"
    }
  },
  "actionSets": [
    {
      "name": "basic",
      "routeRuleConditions": {
        "hostnames": [
          "*.example.com"
        ]
      },
      "actions": [
        {
          "service": "limitadorA",
          "scope": "basic",
          "data": [
            {
              "expression": {
                "key": "a",
                "value": "1"
              }
            }
          ]
        },
        {
          "service": "limitadorB",
          "scope": "basic",
          "data": [
            {
              "expression": {
                "key": "a",
                "value": "1"
              }
            }
          ]
        }
      ]
    }
  ]
}

There are two actions, both with failureMode to false. And the second one targets a missing cluster. On running a request, the request makes its way to upstream and returns 200 regardless of failureMode.

didierofrivia · 2024-10-29T13:51:54Z

@eguzki PR is still a draft, so many things won't work as you expect :) Will definitely ping you and add you as a reviewer once I think it's done ;)

didierofrivia · 2024-10-29T15:45:15Z

Now @eguzki ! this is your time to shine !!! (and thanks in advance for the reviews 🙏🏼 )

didierofrivia · 2024-10-29T15:46:15Z

(I've included the commits from your test, just for easy testing... I could remove them from this PR if you want)

didierofrivia · 2024-10-30T08:53:13Z

src/filter/http_context.rs

+            },
+            Ok(None) => {
+                Action::Continue // No operations left to perform
            }
-        } else {
-            Action::Continue
+            Err(OperationError {
+                failure_mode: FailureMode::Deny,
+                ..
+            }) => {
+                self.send_http_response(500, vec![], Some(b"Internal Server Error.\n"));
+                Action::Continue
+            }
+            _ => Action::Continue,


we could omit

Ok(None) => { Action::Continue // No operations left to perform }

Left it there just so we now explicitly in case we want to perform something else

So, for all other Err, we ... Action::Continue?

I guess I'm saying that I'd leave the Ok(None) => Action::Continue, as an explicit branch but would have 3 Err branches:

FailureMode::Deny, as you do (log the actual error there? as this where it all ends, right?)

FailureMode::Allow, explicit with the actual error logged again

Err(_), that possibly panic! or is marked as unreachable! as that I think is the assumption this builds upon

I made it explicit the handling of both failure modes, but the third branch makes clippy unhappy:

warning: unreachable pattern --> src/filter/http_context.rs:90:13 | 90 | Err(_) => unreachable!() | ^^^^^^ | = note: `#[warn(unreachable_patterns)]` on by default

I could add exception to our clippy default rules tho if needed.

I'm also logging the status. Would that be enough?

Let me know and I can include it in a following PR ;)

didierofrivia · 2024-10-30T09:12:31Z

src/operation_dispatcher.rs

+
+impl fmt::Display for OperationError {
+    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
+        match self.status {


Depending on the status we could add specific messages

didierofrivia · 2024-10-30T10:44:17Z

By the way @eguzki, this PR also fixes your #122 ;)

src/filter/http_context.rs

Signed-off-by: dd di cesare <[email protected]>

* Mutating the state if found * Returning Err with status NotFound and default failureMode Signed-off-by: dd di cesare <[email protected]>

Signed-off-by: dd di cesare <[email protected]>

eguzki

working software 🎖️

All tests passing:

plus existing unittests

alexsnaps

Thinking error handling from performing an operation could be pushed within the Operations themselves, making the orchestration of the pipeline/operation chain a little easier on the reader... not asking to change this tho.

alexsnaps · 2024-11-04T10:47:09Z

src/filter/http_context.rs

+            },
+            Ok(None) => {
+                Action::Continue // No operations left to perform
            }
-        } else {
-            Action::Continue
+            Err(OperationError {
+                failure_mode: FailureMode::Deny,
+                ..
+            }) => {
+                self.send_http_response(500, vec![], Some(b"Internal Server Error.\n"));
+                Action::Continue
+            }
+            _ => Action::Continue,


So, for all other Err, we ... Action::Continue?

src/operation_dispatcher.rs

Signed-off-by: dd di cesare <[email protected]>

didierofrivia changed the base branch from main to e2e-missing-cluster October 29, 2024 11:17

eguzki reviewed Oct 29, 2024

View reviewed changes

didierofrivia force-pushed the operation-dispatcher-err-handling branch from f07915d to 541b3f3 Compare October 29, 2024 13:49

didierofrivia changed the base branch from e2e-missing-cluster to main October 29, 2024 13:50

didierofrivia force-pushed the operation-dispatcher-err-handling branch from e991d6a to 88fb257 Compare October 29, 2024 15:34

didierofrivia marked this pull request as ready for review October 29, 2024 15:44

didierofrivia requested a review from eguzki October 29, 2024 15:44

didierofrivia requested review from alexsnaps and adam-cattermole October 29, 2024 15:45

didierofrivia commented Oct 30, 2024

View reviewed changes

didierofrivia mentioned this pull request Oct 30, 2024

refactor operation dispatcher #127

Open

didierofrivia force-pushed the operation-dispatcher-err-handling branch 3 times, most recently from 5e13324 to b9a52d6 Compare October 30, 2024 16:37

eguzki reviewed Oct 30, 2024

View reviewed changes

src/filter/http_context.rs Show resolved Hide resolved

didierofrivia added 7 commits October 31, 2024 08:40

[refactor] Error handling for operations

d515bbb

Signed-off-by: dd di cesare <[email protected]>

[refactor] Implementing OperationError

2ace497

Signed-off-by: dd di cesare <[email protected]>

[refactor] More error handling

a3ee3d7

Signed-off-by: dd di cesare <[email protected]>

[refactor] Returning Result when getting a "waiting" operation

f7eac86

* Mutating the state if found * Returning Err with status NotFound and default failureMode Signed-off-by: dd di cesare <[email protected]>

[refactor] No more ops means None, not Err

0784019

Signed-off-by: dd di cesare <[email protected]>

[refactor] Explicitly managing error

346d035

Signed-off-by: dd di cesare <[email protected]>

[refactor] Fixing flow of http request on grpc response

06f654f

Signed-off-by: dd di cesare <[email protected]>

didierofrivia force-pushed the operation-dispatcher-err-handling branch from b9a52d6 to 06f654f Compare October 31, 2024 07:42

didierofrivia requested a review from eguzki October 31, 2024 07:43

eguzki approved these changes Oct 31, 2024

View reviewed changes

alexsnaps reviewed Nov 4, 2024

View reviewed changes

adam-cattermole approved these changes Nov 4, 2024

View reviewed changes

[refactor] Explicit handling Error and logging it

5397dde

Signed-off-by: dd di cesare <[email protected]>

didierofrivia requested a review from alexsnaps November 4, 2024 13:48

didierofrivia merged commit 983261d into main Nov 6, 2024
9 checks passed

didierofrivia deleted the operation-dispatcher-err-handling branch November 6, 2024 08:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Operation dispatcher err handling #125

Operation dispatcher err handling #125

didierofrivia commented Oct 29, 2024 •

edited

Loading

eguzki left a comment

eguzki Oct 29, 2024

eguzki Oct 30, 2024

eguzki Oct 30, 2024

didierofrivia Oct 31, 2024 •

edited

Loading

eguzki Oct 31, 2024

eguzki commented Oct 29, 2024

didierofrivia commented Oct 29, 2024 •

edited

Loading

didierofrivia commented Oct 29, 2024

didierofrivia commented Oct 29, 2024

didierofrivia Oct 30, 2024

alexsnaps Nov 4, 2024

alexsnaps Nov 4, 2024 •

edited

Loading

didierofrivia Nov 4, 2024 •

edited

Loading

didierofrivia Nov 6, 2024

didierofrivia Oct 30, 2024

didierofrivia commented Oct 30, 2024

eguzki left a comment •

edited

Loading

alexsnaps left a comment

alexsnaps Nov 4, 2024

Operation dispatcher err handling #125

Operation dispatcher err handling #125

Conversation

didierofrivia commented Oct 29, 2024 • edited Loading

eguzki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

didierofrivia Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eguzki commented Oct 29, 2024

didierofrivia commented Oct 29, 2024 • edited Loading

didierofrivia commented Oct 29, 2024

didierofrivia commented Oct 29, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexsnaps Nov 4, 2024 • edited Loading

Choose a reason for hiding this comment

didierofrivia Nov 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

didierofrivia commented Oct 30, 2024

eguzki left a comment • edited Loading

Choose a reason for hiding this comment

alexsnaps left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

didierofrivia commented Oct 29, 2024 •

edited

Loading

didierofrivia Oct 31, 2024 •

edited

Loading

didierofrivia commented Oct 29, 2024 •

edited

Loading

alexsnaps Nov 4, 2024 •

edited

Loading

didierofrivia Nov 4, 2024 •

edited

Loading

eguzki left a comment •

edited

Loading