Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adapter: add guarantees for AdapterClient::end_transaction #28549

Merged
merged 1 commit into from
Jul 26, 2024

Conversation

maddyblue
Copy link
Contributor

Both the HTTP and pgwire protocols automatically commit implicit transactions by calling AdapterClient::end_transaction which sends a Command::Commit to the Coordinator. That command produced a Plan::CommitTransaction (equivalent to the user typing COMMIT). Since this was a normal plan, the same as if it came from a SQL statement, the RBAC checks run. One of the checks verifies that the user's role still exists (regardless of the statement being executed). It is thus possible for the Plan::CommitTransaction to fail and not sequence its plan (in this case sequence_end_transaction). However, both the AdapterClient (and HTTP and pgwire protocols) had assumed that Command::Commit would always clean up the session's transaction state. But since the cleanup action happens in sequence_end_transaction which isn't guaranteed to execute (if, say, the user's role was dropped), that assumption is not guaranteed.

Change Command::Commit to guarantee that the Coordinator cleans up the txn state, and change AdapterClient::end_transaction to guarantee it cleans up the session's txn state. This leaves all usages of end_transaction as is by changing the implementation to do what the callers assumed it had been doing.

No explicit test was added because the parallel DDL tests have reliably been finding this bug, so adding the soft panic should produce a panic if this bug is still present when running that nightly.

Alternatives considered:

  • Add a Command::ClearTransaction command that AdapterClient::end_transaction calls which would cleanup the state. This seemed unneeded and another message sent to the Coordinator that could be avoided by adding the desired behavior to Commit and having the AdapterClient take care of the session's state.
  • Have callers of AdapterClient::end_transaction terminate the connection. This sounded nice but ended up being tricky because the higher level callers often support generic errors that get serialized and sent off to the user, leaving the connection intact. We'd have to plumb some new logic through that would indicate a termination was needed which didn't seem worth it.
  • Teach Command::Commit to produce a Plan that would skip the rbac checks. Skipping RBAC checks seemed dangerous.

Fixes MaterializeInc/database-issues#8316

Motivation

  • This PR fixes a recognized bug.

Checklist

Both the HTTP and pgwire protocols automatically commit implicit transactions by
calling AdapterClient::end_transaction which sends a Command::Commit to the
Coordinator. That command produced a Plan::CommitTransaction (equivalent to the
user typing `COMMIT`). Since this was a normal plan, the same as if it came from
a SQL statement, the RBAC checks run. One of the checks verifies that the user's
role still exists (regardless of the statement being executed). It is thus
possible for the Plan::CommitTransaction to fail and not sequence its plan (in
this case `sequence_end_transaction`). However, both the AdapterClient (and HTTP
and pgwire protocols) had assumed that Command::Commit would always clean up the
session's transaction state. But since the cleanup action happens in
`sequence_end_transaction` which isn't guaranteed to execute (if, say, the
user's role was dropped), that assumption is not guaranteed.

Change Command::Commit to guarantee that the Coordinator cleans up the txn
state, and change AdapterClient::end_transaction to guarantee it cleans up the
session's txn state. This leaves all usages of end_transaction as is by changing
the implementation to do what the callers assumed it had been doing.

No explicit test was added because the parallel DDL tests have reliably been
finding this bug, so adding the soft panic should produce a panic if this bug is
still present when running that nightly.

Alternatives considered:

- Add a Command::ClearTransaction command that AdapterClient::end_transaction
  calls which would cleanup the state. This seemed unneeded and another message
  sent to the Coordinator that could be avoided by adding the desired behavior
  to Commit and having the AdapterClient take care of the session's state.
- Have callers of AdapterClient::end_transaction terminate the connection. This
  sounded nice but ended up being tricky because the higher level callers often
  support generic errors that get serialized and sent off to the user, leaving
  the connection intact. We'd have to plumb some new logic through that would
  indicate a termination was needed which didn't seem worth it.
- Teach Command::Commit to produce a Plan that would skip the rbac checks.
  Skipping RBAC checks seemed dangerous.

Fixes #28400
@maddyblue maddyblue requested a review from a team as a code owner July 26, 2024 05:04
@maddyblue maddyblue requested a review from jkosh44 July 26, 2024 05:04
Copy link
Contributor

@jkosh44 jkosh44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@maddyblue maddyblue merged commit 7ca67d6 into MaterializeInc:main Jul 26, 2024
77 checks passed
@maddyblue maddyblue deleted the end-txn-guarantees branch July 26, 2024 15:39
@benesch
Copy link
Member

benesch commented Jul 26, 2024

Nice find, @maddyblue! Thanks very much for tracking this one down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants