-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow dbt to cancel connections #718
Allow dbt to cancel connections #718
Conversation
Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA. In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR. CLA has not been signed by users: @holly-evans |
Signed the CLA just now. |
I locked Output before:
Output after:
Logs before:
Logs after:
|
Hi, is there anything I can add to help this PR move forward? Tagging @mikealfare, I see you on most of the PRs this year |
Thanks for your submission @holly-evans! This looks pretty good. I updated the branch and approved running integration tests. The one thing that jumps out is a missing changelog. Please refer to the steps here for adding a changelog entry. I'll review this sometime in the next week. Thanks again! |
@mikealfare added changelog entry 👍 |
dbt/adapters/redshift/connections.py
Outdated
def cancel(self, connection: Connection): | ||
try: | ||
pid = self._get_backend_pid() | ||
pid = connection.handle.backend_pid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@holly-evans This is all looking very good but we do have some tests failing was looking at this a bit and here may be a good point doesn't look like we setup a handle
object so we may need to ref the new ConnectionWrapper more directly.
If I can be of any more help please feel free to message me I will try to spend a little more time with this to give more advice as I can.
@McKnight-42 Thank you for that feedback! @holly-evans also wanted to apologize for the delay with this PR. We have pulled this into our current sprint and we are watching for your engagement and are aiming to get this reviewed and approved as soon as possible. Thank you for your patience. 🙏 @colin-rogers-dbt if this is ready for final approval while I'm out next week, can you please do the honors of getting it across the line? Thanks! |
Many of the failures are known flaky tests (already captured in another ticket). I'm rerunning CI to see if it changes what's passing. There were only maybe one or two failed tests that looked like they could be related to this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left several comments, mostly around trying a different approach. As I mention below, I would have rather used your wrapper approach, but I could not figure out how to get it to work with some existing scenarios. While the approach I recommend appears to keep the existing tests passing, I'm eager to find out if you see a difference in the execution, and subsequently the log outputs that you previously shared. Please let me know how this works out, and if you see any issues with these recommendations. Thanks again for your contribution!
Made the suggested changes and the unit tests are passing. Hoping the functional tests pass! It works in my project too. I locked dbt_hevans.accounting_feature_value before the dbt command to simulate my issue then cancelled the run with CTRL + C.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your help and effort @holly-evans!
Co-authored-by: Holly Evans <[email protected]>
resolves #705
docs dbt-labs/docs.getdbt.com/#
Problem
Interrupting dbt executions does not kill currently running queries. The dbt process does not stop until queries have completed. dbt issues a
pg_terminate_backend
command for the wrong pid.Solution
This PR wraps
Connection.handle
to capture the backend pid upon initiation and uses that connection-specific pid to cancel connections. Storing the pid before the model query begins ensures that we can runselect pg_terminate_backend(pid)
upon cancel, even when a query is in progress.I tried simply switching
_get_backend_id
to use the connection passed in tocancel
, but it could not query for the pid until the running query finished.Context
Currently,
select pg_backend_pid()
is called at the time of cancel, on themaster
connection. This pid is connection-specific, so whenselect pg_terminate_backend(pid)
is called with that pid, it's not canceling the query of the connection passed tocancel
, but a non-existent query onmaster
.Interrupting a dbt execution worked up until dbt-redshift 1.5 because
psycopg2
stored a backend pid on the handle that was used to cancel, butredshift_connector
does not have this pid available.If a connection is actively querying, attempting to query
select pg_backend_pid()
on that connection is delayed until the active query finishes (which may never happen in case of a blocking lock).dbt-databricks uses a connection wrapper to augment the
handle
, which inspired this approach.Checklist