Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(connector): simplify and clean-up unused variants of ConnectorError #15031

Merged
merged 5 commits into from
Feb 13, 2024

Conversation

BugenZhao
Copy link
Member

@BugenZhao BugenZhao commented Feb 6, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

As title. See the documentation for more explanations.

Note that there were 4 error types widely used in the connector crate:

  • RwError
  • anyhow::Error
  • SinkError
  • ConnectorError

In #14950 we've migrated all RwError usages into anyhow::Error. So we still get 3.

This PR only changes the implementation of ConnectorError, which is the least used one. Following PRs may migrate anyhow::Error into it. Based on the review comments, we keep the enum structure untouched.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@BugenZhao BugenZhao marked this pull request as ready for review February 6, 2024 08:06
@BugenZhao BugenZhao requested a review from a team as a code owner February 6, 2024 08:06
Comment on lines 42 to 53
#[error("MySQL error: {0}")]
MySql(#[from] mysql_async::Error),

#[error("Postgres error: {0}")]
Postgres(#[from] tokio_postgres::Error),

#[error("Pulsar error: {0}")]
Pulsar(
#[source]
#[backtrace]
anyhow::Error,
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean that we lose some error context for mysql/postgres? (For Pulsar it seems ok)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate more on the "context"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Postgres error:"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, this is added at the conversion from tokio_postgres::Error to ConnectorError, but now this is not.

TBH I think it's not very useful. But it can be useful if tokio_postgres::Error is very vague. In this case, Postgres error: blabla is more helpful.

I think ideally we should add context() according to the higher level action we are doing though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ideally we should add context() according to the higher level action we are doing though.

Agree.

To encourage developers to provide the necessary context, a good approach would be to define ConnectorError as an enum and utilize thiserror_ext::ContextInto for conversion. 😄

This PR primarily prepares for the next one to wrap the bare anyhow::Error within ConnectorError. Once this is done, we can then consider whether switching it back to an enum implementation is appropriate.

Copy link
Contributor

@tabVersion tabVersion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally LGTM, let's wait for @StrikeW 's review.

@tabVersion tabVersion requested a review from StrikeW February 6, 2024 08:59
Base automatically changed from bz/connector-format-error-lint to main February 7, 2024 03:18
),

#[error("MySQL error: {0}")]
MySql(#[from] mysql_async::Error),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyhow::Error is good enough if one just wants to make the error type informative but not necessarily actionable. However, given a value of type anyhow::Error, it is hard to tell which module or crate it comes from, which may blur the boundary between modules when passing it around, leading to abuse.

I think the new ConnectorError on the right side will lose the context of specific connector, e.g. MySQL error. So it is just a new type of the anyhow::Error, and doesn't embed information of specific crate/module.

Without the context information, we don't know which module throws the error and must look into the stacktrace to find out. For example, previously we can tell the IO error comes from mysql #14847.

So I -1 for current implementation, could you embed the crate/module info into the macro?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically for #14847, I believe the retrying should be done before throwing the mysql_async::Error to upper layer, which is, before turning it into a ConnectorError. This is because performing ad-hoc matching for MySQL in the general code path of handling type-erased ConnectorError appears to be an abstraction leak. In this case, the internal structure of ConnectorError is not relevant.

However, if we're intend to do that on ConnectorError, anyhow still allows developer to downcast to a concrete type, just like trait objects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a principle that errors should be either informative or actionable. In contrast of implementing the retry logic (actionable), I suppose the more reasonable part is that the refactor will lose some context in the error message (informative) and make it vaguer, as discussed in https://github.com/risingwavelabs/risingwave/pull/15031/files#r1479489349.

Specifically, the message will become:

- Connector error: MySQL error: Input/output error: Input/output error: can't parse: buf doesn't have enough data
+ Connector error: Input/output error: Input/output error: can't parse: buf doesn't have enough data

I admit that this undoubtedly make it less informative and harder to identify the root cause of external services at a glance. This could be mainly because the error messages from 3rd-party crates are not managed and reviewed by us. As a result, we are less familiar with them and could confuse them with others.


I'm okay to keep the original enum implementation. As pointed out by @xxchan, a prefix of MySQL error: is not as good as a manually-specified context message like Failed to read the offset from MySQL: , but could be still better than no context. We could improve that in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I am most concerning the losing context problem which would increase the burden of troubleshooting. Ideally, if we can extract 3rd party crate name into the def_anyhow_newtype macro, then we can abandon the original enum implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the discussion with @xxchan, I've come up with another style of anyhow wrapper which will force the developers to provide contexts or detailed messages when upcasting the error. Will explore it in the next PRs.

///
/// [`anyhow::Error`] is good enough if one just wants to make the error type
/// informative but not necessarily actionable. However, given a value of type
/// [`anyhow::Error`], it is hard to tell which module or crate it comes from,
Copy link
Contributor

@StrikeW StrikeW Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get how this macro solve this problem: [anyhow::Error], it is hard to tell which module or crate it comes from. It seems like defining a wrapper type of anyhow::Error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like defining a wrapper type of anyhow::Error.

Yes, it is. I'm talking about the type-erased error type, instead of the error source type (like mysql::Error).

By wrapping anyhow::Error into ConnectorError, upper crate can clearly find it come from the connector crate and not be confused with other error types. With bare anyhow::Error, the boundary will not be clear.

Copy link
Contributor

@StrikeW StrikeW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, may change the PR description corresponding to new implementation.

@BugenZhao BugenZhao changed the title refactor(connector): make ConnectorError a wrapper of anyhow::Error refactor(connector): simplify and clean-up unused variants of ConnectorError Feb 8, 2024
@BugenZhao BugenZhao enabled auto-merge February 13, 2024 06:00
@BugenZhao BugenZhao force-pushed the bz/connector-error-use-anyhow-newtype branch from 3b29e67 to 2dbb5fc Compare February 13, 2024 06:18
@BugenZhao BugenZhao added this pull request to the merge queue Feb 13, 2024
Merged via the queue into main with commit befeba3 Feb 13, 2024
27 of 28 checks passed
@BugenZhao BugenZhao deleted the bz/connector-error-use-anyhow-newtype branch February 13, 2024 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants