-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let taps continue to next partitions and/or streams upon certain errors #280
Comments
We have class Stream:
def _sync_records():
for context in partitions:
try:
# process partition
except EndOfStreamError:
continue
class Tap:
def sync():
for stream in streams:
try:
# process stream
except EndOfStreamError:
continue |
MR started but never merged: #635 |
This has been marked as stale because it is unassigned, and has not had recent activity. It will be closed after 21 days if no further activity occurs. If this should never go stale, please add the |
Still relevant |
Migrated from GitLab: https://gitlab.com/meltano/sdk/-/issues/282
Originally created by @laurentS on 2021-11-29 22:40:06
Summary
Follow up to !195 and probably related to #137 (and maybe #134)
In release 0.3.14, there is now support for
RetriableAPIError
andFatalAPIError
, which I think should be extended to allow for the tap to continue after such an error (whether that's the default or an opt-in behaviour), in particular when dealing with a partitioned stream.Proposed benefits
Example use case:
tap-github
can fetch data for a list of repos. Currently, if it fails at some point, the rest of the list is not handled at all. See this issue for more details. In that particular case, not handling the error means we're not able to retrieve data past the error which seems persistent on some repos.Given that the tap is run as a separate process, bubbling the exception up to the top level doesn't actually help in that the calling process cannot get access to the details of it. Returning state (and maybe an error exit code) seems like a cleaner way to fail.
Proposal details
In
Stream._sync_records
, the code loops through a list of contexts, but there is no error handling. The loop could be enclosed in atry...except
block which could selectively capture the above exceptions and update state (or not) accordingly, before proceeding with the next context.Best reasons not to build
I can't really think of a reason why this would cause problems, but I only have the
tap-github
use case in mind. If there are downsides, that behaviour could be opt-in via a config option, with the default behaviour remaining as is.The text was updated successfully, but these errors were encountered: