Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unified file source stage #1184
base: branch-23.11
Are you sure you want to change the base?
Unified file source stage #1184
Changes from 14 commits
acfd425
dca5e91
d2f8ff8
51cdc0c
32883c0
2dd8d89
5aa68e5
bc73877
f27ea80
ea56c5f
0474b96
7bd1584
e6093ec
0feb848
c01c248
9a6b9e6
f5d0510
54e1f5c
bf953b6
e83edee
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remark: Adding this argument makes me uneasy, since it will be difficult to deprecate in the future if necessary.
Question: Is this being added as a new feature, or is this something that existed on any of the other file source implementations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remark: It appears that
== 0
and< -1
are invalid values formax_files
.Important: Check that
max_files
is in a valid range (if you decide to keep-1
as the default, adjust accordingly).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raising an error if
self._files
isNone
or[]
. We will get at least one value in theself._protocols
, so i didn't put an extra check.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, but what about
max_files
? Ifmax_files == 0
ormax_files < -1
, then this stage won't produce any files. In that case we should either warn or raise an exception.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
max_files
flag takes effect only when set to a value greater than zero; otherwise, it is treated as continuous polling without any imposed limit. Default value is-1
, so I thought raising an error or warn would not needed. Let me know if you still want to add the warning message.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remark: Ifprocessed_files_count > self._max_files
we getfiltered_files[:n]
wheren < 0
, meaning we'll take the lastn
files, which doesn't sound like what we want.Important: make sure we don't accidentally read from the end of the list offiltered_files
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see we won't get a negative number because
processed_files_count
is calculated based on_max_files
. My bad. No change needed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remark: Oh, I see. This is how we are stopping the source when we reach the max file limit. This is fine, but in general cancellation tokens are reserved for flagging from outside of the function that checks them. I think we can move this logic up in to the previous
if self._max_files > 0
condition and usebreak
orreturn
there rather than flagging the cancellation token. Up to you.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated as suggested. I tried to avoid
break
andyield
(multiple times), which is the reason i choosed this approach.