-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter mask #597
Filter mask #597
Conversation
Hey @Aske-Rosted thanks for this. Could we add a check before attempting to access the filter quantities that checks it's actually there? Because if its not, the conversion will break. Here is some pseudo-code:
|
Does this look good? @RasmusOrsoe |
src/graphnet/data/dataconverter.py
Outdated
@@ -435,6 +437,8 @@ def _extract_data(self, fileset: FileSet) -> List[OrderedDict]: | |||
continue | |||
if self._skip_frame(frame): | |||
continue | |||
if self._filter_mask(frame, self._I3filters): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._filter_mask
will return False
by default if "FilterMask" not in frame
. That means that everyone will be unable to extract i3frames without this field. That's not good!
You can resolve this by either
A) Make _filter_mask
return True
by default in these cases
B) change if self._filter_mask(frame, self._I3filters)
to if self._filter_mask(frame, self._I3filters) & len(self._I3filters) > 0
. This way, the frames are only skipped if the user wants to filter out frames this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should very much work such that a True
return from the _filter_mask call will skip the frame (through a continue call) while a False
return will move on to include the frame. I agree this is a little counter intuitive and should probably be specified in the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. I was fooled. You are right!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey! Almost there. Left two comments
src/graphnet/data/dataconverter.py
Outdated
@@ -566,3 +570,27 @@ def _skip_frame(self, frame: "icetray.I3Frame") -> bool: | |||
if frame["I3EventHeader"].sub_event_stream == "NullSplit": | |||
return True | |||
return False | |||
|
|||
def _filter_mask( | |||
self, frame: "icetray.I3Frame", I3filters: List[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can get rid of the codeclimate error if you let _filter_mask access the i3filters inside the function, instead of passing them as argument.
We should probably consider what is the more relevant standard behavior. Passing all of the included filters is required to keep the frame. or passing 1 of the included filters means keeping the frame. in the current version it requires passing all the included filters. |
Yes! I think the most elegant solution would be to create a class like so:
Our current
and your new filter based on filtermasks could look like so:
We can then adjust the arguments to
We can then set the new variable as member of
and re-write the
This way we have a lot of freedom to define filters as our needs evolve, and we wouldn't have to think to hard about default values as you mention. What do you think @Aske-Rosted? |
The IceTray unit test is failing due to missing due to |
@Aske-Rosted I re-ran the workflow and now the disk space issue is gone. However, now a new error about missing files is there. Could you double check that you did not by accident cause this? |
Fixed the unit test issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great now. Just a comment on snake-case argument naming. Well done @Aske-Rosted !
src/graphnet/data/dataconverter.py
Outdated
@@ -107,6 +108,7 @@ def __init__( | |||
workers: int = 1, | |||
index_column: str = "event_no", | |||
icetray_verbose: int = 0, | |||
I3_Filters: List[I3Filter] = [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use lower-case "snake-case" by convention. So the argument should be i3_filters: List[I3Filter] = []
src/graphnet/data/dataconverter.py
Outdated
# I3Filters (NullSplitI3Filter is always included) | ||
self._I3Filters = [NullSplitI3Filter()] + I3_Filters | ||
|
||
for filter in self._I3Filters: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._i3filters
merge with main
@Aske-Rosted The IceTray disk space error is now fixed in main. Please try to update the branch and see if this doesn't fix the error. |
merge from main
hmm new unit test errors and I do not really see how they are related to this change... any thoughts @RasmusOrsoe? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
adds the option of using I3 files FilterMask to implement a filter when converting files