-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix annotation rejection idx #12895
base: main
Are you sure you want to change the base?
BUG: Fix annotation rejection idx #12895
Conversation
- Keep bad annotations that end at the start sample (annotations that are potentialy one sample long): end = start - Do not discard bad annotations that are exactly one sample long: onset = end - Also discard last sample of bad annotation: end - start + 1
- Fix now broken tests - Add tests for annotations that are one sample long
for more information, see https://pre-commit.ci
Thanks for the PR!
Indeed, but that's where this private helper function comes in:
I suspect you will find |
Thank you for the helpful comment! So, I have been thinking about the function sfreq = 10
info = mne.create_info(1, sfreq, "eeg")
raw = mne.io.RawArray(np.random.RandomState(0).randn(1, 30 * sfreq), info)
annotations = Annotations([0], [10], ['BAD'], raw.info['meas_date'])
raw.set_annotations(annotations)
starts, stops = _annotations_starts_stops(raw, 'BAD')
print(starts, stops) 100 is in fact the last sample of the bad segment, so if I do The question is if we still want to go ahead with the change in |
How about a new kwarg FWIW we have for example |
Thanks, that sounds good. I will go ahead and implement it with the new kwarg! |
- Add include_last as keyword argument to _annotations_starts_stops - Adapt corresponding function call in `BaseRaw.get_data()`, and correspondingly remove +1 indexing of used samples - Update documentation of _annotations_starts_stops - Add tests for _annotations_starts_stops - Add towncrier entry for PR
Okay so after implementing Because now
I think the underlying problem is that 'edge' annotations are a special case of annotations. If I understand correctly, they are 1-sample long annotations that are set to mark the beginning/end of a segment when different segments have been concatenated to a single data block. So the 'edge' annotations do not mark bad data, in contrast to for example the annotation of bad acquisitions by 'bad_acq_skip'. But the default parameter I guess there are different potential ways forward from here (simply listed, not ordered by preference):
I'm glad about any feedback! Thanks |
I prefer (3), IMO this problem seems to be a bug within our private code/functions; and there is no reason to complicate the public API to tackle an internal bug. |
Yeah I'm also okay with (3). Regarding:
Do we actually set |
No, it is actually set to In the case of the 'edge' annotation it might be quite obvious that the annotation should not be discarded. But let's imagine we want to annotate a bad segment that goes from tmin=0s to tmax=1s (duration=1s). If we set this annotation, and then call From what I have seen in the codebase is that when we have an annotation starting at 0s with a duration of for example 1s, only 0 to 0.99s (depending on the sfreq of course) are considered to be part of the annotation, and the sample at 1s is not. Maybe it is too much to modify this behavior now which seems to be more or less consistent within MNE. Maybe we should take a step back and think about how we can solve the original problem that sparked this PR initially, which was that if we annotated a bad segment at the end of the recording within the data browser, the last sample was not discarded. This is I think because from the visual annotation, the annotation in the raw object (lets take a recording of that with tmin=0 and tmax=1s) is calculated as e.g. info = mne.create_info(1, sfreq, "eeg")
raw = mne.io.RawArray(
np.random.RandomState(0).randn(1, 30 * sfreq), info
)
annotations = Annotations([0], [1.0], ['BAD'], raw.info['meas_date'])
raw.set_annotations(annotations)
onsets, ends = _annotations_starts_stops(raw, 'BAD') # , include_last=False)
print(onsets, ends)
_, times = raw.get_data(return_times=True, reject_by_annotation='omit')
print(times[0])
raw_crop = raw.crop(tmin=1.0)
print(raw_crop.annotations[0]) |
I would say no. But also agree that converging on this view and modifying code etc. to abide by it would probably be far from trivial!
It seems like you should in principle be able to extend the annotation to the end of the time span represented by the sample. So if our model/mapping of the samples and continuous time is (and I think it is based on what we've discussed above) for example at 1000 Hz
i.e., sample 0 represents the time from 0 to 0.001, sample 1 the time from 0.001 to 0.002, etc., so sample 999 represents the time from 0.999 to 1.000, so our annotations should be allowed to go out to 1.000, not just 0.999. If this squares with our code, we should add to a comment in the code at least, and maybe also the But I digress a bit... but it seems like this problem would be solved by:
|
Okay, then I think this is actually the easiest way forward:
Thanks a ton for the discussion! |
Reference issue (if any)
Addresses #12893
What does this implement/fix?
As discussed in the issue above, I implemented a fix that implements the last sample of "bad" annotations in
raw.get_data
being rejected now as well.However, I noticed that some other functions, for example
filter
, have a similar behavior asget_data
before, where the last sample of the "bad" annotation is not rejected. This means that for consistent behavior, we should also adapt these functions accordingly, right?I'm happy about any other feedback and comments.
Changes