-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix binary metrics so they properly return NaN when appropriate #113
Conversation
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some questions: just wanted to confirm a few things before merge!
non_nan = (!).(isnan.(x) .| isnan.(y)) | ||
x = x[non_nan] | ||
y = y[non_nan] | ||
isempty(x) && return NaN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to check if y
is empty as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nvm, I was confusing myself initially but I think the single check covers it! If x
is empty y
must be empty as well, so the single check is sufficient.
true_positive_rate = (true_positives == 0 && actual_positives == 0) ? | ||
(one(true_positives) / one(actual_positives)) : | ||
(true_positives / actual_positives) | ||
true_negative_rate = (true_negatives == 0 && actual_negatives == 0) ? | ||
(one(true_negatives) / one(actual_negatives)) : | ||
(true_negatives / actual_negatives) | ||
false_positive_rate = (false_positives == 0 && actual_negatives == 0) ? | ||
(zero(false_positives) / one(actual_negatives)) : | ||
(false_positives / actual_negatives) | ||
false_negative_rate = (false_negatives == 0 && actual_positives == 0) ? | ||
(zero(false_negatives) / one(actual_positives)) : | ||
(false_negatives / actual_positives) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you remember there a rationale behind these quantities being 1/0 in the edge case? I agree these should return NaN but just curious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I traced this code back to https://github.com/beacon-biosignals/OldLighthouse.jl/pull/36 (private repo), so I believe it was to support ROC curves better, where NaNs could contaminate the whole AUC even if it's just some 0.0 threshold or something. But IMO that is much better handled in the AUC computation itself rather than here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know the rationale behind lighthouse returning missing
instead of NaN
? I'm worried that there may be code that depends on this on this behavior rather than returning NaN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, and I agree this is a dangerous change. However, the most common usages of area_under_curve
are
Line 222 in 2a4494e
return TradeoffMetricsV1(; class_index, class_labels, roc_curve, |
Line 302 in 2a4494e
roc_auc::Float64 |
So I think there is a bug somewhere, either TradeoffMetricsV1 needs to support missing or area_under_curve needs to not return missing. I believe this change will be less disruptive. It does contravene the docstring for area_under_curve
so I believe it is technically a breaking change. We could do a breaking version bump here but that will cause a lot of compat updating work and I don't think this will break any users in practice, and it is essentially a bugfix, so I think it's OK. But it is a bit dicey.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took another look at this and answered one of my own questions, so I'll approve! Would still like info on the other questions just for context if you have it, but they're less important for merge
fixes #99