-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add Cramer-rao uncertainties + covariance using autodiff to non-minuit fits by default #2269
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now pyhf.infer.mle.fit(..., return_uncertainties=True)
silently ignores that kwarg if the backend isn't autodiff-enabled as far as I can see. I think we'd need to raise at least a warning, if not just error out in that case. Ignoring the kwarg also means that the returned shape is not as expected.
edit: it seems that the silent ignore was the default behavior previously already, which is not ideal in my opinion.
can you point to the specific line(s) here? i can't see that behaviour on first pass (edit: as Alex said, this is just already in main somewhere, not part of this PR!) |
Thanks @phinate! This would be a welcome addition I think. I probably won't have time to review this until later this week, but from a quick scan looks like there are still some development things to do and so I've put it in draft mode. There might be enough work here to split things out across maybe 2 PRs to try to keep things more atomic, but we can see. This would close Issue #2268, correct? If so, can you please note that in the PR body? |
Indeed, sorry I should have made this a draft! Wasn't sure if the draft would trigger the CI or not (was curious if it would pass before local testing). I've updated the PR with that issue number, thanks for that! That gist was mainly for the errors on yields so i got a bit mixed up, but the title is clear :) |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2269 +/- ##
==========================================
- Coverage 98.21% 97.88% -0.33%
==========================================
Files 69 69
Lines 4543 4593 +50
Branches 804 814 +10
==========================================
+ Hits 4462 4496 +34
- Misses 48 59 +11
- Partials 33 38 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Worked some more on this, and expanded the scope of the uncertainty and correlation tests to encompass non-minuit options. Importantly, I've addressed an oversight, where the covariance matrix (hess_inv) was not treated correctly during fits with fixed parameters -- it's right to zero out the uncertainties, but then the covariance matrix also needs the relevant row/column zeroed out too (noticed this when calculating correlations). I don't think this was used actively downstream, but it's been fixed for minuit + scipy now. |
I got worried that this also would probably imply a bug in import pyhf
model = pyhf.simplemodels.uncorrelated_background(
signal=[12.0, 11.0], bkg=[50.0, 52.0], bkg_uncertainty=[3.0, 7.0]
)
data = [62, 63] + model.config.auxdata
pyhf.set_backend("numpy", "minuit")
fix_pars = model.config.suggested_fixed()
fix_pars[2] = True
res, corr, res_obj = pyhf.infer.mle.fit(
data, model, fixed_params=fix_pars, return_correlations=True, return_result_obj=True
)
print(corr)
print(res_obj.minuit.covariance.correlation()) prints
which correctly zeros out those columns, both from the explicit return |
ah okay, that's great to know -- have just located the reasons for this after looking through the minuit issue: scikit-hep/iminuit#762 (comment)
sorry for flagging this with the wrong wording! i was just confused by the fact that the uncertainties need to be manually set to zero within pyhf, but the covariance wasn't edited in-turn (but it is needed for autodiff at least, so has been useful to look at!) |
I've marked this ready for review, because it now seems to do the correct thing in the cases i've tried, and the tests are passing. I've not added any new tests -- just updated the uncert-using tests in This is something I'd like your thoughts on as part of double-checking this: Lines 425 to 428 in 2787b92
It seems like for the model that's being tested here, there's quite a big difference in the minuit/autodiff uncertainties (0.6 compared to 0.2). The autodiff result is consistent across backends. @alexander-held made a plot for this pre-fit: Is it important that we keep this model in particular? It seems like we can tweak this a little to make the uncertainties agree more (e.g. by adding more data in the first bin as Alex suggested), but i'm not 100% sure of the scenarios that I'd expect minuit/autodiff uncertainties to stop agreeing. Open to your thoughts here! |
What's the central value in this case? I'm guessing the issue might be that the lower default POI bound of 0 is close to the best-fit value and therefore the uncertainty estimates don't make too much sense. Easiest way to avoid that would be relaxing the POI bound (or changing the model / data). |
So the test where that line is fixes the POI to 1 in the fit -- even so, seems the bounds are also (-5,5) in |
If it is just the histosys, I am not really sure, but given the big difference you could propagate this through to the post-fit yield uncertainties, they should be comparable to sqrt(N). Given the large difference, that should show which of the numbers is correct. |
Doing the fixed POI fit again with mu=1 and propagating:
Looks like autodiff is much more like sqrt(N), with the difference being in the first bin. Still trying to work out if this kind of severe overconstraint on the NP makes sense given the tight sys band pre-fit. I think this is at least evidence that the autodiff uncerts are not malfunctioning in some way, so good news! For the sake of scoping, should we raise this in a minuit issue to see if it's expected or not -- i.e. double check what's an artefact of the model or of the optimizer? Then we can patch the tests accordingly here in a later PR if needed, leaving them as-is for now. (edit: slight update in numbers, accidentally used slightly modified model from original post) |
How does MINOS compare for the parameter uncertainties? |
It's worth noting that this difference between the two methods disappears when doing MLE fits instead of fixing the POI in the fit. I think these tests in |
…orrelation matricies for fixed parameters
cd48d7d
to
b1c9b3d
Compare
… that references this behaviour being done
for more information, see https://pre-commit.ci
Just been and wiped the dust of this PR in the Hacktoberfest spirit! Tagging @alexander-held @matthewfeickert when you have a moment. To recap the behaviour here: result = pyhf.optimizer.minimize(..., return_uncertainties=True) will return uncertainties depending on the choice of optimizer:
I have not added a keyword argument to force autodiff uncertainties if using minuit, so if that was desired, that's missing right now. Also, the discussion above has been incorporated into the tests by just allowing minuit and autodiff uncerts to drastically disagree, with a link to this discussion added as a comment. I think we verified that it's a result of either the model or doing a fixed poi fit to eval the uncertainties (differences disappear doing a global fit). Let me know if there's more I can do with this PR before it's good to go! |
Description
One of the main reasons minuit is the default optimizer is that it packages up fit results with uncertainties. We can do that for any fitting procedure using the Fisher information matrix!
This PR adds this behaviour by default for the case that the optimizer isn't minuit. The
scipy.optimize.fit_result
is packaged up with the uncert info as if it was calculated natively.(still needs tests!)
Would close #2268.
Checklist Before Requesting Reviewer
Before Merging
For the PR Assignees: