-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xml2json extremely slow #1882
Comments
Thanks for the report @cburgard. Can you please share the files as a private GitHub Gist with the dev team? |
The files are too large for a gist, but I've shared them with you privately on CERNbox. Please distribute them to the team. |
@cburgard Please first verify that you have read through the translations docs and tell us how the workspaces were created. Then please tell us the exact commands that you are using. It is important to understand how you created them, because the files you've provided as is are not reproducible given that you have hard paths throughout them like |
The workspaces were created with SFramwork, specifically with this function: https://gitlab.cern.ch/atlas-caf/CAFCore/-/blob/b1d40c29b0b4635c28d563db1483317d4a8b25d5/SFramework/Root/TSModelFactory.cxx#L176 The exact command I was using is
|
@cburgard Okay, well as you'll note we have no translation recipe for |
SFramework doesn't really use these XMLs, they are just produced as a debugging tool. If you'd be aware of a simple way to instruct the C++ implementation of HistFactory to output xmls with relative rather than absolute paths, I'd be happy to reproduce the xmls with that setting, otherwise, |
Thanks for the info as this helps move things forward. Manually changing paths isn't a reproducible workflow, and so can't be considered seriously. 😬 Perhaps we he's back from holiday we can ask @kratsg if he has any insight onto how HistFitter is able to do things unless @longjon929 is able to comment on that sooner. |
@cburgard great news (I hope)! @kratsg is adding a feature in PR #1909 that is able to automatically deal with hardcoded absolute paths and when he does the conversion on an old 2011 macbook running macOS 10.11 (his current MBP is in for repairs 😢) he gets the following $ time pyhf xml2json --hide-progress -v $PWD:/home/cburgard/Physics/statistics/pyfittools/ --output-file test.json hww-xml/HWWRun2GGF.xml
real 1m7.212s
user 1m6.320s
sys 0m1.180s so hopefully you'll get things even faster than that on your local machine. |
Hi @matthewfeickert , sorry for being slow and I imagine this isn't very useful. It looks like HistFitter writes out the XML by hand making a top level file from the fitconfig here and then each channel is looped over and dumps a file here. I imagine HF was developed with this structure in mind from HistFactory, which makes it straightforward to write out the xml. |
|
This will be better with #1909 but you can already use |
A final note on this, as I write up the release notes in PR #1705: $ ls
hww-xml
$ time pyhf xml2json \
--hide-progress \
--mount $PWD:/home/cburgard/Physics/statistics/pyfittools/ \
--output-file workspace.json \
hww-xml/HWWRun2GGF.xml
pyhf xml2json --hide-progress --mount --output-file workspace.json 14.71s user 2.33s system 119% cpu 14.269 total 👍 Great work by @kratsg! 🚀 |
Summary
When trying to convert some XMLs to pyhf JSON, I found that the runtime was unacceptably slow (order of hours).
OS / Environment
Steps to Reproduce
The XMLs are ATLAS internal, but I'm happy to share them with ATLAS members of the dev team on request.
File Upload (optional)
No response
Expected Results
I was hoping it would finish relatively fast.
Actual Results
It took hours.
pyhf Version
Code of Conduct
The text was updated successfully, but these errors were encountered: