-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import time #245
Comments
Loading the nuclei table takes 10 seconds. @eduardo-rodrigues we need to address this before releasing. You could load the entire numpy library almost 100 times in this timeframe. Particle was designed to load a few hundred particles in a table, not 6,500. (though the speed per object is rather dismal...) |
Ran py-spy and the flame graph showed that all the time was in in the search for an existing particle. I've moved the internal |
@henryiii, these timing issues you are seeing is something I had sorted out a while back, see https://github.com/scikit-hep/particle/pull/231/commits and 55486ea specifically. I'm very surprised you are seeing the same kind of issues again. It was all working fine. I have no idea what happened in between the developments. Good anyway that you find another solution! I will look closer at the updates today ... |
Just confirming - on my Windows 3.7.7 local installation all goes fast. Hmm … |
The issue with timing was indeed something I saw when adding information on nuclei, hence the discussion at the time since trouble for information that >95% of our users will not care about was a no-go. Indeed there are many isotopes added … BTW, I'm working on a complete rewrite of the data files, which will be much easier to maintain and rather comprehensive. This relates also to #118 and will be much nicer. News as soon as I find more time to add the new data per categories - quarks, leptons, light mesons, charm baryons, etc.; you get it. |
I have not played with this so if you have some commands to teach me and give it a go, I would be very interested. Thanks a lot! |
The problem was that I had to remove the file chaining that was being done, and that removed your speed hack. I might be able to restore the speed boost there too, and get even more! (Greedy cackling). I will check. The set nicely speeds up user code too, while the old hack only helped the original file loading. |
Agreed that using Yep, we may be able to be even faster ... |
Set checking is order 1; I can even drop the initial check for append entirely. (If you have multiple particles with the same PDGID, the last one will always be present). There's no performance difference for turning it off at all. |
* Some ignores for backports for typing * Fixing a bunch of unclosed file warnings * Starting on mypy support * Adding MyPy support to PDGID * Full mypy support added to pdgid/functions As a result, anything that supports __int__ can now be used in the functions; this includes Particle directly! * Adding pre-commit changes * Basic types for Particle (with several skips * Fix for extra __main__ being dumped in the wrong place * Fix for Python 2 * Fix for loading from zipfiles * Make typing optional for the ZipApp * Prepare to push ZipApp as well * ZipApp requires Python 3.7+ * ZipApp compression and earlier Python 3 support * Minor optimizations * No need to skip a file that is not present * Change internal table to a set. 100x faster load, faster searches. Fixes #245 * Add GHA badge * Update docs/CHANGELOG.md * Update docs/CHANGELOG.md * Update docs/CHANGELOG.md * Update docs/CHANGELOG.md * Update docs/CHANGELOG.md * Remove old performance addition - not needed with sets! * Mention PDGID works on any SupportsInt Co-authored-by: Eduardo Rodrigues <eduardo.rodrigues@cern.ch>
I've noticed that something that changed recently has caused the import time to skyrocket. (This could even just be in my branch). Running this in Python 3.7 with
python -X importtime
shows a whopping 10+ seconds devoted toparticle.particle.literals
!The text was updated successfully, but these errors were encountered: