Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additions to MiniDST.xml #151

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Conversation

ueinhaus
Copy link
Contributor

BEGINRELEASENOTES

  • added second CPID processor, running inference based on 2fZhad (in addition to singleP)
  • added WWCategorisationProcessor
  • added ParticleDFilter to remove superfluous PID entries from before dE/dx correction
    ENDRELEASENOTES

ueinhaus and others added 12 commits February 16, 2024 12:51
…ning with single particles and one based on 2-fermion-Z-hadronic events.

This includes the weight files and the reference files which point to the weight files.
In addition, the CPID steering is added to the MarlinRecoMiniDST.xml.
Paths are relative and only work if executed from the StandardConfig/production folder!
…ning with single particles and one based on 2-fermion-Z-hadronic events.

This includes the weight files and the reference files which point to the weight files.
In addition, the CPID steering is added to the MarlinRecoMiniDST.xml.
Paths are relative and only work if executed from the StandardConfig/production folder!
Comment on lines +1031 to +1036
-1.28883368e-02 2.72959919e+01 1.10560871e+01 -1.74534200e+00 -9.84887586e-07
6.49143971e-02 1.55775592e+03 9.31848047e+08 2.32201725e-01 2.50492066e-04
6.54955215e-02 8.26239081e+04 1.92933904e+07 2.52743206e-01 2.26657525e-04
7.52235689e-02 1.59710415e+04 1.79625604e+06 3.15315795e-01 2.30414997e-04
7.92251260e-02 6.38129720e+04 3.82995071e+04 2.80793601e-01 7.14371743e-04
1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a comment stating where this numbers come from?

@tmadlener
Copy link
Contributor

I am not entirely sure where the merge conflict comes from, but if you add the comment for the dE/dx numbers, I can deal with that and ping you again for a final review once it's done.

@dudarboh
Copy link
Member

The inference files increase the repository size by 770 Mb, approximately the current repository's size.
The current repository is 316 Mb, + git history is 463 Mb.
I think this may create a long-term problem if we keep this practice.

  • Re-training current inference (e.g. found bug, new PID, new physics samples, etc...) will leave the 700 Mb in git history.
  • Or if we keep the same practice for new NN/BDT solutions and each adds about 700 Mb, the repo size will grow quickly.
    While those are necessary for running miniDST, they are not used at all for running full reconstruction, for example...
    Moreover, those weight files are not intended to be human-readable, so I am curious if it makes sense to keep them in git history.

@tmadlener, @ueinhaus, what do you think?

As one of the solutions that occurs to me, would it be better if we package weight files into the release? @tmadlener, what do you think?
Then, even if we develop ten different NN/BDTs for various purposes, we can easily share all of them in the releases and keep the git history clean.
I understand that three analyses are waiting for miniDSTs, and we might need to go with the fastest solution for now.
So these are just thoughts out loud.

@tmadlener
Copy link
Contributor

It's definitely a concern and it will probably become even worse, once we have actual binary files from e.g. torch or onnx. This repository is not small to begin with, because it already has quite a few weight files etc., so we are building on existing practice, even though the scale is obviously not great.

Technically the release tarballs don't have the git history, so at least that part is not too hard, but for an initial clone and for checking out different branches git has to do a lot of heavy lifting.

Some issues with keeping the weights separate (without any particular order)

  • Chances that they diverge at some point are increased as keeping them in sync now means that we have to change two "repositories"
  • They need to be stored in a place that is publicly accessible and that allows to host potentially 10s GBs of weight files (free of charge).
  • I don't know if there is an easy way to "attach" some things to a github generated release tarball after the fact.

Some of these can be worked around, or we could use existing things (e.g. CMake ExternalData and CERN's EOS storage could be used to address the first two while keeping chances of mismatches between weights and ILDConfig versions minimal).

The "proper" solution to this would probably be to treat all of these weights as conditions data and then retrieve them from a conditions database. However, setting that up (and maintaining it longish term) is another issue.

TL;DR. I agree, but it's complicated and quite a bit of work for somewhat diminishing returns at the moment (as far as I can see)

@dudarboh
Copy link
Member

dudarboh commented Nov 19, 2024

Just FYI, I thought to do something like this: https://github.com/dudarboh/ILDConfig/releases/tag/v00-00
I just added archives with weights manually to the release. Not hacking the Github generated tarball, but as extra stuff.
It is usually done to distribute compiled binaries of the release, but can be as well used for inferences, why not.

From the user perspective it is a couple of extra steps, i.e. downloading the model separately from the code, extracting it and putting in the right folder.

EDIT: commenting on size. The maximum size we can attach to the github release is ~2 Gb. Assuming for compression to zip/tar/etc it can be up to 10 Gb.
For the larger models I think huggingface is commonly used, which, from the quick google, has 50 Gb limit.

But, I understand that the manpower is the main showstopper here.
I thought it is still worth at least to create a discussion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants