Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current status of Boosted Tau #89

Draft
wants to merge 90 commits into
base: master
Choose a base branch
from

Conversation

aloeliger
Copy link

I had promised a while ago to open a PR for the changes made to the boosted tau that went into some of our initial trainings. This is the status of the boosted tau repository, current/merged with the master branch as of March 18, 2022. Please do not merge these changes of course. but perhaps we can discuss how to keep boosted tau developments and HPS tau developments concurrent.

That said, currently training is not working for this state of the boosted tau master branch. There seem to be compilation errors with the current status of the repository. Trying to use the current state of master with these code changes, we get:

Compiling DataLoader headers.
In file included from input_line_35:1:
/afs/hep.wisc.edu/cms/aloeliger/DeepBoostedTau/CMSSW_10_6_20/src/TauMLTools/Training/interface/DataLoader_main.h:204:13: error: use of undeclared identifier 'debug'
if (debug) hist_weights[tau_type]->SaveAs(("Temp_"+tau_name+".root").c_str()); // It's required that all bins are filled in these histograms; save them to check incase binning is too fine and some bins are empty
^
/afs/hep.wisc.edu/cms/aloeliger/DeepBoostedTau/CMSSW_10_6_20/src/TauMLTools/Training/interface/DataLoader_main.h:204:33: error: use of undeclared identifier 'tau_type'
if (debug) hist_weights[tau_type]->SaveAs(("Temp_"+tau_name+".root").c_str()); // It's required that all bins are filled in these histograms; save them to check incase binning is too fine and some bins are empty
^
/afs/hep.wisc.edu/cms/aloeliger/DeepBoostedTau/CMSSW_10_6_20/src/TauMLTools/Training/interface/DataLoader_main.h:204:60: error: use of undeclared identifier 'tau_name'
if (debug) hist_weights[tau_type]->SaveAs(("Temp_"+tau_name+".root").c_str()); // It's required that all bins are filled in these histograms; save them to check incase binning is too fine and some bins are empty
^
/afs/hep.wisc.edu/cms/aloeliger/DeepBoostedTau/CMSSW_10_6_20/src/TauMLTools/Training/interface/DataLoader_main.h:252:62: error: use of undeclared identifier 'DeepTauVSjet_cut'
if (gen_match &&tau.tau_byDeepTau2017v2p1VSjetraw >DeepTauVSjet_cut){
^
/afs/hep.wisc.edu/cms/aloeliger/DeepBoostedTau/CMSSW_10_6_20/src/TauMLTools/Training/interface/DataLoader_main.h:916:76: error: use of undeclared identifier 'rm_inner_from_outer'
const bool accept_outer = !inner && inside_iso_cone && (!rm_inner_from_outer || !inside_signal_cone);
^
*** Break *** segmentation violation

Which are issues with these lines currently:

if (debug) hist_weights[tau_type]->SaveAs(("Temp_"+tau_name+".root").c_str()); // It's required that all bins are filled in these histograms; save them to check incase binning is too fine and some bins are empty

(debug, doesn't seem to be referenced anywhere else in DataLoader_main, is this prought in from the compiling python configuration, or is it global to one of the includes? the tau type errors are on my end I believe).

if (gen_match &&tau.tau_byDeepTau2017v2p1VSjetraw >DeepTauVSjet_cut){

(DeepTauVSjet_cut, also does not seem to be referenced elsewhere in DataLoader_main.)

const bool accept_outer = !inner && inside_iso_cone && (!rm_inner_from_outer || !inside_signal_cone);

(rm_inner_from_outer similarly does not seem to be referenced elsewhere in DataLoader_main)

I have attempted rolling back through versions of the dataloader with our changes done overtop of them, but even when I get to a version where only the debug error is present (and commented out), I get run time errors in DataLoaderBase complaining about this line:

return tuple([tuple([x.clone().numpy() for x in X[0]]),

I don't have the exact error available, but I believe it mentioned that EagerTensor (numpy.EagerTensor perhaps?) has no attribute clone, which seems like something mistyped is being handled by this function.

I assume these errors are unique to Boosted Tau development at the moment? Has anyone else seen these? Is there perhaps a change I have made somewhere that seems to introduce an obvious conflict that has introduced these issues?

Even rolled back to even earlier Dataloader code from early February, we're running into issues with NaNs in the tensor:

Nan detected! element= (250, 11, 11, 63)
[[156 5 4 32]]
...
Nan detected! element= (250, 21, 21, 63)
[[151 7 10 32]]
...
Nan detected! element= (250, 21, 21, 63)
[[210 14 11 32]]

etc.

I am going to spend some debugging this, as this seems specific to our input.

The boosted tau production changes are also present in this code. Those will be removed when I go through the comments on the production branch.

aloeliger and others added 30 commits May 14, 2021 05:40
@aloeliger aloeliger changed the title Current status of Boosted Tau (Not to be merged!) Current status of Boosted Tau Mar 19, 2022
@aloeliger aloeliger marked this pull request as draft March 19, 2022 22:30
…duction code.

Removes the redundant ED filter infrastructure, and changes the configuration to make useBoostedTauFilter an option given to TauTupleProducer. This option should now remove all non boosted tau type tau jets
Ntuples otherwise seem to get empty MVA evaluations
@kandrosov
Copy link
Collaborator

To integrate boostedTaus we need to move all code to python-based compilation (as already done for DataLoader or TauTuple code generation). Then in .h/.cpp files use a meta-language: e.g. TAU which would be replaced to tau or boostedTau depending on config parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants