Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add various utility functions for LCIO objects #150

Merged
merged 11 commits into from
Oct 19, 2022

Conversation

dudarboh
Copy link
Member

BEGINRELEASENOTES

  • Added a utility function to calculate Track momentum based on its track parameters and magnetic field
  • Added methods to the LCRelationNavigator that extract the highest weight with an option to indicate weight encoding type ("track"/"cluster").
  • Added a utility function to get an index of a provided object inside a given LCCollection
  • Added a utility function to return a leading track from ReconstructedParticle in case it has multiple tracks attached.
    ENDRELEASENOTES

TL;DR:
These are the utility functions i often use in my code and I feel like they might be useful for others.
Here is more detailed overview on them:

a. Track momentum: This is very often implemented by hand in many places.

  1. https://github.com/iLCSoft/MarlinReco/blob/7c3b91368d28b80a6b414f6dd94e2d4166d9d1da/Analysis/TauFinder/src/PrepareRECParticles.cc#L210-L216
  2. https://github.com/iLCSoft/MarlinReco/blob/39201c6fbfc257a30e4bb86c7e2eacca908a6729/Analysis/VertexChargeRecovery/src/TrackOperator.cc#L17-L28
  3. https://github.com/iLCSoft/MarlinReco/blob/39201c6fbfc257a30e4bb86c7e2eacca908a6729/Analysis/TrackToRecoParticle/src/TrackToRecoParticleConverter.cc#L70-L81
  4. https://github.com/iLCSoft/MarlinReco/blob/48c3c1cc1074702ba4106911bec4756f121d2ce6/Tracking/KinkFinder/src/KinkFinder.cc#L382-L392
  5. https://github.com/iLCSoft/MarlinReco/blob/39201c6fbfc257a30e4bb86c7e2eacca908a6729/PFOID/src/CreatePDFs.cc#L403-L413
  6. https://github.com/iLCSoft/MarlinReco/blob/39201c6fbfc257a30e4bb86c7e2eacca908a6729/Analysis/CLICPfoSelector/src/CLICPfoSelector.cc#L223-L234
  7. https://github.com/iLCSoft/MarlinReco/blob/39201c6fbfc257a30e4bb86c7e2eacca908a6729/PFOID/src/PFOID.cc#L520-L530
  8. https://github.com/iLCSoft/MarlinReco/blob/97966564af294b7a4306a4233b9161d0c3bf0dc3/TimeOfFlight/src/TOFUtils.cc#L32-L42
  9. https://github.com/iLCSoft/MarlinReco/blob/39201c6fbfc257a30e4bb86c7e2eacca908a6729/Tracking/V0Finder/src/V0Finder.cc#L154-L188
  10. https://github.com/iLCSoft/MarlinReco/blob/39201c6fbfc257a30e4bb86c7e2eacca908a6729/Analysis/FourMomentumCovMat/src/algebraImplementation.cc#L65-L75
  11. https://github.com/iLCSoft/MarlinReco/blob/39201c6fbfc257a30e4bb86c7e2eacca908a6729/Analysis/VertexChargeRecovery/src/TrackOperator.cc#L17-L28
  12. https://github.com/lcfiplus/LCFIPlus/blob/278d7bb54e2602c2c0cd170c00f72fe302794bcf/src/LCIOStorer.cc#L475-L487

In some cases it is done through the HelixClass::Initialize_Canonical, which, in principle has the same implementation, however it forces you to build a Helix object, even if you just need a momentum and you need to copy 5 track parameters, which most of the time makes 5 lines instead of one. Also Helix class needs a capital cleanup, due to some issues, much makes it not a good place to extract momentum in a long term run. In my opinion it fits the best together with LCIO track.

b. LCRelationNavigator getMaxWeightObject:

This potentially helpful to avoid bugs, where people assume the objects are sorted from highest to lowest weight in the first place, e.g.: lcfiplus/LCFIPlus#64.

Also, having an option to sort assuming track/cluster weight encoding (which is used in "RecoMCTruthLink" relations) might raise awareness of people, why weights are not normalized and prevent unwanted behaviour, if one wasn't aware of documentation here.

c. Getting index of an element inside a collection.

It is helpful if one needs to access an object from a similar collection which doesn't have relation links:
E.g.
MCCollection MarlinTrkTracks FixedTracksCollection OldTracksCollection
MCParticle <-> Track (0x7f0b08) # 13 <-> # 13 <-> # 13

If there is no LCRelations written for new collections, then we need a track index (13) to compare objects one-by-one

d. Getting a leading track from the PFO.
It might be a more consistent thing to do for PFOs with more than a single track, than just taking the first element in the array.

Copy link
Contributor

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall. I have a few comments / questions inline.

src/cpp/include/UTIL/ReconstructedParticleTools.h Outdated Show resolved Hide resolved
src/cpp/include/UTIL/ReconstructedParticleTools.h Outdated Show resolved Hide resolved
src/cpp/include/UTIL/TrackTools.h Outdated Show resolved Hide resolved
Comment on lines 76 to 81
std::function<float(float)> getWeight;
if (weightType == "track") getWeight = [](float encoding){ return (int(encoding)%10000)/1000.; };
else if (weightType == "cluster") getWeight = [](float encoding){ return (int(encoding)/10000)/1000.; };
else {
getWeight = [](float encoding){ return encoding; };
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could potentially be solved without having to use an std::function with an immediately invoked function expression (IIFE), something like

const auto getWeight = [&weightType](){
  if (weightType == "track") {
    return [](float encoding) { return (int(encoding)%10000)/1000.; };
  }
  // ...
}();

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work.

I think each lambda inside if returns its unique type, so in the end it returns an error, not knowing which type to assign to the final lambda:

error: inconsistent types ‘UTIL::LCRelationNavigator::getRelatedToMaxWeightObject(EVENT::LCObject*, const string&) const::<lambda()>::<lambda(float)>’
and                       ‘UTIL::LCRelationNavigator::getRelatedToMaxWeightObject(EVENT::LCObject*, const string&) const::<lambda()>::<lambda(float)>’ deduced for lambda return type
               return [](float encoding){ return (int(encoding)/10000)/1000.; };

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, forgot about the different types of the lambdas... In that case I think we go with the std::function (and it's small overhead), but in that case could we make the default usage more prominent, i.e.:

Suggested change
std::function<float(float)> getWeight;
if (weightType == "track") getWeight = [](float encoding){ return (int(encoding)%10000)/1000.; };
else if (weightType == "cluster") getWeight = [](float encoding){ return (int(encoding)/10000)/1000.; };
else {
getWeight = [](float encoding){ return encoding; };
}
std::function<float(float)> getWeight = [](float encoding) { return encoding };
if (weightType == "track") getWeight = [](float encoding){ return (int(encoding)%10000)/1000.; };
else if (weightType == "cluster") getWeight = [](float encoding){ return (int(encoding)/10000)/1000.; };

As a (more or less) purely academic aside (and for your information), The IIFE could be made to work with the following little trick:

const auto Lambda = [](int a, int b) { return a < b; }; // Lambda will be some compiler named type
const auto FuncPtr = +[](int a, int b) { return a < b; } // The lambda will be implicitly converted to 
                                                         // a function pointer and FuncPtr will be a 
                                                         // bool (* const)(int, int)

By converting the lambda into a function pointer via the unary operator+ we would be able to use the IIFE again. However, given the context we are doing this in, I think it is better to go with the more legible std::function approach.

Copy link
Member Author

@dudarboh dudarboh Sep 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think then I leave the std::function out and simply do e.g.:

Suggested change
std::function<float(float)> getWeight;
if (weightType == "track") getWeight = [](float encoding){ return (int(encoding)%10000)/1000.; };
else if (weightType == "cluster") getWeight = [](float encoding){ return (int(encoding)/10000)/1000.; };
else {
getWeight = [](float encoding){ return encoding; };
}
float maxWeight = *std::max_element(weights.begin(), weights.end(), [](float a, float b){return a < b ;};
if (weightType == "track") maxWeight = *std::max_element(weights.begin(), weights.end(), [](float a, float b){return (int(a)%10000)/1000. < (int(b)%10000)/1000.;});
else if (weightType == "cluster") maxWeight = *std::max_element(weights.begin(), weights.end(), [](float a, float b){return (int(a)/10000)/1000. < (int(b)/10000)/1000. ;});
return maxWeight;

what do you think?

src/cpp/src/UTIL/LCRelationNavigator.cc Outdated Show resolved Hide resolved
Comment on lines 69 to 89
const EVENT::FloatVec & getRelatedFromWeights(EVENT::LCObject * to) const ;

/** Object with a highest weight that the given from-object is related to.
* LCObject is of type getToType().
*/
const EVENT::LCObject* getRelatedToMaxWeightObject(EVENT::LCObject* from, std::string weightType) const ;

/** From-object related to the given object with a highest weight (the inverse relationship).
* LCObject is of type getFromType().
*/
const EVENT::LCObject* getRelatedFromMaxWeightObject(EVENT::LCObject* to, std::string weightType) const ;

/** The highest weight of the relations returned by a call to getRelatedToObjects(from).
* @see getRelatedToObjects
*/
float getRelatedToMaxWeight(EVENT::LCObject* from, std::string weightType) const ;

/** The highest weight of the relations returned by a call to getRelatedFromObjects(to).
* @see getRelatedFromObjects
*/
float getRelatedFromMaxWeight(EVENT::LCObject* to, std::string weightType) const ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the weights and the objects commonly used together? I could imagine there is a use case for having only the object, but is there also a use case for only the weight?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say weights and object must always be used together.

You always want to extract an object with a certain weight.
You might want the weight value as well to require certain cut, e.g.: accept only MCParticles which contributed >90% of the energy to this particular PFO.

I would much prefer here the use of std::pair, to return both object and weight in a single method, but I decided to be consistent with already existing code style, which has separate function to return weights and objects.

Copy link
Contributor

@tmadlener tmadlener Sep 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I would argue that having either a std::pair return value or even a

struct RelObject {
  Event::LCObject* object;
  float weight;
};

(which should be usable like)

const auto [obj, w] = getRelatedFromMaxWeigth(to, "");

should be the main implementation and if you want to keep the compatibility with the existing code base, you add simple wrappers around this to return only part of the result.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean,
maxWeight or maxWeightObject can be used separately, so I don't think it makes sense to change them to return a std::pair. If you know that this object already has a max weight (because thats what method returns), you don't need a weight value already.
In the last message, when i said object must always be used together exactly for the purpose to find the maxWeight object or leastWeight object, etc. If you take a simply object, ignoring its weight, it doesn't really make sense.
But this would require complete overhaul how do we store them. I wont go that deep here

src/cpp/include/UTIL/TrackTools.h Outdated Show resolved Hide resolved
@dudarboh
Copy link
Member Author

dudarboh commented Sep 2, 2022

@tmadlener

I have moved all implementations to the .cc files, so compiler sees them. And I have added CLHEP dependency to the LCIO.

I squashed some of your comments fixes into existing commits, so it doesn't look like a mess, please take a look.

src/cpp/include/UTIL/LCCollectionTools.h Outdated Show resolved Hide resolved
src/cpp/src/UTIL/LCRelationNavigator.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@gaede gaede left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dudarboh, we do not want to introduce a mandatory dependence on CLHEP.
As far as I have seen, you only need this for three vectors - this should be easy to solve with either a simple ThreeVector class in UTIL or using arrays...

@dudarboh
Copy link
Member Author

dudarboh commented Sep 8, 2022

Hi @gaede,

Dont we have CLHEP dependencies already here?

#include "CLHEP/Vector/LorentzVector.h"
// CLHEP 1.9 and higher introduce a namespace:
namespace CLHEP{}
using namespace CLHEP ;

not sure if it works, because currently we don't have CLHEP dependency inside CMakeLists.txt, however it compiles, because the header file isn't used anywhere.

We also have a test for this LCFourVector, and CMakeLists.txt for test has a needed CLHEP dependency:

LCIO/tests/CMakeLists.txt

Lines 296 to 302 in a7b74eb

FIND_PACKAGE( CLHEP QUIET )
IF( CLHEP_FOUND )
INCLUDE_DIRECTORIES( ${CLHEP_INCLUDE_DIRS} )
ADD_LCIO_TEST( test_fourvector)
ELSE()
MESSAGE( STATUS "cannot build test_fourvector: clhep not found" )
ENDIF( CLHEP_FOUND )

If we strongly don't want to introduce CLHEP dependency, then should we also remove LCFourVector with its test for the consistency?

EDIT: I missed the word "mandatory", was that the intended design that one could compile LCIO w/o CLHEP dependency, but then just couldn't use LCFourVector?

@gaede
Copy link
Contributor

gaede commented Sep 8, 2022

EDIT: I missed the word "mandatory", was that the intended design that one could compile LCIO w/o CLHEP dependency, but then just couldn't use LCFourVector?

Yes, exactly, we want LCIO to compile w/o CLHEP. The LCFourVector is a historical leftover and it can be used in downstream code that actually has CLHEP but does not introduce a mandatory CLHEP dependency...

@dudarboh
Copy link
Member Author

@tmadlener, how do I fix this for macOS?

Comment on lines 18 to 29
std::array<double, 3> getTrackMomentum(const EVENT::TrackState* ts, double bz){
std::array<double, 3> mom;
double omega = ts->getOmega();
double phi = ts->getPhi();
double tanL = ts->getTanLambda();
double c_light = 299.792458; // mm/ns
double pt = (1e-6 * c_light * bz) / std::abs(omega);
mom[0] = pt*std::cos(phi);
mom[1] = pt*std::sin(phi);
mom[2] = pt*tanL;
return mom;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a templated function, right? On a first glance this seems to be the identical code:

something like

template <typename TrackLikeT>
std::array<double, 3> getTrackMomentum(const TrackLikeT* track, double bz) {
  // ... implementation
}

Additionally, I think you could save creating an array explicitly, by doing

  return {pt * std::cos(phi), pt * std::sin(phi), pt * tanL};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation will then have to move to the header

Copy link
Member Author

@dudarboh dudarboh Oct 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I test then, whether it compiles works?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By adding a test ;)

https://github.com/iLCSoft/LCIO/tree/master/src/cpp/src/TESTS

has a test_tracks.cc and also a test_trackstate.cc. They seem to already create a few tracks and track states, so you should be able to check whether your new utilities actually behave as they should. Essentially the MYTEST calls in the second half are what you are looking for, they work like

MYTEST(/*actual*/, /*expected*/, /*name of thing that is compared*/);

as long as there is an implementation of operator== for the thing that you try to compare this should work. You can in principle also add the getLeadingTrack to one of the other tests, if you find a reconstructed particle with multiple tracks attached.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some troubles. I feel like it also needs << operator overload... Which std::array doesn't have.
I can compare absolute values though..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I can't used std::hypot(px, py, pz) (since C++17)?
LCIO is still not C++17?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it also needs << operator overload... Which std::array doesn't have.

Ah you are right, I missed that, sorry. In this case, that would probably need to go to the TESTS/tutils.h header (if you want std::array outputs (with a semi-fancy version below).

template<typename T, size_t N>
std::ostream& operator<<(std::ostream& os, const std::array<T, N>& arr) {
  if constexpr (N == 0) {
    return os << "[]";
  }

  os << "[" << arr[0];
  for (size_t i = 1; i < N; ++i) {
    os << ", " << arr[i];
  }
  return os << "]";
}

And I can't used std::hypot(px, py, pz) (since C++17)?

std::hypot should in principle be available, as all CI workflows actually use CMAKE_CXX_STANDARD=17. Not entirely sure why you are running into problems here tbh. I will have to check.

Additionally, it looks like the values that are filled into the tracks/track states do not really make sense from a physics point of view. They produce NaNs when trying to calculate the momenta from them:

https://github.com/iLCSoft/LCIO/actions/runs/3267033539/jobs/5371642760#step:4:17906

@tmadlener
Copy link
Contributor

@tmadlener, how do I fix this for macOS?

You don't ;) I still have to remove the macOS workflow here, since github has removed runners with an appropriate versions. See, e.g. AIDASoft/run-lcg-view#3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants