-
Notifications
You must be signed in to change notification settings - Fork 97
design_document
Standard, high-precision reconstruction takes O(10s) per event to run meaning that it is limited to the final steps in the reconstruction chain after simpler, less performant means have been used to bring down the rate. That is because standard IceCube reconstruction is optimisation-based, i.e. each event is iteratively tested against several particle hypotheses using a detailed likelihood-based approach. Alternatively, one could use machine learning, which separates optimisation (training) and inference. This way, one could train an ML model in advance, possibly using GPUs to speed up the process, and then only need to run one forward/inference pass for each event to be reconstructed. This has the potential to bring down reconstruction time by several orders of magnitude compared to the current state-of-the-art (RETRO, etc.).
The challenge with IceCube is its complexity, heterogeneity, and high dimensionality. This affects the choice of which ML paradigm to employ: Standard, tabular methods (e.g. BDTs) require collapsing the event information into a tabular format e.g. by using manually engineered features. However, from previous work (Baldi, Sadowski, and Whiteson, 1402.4735) it is known that in particular deep learning models provide the largest marginal improvement when applied to data at the lowest level rather than to engineered features only. In IceCube, this would imply using the DOM-level data (DOM coordinates, pulse charge, timing, etc.) directly in neural networks.
One option would be to use convolutional neural networks (CNNs), but these require the input data be Euclidean; something even the IceCube-86 detector isn't and with each additional detector component (DeepCore, Upgrade, etc.) it becomes less and less straightforward to format raw IceCube data in a way that is suitable for CNN processing. Another option is to use graph neural networks (GNNs). These models can accommodate any spatial geometry of the input data through the notion of adjacency. CNNs can be considered special cases of GNNs in the language of geometric deep learning (Bronstein et al., 2104.13478). Within this paradigm, DOM data is represented as nodes on a graph corresponding to the IceCube detector (naturally also accommodating any detector extensions), and information is passed through the network along edges connecting these nodes; corresponding to generalised convolutions of CNNs. This makes GNNs a well-suited paradigm for naturally accommodating low-level IceCube data.
To fully exploit the potential of GNNs in IceCub, leverage collaboration across institutes and analysis groups, and break down unproductive silos, it is proposed to coordinate and align all GNN development efforts in IceCube through the gnn-reco project. This project would constitute an internal "center of excellence" within this branch of machine learning, where all IceCube members can contribute new models, applications/use cases, etc. and get support for e.g. using GNNs in their analysis.
The collaboration itself will take place through weekly developer meetings, discussion in a dedicated Slack chat, etc. (TBC) The work will be coordinated and guided by the repository maintainers.
The technical solution (i.e. the project code base) will be the foundation on which GNN applications are built in IceCube, and therefore the quality and appropriateness of the base code structure is crucial for the success of the project. To this end, the project should aim to be sufficiently general and extensible; it should facilitate collaboration; and generally follow best practices for ML development.
Generally, the project should provide effective means for:
- Ingesting I3 data and converting them to a format that is suitable for ML.
- Building GNN models specific to each detector and application with minimal need for duplicate work.
- Training and optimising GNN models on well-defined tasks using standard datasets
- Benchmarking GNN models against each other as well as against other reconstruction/classification/etc. methodologies to assess the added value of this approach for each potential application.
- Validating GNN models, including considerations regarding systematic uncertainties, calibration, etc.
- Deploying GNN models in a way that allows for easy use in official reconstruction chains and analyses.