Here is just a brief introduction for explaning the meaning and the usage of different packages of ILCSoft. When one wants to analyse a physics process, the typical work flow may look as following:
- Generate monte-carlo particles for that physics process with a generator, e.g. Whizard+Pythia. (If you generate very large stdhep files, e.g. x00 MB/file, it's a good idea to split a stdhep file to many small files (xx MB/file). Because the simulation process will store huge information for each events, than the final file may be too large. If you decide to splite the stdhep file, when finishing simulation, you also have to merge those slcio files into one or several bigger files.)
- Simulate particle samples with Mokka/DD4hep, which will generate "slcio" files with GB/file. The slcio file is a standard file for ILCSoft.
- Reconstruct particle information with Marlin processors, which will generate two kinds of files -- "DST" files and "REC" files. The "REC" file contains all infomation for generating, simulation and reconstruction. The "DST" file only contain some of them. Usually, the "DST" file is suitable for analysis, but you can also check the "REC" file for more details.
- Write a steering file for Marlin platform, with processors supplied by ILCSoft or your own processors, to complete your analysis.
- In most of the cases, you need to write your own processor.
Following these steps, this introduction is organised as the following:
1. How to initialization iLCSoft enverionment.
2. How to simulate events.
3. The name rules for events samples in ILC group.
4. Basic tools to check the events.
5. How to run Marlin for analysis.
6. Marlin processors for ILCSoft ---- a short introduction for some processors.
7. How to create a new Marlin processor.
8. Slcio file structure and lcio program API ---- what one should know for writing a new processor.
9. How to program for a new Marlin processor.
In this introduction, we only explain some basic usage of the ILCSoft, the manual of each package can be found in their own README file. ILCSoft contains many subpackages, there is a example folder in each package, where you can find detail examples.
To initialize the iLCSoft environment with a command like this:
source /The path of your iLCSoft/v01-19-04/init_ilcsoft.sh
Then you can use all iLCSoft command. For example, this command will show all packages in the current iLCSoft release:
find $ILCSOFT -maxdepth 2 -mindepth 2 -type d
Let's suppose that you already know how to use Whizard, and have generated a stdhep file. Then you can simulate the events with "Mokka/DD4Hep".
run a simulation from an stdhep generator file:
ddsim --inputFiles ./bbudsc_3evt.stdhep --outputFile=./bbudsc_3evt.slcio \
--compactFile $lcgeo_DIR/ILD/compact/ILD_l4_v02/ILD_l4_v02.xml \
--steeringFile=./ddsim_steer.py > ddsim.out 2>&1 &
Here, we use two steering files:
- ddsim_steer.py steering the simulation
- ILD_l4_v02.xml the detctor geometry model
- ILD group ---- ILD sample examples.
- xxx .
- xxx .
When one generate some event samples, the first thing he/she wants to do usually is checking the events with some basic straight-forward tools. The ILCSoft provides many tools to check them for different purposes.
-
Suppose you only have a .stdhep file, which is very easy generated by many generators, then you can transfer a .stdhep file to a .slcio file with
- stdhepjob xxx.stdhep xxx.slcio 1
-
The file after simulation is always a "slcio" file, you can check the "slcio" file with
- anajob xxx.slcio --- for basic information (total events number, how many collections, what collections, the meaning for collections will be explained in the "slcio structure section")
- dumpevent xxx.slcio n | less --- for details of the n th events (The detail information for each collections)
-
check the event in the detector, the GearOutput.xml can be found in "ILDConfig" folder, which contains the ILD detector information.
- ced2go -d GearOutput.xml -v DSTViewer xxx.slcio --- for DST file
- ced2go -d GearOutput.xml xxx.slcio --- for REC file
When you have the events, you can analyse them in the Marlin platform. To run Marlin, first you need a steering file, you can create a typical steering file by a command
Marlin -x >> mysteer.xml
In this steering file "mysteer.xml", it contains many processors supplied by ILCSoft. Or you can find a more practical example at https://github.com/iLCSoft/ILDConfig/blob/master/StandardConfig/lcgeo_current/bbudsc_3evt_stdreco_dd4hep.xml
But they are so complicated, we can use a simple one for explaination. This is a typical steering file, the file ends with .xml The steering file uses XML language, which is a markup language, there is the usage of xml http://www.xmlfiles.com/xml/xml_usedfor.asp. In this file, it only uses one processor --- IsolatedLeptonTaggingProcessor, and some parameters of this processer use the default value.
<marlin>
<execute>
<processor name="MyIsolatedLeptonTaggingProcessor"/>
</execute>
<global>
<parameter name="LCIOInputFiles">
<!-- put some input slcio files here-->
</parameter>
<parameter name="GearXMLFile"> GearOutput.xml </parameter>
<!-- tells Marlin which detector structure should use, you can find this file from ILDConfig folder and put it in the same path with this steering file-->
<parameter name="MaxRecordNumber" value="0" />
<!-- tell Marlin to run the first n events, if n==0, then run all events -->
<parameter name="SkipNEvents" value="0" />
<!-- tell Marlin to skip the first n events, if n==0, then run from the first events -->
<parameter name="SupressCheck" value="false" />
<parameter name="Verbosity" options="DEBUG0-4,MESSAGE0-4,WARNING0-4,ERROR0-4,SILENT">WARNING</parameter>
<!-- tell what debug information will be shown on the screen -->
<parameter name="AllowToModifyEvent" value="true" />
<!-- give Marlin right to modify events -->
</global>
<processor name="MyIsolatedLeptonTaggingProcessor" type="IsolatedLeptonTaggingProcessor">
<!-- some parameters for IsolatedLeptonTaggingProcessor -->
<parameter name="CosConeLarge" type="float">0.95 </parameter>
<parameter name="CosConeSmall" type="float">0.98 </parameter>
<parameter name="CutOnTheISOElectronMVA" type="float"> 0.5 </parameter>
<parameter name="CutOnTheISOMuonMVA" type="float">0.7 </parameter>
<!-- the weight file for this processor-->
<parameter name="DirOfISOElectronWeights" type="string"> /afs/desy.de/project/ilcsoft/sw/x86_64_gcc48_sl6/v01-17-09/MarlinReco/v01-14/Analysis/IsolatedLeptonTagging/example/isolated_electron_weights </parameter>
<parameter name="DirOfISOMuonWeights" type="string"> /afs/desy.de/project/ilcsoft/sw/x86_64_gcc48_sl6/v01-17-09/MarlinReco/v01-14/Analysis/IsolatedLeptonTagging/example/isolated_muon_weights </parameter>
</processor>
</marlin>
In the steering file, it begins and ends with <marlin> *** </marlin>
. Between this block, it always contains three section
<execute>
<processor name="[You choose a name to describe the processor, e.g.] NameA"/>
# ...
# all the processors that you want to use, they will be executed one by one.
</execute>
<global>
# some global parameters, for example, the input file name.
<parameter name="LCIOInputFiles">
#input filenames and directories
</parameter>
#The detector structure file
<parameter name="GearXMLFile"> GearOutput.xml </parameter>
...
</global>
<processor name="[this name should be the same with which is in the "execute" block] NameA" type="[The real processor name, e.g.] IsolatedLeptonTaggingProcessor">
# This is for find Isolated Lepton in the events samples.
# You need to supply the necessary parameters for this processor. e.g.
<parameter name="CosConeLarge" type="float">0.95 </parameter>
# A processor may need many parameters, generally if you don't set them, the processor will use default values
</processor>
In this example, you can get two output collections "Isoleps" and "PFOsWithoutIsoleps", they can be used for further analysis in other processors. In some other processors, e.g. lctuple (which change the information in a slcio file into a root file), you can get a root file as the output.
You can check which marlin processor library has been loaded by a bash command
echo $MARLIN_DLL
The processors that relate to ILDAnalysis can be found https://github.com/ILDAnaSoft/ILDDoc
The most easy way to create a new Marlin processor is copying an example processor, instead of rewritting all of them. You can do this by
./copy_new_processor.sh new_processor_name
This script can be decomposed by following steps:
- Copy an example processer that supplied by ILCSoft with .cc and .h file into a new folder and change the processer.
- Change the processor class name into the new name!! NOTE: this is important, or it may conflict with existed processors.
- Put ./action.sh into the bin folder, change the PROJECTNAME in the action, and run ./bin/action.sh
- When running this action.sh, six folders will be created, and all files will be put into their specific folder.
| folder | meaning |
|:----------:|:-----------------------:|
| build | all compling file |
| src | source file |
| include | head file |
| xml | steering file |
| lib | your processor library |
| bin | execute file |
- go to xml folder, a default steering file has been created, change this steering file with the processors you want, then run it with
Marlin mysteer.xml
- in the next time, when you change something for this processor and need to recompile it, just run
./bin/action.sh
.
The copy_new_processor.sh script can also be used for copy any other exist processor to a new one by
./copy_new_processor.sh old_processor_directory new_processor_name
When you need to write a new Marlin processor, you have to know the structure of the slcio file. You can check the slcio file structure with anajob for a general information or dumpevent for details.
The ILD slcio file examples can be found https://github.com/ILDAnaSoft/ILDDoc/blob/master/dst/ild_dst_collections.md
The way to call these collections and their values can be found at http://lcio.desy.de/v02-09/doc/doxygen_api/html/namespaces.html, which is all the c++ API for lcio. The lcio use many c++ STL grammars. if you are not familiar with STL, you can find some explaination athttp://www.cplusplus.com/reference/stl/
We recommand you to begin your first processor with the Marlin examples. If you are a experienced programmer, you can find iLCSoft general documentations at http://ilcsoft.desy.de/portal/general_documentation/index_eng.html.
For each Marlin processor, it at least contains two files: MyProcessor.h MyProcessor.cc. In MyProcessor.h, the file structure will be
#ifndef MyProcessor_h
#define MyProcessor_h 1
#include "marlin/Processor.h"
#include "lcio.h"
#include <string>
using namespace lcio ;
using namespace marlin ;
class MyProcessor : public Processor {
public:
virtual Processor* newProcessor() { return new MyProcessor ; }
MyProcessor() ;
virtual void init() ;
virtual void processRunHeader( LCRunHeader* run ) ;
virtual void processEvent( LCEvent * evt ) ;
virtual void check( LCEvent * evt ) ;
virtual void end() ;
// declare your new methods here
protected:
std::string _colName ;
// tell you which event is running
int _nRun ;
int _nEvt ;
// declare your new variables here
} ;
#endif
It has the declarations of six basic methods: constructor(), init(), processRunHeader(), processEvent(), check() and end(), which will be realized in MyProcessor.cc file. The source file's basic structure will be:
#include "MyProcessor.h"
#include <iostream>
#include <EVENT/LCCollection.h>
#include <EVENT/MCParticle.h>
#include "marlin/VerbosityLevels.h"
using namespace lcio ;
using namespace marlin ;
MyProcessor aMyProcessor ;
MyProcessor::MyProcessor() : Processor("MyProcessor") {
//recieve input parameters by registerInputCollection method
// register steering parameters: name, description, class-variable, default value
// the related information should be provided by steering file.
// the Collection name should exist in the input slcio file, you can check that by anajob *.slcio.
registerInputCollection( LCIO::MCPARTICLE,
"CollectionName" ,
"Name of the MCParticle collection" ,
_colName ,
std::string("MCParticle")
);
}
void MyProcessor::init() {
//initialization --- sometimes the input is not one file, but many files (or some files combined to one file), this tell you which file you are using.
_nRun = 0 ;
//initialization --- tell you in one specific file which event is running.
_nEvt = 0 ;
}
void MyProcessor::processRunHeader( LCRunHeader* run ) {
_nRun++ ;
}
void MyProcessor::processEvent( LCEvent * evt ) {
// this method will run for every event.
// put your main analysis code here.
// read collection that input in the steer file.
LCCollection* col = evt->getCollection( _colName ) ;
// this will only be entered if the collection is available
if( col != NULL ){
int nMCP = col->getNumberOfElements() ;
//loop for all particles of the collection in one event.
for(int i=0; i< nMCP ; i++){
// use this pointer for some operations
MCParticle* p = dynamic_cast<MCParticle*>( col->getElementAt( i ) ) ;
}
}
_nEvt ++ ;
}
void MyProcessor::check( LCEvent * evt ) {
// could be used to fill checkplots in reconstruction processor
}
void MyProcessor::end(){
// print some information when the end of the whole Marlin program.
}
/*
void MyProcessor::some_your_own_method(){
// write your own method.
}
*/
You can also use many other functions with one file or multi files.
You would better to declare them as the methods of class MyProcessor and include "MyProcessor.h",
or those methods may conflict with other existed processor.
Another thing should be notised that the processor name need to be different from other default ones,
so you'd better change MyProcessor with some other special name.