This document briefly describes the architecture of correctionlib. It is meant to provide a good starting point for new contributors to find their way around the codebase. It assumes some familiarity with correctionlib as a user.
schemav2
module: Pydantic models for correctionlib's data structuresCorrectionSet
is a list ofCorrection
sCorrection
represents a single correction. Itsdata
attribute is ofContent
type and represents the root node of the computation graph for this correction. Corrections also have a list of inputs as well as one output of typeVariable
(basically a pair of a name and a type, int/float/string) andContent
is the type of a node in the computation graph of aCorrection
. It's aUnion
of the various types of corrections available:Binning
,MultiBinning
,Category
,Formula
,FormulaRef
,Transform
,HashPRNG
, float
highlevel
module: user-facing types (correctionlib.Correction
resolves tocorrectionlib.highlevel.Correction
, etc.)CorrectionSet
is a list ofCorrection
s (same as inschemav2
but focus is on user API rather than defining the schema/structure of the corrections)Correction
andCompoundCorrection
wrap the corresponding C++ evaluator and expose theevaluate
method
_core
module: a small module that contains the Python facades for the corresponding C++ types, in__init__.pyi
.- types are
CorrectionSet
,Correction
,CompoundCorrection
andVariable
- the bindings are declared in
src/python.cc
- types are
include/correction.h
and src/correction.cc
contain the the C++ types that perform the actual computations:
- a
Variable
type with a name and a type (string, integer, real) - a
CorrectionSet
builds a list ofCorrections
- the
Correction
type, which builds a compute graph of correction nodes - types for the different types of nodes in a correction's compute graph, e.g.
Binning
,Formula
, each with itsevaluate
method. They are constructed by deserializing a JSON object.Formula::Formula
, for example, parses aTFormula
expression in the JSON and builds the correspondingFormulaAST
In short, the C++ correction objects that perform the actual correction evaluations are constructed from the JSON representations of the Pydantic types defined in schemav2
.
Let's say the user calls schemav2.Correction.to_evaluator
. This:
- constructs a
schemav2.CorrectionSet
(the pydantic model) - constructs a
highlevel.CorrectionSet
from it and immediately extracts the righthighlevel.Correction
from it, returning it
The actual construction of the internal C++ correction evaluators happens in the construction of the highlevel.CorrectionSet
, which converts the Pydantic CorrectionSet
to JSON and uses it to construct a _core.CorrectionSet
(using CorrectionSet.from_string
)
_core.CorrectionSet.from_string
constructs a rapidjson JSONObject and callsCorrectionSet(const JSONObject &)
- then for each object in JSONObject it constructs a Correction (
Correction(const JSONObject&)
), and puts it inCorrectionSet::corrections_
Correction::Correction(const JSONObject&)
setsdata_
to the output ofresolve_content
, passing the jsonresolve_content
(defined in correction.cc) constructs the appropriate type depending on the JSON input (if/else-ing over the known correction types)