Tagging Information in DaVinci

Option file for testing the ParticleTaggerAlg algorithm and the related ThOr functors MAP_INPUT_ARRAY.

The aim of this example, is to retrieve and store in the final ntuple some variables related to the additional tracks generated in the event that are not included in the decay chain. This kind of task has been developed for flavour tagging purposes but can be extended also to other area.

In particular, the job runs over a spruced sample and retrieves a set of \(B^0 \to D^-_s K^+\) candidates. For each candidate the ParticleTaggerAlg looks at the TES location defined via the make_long_pions function and creates a ‘one-to-many’ relation map linking all the available tracks to the \(B\) candidate of the events.

Then the MAP_INPUT_ARRAY functor takes in input this relation map and for each entry stores the output of an external functor (i.e F.P, F.PT) in a vector.

In addition, MC truth information for each track in the make_long_pions location is retrieved and stored in the final tuples via the MCTruthAndBkgCat function.

N.B. the job runs over a spruced sample, but the same flow can be used also for turbo .dst.


from PyConf.reading import get_particles
from PyConf.Algorithms import ParticleTaggerAlg, ParticleContainerMerger

import Functors as F
from FunTuple import FunctorCollection
from FunTuple import FunTuple_Particles as Funtuple

from DaVinci import Options, make_config
from DaVinci.algorithms import create_lines_filter
from Hlt2Conf.standard_particles import make_long_pions
from DaVinciMCTools import MCTruthAndBkgCat


def main(options: Options):
    # Define the fields of the decay chain of interest
    fields = {
        "B0": "[B0 -> D_s- K+]CC",
        "Ds": "[B0 -> ^D_s- K+]CC",
        "Kp": "[B0 -> D_s- ^K+]CC",
    }

    # Retrieve particles surviving a specific spruced line
    bd2dsk_line = "Spruce_Test_line"
    bd2dsk_data = get_particles(f"/Event/Spruce/{bd2dsk_line}/Particles")

    # Create a new pion container via the 'make_long_pions()' function
    # implemented in DaVinci.common_particles modulo.
    pions = make_long_pions()

    # Since the provenance of the tracks is not important in this kind of task,
    # the 'ParticleTaggerAlg' runs over a unique 'ParticleContainerMerger'
    # object, merging together all the track containers defined by the user.
    tagging_container = ParticleContainerMerger(InputContainers=[pions]).OutputContainer

    # Define ParticleTagger algorithm and create a relation table between the
    # decay mother particle, i.e. B0 meson and all the tracks defined in the
    # 'tagging_contaienr'.
    tagAlg = ParticleTaggerAlg(
        Input=bd2dsk_data, TaggingContainer=tagging_container, OutputLevel=3
    )
    # Retrieve the relation map linking all the underlying tracks available in the 'tagging_container'
    # to the B meson. This map will be used in the next steps for the functor evaluation.
    tagAlg_rels = tagAlg.OutputRelations

    # If the user needs to store the MC truth information related to the tracks
    # available in the event, the same functions used for the decay chain
    # particles can be followed.
    # For a comparison define both a relation map to the corresponding MC particles
    # for the decay chain ('mctruth') and for the other tracks in the event ('mctruth_pions').
    MCTRUTH = MCTruthAndBkgCat(bd2dsk_data, name="MCTruthAndBkgCat")
    MCTRUTH_pions = MCTruthAndBkgCat(tagging_container, name="MCTruthAndBkgCat_pions")

    # make collection of functors
    # Define all the variables to be associated to the B field
    #
    # N.B: a default value has to be defined for functors returning an int/bool value (or an array of int/bool values)
    # in case the output is empty: e.g. "TRUEID" or "TagTr_TRUEKEY[nTags]".
    variables_B = FunctorCollection(
        {
            "THOR_MASS": F.MASS,
            "PT": F.PT,
            # Retrieve the true ID for the B meson using the 'MCTRUTH' lambda function defined above
            "TRUEID": F.VALUE_OR(0) @ MCTRUTH(F.PARTICLE_ID),
            # Define variables for the tagging particles associated to the B meson.
            # The 'MAP_INPUT_ARRAY' functor can be used to evaluate the functor of interest to all the tracks
            # linked the 'tagAlg_rels' relation table generated via the 'ParticleTaggeerAlg' algorithm.
            "TagTr_P": F.MAP_INPUT_ARRAY(Functor=F.P, Relations=tagAlg_rels),
            # Currently this stores a branch called "indx" which corresponds to nPVs.
            # You can give a custom name for this via following
            "TagTr_PT[nTags]": F.MAP_INPUT_ARRAY(Functor=F.PT, Relations=tagAlg_rels),
            "TagTr_PHI[nTags]": F.MAP_INPUT_ARRAY(Functor=F.PHI, Relations=tagAlg_rels),
            # Define variables containing the MC truth information for the underlying tracks.
            # The 'MAP_INPUT_ARRAY' functor can be used as in the previous example, but in this case
            # the 'MCTRUTH_pions' lambda function has to be exploited for applying the functor of interest
            # to MC particles associated to the underlying tracks.
            #
            # N.B.: an additional default value has to be added in the definition of MAP_INPUT_ARRAY
            # in case of internal functors built via lambda functions in order to ensure that a valid
            # output is always defined, e.g. MCTRUTH_pions.
            "TagTr_TRUEID[nTags]": F.VALUE_OR([0])
            @ F.MAP_INPUT_ARRAY(
                Functor=F.VALUE_OR(0) @ MCTRUTH_pions(F.PARTICLE_ID),
                Relations=tagAlg_rels,
            ),
            "TagTr_TRUEKEY[nTags]": F.VALUE_OR([-1])
            @ F.MAP_INPUT_ARRAY(
                Functor=F.VALUE_OR(-1) @ MCTRUTH_pions(F.OBJECT_KEY),
                Relations=tagAlg_rels,
            ),
            "TagTr_TRUEP[nTags]": F.MAP_INPUT_ARRAY(
                Functor=F.VALUE_OR(F.NaN) @ MCTRUTH_pions(F.P), Relations=tagAlg_rels
            ),
            "TagTr_TRUEPT[nTags]": F.MAP_INPUT_ARRAY(
                Functor=F.VALUE_OR(F.NaN) @ MCTRUTH_pions(F.PT), Relations=tagAlg_rels
            ),
            "TagTr_TRUEPX[nTags]": F.MAP_INPUT_ARRAY(
                Functor=F.VALUE_OR(F.NaN) @ MCTRUTH_pions(F.PX), Relations=tagAlg_rels
            ),
            "TagTr_TRUEPY[nTags]": F.MAP_INPUT_ARRAY(
                Functor=F.VALUE_OR(F.NaN) @ MCTRUTH_pions(F.PY), Relations=tagAlg_rels
            ),
            "TagTr_TRUEPZ[nTags]": F.MAP_INPUT_ARRAY(
                Functor=F.VALUE_OR(F.NaN) @ MCTRUTH_pions(F.PZ), Relations=tagAlg_rels
            ),
            "TagTr_TRUEENERGY[nTags]": F.MAP_INPUT_ARRAY(
                Functor=F.VALUE_OR(F.NaN) @ MCTRUTH_pions(F.ENERGY),
                Relations=tagAlg_rels,
            ),
            "TagTr_TRUEPHI[nTags]": F.MAP_INPUT_ARRAY(
                Functor=F.VALUE_OR(F.NaN) @ MCTRUTH_pions(F.PHI), Relations=tagAlg_rels
            ),
        }
    )

    # Make collection of functors for all the signal decay chain particles
    variables_all = FunctorCollection(
        {
            "THOR_P": F.P,
            "THOR_PT": F.PT,
        }
    )

    # Define a dict with all the variables to be stored associated to the corresponding field.
    variables = {
        "ALL": variables_all,  # adds variables to all fields
        "B0": variables_B,
    }

    # Define the FunTuple object that will produce the final .root files, passing the directory and name
    # of the output tree, the list of fields and the dict of variables to be stored, and the input data
    # location containing the signal decay chain particles.
    tuple_B0DsK = Funtuple(
        name="B0DsK_Tuple",
        tuple_name="DecayTree",
        fields=fields,
        variables=variables,
        inputs=bd2dsk_data,
    )

    # Define a filter in order to process only the event with at least on candidates of interest.
    # This is a very important step aimed to reduce both the computation time required by the job and
    # to prevent any failure due to empty TES location.
    filter_B0DsK = create_lines_filter(name="HDRFilter_B0DsK", lines=[f"{bd2dsk_line}"])

    # Configure DaVinci passing the options and a list of the user-defined algorithms to be run.
    return make_config(options, [filter_B0DsK, tuple_B0DsK])

To run the example:

lbexec DaVinciExamples.tupling.option_davinci_tupling_array_taggers:main $DAVINCIEXAMPLESROOT/example_data/test_spruce_MCtools.yaml

For reference, these are the options of this example

input_files:
- root://eoslhcb.cern.ch//eos/lhcb/wg/dpa/wp3/tests/spruce_realtimereco_dstinput.dst
input_manifest_file: root://eoslhcb.cern.ch//eos/lhcb/wg/dpa/wp3/tests/spruce_example_realtime_dstinput.tck.json
input_type: ROOT
simulation: true
conddb_tag: sim-20171127-vc-md100
dddb_tag: dddb-20171126
conditions_version: master
geometry_version: run3/trunk
histo_file: sprucing_mc_histos.root
ntuple_file: sprucing_mc_tuple.root
input_raw_format: 0.5
input_process: Spruce
persistreco_version: 0.0
lumi: False
write_fsr: False