ATLAS Offline Software
Loading...
Searching...
No Matches
CalibrationDataInterfaceROOT.cxx
Go to the documentation of this file.
1/*
2 Copyright (C) 2002-2026 CERN for the benefit of the ATLAS collaboration
3*/
4
6// //
7// CalibrationDataInterfaceROOT //
8/* Begin_Html
9<h2>Purpose</h2>
10<p>This class provides the main interface to b-tagging calibration information.
11 Each instance can be used to handle a single tagger (so if multiple taggers' information
12 needs to be accessed, multiple CalibrationDataInterfaceROOT instances need to be created).<br />
13 Its action is steered by a configuration file (which is parsed using TEnv and which is
14 specified in the CalibrationDataInterfaceROOT constructor).
15</p>
16<p>
17 Further details can be found below, arranged by topic: <a href="#basic">basic usage</a>,
18 <a href="#ev">eigenvector variation usage</a>, and <a href="#configuration">configuration file
19 specification</a>.
20</p>
21<a name="basic"><h2>Basic usage</h2></a>
22<p>
23 The main functionality is to retrieve data/MC calibration scale factors and MC efficiencies,
24 on a jet-by-jet basis. This is done using the following methods:
25 <pre>
26 getScaleFactor (variables, label, OP, unc)
27 getMCEfficiency (variables, label, OP, unc, mapIndex)
28 </pre>
29 with the following meaning for the arguments:
30 <pre>
31 variables: a CalibrationDataVariables object that should be filled by the user with all
32 the information needed to extract the calibration information
33 label: flavour label. Currently recognised labels follow the Athena TruthInfo conventions
34 (see package PhysicsAnalysis/JetTagging/JetTagInfo): "B", "C", "T", "N/A", and in
35 addition "Light" (the "N/A" is internally converted to "Light")
36 OP: tagger working point. This should correspond to a documented weight cut for the
37 tagger under consideration, but converted to a string, and with any period (".")
38 replaced with an underscore for technical reasons. Alternatively, "Continuous" may
39 be used if a "continuous tagging" calibration object exists for the tagger under
40 consideration. (Note that the use of this method for "continuous tagging" is not
41 in general to be recommended, as it does not not allow for scale factor rescaling,
42 which is ne
43 unc: enum specifying what uncertainty information is to be retrieved. Recognised values
44 for normal usage are <b>None</b> (retrieval of central values only), <b>Statistical</b>
45 (statistical uncertainty), <b>Systematic</b> (systematic uncertainty), <b>Total</b>
46 (combined statistical and systematic uncertainty). (Other choices exist for more advanced
47 usage and will be described <a href="#ev">below</a>.)
48 mapIndex: index specifying the precise MC efficiency calibration to be used. This is relevant if
49 multiple "efficiency calibration" names are specified for the given flavour label (see
50 the <a href="#configuration">configuration section</a>).
51 </pre>
52 All methods return a pair&lt;double, double&gt; (the stated <b>CalibResult</b> return type is a simple typedef),
53 the first member of which contains the central value and the second the requested uncertainty (or 0 if no
54 uncertainty was requested). Note that central values are retricted to be within the physical region (between 0
55 and 1 for the MC efficiencies, and positive for the scale factors).
56</p>
57<p>
58 A few additional methods exist, based on (and internally using) the above methods:
59 <pre>
60 getEfficiency (variables, label, OP, unc, mapIndex): data efficiency calculated as MC efficiency times SF
61 getInefficiency (variables, label, OP, unc, mapIndex): 1 - data efficiency
62 getMCInefficiency (variables, label, OP, unc, mapIndex): 1 - MC efficiency
63 getInefficiencyScaleFactor (variables, label, OP, unc, mapIndex): ratio of data to MC inefficiencies
64 </pre>
65 Especially the last of these methods is likely to be of interest when data/MC scale factors are applied;
66 <a href="http://cdsweb.cern.ch/record/1269912">this note</a> provides more detail. Restrictions to the physical
67 range apply, as is the case for the data/MC scale factor and MC efficiency methods.
68</p>
69<a name="speed-up"><h3>Code speed-up using variable caching</h3></a>
70<p>
71 Internally, pointers to the calibration objects are cached. However, the string matching needed to
72 exploit this feature is slow and leads to a non-negligible CPU overhead. Since internally all pointers
73 are stored in a vector, it is possible to request the position in this vector for a given calibration
74 object. This position can then be used in subsequent alternative calls to retrieve information:
75 <pre>
76 getScaleFactor (variables, indexSF, indexEff, unc)
77 getMCEfficiency (variables, index, unc)
78 </pre>
79 Here the "index" variables replace the specification of the flavour label and operating point. They can
80 be retrieved by calling
81 <pre>
82 retrieveCalibrationIndex(label, OP, author, isSF, index)
83 </pre>
84 where the following additional variables are to be used:
85 <pre>
86 author: jet collection name
87 isSF: set to true (false) if the object concerned is a scale factor object (a MC efficiency object)
88 index: requested information
89 </pre>
90 Note that this method has a boolean return value, indicating whether or not the retrieval succeeded.
91 This return value should be checked by the user, as no checks are carried in the alternative getScaleFactor()
92 etc. methods as to the validity of the index specified. Note also that for all methods except those using
93 only the MC information, both indices need to be provided.
94</p>
95<a name="continuous"><h3>Continuous tagging</h3></a>
96<p>
97 Rather than merely inquiring whether the tag weight discriminant resulting from a given tagger satisfies a
98 given criterion, it may be useful to use more detailed information. In particular, "continuous tagging"
99 information can be made available. This is essentially a calibrated version of the (binned) tag weight
100 discriminant distribution, or to be more precise, the corresponding data/MC ratio.
101</p>
102<p>
103 The use of this information differs somewhat from the regular calibration usage, as the "transport" from
104 the calibration sample(s) to the sample used in physics becomes a non-trivial extension of the efficiency
105 and inefficiency scale factors used for the regular calibrations. Therefore, two separate methods exist
106 which should be used for such cases:
107 <pre>
108 getWeightScaleFactor (variables, label, unc, numVariation, mapIndex)
109 getWeightScaleFactor (variables, indexSF, indexEff, unc, numVariation)
110 </pre>
111 The meaning of the variables is as above.
112</p>
113<a name="ev"><h2>Eigenvector variations</h2></a>
114<p>
115 The 'basic usage' described above does not do justice to the fact that data/MC calibration scale factors
116 are derived in kinematic bins, with uncertainties not being fully correlated from bin to bin (indeed, the
117 methods themselves do not know anything about binning to start with). It is however possible to arrive at
118 a statistically more correct approach by considering 'variations'. This exploits the fact that in the case
119 of scale factor calibrations, besides statistical and total systematic uncertainties also individual
120 systematic uncertainties are stored (along with a model of their bin-to-bin correlations). The typical use
121 of this, applicable for a systematic uncertainty that is fully correlated (or anti-correlated) between bins,
122 would be to consider the effect on all calibration scale factors from a &pm; 1 standard deviation of the
123 underlying source of systematic.
124</p>
125<p>
126 In practice, this approach is a bit cumbersome due to the sometimes large number of contributions to the
127 systematic uncertainty. In addition, no infrastructure exists at present to deal with uncertainties that are
128 not, or only partially, correlated from bin to bin. An eigenvector decomposition technique allows to address
129 both of these issues at the same time.
130</p>
131<p>
132 This method starts from the scale factor covariance matrix that can be constructed on the basis of the
133 available information (uncertainties plus a model of their bin-to-bin correlations). This covariance matrix
134 can be diagonalised and its eigenvectors and corresponding eigenvalues determined. The product of an eigenvector
135 and the square root of its corresponding eigenvalue is what is referred to as an 'eigenvector variation'.
136 The number of such (statistically independent) variations equals the number of calibration bins, and the set
137 of variations is the minimal set needed to (re-)construct the covariance matrix. After this eigenvector
138 decomposition, the eigenvector variations may be used instead of the variations corresponding to the original
139 sources of uncertainty.
140</p>
141<p>
142 One further refinement can be made, related to the fact that sometimes sources of uncertainty affecting
143 the calibration analyses also affect the physics analysis in which the calibration results are used.
144 Including these sources of uncertainty in the eigenvector decomposition would not allow to correlate the
145 effect on the calibration properly with the effect on the physics analysis; therefore it is possible to
146 exclude these sources of uncertainty from being used in the construction of the covariance matrix that is
147 diagonalised, and consider their variations separately (note, per the above, that this will be done
148 correctly only in the case of full bin-to-bin correlations).
149</p>
150<p>
151 The use of the eigenvector variation is not enabled by default and must be switched on in the configuration
152 file (see <a href="#configuration">Configuration</a>).
153</p>
154<p>
155 The above can be used to obtain jet-by-jet information by a slight modification of the arguments to the
156 normal methods for information retrieval also employed for <a href="#basic">basic usage</a>:
157 <pre>
158 getScaleFactor (variables, label, OP, unc, numVariation)
159 </pre>
160 Here, the first three variables function exactly as in the basic usage case. As for the two last arguments:
161 <pre>
162 unc: this should take the value <b>SFEigen</b> or <b>SFNamed</b> for eigenvector variations or named variations, respectively.
163 numVariation: this indicates exactly which eigenvector variation or named variation is to be considered.
164 </pre>
165 In addition, it should be noticed that the methods' return values have a different meaning in this context:
166 while in normal usage they are (value, uncertainty) pairs, here they are (up, down) pairs, with up and down
167 meaning the result of an up- or downward eigenvector variation or named variation.
168</p>
169<p>
170 The number of valid variations for each type can be retrieved using
171 <pre>
172 getNumVariations(author, label, OP, Uncertainty unc)
173 </pre>
174 with "unc" set appropriately. Finally, calling
175 <pre>
176 listScaleFactorUncertainties(author, label, OP, named)
177 </pre>
178 can be used to retrieve information about uncertainties in two ways, depending on the value of the last argument:
179 <pre>
180 named = false (default): retrieve an unsorted list of all the uncertainties associated with the relevant scale factor
181 calibration object. This can be used to identify sources of uncertainty to be excluded from
182 the eigenvector decomposition.
183 named = true: the list in this case is restricted to the named uncertainties (these should have been specified
184 in the configuration file by the user) and ordered: the position in the vector constitutes the
185 link between the name and the index to be used as the "numVariation" argument to the
186 getScaleFactor() etc. method.
187 </pre>
188 Note that as in the basic usage case, the code can be sped up by replacing the jet author/label/OP specification by the appropriate
189 integer index (see the <a href="#speed-up">corresponding section</a>).
190</p>
191<a name="configuration"><h2>Configuration</h2></a>
192<p>
193 The configuration file may specify the following settings:
194</p>
195<ul>
196 <li>calibration file specification:
197 <pre>
198
199 filename: common ROOT file for calibration scale factors and MC efficiencies
200 (default: BTaggingPerformanceCalibrations.root)
201 filenameEff: separate ROOT file for MC efficiencies
202 filenameSF: separate ROOT file for calibration scale factors
203 </pre>
204 Note that it is not necessary to specify all three files. The most common use for the specification
205 of more than one file is likely to be the specification of analysis-specific MC efficiencies (the
206 efficiencies provided in the common file are expected to be sufficient for many analysis purposes,
207 but depending on the accuracy needed it may be desirable to add custom efficiencies).
208 </li>
209 <li>choice of calibration:
210 <pre>
211
212 ScaleFactorCalibrationBName: name for scale factor calibration object to be used for b-jets (default: default)
213 EfficiencyCalibrationBName: name for MC efficiencies object(s) to be used for b-jets (default: default).
214 Note that multiple (semicolon-separated) MC efficiency objects may be specified;
215 internally they will be converted to a vector. The position in the vector can
216 then be used as the 'mapIndex' argument in the methods involving MC information
217 (see e.g. the <a href="#basic">basic usage</a> section).
218 </pre>
219 Analogous keywords (with B replaced with C, T, Light) can be used to specify the objects to be used for
220 charm, tau, and light-flavour jets, respectively. Note that the "default" object should always exist and reflect
221 the recommended choice of object unless analysis-specific needs dictate otherwise.
222 </li>
223 <li>jet aliasing:
224 <pre>
225 aliases: semicolon-separated list of jet collection aliases
226 </pre>
227 The feature of jet aliasing follows the strategy also used in the CalibrationBroker Athena class.
228 It allows for the use of a calibration for different jet collection names than those actually
229 employed in physics analysis (caveat emptor!). Each alias consists of the specification
230 <pre>
231 source-&gt;target
232 </pre>
233 where source indicates the name used in physics analysis, while target is the name to be used for calibration purposes.
234 </li>
235 <li>eigenvector decomposition:
236 <pre>
237 runEigenVectorMethod: if set to true this will carry out an eigenvector decomposition, the results
238 of which can subsequently be used as calibration scale factor "variations"
239 correctly accounting for bin-to-bin correlations. For detailed usage of
240 b-tagging results (e.g. in profile likelihood fits) this is recommended (default: false)
241 excludeFromCovMatrix: semicolon-separated list of uncertainties to be excluded from the eigenvector
242 decomposition, for all flavours.
243 excludeFromBCovMatrix: semicolon-separated list of uncertainties to be excluded from the eigenvector
244 decomposition for b jets
245 excludeFromCCovMatrix: semicolon-separated list of uncertainties to be excluded from the eigenvector
246 decomposition for c jets
247 excludeFromTCovMatrix: semicolon-separated list of uncertainties to be excluded from the eigenvector
248 decomposition for tau "jets"
249 excludeFromLightCovMatrix: semicolon-separated list of uncertainties to be excluded from the eigenvector
250 decomposition for light-flavour jets
251 </pre>
252 </li>
253</ul>
254End_Html */
255//
256// CalibrationDataInterfaceROOT.cxx, (c) ATLAS Detector software //
258
261
264#include <boost/algorithm/string.hpp>
265
266#include "TMath.h"
267#include "TEnv.h"
268#include "TFile.h"
269#include "TObjString.h"
270#include <iostream>
271#include <iomanip>
272#include <cmath>
273#include <cassert>
274#include <cstring>
275
276using std::string;
277using std::cout;
278using std::cerr;
279using std::endl;
280using std::setw;
281
286using boost::trim;
287
288#ifndef __CINT__
290#endif
291
292//________________________________________________________________________________
293Analysis::CalibrationDataInterfaceROOT::CalibrationDataInterfaceROOT(const string& taggerName, const std::string& configname, const std::string& pathname) :
294 m_runEigenVectorMethod(false), m_EVStrategy(SFEigen), m_useRecommendedEVExclusions(false), m_verbose(true),
295 m_absEtaStrategy(GiveUp), m_otherStrategy(Flag)
296{
297 // Normal constructor.
298 //
299 // taggerName: this should correspond to the tagger name as used in the calibration ROOT file
300 // configname: full name of the configuration file
301 // pathname: directory specification for separate scale factor or efficiency ROOT file
302
303 m_taggerName = taggerName;
304
305 TEnv env;
306 env.ReadFile(configname.c_str(),kEnvGlobal);
307
308
309 // ROOT file containing the calibrations
310 TString filename = env.GetValue("File", "BTaggingPerformanceCalibrations.root");
311 m_filenameEff = string(env.GetValue("FileEff", "")); trim(m_filenameEff);
312 m_filenameSF = string(env.GetValue("FileSF", "")); trim(m_filenameSF);
313 if (m_filenameEff == "") {
314 m_filenameEff = pathname + filename.Data();
315 }
316 if (m_filenameSF == "") {
317 m_filenameSF = pathname + filename.Data();
318 }
319
320 if (m_verbose) {
321 cout << "=== CalibrationDataInterfaceROOT::CalibrationDataInterfaceROOT ===" << endl;
322 cout << " Config name : " << configname << endl;
323 cout << " taggerName : " << taggerName << endl;
324 cout << " Efficiency file name : " << m_filenameEff << endl
325 << " SF file name : " << m_filenameSF << endl;
326 }
327
328 m_fileEff = TFile::Open(m_filenameEff.c_str(), "READ");
329 if (m_filenameEff == m_filenameSF)
330 m_fileSF = m_fileEff;
331 else
332 m_fileSF = TFile::Open(m_filenameSF.c_str(), "READ");
333
334 if (m_verbose) {
335 TObjString* s;
336 m_fileSF->GetObject("VersionInfo/BuildNumber", s);
337 if (s) cout << " CDI file build number: " << s->GetName() << endl;
338 cout << endl;
339 }
340
341 m_flavours = { "B", "C", "T", "Light" };
342 string testPrefix(taggerName); testPrefix += ".";
343
344 // Since TEnv doesn't allow for straight retrieval of vectors of strings, expect
345 // semicolon-separated entries (semicolon because ROOT considers this as a "special"
346 // token anyway in object names).
347 string::size_type end;
348
349 // Calibration names for the efficiencies
350 std::map<string, std::vector<string> > effNames;
351 for (auto const& flavour : m_flavours) {
352 string test(testPrefix); test += "EfficiencyCalibration"; test += flavour; test += "Name";
353 effNames[flavour] = split(string(env.GetValue(test.c_str(), "default")));
354 }
355 setEffCalibrationNames(effNames);
356
357 // Calibration names for the efficiency scale factors
358 std::map<string, string> SFNames;
359 for (auto const& flavour : m_flavours) {
360 string test(testPrefix); test += "ScaleFactorCalibration"; test += flavour; test += "Name";
361 SFNames[flavour] = string(env.GetValue(test.c_str(), "default")); trim(SFNames[flavour]);
362 }
363 setSFCalibrationNames(SFNames);
364
365 // Since TEnv doesn't allow for straight retrieval of vectors of strings, expect
366 // semicolon-separated entries (semicolon because ROOT considers this as a "special"
367 // token anyway in object names).
368 // Don't prefix this since the aliases are common to all taggers (even if they are read again for each tagger).
369 string AL(env.GetValue("aliases", ""));
370 if (AL.size() > 0) {
371 do {
372 end = AL.find(";");
373 string alias = AL.substr(0, end);
374 // Each alias specification uses an arrow ("->"). Forget about entries
375 // not properly following this specification.
376 // NB: TEnv imposes a maximum string length of 1024 characters -- is this a problem?
377 string::size_type arrow = alias.find("->");
378 if (arrow == string::npos) continue;
379 string target = alias.substr(0,arrow); trim(target);
380 m_aliases[target] = alias.substr(arrow+2); trim(m_aliases[target]);
381 if (end != string::npos) AL = AL.substr(end+1);
382 } while (end != string::npos);
383 }
384
385 //run egenvector method or not?
386 string test="runEigenVectorMethod";
387 m_runEigenVectorMethod=(bool)env.GetValue(test.c_str(),0);
388
389 if (m_runEigenVectorMethod) {
390 // Retrieve the list of systematic uncertainties not to be considered when building up
391 // the full covariance matrix used for the eigenvector method.
392 // We do this in two steps: first, for backward compatibility reasons, a flavour-independent list is scanned.
393 // Second, flavour-specific lists are scanned.
394 test = "excludeFromCovMatrix";
395 std::vector<std::string> to_exclude = split(env.GetValue(test.c_str(), ""));
396 // Copy the resulting list to all flavours
397 for (auto const& flavour : m_flavours) {
398 m_excludeFromCovMatrix[flavour] = to_exclude;
399 }
400 for (auto const& flavour : m_flavours) {
401 test = "excludeFrom"; test += flavour; test += "CovMatrix";
402 to_exclude = split(env.GetValue(test.c_str(), ""));
403 // Append to the existing list
404 m_excludeFromCovMatrix[flavour].insert(m_excludeFromCovMatrix[flavour].end(), to_exclude.begin(), to_exclude.end());
405 }
406
407 unsigned int n_excluded = 0;
408 for (auto const& flavour : m_flavours) {
409 n_excluded += m_excludeFromCovMatrix[flavour].size();
410 }
411 if (m_verbose) {
412 cout << " List of uncertainties to exclude:";
413 if (n_excluded == 0) cout << " none";
414 for (auto const& flavour : m_flavours) {
415 if (m_excludeFromCovMatrix[flavour].size() > 0) {
416 cout << "\n\t" << flavour << ":\t";
417 for (unsigned int i = 0; i < m_excludeFromCovMatrix[flavour].size(); ++i) {
418 cout << m_excludeFromCovMatrix[flavour].at(i);
419 if (i+1 == m_excludeFromCovMatrix[flavour].size()) cout << "; ";
420 }
421 cout << endl;
422 }
423 }
424 cout << endl;
425 }
426
427 // The following determines whether also pre-determined (recommended) lists of uncertainties are to be excluded from EV decomposition.
428 // These lists are stored with the CalibrationDataContainers, which have not been instantiated yet (so we cannot show them at this point).
429 m_useRecommendedEVExclusions = (bool) env.GetValue("ExcludeRecommendedFromEigenVectorTreatment", false);
430
431 // determine also the eigenvector reduction strategies
432 std::map<string, EVReductionStrategy> mappings;
433 mappings["Loose"] = Loose;
434 mappings["Medium"] = Medium;
435 mappings["Tight"] = Tight;
436 for (auto const& flavour : m_flavours) {
437 test = testPrefix; test += "EigenvectorReduction"; test += flavour;
438 std::string reduction = string(env.GetValue(test.c_str(), "Loose")); trim(reduction);
439 m_EVReductions[flavour] = mappings.find(reduction) == mappings.end() ? mappings["Loose"] : mappings.find(reduction)->second;
440 }
441 }
442
443 // determine |eta| validity range
444 m_maxAbsEta = env.GetValue("MaxAbsEta", 2.5);
445 if (m_maxAbsEta < 0) m_maxAbsEta = 2.5;
446
447 // set validation / protection strategy in case an out-of-bounds eta value is specified
448 string strategy = string(env.GetValue("OutOfBoundsEta", "GiveUp")); trim(strategy);
449 if (strategy == "GiveUp") m_absEtaStrategy = GiveUp;
450 else if (strategy == "Flag") m_absEtaStrategy = Flag;
451 else if (strategy == "Ignore") m_absEtaStrategy = Ignore;
452 else {
453 cerr << "unknown |eta| extrapolation strategy: " << strategy << ", setting to GiveUp" << endl;
454 m_absEtaStrategy = GiveUp;
455 }
456
457 // set validation / protection strategy in case out-of-bounds variables are specified
458 strategy = string(env.GetValue("OutOfBoundsOther", "Flag")); trim(strategy);
459 if (strategy == "GiveUp") m_otherStrategy = GiveUp;
460 else if (strategy == "GiveUpExtrapolated") m_otherStrategy = GiveUpExtrapolated;
461 else if (strategy == "Flag") m_otherStrategy = Flag;
462 else if (strategy == "Ignore") m_otherStrategy = Ignore;
463 else {
464 cerr << "unknown general extrapolation strategy: " << strategy << ", setting to Flag" << endl;
465 m_otherStrategy = Flag;
466 }
467
468 // maximum tag weight to accept
469 m_maxTagWeight = env.GetValue("MaxTagWeight", 10.0);
470
471 // MC/MC (hadronisation) scale factors: making this user-steerable is intended to be *temporary* only
472 m_useMCMCSF = (bool) env.GetValue("useMCMCSF", 1);
473 // MC/MC (topology) scale factors: making this user-steerable is intended to be *temporary* only
474 m_useTopologyRescaling = (bool) env.GetValue("useTopologySF", 0);
475
476 if (m_verbose) cout << "======= end of CalibrationDataInterfaceROOT instantiation ========" << endl;
477}
478
479
480//________________________________________________________________________________
482 const char* fileSF, const char* fileEff,
483 const std::vector<std::string>& jetAliases,
484 const std::map<std::string, std::string>& SFNames,
485 const std::map<std::string, std::vector<std::string> >& EffNames,
486 const std::map<std::string, std::vector<std::string> >& excludeFromEV,
487 const std::map<std::string, EVReductionStrategy>& EVReductions,
488 bool useEV, Uncertainty strat, bool useMCMCSF, bool useTopologyRescaling,
489 bool useRecommendedEEVExclusions, bool verbose,
490 std::vector<std::string> flavours) :
491 m_filenameSF(fileSF), m_filenameEff(""), m_flavours(std::move(flavours)),
492 m_runEigenVectorMethod(useEV), m_EVStrategy(strat), m_EVReductions(EVReductions),
493 m_useRecommendedEVExclusions(useRecommendedEEVExclusions), m_verbose(verbose),
494 m_useMCMCSF(useMCMCSF), m_useTopologyRescaling(useTopologyRescaling),
497{
498 // Normal constructor avoiding the need for a .env file.
499 //
500 // taggerName: this should correspond to the tagger name as used in the calibration ROOT file
501 // fileSF: full path of the calibration ROOT file containing the calibration scale factors
502 // fileEff: optional full path name of a ROOT file containing additional MC efficiency maps
503 // (use a null pointer to disable the use of such additional file)
504 // flavours; This should correspond to the list of flavour labels that's used by a tagger, and
505 // which corresponds to the labels used in internal maps
506 // jetAliases: this can be used to convert jet collection names to the corresponding names in the
507 // calibration ROOT file (this may be useful as e.g. the collection names in the
508 // calibration ROOT file have the JVF criterion attached as a suffix).
509 // Each alias is specified as
510 // nameOrig->nameTarget,
511 // where nameOrig and nameTarget are the names of the input jet collection and the
512 // jet collection name as used in the calibration ROOT file, respectively.
513 // SFNames: map specifying for each of the calibration flavours ("B", "C", "T", "Light") the
514 // name of the scale factor calibration object
515 // EffNames: map specifying for each of the calibration flavours ("B", "C", "T", "Light") the
516 // names of the possibly relevant efficiency calibration objects
517 // excludeFromEV: map specifying for each of the calibration flavours ("B", "C", "T", "Light") the
518 // systematic uncertainties to be excluded from the Eigenvector variation treatment
519 // (this is used only if Eigenvector variations are used to begin with)
520 // EVReductions: Eigenvector variation reduction strategies for "B", "C", "Light" jets (again,
521 // this is only relevant if Eigenvector variations are used to begin with)
522 // useEV: switch specifying if Eigenvector variations will be used or not
523 // useMCMCSF: switch specifying if generator-dependent scale factors are to be applied or not
524
525 // Note: at present, the means to change the strategies and maximum values initialized above do not exist
526 // when using this constructor
527
528 if (m_verbose) {
529 cout << "=== CalibrationDataInterfaceROOT::CalibrationDataInterfaceROOT ===" << endl;
530 cout << " taggerName : " << taggerName << endl;
531 cout << " Systematic strategy : ";
532 if (m_EVStrategy == Analysis::Uncertainty::SFEigen){
533 cout << "SFEigen" << endl;
534 } else if (m_EVStrategy == Analysis::Uncertainty::SFGlobalEigen){
535 cout << "SFGlobalEigen" << endl;
536 } else {
537 cout << " Other" << endl;
538 }
539 if (fileEff) cout << " Efficiency file name : " << fileEff << endl;
540 cout << " SF file name : " << fileSF << endl
541 << endl;
542 }
543
544 m_taggerName = taggerName;
545
546 m_fileSF = TFile::Open(fileSF, "READ");
547 if (fileEff && strcmp(fileSF, fileEff) != 0) {
548 m_filenameEff = string(fileEff);
549 m_fileEff = TFile::Open(fileEff, "READ");
550 } else
552
553 if (m_verbose) {
554 TObjString* s;
555 m_fileSF->GetObject("VersionInfo/BuildNumber", s);
556 if (s) cout << " CDI file build number: " << s->GetName() << endl;
557 cout << endl;
558 }
559
560 for (unsigned int i = 0; i < jetAliases.size(); ++i) {
561 // Each alias specification uses an arrow ("->"). Forget about entries
562 // not properly following this specification.
563 string::size_type arrow = jetAliases[i].find("->");
564 if (arrow == string::npos) continue;
565 m_aliases[jetAliases[i].substr(0,arrow)] = jetAliases[i].substr(arrow+2);
566 }
567
568 setEffCalibrationNames(EffNames);
569 setSFCalibrationNames(SFNames);
570
572 // if we want to run EV, then decide which one
573 // The following should hold for both eigenvector decomposition methods (SFEigen and SFGlobalEigen)
574 // The global one simply adapts itself to using the m_excludeFromCovMatrix to perform the same task
575
576 m_excludeFromCovMatrix = excludeFromEV;
577 unsigned int n_excluded = 0;
578 for (auto const& flavour : m_flavours) {
579 n_excluded += m_excludeFromCovMatrix[flavour].size();
580 }
581 if (m_verbose) {
582 cout << " List of uncertainties to exclude:";
583 if (n_excluded == 0) cout << " none";
584 for (auto const& flavour : m_flavours) {
585 if (m_excludeFromCovMatrix[flavour].size() > 0) {
586 cout << "\n\t" << flavour << ":\t";
587 for (unsigned int i = 0; i < m_excludeFromCovMatrix[flavour].size(); ++i) {
588 cout << m_excludeFromCovMatrix[flavour].at(i);
589 if (i+1 == m_excludeFromCovMatrix[flavour].size()) cout << "; ";
590 }
591 cout << endl;
592 }
593 }
594 cout << endl;
595 }
596
597 }
598
599 if (m_verbose) cout << "======= end of CalibrationDataInterfaceROOT instantiation ========" << endl;
600}
601
602//________________________________________________________________________________
604{
605 // Default constructor for PROOF purposes
606
607 m_fileEff=0;
608 m_fileSF=0;
609}
610
611//________________________________________________________________________________
621{
622 // Copy constructor. Note that the "cacheable" items aren't copied (they will be re-created if needed)
623
624 // The TFile objects cannot be copied. Therefore, create duplicate objects starting from the filenames
625 m_fileSF = TFile::Open(m_filenameSF.c_str(), "READ");
628 else
629 m_fileEff = TFile::Open(m_filenameEff.c_str(), "READ");
630}
631
632//________________________________________________________________________________
634{
635 // Destructor
636 if ((m_fileEff!=0) && (m_fileSF!=0)) {
637 if (m_fileEff == m_fileSF) {
638 m_fileEff->Close();
639 delete m_fileEff; m_fileEff = 0;
640 } else {
641 m_fileEff->Close();
642 m_fileSF->Close();
643 delete m_fileEff; m_fileEff = 0;
644 delete m_fileSF; m_fileSF = 0;
645 }
646 }
647 // delete also the stored objects (these are owned by us)
648 for (std::vector<CalibrationDataContainer*>::iterator it = m_objects.begin(); it != m_objects.end(); ++it) {
649 if (*it) {
650 delete *it; *it = 0;
651 }
652 }
653
654 for (std::map<std::string, HadronisationReferenceHelper*>::iterator it = m_refMap.begin();
655 it != m_refMap.end(); ++it) {
656 if(it->second)
657 { delete it->second; it->second=nullptr; }
658 }
659
660 // Print summary output on out-of-bounds issues
661 if (m_absEtaStrategy == Flag && m_verbose) {
662 bool found = false;
663 cout << "\t\tCalibrationDataInterfaceROOT |eta| out-of-bounds summary:" << endl;
664 for (unsigned int index = 0; index < m_mainCounters.size(); ++index)
665 if (m_etaCounters[index] > 0) {
666 found = true;
667 cout << "\t\t\t" << nameFromIndex(index) << ": " << m_etaCounters[index] << endl;
668 }
669 if (!found) cout << "\t\t\tNo issues found" << endl;
670 }
671 if (m_otherStrategy == Flag && m_verbose) {
672 bool found = false;
673 cout << "\t\tCalibrationDataInterfaceROOT object out-of-bounds summary:" << endl;
674 for (unsigned int index = 0; index < m_mainCounters.size(); ++index)
676 found = true;
677 cout << "\t\t\t" << nameFromIndex(index)
678 << " general: " << m_mainCounters[index]
679 << ", extrapolated: " << m_extrapolatedCounters[index]
680 << endl;
681 }
682 if (!found) cout << "\t\t\tNo issues found" << endl;
683 }
684}
685
686//________________________________________________________________________________
687bool
689 const std::string& OP,
690 const std::string& author,
691 bool isSF, unsigned int& index,
692 unsigned int mapIndex)
693{
694 // Retrieve the integer index corresponding to a given combination of
695 // flavour label / tagger / working point / jet collection name, and separately
696 // for calibration scale factors and MC efficiencies (all these ingredients are needed
697 // to specify fully the calibration object).
698 // In fact this method will also trigger the retrieval of the object itself, if not already
699 // done, and will cache it internally. The absence of the requested calibration object will
700 // be flagged by a false return value.
701 // This method is used internally but should also be called by users in order to exploit the
702 // "code speed-up" features documented above.
703 //
704 // label: jet flavour label
705 // OP: tagger working point
706 // author: jet collection name
707 // isSF: set to true (false) for scale factors (MC efficiencies)
708 // index: resulting index (meaningful only for a 'true' function return value)
709 // mapIndex: index to the MC efficiency map to be used
710
711 index = 0;
712
713 // construct the full name from the label, operating point, SF/Eff choice;
714 // then look up this full name
715 string name = fullName(author, OP, label, isSF, mapIndex);
716 std::map<string, unsigned int>::const_iterator it = m_objectIndices.find(name);
717 if (it == m_objectIndices.end()) {
718 // If no container is found, attempt to retrieve it here (this is so that users won't
719 // have to call the named scale factor etc. methods once just to retrieve the container).
720 string flavour = (label == "N/A") ? "Light" : label;
721 string cntname = getContainername(flavour, isSF, mapIndex);
722 if (m_verbose) std::cout << "CalibrationDataInterfaceROOT->retrieveCalibrationIndex : container name is " << cntname << std::endl;
723 retrieveContainer(flavour, OP, author, cntname, isSF, m_verbose); // Only call this if you want to retrieve a currently not available container
724 it = m_objectIndices.find(name);
725 if (it == m_objectIndices.end()) return false;
726 } else {
727 if (m_verbose) std::cout << "CalibrationDataInterfaceROOT->retrieveCalibrationIndex : container " << name << " already cached! " << std::endl;
728 }
729
730 index = it->second;
731 return true;
732}
733
734//________________________________________________________________________________
737 const string& label, const string& OP,
738 Uncertainty unc, unsigned int numVariation,
739 unsigned int mapIndex) //const
740{
741 // Scale factor retrieval identifying the requested calibration object by name.
742 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
743 // above, and will be a dummy value in case an error occurs.
744 //
745 // variables: object holding kinematic (and other) information needed to compute the result
746 // label: jet flavour label
747 // OP: tagger operating point
748 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
749 // named variations are to be computed)
750 // numVariation: variation index (in case of eigenvector or named variations)
751 // mapIndex: index to the efficiency map to be used (this is needed for MC/MC scale factor
752 // application)
753 unsigned int indexEff, indexSF;
754 if (! (retrieveCalibrationIndex (label, OP, variables.jetAuthor, false, indexEff, mapIndex) && retrieveCalibrationIndex (label, OP, variables.jetAuthor, true, indexSF))) {
755 cerr << "getScaleFactor: unable to find SF calibration for object " << fullName(variables.jetAuthor, OP, label, false, mapIndex) << " or SF calibration for object " << fullName(variables.jetAuthor, OP, label, true) << endl;
756 // Return a dummy result if the object is not found
758 }
759
760 Analysis::CalibResult result; // the following is SF #3
761 return (getScaleFactor(variables, indexSF, indexEff, unc, numVariation, result, label) == Analysis::kError) ?
762 Analysis::dummyResult : result;
763}
764
765//________________________________________________________________________________
768 unsigned int indexSF, unsigned int indexEff,
769 Uncertainty unc, const std::string& flavour, unsigned int numVariation)
770{
771 // Scale factor retrieval identifying the requested calibration object by index.
772 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
773 // above, and will be a dummy value in case an error occurs.
774 //
775 // variables: object holding kinematic (and other) information needed to compute the result
776 // indexSF: index to scale factor calibration object
777 // indexEff: index to MC efficiency object
778 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
779 // named variations are to be computed)
780 // numVariation: variation index (in case of eigenvector or named variations)
781 Analysis::CalibResult result; // the following is SF #3
782 return (getScaleFactor(variables, indexSF, indexEff, unc, numVariation, result, flavour) == Analysis::kError) ?
783 Analysis::dummyResult : result;
784}
785
786//________________________________________________________________________________
789 unsigned int indexSF, unsigned int indexEff,
790 Uncertainty unc, unsigned int numVariation,
791 Analysis::CalibResult& result, const string& flavour)
792{
793 // Scale factor retrieval identifying the requested calibration object by index.
794 //
795 // variables: object holding kinematic (and other) information needed to compute the result
796 // indexSF: index to scale factor calibration object
797 // indexEff: index to MC efficiency object
798 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
799 // named variations are to be computed)
800 // numVariation: variation index (in case of eigenvector or named variations)
801 // result: (value, uncertainty) or (up, down) variation pair, depending on the unc value.
802 // A dummy value will be returned in case of an error.
803
804 CalibrationDataContainer* container = m_objects[indexSF];
805 if (! container) {
806 cerr << "getScaleFactor: error retrieving container!" << endl;
807 return Analysis::kError;
808 }
809
810 // perform out-of-bound check of jet eta
811 if (!checkAbsEta(variables, indexSF)) {
812 if (m_verbose)
813 cerr << "Jet |eta| is outside of the boundary!" << endl;
814 return Analysis::kRange;
815 }
816
817 // retrieve the MC/MC scale factor
818 double MCMCSF = m_useMCMCSF ? getMCMCScaleFactor(variables, indexSF, indexEff) : 1; // if we don't want to switch generator, MCMCSF = 1, as it should be
819
820 if (!m_runEigenVectorMethod && (unc == SFEigen || unc == SFNamed || unc == SFGlobalEigen))
821 {
822 cerr << " ERROR. Trying to call eigenvector method but initialization not switched on in b-tagging configuration." << endl;
823 cerr << " Please correct your configuration first. Nominal uncertainties used. " << endl;
824 }
825
826 // Procede with eigenvariations methods i.e. return the SF variations
827 if (unc == SFEigen || unc == SFNamed || unc==SFGlobalEigen) {
828 std::shared_ptr<CalibrationDataEigenVariations> eigenVariation;
829 try {
830 eigenVariation=m_eigenVariationsMap.at(container);
831 } catch (const std::out_of_range&) {
832 cerr << " Could not retrieve eigenvector variation, while it should have been there." << endl;
833 return Analysis::kError;
834 }
835 TH1* up=0;
836 TH1* down=0;
837 bool extrapolate = false; // store if the numVariation is the extrapolation named uncertainty index
838 if (unc == SFEigen || unc==SFNamed){
839 unsigned int maxVariations = (unc == SFEigen) ? eigenVariation->getNumberOfEigenVariations() : eigenVariation->getNumberOfNamedVariations();
840 if (numVariation > maxVariations-1) {
841 cerr << "Asked for " << ((unc == SFEigen) ? "eigenvariation" : "named variation") << " number: " << numVariation << " but overall number of available variations is: " << maxVariations << endl;
842 return Analysis::kError;
843 }
844 bool isOK = eigenVariation->getEigenvectorVariation(numVariation,up,down);
845 if (!isOK) {
846 cerr << "Eigenvector object is there but cannot retrieve up and down uncertainty histograms." << endl;
847 return Analysis::kError;
848 }
849 // the 'extrapolation' uncertainty (always a named one) needs a somewhat special treatment
850 extrapolate = (unc == SFNamed) ? eigenVariation->isExtrapolationVariation(numVariation) : false;
851
852 } else if (unc == SFGlobalEigen) {
853 std::shared_ptr<CalibrationDataGlobalEigenVariations> GEV = std::dynamic_pointer_cast<CalibrationDataGlobalEigenVariations>(eigenVariation); //dynamic_cast<std::shared_ptr<CalibrationDataGlobalEigenVariations> >(eigenVariation);
854 if (not GEV){
855 cerr << "Analysis::CalibrationDataInterfaceROOT::getScaleFactor: dynamic cast failed\n";
856 return Analysis::kError;
857 }
858 unsigned int maxVariations = GEV->getNumberOfEigenVariations(flavour); // <----- This gets the number of variations of the flavour
859 if (maxVariations == 0){
860 cerr << "Overall number of available variations is 0!" << endl;
861 return Analysis::kError;
862 }
863 if (numVariation > maxVariations-1) {
864 cerr << "Asked for global eigenvariation number: " << numVariation << " but overall number of available variations is: " << maxVariations << endl;
865 return Analysis::kError;
866 }
867 bool isOK = GEV->getEigenvectorVariation(flavour, numVariation,up,down);
868 if (!isOK) {
869 cerr << "Eigenvector object is there but cannot retrieve up and down uncertainty histograms." << endl;
870 return Analysis::kError;
871 }
872 // the 'extrapolation' uncertainty (always a named one) needs a somewhat special treatment
873 extrapolate = GEV->isExtrapolationVariation(numVariation, flavour);
874 } else {
875 std::cerr << "ERROR: you requested " << unc << " but that isn't in the set of (SFEigen, SFGlobalEigen, SFNamed) for eigenvariations. " << std::endl;
876 return Analysis::kError;
877 }
878
879 double valueUp;
880 double valueDown;
881 Analysis::CalibrationStatus statUp = container->getResult(variables, valueUp, up, extrapolate); // This is what actually retrieves results from the container
882 Analysis::CalibrationStatus statDown = container->getResult(variables, valueDown,down, extrapolate);
883
884 if (statUp == Analysis::kError || statDown == Analysis::kError)
885 return Analysis::kError;
886 if (m_otherStrategy == GiveUp)
887 assert (statUp != Analysis::kRange); // no need to test also statDown
889 assert (statUp != Analysis::kExtrapolatedRange); // no need to test also statDown
890 else if (m_otherStrategy == Flag) {
891 if (statUp == Analysis::kRange)
892 increaseCounter(indexSF);
893 else if (statUp == Analysis::kExtrapolatedRange)
895 }
896
897 result.first = MCMCSF*valueUp;
898 result.second = MCMCSF*valueDown;
899
900 // Prevent negative return values. Should the comparison be against a strict 0?
901 result.first = std::max(Analysis::CalibZERO, result.first);
902 result.second = std::max(Analysis::CalibZERO, result.second);
903
904 return statUp; // end of getScaleFactor if SFEigen, SFGlobalEigen, or SFNamed is set
905
906
907 } // The above returns the up/down varied scale factor
908 //Proceed with no-eigenvector result
909
910 // always retrieve the result itself
911 double value;
912 Analysis::CalibrationStatus status = container->getResult(variables, value);
913 if (status == Analysis::kError) {
914 cerr << "getScaleFactor: error retrieving result in non-EV context!" << endl;
915 return status;
916 }
917 if (m_otherStrategy == GiveUp){
918 assert (status != Analysis::kRange);
919 } else if (m_otherStrategy == GiveUpExtrapolated) {
920 assert (status != Analysis::kExtrapolatedRange);
921 } else if (m_otherStrategy == Flag) {
922 if (status == Analysis::kRange){
923 increaseCounter(indexSF);
924 } else if (status == Analysis::kExtrapolatedRange) {
926 }
927 }
928
929 // retrieve the statistical uncertainty if desired
930 double stat(0);
931 if (unc == Total || unc == Statistical) {
932 if (container->getStatUncertainty(variables, stat) == Analysis::kError) {
933 cerr << "getScaleFactor: error retrieving Scale factor parameter covariance matrix!" << endl;
934 return Analysis::kError;
935 }
936 }
937
938 Analysis::UncertaintyResult resSyst(0,0);
939 if (unc == Total || unc == Systematic) {
940 if (container->getSystUncertainty(variables, resSyst) == Analysis::kError) {
941 cerr << "getScaleFactor: error retrieving Scale factor parameter systematic uncertainty!" << endl;
942 return Analysis::kError;
943 }
944 } else if (unc == Extrapolation) {
945 // this uncertainty is special, since it is not normally to be combined into the overall systematic uncertainty
946 if (container->getUncertainty("extrapolation", variables, resSyst) == Analysis::kError)
947 cerr << "getScaleFactor: error retrieving Scale factor parameter extrapolation uncertainty!" << endl;
948 } else if (unc == TauExtrapolation) {
949 // also this uncertainty is special, since it it singles out an uncertainty relevant only for tau "jets",
950 // and some care has to be taken not to duplicate or omit uncertainties
951 if (container->getUncertainty("extrapolation from charm", variables, resSyst) == Analysis::kError)
952 cerr << "getScaleFactor: error retrieving Scale factor parameter extrapolation uncertainty!" << endl;
953 }
954
955 double uncertainty = combinedUncertainty(stat, resSyst);
956 result.first = MCMCSF*value;
957 result.second = MCMCSF*uncertainty;
958
959 // Prevent negative return values. Should the comparison be against a strict 0?
960 result.first = std::max(Analysis::CalibZERO, result.first);
961 return status;
962
963}
964
965//________________________________________________________________________________
968 const string& label, const string& OP,
969 Uncertainty unc, unsigned int mapIndex)
970{
971 // MC efficiency retrieval identifying the requested calibration object by name.
972 // The return value is a (value, uncertainty) pair, as documented above, and will
973 // be a dummy value in case an error occurs.
974 //
975 // variables: object holding kinematic (and other) information needed to compute the result
976 // label: jet flavour label
977 // OP: tagger operating point
978 // unc: keyword indicating what uncertainties to evaluate
979 // mapIndex: index to the efficiency map to be used
980
981 unsigned int index;
982 if (! retrieveCalibrationIndex (label, OP, variables.jetAuthor, false, index, mapIndex)) {
983 cerr << "getMCEfficiency: unable to find Eff calibration for object " << fullName(variables.jetAuthor, OP, label, false, mapIndex) << endl;
984 // Return a dummy result if the object is not found
986 }
987
989 return (getMCEfficiency(variables, index, unc, result) == Analysis::kError) ?
990 Analysis::dummyResult : result;
991}
992
993//________________________________________________________________________________
996 unsigned int index, Uncertainty unc) //const
997{
998 // MC efficiency retrieval identifying the requested calibration object by index.
999 // The return value is a (value, uncertainty) pair, as documented above, and will
1000 // be a dummy value in case an error occurs.
1001 //
1002 // variables: object holding kinematic (and other) information needed to compute the result
1003 // index: index to calibration object
1004 // unc: keyword indicating what uncertainties to evaluate
1005
1006 Analysis::CalibResult result;
1007 return (getMCEfficiency(variables, index, unc, result) == Analysis::kError) ?
1008 Analysis::dummyResult : result;
1009}
1010
1011//________________________________________________________________________________
1014 unsigned int index, Uncertainty unc,
1015 Analysis::CalibResult& result)
1016{
1017 // MC efficiency retrieval identifying the requested calibration object by index.
1018 //
1019 // variables: object holding kinematic (and other) information needed to compute the result
1020 // index: index to calibration object
1021 // unc: keyword indicating what uncertainties to evaluate
1022 // result: (value, uncertainty) variation pair.
1023 // A dummy value will be returned in case of an error.
1024
1026 if (! container) return Analysis::kError;
1027
1028 // perform out-of-bound check of jet eta
1029 if (!checkAbsEta(variables, index)) {
1030 if (m_verbose)
1031 cerr << "Jet |eta| is outside of the boundary!" << endl;
1032 return Analysis::kRange;
1033 }
1034
1035
1036 // always retrieve the result itself
1037 double value;
1038 Analysis::CalibrationStatus status = container->getResult(variables, value);
1039 if (status == Analysis::kError) return status;
1040 if (m_otherStrategy == GiveUp)
1041 assert (status != Analysis::kRange); // no need to test also statDown
1042 else if (m_otherStrategy == Flag)
1043 if (status == Analysis::kRange)
1044 this->increaseCounter(index);
1045
1046 // retrieve the statistical uncertainty if desired
1047 double stat(0);
1048 if (unc == Total || unc == Statistical) {
1049 if (container->getStatUncertainty(variables, stat) == Analysis::kError) {
1050 cerr << "getMCEfficiency: error retrieving MC efficiency parameter covariance matrix!" << endl;
1051 return Analysis::kError;
1052 }
1053 }
1054
1055 // Temporary(?) hack: comment this out since the present MC results don't have "systematics" contributions
1056 // Analysis::UncertaintyResult resSyst(0,0);
1057 // if (unc == Total || unc == Systematic) {
1058 // if (container->getSystUncertainty(variables, resSyst) == Analysis::kError)
1059 // cerr << "getScaleFactor: error retrieving Scale factor parameter covariance matrix!"
1060 // << endl;
1061 // }
1062
1063 // since there is no combination of stat/syst uncertainties to be made, comment this out too
1064 double uncertainty = stat; // combinedUncertainty(stat, resSyst);
1065 result.first = std::max(0., std::min(1., value));
1066 result.second = uncertainty;
1067
1068 return status;
1069}
1070
1071//====================== efficiency retrieval ==========================================
1072
1073//________________________________________________________________________________
1076 const string& label,
1077 const string& OP, Uncertainty unc, const std::string& flavour,
1078 unsigned int numVariation, unsigned int mapIndex)
1079{
1080 // Data efficiency retrieval identifying the requested calibration objects by name.
1081 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor.
1082 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
1083 // above, and will be a dummy value in case an error occurs.
1084 //
1085 // variables: object holding kinematic (and other) information needed to compute the result
1086 // label: jet flavour label
1087 // OP: tagger operating point
1088 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1089 // named variations are to be computed)
1090 // numVariation: variation index (in case of eigenvector or named variations)
1091 // mapIndex: index to the efficiency map to be used
1092
1093 unsigned int indexSF, indexEff;
1094 if (! (retrieveCalibrationIndex (label, OP, variables.jetAuthor, false, indexEff, mapIndex) &&
1095 retrieveCalibrationIndex (label, OP, variables.jetAuthor, true, indexSF))) {
1096 cerr << "getEfficiency: unable to find Eff calibration for object " << fullName(variables.jetAuthor, OP, label, false, mapIndex) << " or SF calibration for object " << fullName(variables.jetAuthor, OP, label, true) << endl;
1097 // Return a dummy result if the object is not found
1098 return Analysis::dummyResult;
1099 }
1100
1101 Analysis::CalibResult result;
1102 return (getEfficiency(variables, indexSF, indexEff, unc, numVariation, result, flavour) == Analysis::kError) ? Analysis::dummyResult : result;
1103}
1104
1105//________________________________________________________________________________
1108 unsigned int indexSF, unsigned int indexEff,
1109 Uncertainty unc, const std::string& flavour, unsigned int numVariation)
1110{
1111 // Data efficiency retrieval identifying the requested calibration objects by index.
1112 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor.
1113 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
1114 // above, and will be a dummy value in case an error occurs.
1115 //
1116 // variables: object holding kinematic (and other) information needed to compute the result
1117 // indexSF: index to scale factor calibration object
1118 // indexEff: index to MC efficiency object
1119 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1120 // named variations are to be computed)
1121 // numVariation: variation index (in case of eigenvector or named variations)
1122
1123 Analysis::CalibResult result;
1124 return (getEfficiency(variables, indexSF, indexEff, unc, numVariation, result, flavour) == Analysis::kError) ?
1125 Analysis::dummyResult : result;
1126}
1127
1128//________________________________________________________________________________
1131 unsigned int indexSF, unsigned int indexEff,
1132 Uncertainty unc, unsigned int numVariation,
1133 Analysis::CalibResult& result, const std::string& flavour)
1134{
1135 // Data efficiency retrieval identifying the requested calibration objects by index.
1136 //
1137 // variables: object holding kinematic (and other) information needed to compute the result
1138 // indexSF: index to scale factor calibration object
1139 // indexEff: index to MC efficiency object
1140 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1141 // named variations are to be computed)
1142 // numVariation: variation index (in case of eigenvector or named variations)
1143 // result: (value, uncertainty) or (up, down) variation pair, depending on the unc value.
1144 // A dummy value will be returned in case of an error.
1145
1146 Analysis::CalibResult sfResult;
1147 Analysis::CalibrationStatus sfStatus = getScaleFactor(variables, indexSF, indexEff, unc, numVariation, sfResult, flavour);
1148 if (sfStatus == Analysis::kError) return sfStatus;
1149 Analysis::CalibResult effResult;
1150 Analysis::CalibrationStatus effStatus= getMCEfficiency(variables, indexEff, unc, effResult);
1151 if (effStatus == Analysis::kError) return effStatus;
1152
1153 double relative = 0;
1154 double value = effResult.first;
1155 if (TMath::Abs(sfResult.first) > Analysis::CalibZERO) {
1156 value = std::min(effResult.first*sfResult.first, 1.);
1157
1158 // Treat the scale factor variation cases separately since the contents of the CalibResult are different
1159 // (e.g. 'value' above contains the upward variation)
1160 if (unc == SFEigen || unc == SFNamed) {
1161 double valueDown = effResult.first*sfResult.second;
1162 result.first = value; // up/down variataions of data-efficiency
1163 result.second = valueDown;
1164 return sfStatus;
1165 }
1166 if (value > 0.) {
1167 relative = effResult.second/effResult.first;
1168 double sfRelative = sfResult.second/sfResult.first;
1169 /*
1170 cout << "sferr=" << sfResult.second
1171 << "btag Calib relative=" << relative << " sfRelative=" << sfRelative << endl;
1172 */
1173 relative = TMath::Sqrt(sfRelative*sfRelative + relative*relative);
1174 }
1175 } else {
1176 // now never happens due to protection of SF return value:
1177 cerr << "ERROR: CalibrationDataInterfaceROOT::getEfficiency: SF null result, SF=" << sfResult.first << " MC eff=" << effResult.first << "; setting SF=1." << endl;
1178 relative = Analysis::dummyValue;
1179 }
1180
1181 result.first = value;
1182 result.second = value*relative;
1183 // "Select" the status code for the actual calibration (it is subject to more constraints)
1184 return sfStatus;
1185}
1186
1187
1188//________________________________________________________________________________
1191 const string& label,
1192 const string& OP, Uncertainty unc,
1193 unsigned int numVariation, unsigned int mapIndex)
1194{
1195 // Inefficiency scale factor retrieval identifying the requested calibration objects by name.
1196 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor;
1197 // the inefficiency scale factor is then computed as the ratio of data to MC inefficiencies.
1198 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
1199 // above, and will be a dummy value in case an error occurs.
1200 //
1201 // variables: object holding kinematic (and other) information needed to compute the result
1202 // label: jet flavour label
1203 // OP: tagger operating point
1204 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1205 // named variations are to be computed)
1206 // numVariation: variation index (in case of eigenvector or named variations)
1207 // mapIndex: index to the efficiency map to be used
1208
1209 unsigned int indexSF, indexEff;
1210 if (! (retrieveCalibrationIndex (label, OP, variables.jetAuthor, false, indexEff, mapIndex) &&
1211 retrieveCalibrationIndex (label, OP, variables.jetAuthor, true, indexSF))) {
1212 cerr << "getInefficiencyScaleFactor: unable to find Eff calibration for object "
1213 << fullName(variables.jetAuthor, OP, label, false, mapIndex)
1214 << " or SF calibration for object "
1215 << fullName(variables.jetAuthor, OP, label, true) << endl;
1216 // Return a dummy result if the object is not found
1217 return Analysis::dummyResult;
1218 }
1219
1220 Analysis::CalibResult result;
1221 return (getInefficiencyScaleFactor(variables, indexSF, indexEff, unc, numVariation, result, label) == Analysis::kError) ?
1222 Analysis::dummyResult : result;
1223}
1224
1225//________________________________________________________________________________
1228 unsigned int indexSF, unsigned int indexEff,
1229 Uncertainty unc, const std::string& flavour, unsigned int numVariation)
1230{
1231 // Inefficiency scale factor retrieval identifying the requested calibration objects by index.
1232 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor;
1233 // the inefficiency scale factor is then computed as the ratio of data to MC inefficiencies.
1234 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
1235 // above, and will be a dummy value in case an error occurs.
1236 //
1237 // variables: object holding kinematic (and other) information needed to compute the result
1238 // indexSF: index to scale factor calibration object
1239 // indexEff: index to MC efficiency object
1240 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1241 // named variations are to be computed)
1242 // numVariation: variation index (in case of eigenvector or named variations)
1243
1244 Analysis::CalibResult result;
1245 return (getInefficiencyScaleFactor(variables, indexSF, indexEff, unc, numVariation, result, flavour) == Analysis::kError) ?
1246 Analysis::dummyResult : result;
1247}
1248
1249//________________________________________________________________________________
1252 unsigned int indexSF, unsigned int indexEff,
1253 Uncertainty unc, unsigned int numVariation,
1254 Analysis::CalibResult& result, const std::string& flavour)
1255{
1256 // Inefficiency scale factor retrieval identifying the requested calibration objects by index.
1257 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor;
1258 // the inefficiency scale factor is then computed as the ratio of data to MC inefficiencies.
1259 //
1260 // variables: object holding kinematic (and other) information needed to compute the result
1261 // indexSF: index to scale factor calibration object
1262 // indexEff: index to MC efficiency object
1263 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1264 // named variations are to be computed)
1265 // numVariation: variation index (in case of eigenvector or named variations)
1266 // result: (value, uncertainty) or (up, down) variation pair, depending on the unc value.
1267 // A dummy value will be returned in case of an error.
1268
1269 Analysis::CalibResult sfResult;
1270 Analysis::CalibrationStatus sfStatus = getScaleFactor(variables, indexSF, indexEff, unc, numVariation, sfResult, flavour);
1271 if (sfStatus == Analysis::kError) return sfStatus;
1272 Analysis::CalibResult effResult;
1273 Analysis::CalibrationStatus effStatus= getMCEfficiency(variables, indexEff, unc, effResult);
1274 if (effStatus == Analysis::kError) return effStatus;
1275
1276 double eff = std::min(effResult.first, 1.);
1277 // double efferr = effResult.second; // not needed as (per the code change indicated below) we are not doing anything with MC statistical uncertainties
1278 double sf = sfResult.first;
1279 double sferr = sfResult.second;
1280
1281 double val = 0.; // Analysis::dummyValue;
1282 double err = 0.; // Analysis::dummyValue;
1283 if (1. - eff > CalibZERO) {
1284 // Protect against negative scale factors
1285 val = std::max((1. - eff*sf), CalibZERO) / (1. - eff);
1286 // Treat the scale factor variation cases separately since the contents of the CalibResult are different
1287 // ('sf' and 'sferr' above contain the upward and downward variations, respectively).
1288 if (unc == SFEigen || unc == SFNamed) {
1289 double valDown = std::max((1. - eff*sferr), CalibZERO) / (1. - eff);
1290 result.first = val;
1291 result.second = valDown;
1292 return sfStatus;
1293 }
1294 // When using eigenvector (or named) variations (as above), only scale factor variations are considered.
1295 // For the sake of consistency, it has been decided (see https://its.cern.ch/jira/browse/AFT-350) to remove them also when EV variations aren't used
1296 //err = pow((1. - sf) / (1. - eff) * efferr, 2) + pow(eff*sferr, 2);
1297 err = pow(eff*sferr, 2);
1298 if (err > 0.)
1299 err = 1./(1. - eff) * TMath::Sqrt(err);
1300 // cout << "btag Calib Ineff err=" << err << endl;
1301 }
1302
1303 result.first = std::max(CalibZERO, val);
1304 result.second = err;
1305 // "Select" the status code for the actual calibration (it is subject to more constraints)
1306 return sfStatus;
1307}
1308
1309//________________________________________________________________________________
1312 const string& label,
1313 const string& OP, Uncertainty unc,
1314 unsigned int numVariation, unsigned int mapIndex)
1315{
1316 // Data inefficiency retrieval identifying the requested calibration objects by name.
1317 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor;
1318 // the inefficiency is then computed as the 1 minus the efficiency.
1319 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
1320 // above, and will be a dummy value in case an error occurs.
1321 //
1322 // variables: object holding kinematic (and other) information needed to compute the result
1323 // label: jet flavour label
1324 // OP: tagger operating point
1325 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1326 // named variations are to be computed)
1327 // numVariation: variation index (in case of eigenvector or named variations)
1328 // mapIndex: index to the efficiency map to be used
1329
1330 unsigned int indexSF, indexEff;
1331 if (! (retrieveCalibrationIndex (label, OP, variables.jetAuthor, false, indexEff, mapIndex) &&
1332 retrieveCalibrationIndex (label, OP, variables.jetAuthor, true, indexSF))) {
1333 cerr << "getInefficiency: unable to find Eff calibration for object "
1334 << fullName(variables.jetAuthor, OP, label, false, mapIndex)
1335 << " or SF calibration for object "
1336 << fullName(variables.jetAuthor, OP, label, true) << endl;
1337 // Return a dummy result if the object is not found
1338 return Analysis::dummyResult;
1339 }
1340
1341 Analysis::CalibResult result;
1342 return (getInefficiency(variables, indexSF, indexEff, unc, numVariation, result, label) == Analysis::kError) ?
1343 Analysis::dummyResult : result;
1344}
1345
1346//________________________________________________________________________________
1349 unsigned int indexSF, unsigned int indexEff,
1350 Uncertainty unc, const std::string& flavour, unsigned int numVariation)
1351{
1352 // Data inefficiency retrieval identifying the requested calibration objects by index.
1353 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor;
1354 // the inefficiency is then computed as the 1 minus the efficiency.
1355 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
1356 // above, and will be a dummy value in case an error occurs.
1357 //
1358 // variables: object holding kinematic (and other) information needed to compute the result
1359 // indexSF: index to scale factor calibration object
1360 // indexEff: index to MC efficiency object
1361 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1362 // named variations are to be computed)
1363 // numVariation: variation index (in case of eigenvector or named variations)
1364
1365 Analysis::CalibResult result;
1366 return (getInefficiency(variables, indexSF, indexEff, unc, numVariation, result, flavour) == Analysis::kError) ?
1367 Analysis::dummyResult : result;
1368}
1369
1370//________________________________________________________________________________
1373 unsigned int indexSF, unsigned int indexEff,
1374 Uncertainty unc, unsigned int numVariation,
1375 Analysis::CalibResult& result, const std::string& flavour)
1376{
1377 // Data inefficiency retrieval identifying the requested calibration objects by index.
1378 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor;
1379 // the inefficiency is then computed as the 1 minus the efficiency.
1380 //
1381 // variables: object holding kinematic (and other) information needed to compute the result
1382 // indexSF: index to scale factor calibration object
1383 // indexEff: index to MC efficiency object
1384 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1385 // named variations are to be computed)
1386 // numVariation: variation index (in case of eigenvector or named variations)
1387 // result: (value, uncertainty) or (up, down) variation pair, depending on the unc value.
1388 // A dummy value will be returned in case of an error.
1389
1390 Analysis::CalibResult sfResult;
1391 Analysis::CalibrationStatus sfStatus = getScaleFactor(variables, indexSF, indexEff, unc, numVariation, sfResult, flavour);
1392 if (sfStatus == Analysis::kError) return sfStatus;
1393 Analysis::CalibResult effResult;
1394 Analysis::CalibrationStatus effStatus= getMCEfficiency(variables, indexEff, unc, effResult);
1395 if (effStatus == Analysis::kError) return effStatus;
1396
1397 double val = std::max(0., 1. - effResult.first * sfResult.first);
1398 double err = 0.; // Analysis::dummyValue;
1399
1400 // Bail out here if not both results are strictly positive
1401 if (effResult.first <= 0. || sfResult.first <= 0.) return Analysis::kError;
1402
1403 // Treat the scale factor variation cases separately since the contents of the CalibResult are different
1404 // (e.g. 'val' above contains the upward variation)
1405 if (unc == SFEigen || unc == SFNamed) {
1406 double valDown = std::max(0., 1. - effResult.first*sfResult.second);
1407
1408 result.first = val;
1409 result.second = valDown;
1410 } else {
1411 // safer than pow(x, 2):
1412 err = effResult.second/effResult.first*effResult.second/effResult.first
1413 + sfResult.second/sfResult.first*sfResult.second/sfResult.first;
1414 err = val*TMath::Sqrt(err);
1415
1416 result.first = std::max(0., std::min(1., val));
1417 result.second = err;
1418 }
1419
1420 // "Select" the status code for the actual calibration (it is subject to more constraints)
1421 return sfStatus;
1422}
1423
1424//________________________________________________________________________________
1427 const string& label, const string& OP,
1428 Uncertainty unc, unsigned int mapIndex)
1429{
1430 // Data inefficiency retrieval identifying the requested calibration objects by name.
1431 // The inefficiency is computed as the 1 minus the efficiency.
1432 // The return value is a (value, uncertainty), as documented above, and will be a dummy value
1433 // in case an error occurs.
1434 //
1435 // variables: object holding kinematic (and other) information needed to compute the result
1436 // label: jet flavour label
1437 // OP: tagger operating point
1438 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1439 // named variations are to be computed)
1440 // numVariation: variation index (in case of eigenvector or named variations)
1441 // mapIndex: index to the efficiency map to be used
1442
1443 Analysis::CalibResult effResult = getMCEfficiency(variables, label, OP, unc, mapIndex);
1444 return std::make_pair(std::max(0., 1. - effResult.first), effResult.second);
1445}
1446
1447//________________________________________________________________________________
1450 unsigned int index, Uncertainty unc)
1451{
1452 // MC inefficiency retrieval identifying the requested calibration object by index.
1453 // The inefficiency is computed as the 1 minus the efficiency.
1454 // The return value is a (value, uncertainty), as documented above, and will be a dummy value
1455 // in case an error occurs.
1456 //
1457 // variables: object holding kinematic (and other) information needed to compute the result
1458 // index: index to MC efficiency object
1459 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1460 // named variations are to be computed)
1461 // numVariation: variation index (in case of eigenvector or named variations)
1462
1463 Analysis::CalibResult effResult = getMCEfficiency(variables, index, unc);
1464 return std::make_pair(std::max(0., 1. - effResult.first), effResult.second);
1465}
1466
1467//________________________________________________________________________________
1468double
1470 unsigned indexSF, unsigned int indexEff) const
1471{
1472 // Retrieve the MC/MC scale factor given the set of scale factor and efficiency indices.
1473 // variables: object holding kinematic (and other) information needed to compute the result
1474 // indexSF: index to scale factor calibration object
1475 // indexEff: index to MC efficiency object
1476
1477 // If either reference doesn't exist, or if they are the same, nothing can / needs to be done.
1478 int indexSFRef = m_hadronisationReference[indexSF], indexEffRef = m_hadronisationReference[indexEff];
1479 if (indexSFRef < 0 || indexEffRef < 0 || indexSFRef == indexEffRef) return 1;
1480
1481 // Verify also that the individual efficiencies are physically meaningful.
1482 double effSFRef; m_objects[indexSFRef]->getResult(variables, effSFRef);
1483 double effEffRef; m_objects[indexEffRef]->getResult(variables, effEffRef);
1484 return (effSFRef > 0 && effEffRef > 0) ? effSFRef/effEffRef : 1;
1485}
1486
1487//________________________________________________________________________________
1490 const string& label, Uncertainty unc,
1491 unsigned int numVariation, unsigned int mapIndex)
1492{
1493 // #1
1494 // Tag weight fraction scale factor retrieval identifying the requested calibration object by name.
1495 // The return value is either a (value, uncertainty) or (if eigenvector or named variations are specified)
1496 // an (up, down) variation pair, and will be a dummy value in case an error occurs.
1497 // Note that in contrast to the "regular" (non-continuous) case, the computation of the scale factor in
1498 // general needs the (selection- or even process-specific) MC tag weight fractions, in order to rescale
1499 // scale factors. This is used to ensure that the tag weight fractions (both in data and in MC) sum up to
1500 // unity for each given kinematic bin.
1501 //
1502 // variables: object holding kinematic (and other) information needed to compute the result
1503 // label: jet flavour label
1504 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1505 // named variations are to be computed)
1506 // numVariation: variation index (in case of eigenvector or named variations)
1507 // mapIndex: index to the MC efficiency map to be used for scale factor rescaling
1508
1509 static const string cont("Continuous");
1510
1511 unsigned int indexSF, indexEff;
1512 if (! (retrieveCalibrationIndex (label, cont, variables.jetAuthor, false, indexEff, mapIndex) &&
1513 retrieveCalibrationIndex (label, cont, variables.jetAuthor, true, indexSF))) {
1514 cerr << "getWeightScaleFactor: unable to find Eff calibration for object "
1515 << fullName(variables.jetAuthor, cont, label, false, mapIndex)
1516 << " or SF calibration for object "
1517 << fullName(variables.jetAuthor, cont, label, true) << endl;
1518 return Analysis::dummyResult;
1519 }
1520
1521 Analysis::CalibResult result;
1522 return (getWeightScaleFactor(variables, indexSF, indexEff, unc, numVariation, result) == Analysis::kError) ? Analysis::dummyResult : result;
1523}
1524
1525//________________________________________________________________________________
1528 unsigned int indexSF, unsigned int indexEff,
1529 Uncertainty unc, unsigned int numVariation)
1530{
1531 // #2
1532 // Tag weight fraction scale factor retrieval identifying the requested calibration object by index.
1533 // The return value is either a (value, uncertainty) or (if eigenvector or named variations are specified)
1534 // an (up, down) variation pair, and will be a dummy value in case an error occurs.
1535 // Note that in contrast to the "regular" (non-continuous) case, the computation of the scale factor in
1536 // general needs the (selection- or even process-specific) MC tag weight fractions, in order to rescale
1537 // scale factors. This is used to ensure that the tag weight fractions (both in data and in MC) sum up to
1538 // unity for each given kinematic bin.
1539 //
1540 // variables: object holding kinematic (and other) information needed to compute the result
1541 // indexSF: index to calibration object
1542 // indexEff: index to MC tag weight
1543 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1544 // named variations are to be computed)
1545 // numVariation: variation index (in case of eigenvector or named variations)
1546
1547 Analysis::CalibResult result;
1548 return (getWeightScaleFactor(variables, indexSF, indexEff, unc, numVariation, result) == Analysis::kError) ?
1549 Analysis::dummyResult : result;
1550}
1551
1552//________________________________________________________________________________
1555 unsigned int indexSF, unsigned int indexEff,
1556 Uncertainty unc, unsigned int numVariation,
1557 Analysis::CalibResult& result)
1558{
1559 // #3
1560 // Tag weight fraction scale factor retrieval identifying the requested calibration object by index.
1561 // Note that in contrast to the "regular" (non-continuous) case, the computation of the scale factor in
1562 // general needs the (selection- or even process-specific) MC tag weight fractions, in order to rescale
1563 // scale factors. This is used to ensure that the tag weight fractions (both in data and in MC) sum up to
1564 // unity for each given kinematic bin.
1565 //
1566 // variables: object holding kinematic (and other) information needed to compute the result
1567 // indexSF: index to calibration object
1568 // indexEff: index to MC tag weight
1569 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1570 // named variations are to be computed)
1571 // numVariation: variation index (in case of eigenvector or named variations)
1572 // result: (value, uncertainty) or (up, down) variation pair, depending on the unc value.
1573 // A dummy value will be returned in case of an error.
1574 CalibrationDataContainer* container = m_objects[indexSF];
1575 if (! container) return Analysis::kError;
1576 CalibrationDataContainer* effContainer = m_objects[indexEff];
1577 if (! effContainer) return Analysis::kError;
1578
1579 // the first time this combination of scale factor and "efficiency" objects is given, check on the
1580 // scale factors that will result from their combination (where the computations reproduce those
1581 // shown below)
1582 checkWeightScaleFactors(indexSF, indexEff);
1583
1584 // perform out-of-bound check of jet eta
1585 if (!checkAbsEta(variables, indexSF)) {
1586 if (m_verbose)
1587 cerr << "Jet |eta| is outside of the boundary!" << endl;
1588 return Analysis::kRange;
1589 }
1590
1591 // Always retrieve the result itself
1592 double value;
1593 Analysis::CalibrationStatus status = container->getResult(variables, value);
1594 if (status == Analysis::kError) return status;
1595 if (m_otherStrategy == GiveUp) assert (status != Analysis::kRange);
1596 else if (m_otherStrategy == GiveUpExtrapolated) assert (status != Analysis::kExtrapolatedRange);
1597 else if (m_otherStrategy == Flag) {
1598 if (status == Analysis::kRange)
1599 increaseCounter(indexSF);
1600 else if (status == Analysis::kExtrapolatedRange)
1601 increaseCounter(indexSF, Extrapolated);
1602 }
1603
1604 // Retrieve the reference MC tag weight fraction (corresponding to the calibration scale factors)
1605 Analysis::UncertaintyResult refMCResult(0,0);
1606 if (container->getUncertainty("MCreference", variables, refMCResult) == Analysis::kError)
1607 return Analysis::kError;
1608 double fracMCref = refMCResult.first;
1609 // Retrieve the MC reference information, if requested (the initialisation below is to make sure
1610 // that no exceptions in the code will be needed)
1611 double fracSFref = fracMCref, fracEffref = fracMCref;
1612 if (m_useMCMCSF) {
1613 int indexSFref = m_hadronisationReference[indexSF], indexEffref = m_hadronisationReference[indexEff];
1614 if (indexSFref < 0 || indexEffref < 0) {
1615 cerr << "getWeightScaleFactor: error: generator-specific corrections requested but necessary reference containers lacking " << endl;
1616 return Analysis::kError;
1617 } else {
1618 m_objects[indexSFref]->getResult(variables, fracSFref);
1619 m_objects[indexEffref]->getResult(variables, fracEffref);
1620 if (! (fracSFref > 0. && fracEffref > 0.)) {
1621 cerr << "getWeightScaleFactor: error: invalid reference tag weight fraction " <<fracSFref <<" " <<fracEffref << std::endl;
1622 return Analysis::kError;
1623 }
1624 }
1625 }
1626
1627 // Retrieve the MC tag weight fraction for the sample we need to reweight to
1628 double fracMCnew;
1629 Analysis::CalibrationStatus effStatus = effContainer->getResult(variables, fracMCnew);
1630 if (effStatus == Analysis::kError) return effStatus;
1631 if (m_otherStrategy == GiveUp) assert (effStatus != Analysis::kRange);
1632 else if (m_otherStrategy == Flag)
1633 if (effStatus == Analysis::kRange) increaseCounter(indexEff);
1634 // since we need to divide by this quantity, check that it is well-defined
1635 if (!(fracMCnew > 0.) and m_useTopologyRescaling) {// but we only care if using topology rescaling
1636 cerr << "getWeightScaleFactor: error: null fracMCnew would lead to invalid operation" << endl;
1637 return Analysis::kError;
1638 }
1639
1640 if (!m_runEigenVectorMethod && (unc == SFEigen || unc == SFNamed)) {
1641 cerr << "getWeightScaleFactor: ERROR. Trying to call eigenvector method but initialization not switched on in b-tagging .env config file." << endl;
1642 cerr << " Please correct your .env config file first. Nominal uncertainties used. " << endl;
1643 }
1644
1645 if (unc == SFEigen || unc == SFNamed) {
1646 std::shared_ptr<CalibrationDataEigenVariations> eigenVariation;
1647 try {
1648 eigenVariation = m_eigenVariationsMap.at(container);
1649 } catch (const std::out_of_range&) {
1650 cerr << "getWeightScaleFactor: could not retrieve eigenvector variation, while it should have been there." << endl;
1651 return Analysis::kError;
1652 }
1653 unsigned int maxVariations = (unc == SFEigen) ? eigenVariation->getNumberOfEigenVariations() : eigenVariation->getNumberOfNamedVariations();
1654 if (numVariation > maxVariations-1) {
1655 cerr << "getWeightScaleFactor: asked for " << ((unc == SFEigen) ? "eigenvariation" : "named variation") << " number: " << numVariation << " but overall number of available variations is: " << maxVariations << endl;
1656 return Analysis::kError;
1657 }
1658 TH1* up=0;
1659 TH1* down=0;
1660 bool isOK = (unc == SFEigen) ? eigenVariation->getEigenvectorVariation(numVariation,up,down) : eigenVariation->getNamedVariation(numVariation,up,down);
1661 if (!isOK) {
1662 cerr << "getWeightScaleFactor: Eigenvector object is there but cannot retrieve up and down uncertainty histograms." << endl;
1663 return Analysis::kError;
1664 }
1665 // the 'extrapolation' uncertainty (always a named one) needs a somewhat special treatment
1666 bool extrapolate = ( unc == SFNamed ) ? eigenVariation->isExtrapolationVariation(numVariation) : false;
1667
1668 double valueUp;
1669 double valueDown;
1670 Analysis::CalibrationStatus statusUp = container->getResult(variables, valueUp, up, extrapolate);
1671 Analysis::CalibrationStatus statusDown = container->getResult(variables, valueDown,down, extrapolate);
1672 if (statusUp == Analysis::kError || statusDown == Analysis::kError) return Analysis::kError;
1673
1674 // now carry out the rescaling. Protect against unphysical or suspiciously large scale factors
1675 double variationUp = valueUp - value;
1676 double variationDown = valueDown - value;
1677 // First step: from the calibration sample to its reference sample
1678 if (m_useTopologyRescaling) value = 1.0 + (value - 1.0) * (fracMCref / fracSFref);
1679 // Second step: from the calibration reference sample to the MC object's reference sample
1680 if (m_useMCMCSF) value *= (fracSFref / fracEffref);
1681 // Third step: from the MC object's reference sample to the MC sample itself
1682 if (m_useTopologyRescaling) value = 1.0 + (value - 1.0) * (fracEffref / fracMCnew);
1683 // Since all transformations of the scale factor itself are linear, the transformation of the variations is simpler.
1685 double f = (fracMCref / fracMCnew);
1686 variationUp *= f;
1687 variationDown *= f;
1688 } else if (m_useMCMCSF) {
1689 double f = (fracSFref / fracEffref);
1690 variationUp *= f;
1691 variationDown *= f;
1692 }
1693 valueUp = value + variationUp;
1694 valueDown = value + variationDown;
1695 if (valueUp < 0) {
1696 valueUp = 0; increaseCounter(indexSF, TagWeight);
1697 } else if (valueUp > m_maxTagWeight) {
1698 valueUp = m_maxTagWeight; increaseCounter(indexSF, TagWeight);
1699 }
1700 if (valueDown < 0) {
1701 valueDown = 0; increaseCounter(indexSF, TagWeight);
1702 } else if (valueDown > m_maxTagWeight) {
1703 valueDown = m_maxTagWeight; increaseCounter(indexSF, TagWeight);
1704 }
1705
1706 result.first = valueUp;
1707 result.second = valueDown;
1708 return statusUp;
1709 } //end eigenvector method
1710
1711 //Proceed with no-eigenvector result
1712
1713 // retrieve the statistical uncertainty if desired
1714 double stat(0);
1715 if (unc == Total || unc == Statistical) {
1716 if (container->getStatUncertainty(variables, stat) == Analysis::kError) {
1717 cerr << "getWeightScaleFactor: error retrieving Scale factor parameter covariance matrix!" << endl;
1718 return Analysis::kError;
1719 }
1720 }
1721 Analysis::UncertaintyResult uncertaintyResult(0,0);
1722 if (unc == Total || unc == Systematic) {
1723 if (container->getSystUncertainty(variables, uncertaintyResult) == Analysis::kError) {
1724 cerr << "getWeightScaleFactor: error retrieving Scale factor parameter systematic uncertainty!" << endl;
1725 return Analysis::kError;
1726 }
1727 } else if (unc == Extrapolation) {
1728 // this uncertainty is special, since it is not normally to be combined into the overall systematic uncertainty
1729 if (container->getUncertainty("extrapolation", variables, uncertaintyResult) == Analysis::kError)
1730 cerr << "getWeightScaleFactor: error retrieving Scale factor parameter extrapolation uncertainty!" << endl;
1731 } else if (unc == TauExtrapolation) {
1732 // also this uncertainty is special, since it it singles out an uncertainty relevant only for tau "jets",
1733 // and some care has to be taken not to duplicate or omit uncertainties
1734 if (container->getUncertainty("extrapolation from charm", variables, uncertaintyResult) == Analysis::kError)
1735 cerr << "getWeightScaleFactor: error retrieving Scale factor parameter extrapolation uncertainty!" << endl;
1736 }
1737
1738 double uncertainty = combinedUncertainty(stat, uncertaintyResult);
1739
1740 // Now carry out the rescaling. Again protect against unphysical or suspiciously large scale factors
1741 // First step: from the calibration sample to its reference sample
1742 if (m_useTopologyRescaling) value = 1.0 + (value - 1.0) * (fracMCref / fracSFref);
1743 // Second step: from the calibration reference sample to the MC object's reference sample
1744 if (m_useMCMCSF) value *= (fracSFref / fracEffref);
1745 // Third step: from the MC object's reference sample to the MC sample itself
1746 if (m_useTopologyRescaling) value = 1.0 + (value - 1.0) * (fracEffref / fracMCnew);
1747 if (value < 0) {
1748 value = 0; increaseCounter(indexSF, TagWeight);
1749 } else if (value > m_maxTagWeight) {
1750 value = m_maxTagWeight; increaseCounter(indexSF, TagWeight);
1751 }
1752 // Since all transformations of the scale factor itself are linear, the transformation of the uncertainty is simpler.
1754 uncertainty *= (fracMCref / fracMCnew);
1755 } else if (m_useMCMCSF) {
1756 uncertainty *= (fracSFref / fracEffref);
1757 }
1758
1759 result.first = std::max(0., value);
1760 result.second = uncertainty;
1761 // "Select" the status code for the actual calibration object (it is subject to more constraints)
1762 return status;
1763}
1764
1765//________________________________________________________________________________
1766void
1768 unsigned int indexEff)
1769{
1770 // Check the tag weight scale factors that would result from the combination of
1771 // the provided scale factor and MC tag weight objects.
1772 // The way this is done is by determining the binning that would apply to the
1773 // combination of the two individual inputs, and then by explicitly computing
1774 // the scale factors in each of these resulting bins.
1775
1776 std::vector<std::pair<unsigned int, unsigned int> >::const_iterator it = std::find(m_checkedWeightScaleFactors.begin(), m_checkedWeightScaleFactors.end(), std::make_pair(indexSF, indexEff));
1777 if (it != m_checkedWeightScaleFactors.end()) return;
1778
1779
1780 // Assume that only histogram containers are involved here (this should be the case
1781 // as at least a strict tag weight binning should be applied).
1783 if (! container) {
1784 cerr << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: error: container for object " << nameFromIndex(indexSF) << " not found!" << endl;
1785 return;
1786 } else if (! container->GetValue("MCreference")) {
1787 cerr << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: error: no MCreference histogram for object " << nameFromIndex(indexSF) << "!" << endl;
1788 return;
1789 }
1790 CalibrationDataHistogramContainer* effContainer = dynamic_cast<CalibrationDataHistogramContainer*>(m_objects[indexEff]);
1791 if (! effContainer) {
1792 cerr << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: error: container for object " << nameFromIndex(indexEff) << " not found!" << endl;
1793 return;
1794 }
1795
1796 // Retrieve the variable types and corresponding bin boundaries
1797 std::vector<unsigned int> vars = container->getVariableTypes();
1798 std::vector<unsigned int> effVars = effContainer->getVariableTypes();
1799 // Retrieve the corresponding bin boundaries
1800 std::map<unsigned int, std::vector<double> > boundaries, effBoundaries, mergedBoundaries;
1801 for (unsigned int t = 0; t < vars.size(); ++t)
1802 boundaries[vars[t]] = container->getBinBoundaries(vars[t]);
1803 for (unsigned int t = 0; t < effVars.size(); ++t)
1804 effBoundaries[effVars[t]] = effContainer->getBinBoundaries(effVars[t]);
1805
1806 // Special case: handle |eta| versus eta differences, by transforming to the latter
1807 if (boundaries.find(CalibrationDataContainer::kEta) == boundaries.end() && boundaries.find(CalibrationDataContainer::kAbsEta) != boundaries.end()) {
1809 boundaries.erase(CalibrationDataContainer::kAbsEta);
1810 }
1811 if (effBoundaries.find(CalibrationDataContainer::kEta) == effBoundaries.end() && effBoundaries.find(CalibrationDataContainer::kAbsEta) != effBoundaries.end()) {
1813 effBoundaries.erase(CalibrationDataContainer::kAbsEta);
1814 }
1815 if (boundaries.find(CalibrationDataContainer::kEta) != boundaries.end() && effBoundaries.find(CalibrationDataContainer::kEta) != effBoundaries.end()) {
1816 std::vector<double>& v = boundaries[CalibrationDataContainer::kEta];
1817 std::vector<double>& vEff = effBoundaries[CalibrationDataContainer::kEta];
1818 if (v[0] < 0 && vEff[0] >= 0) {
1819 // in this case, supplement the positive entries in vEff with their negative analogues
1820 std::vector<double> vtmp(vEff);
1821 for (std::vector<double>::iterator it = vtmp.begin(); it != vtmp.end(); ++it)
1822 if (*it > 0) vEff.insert(vEff.begin(), -(*it));
1823 } else if (v[0] >= 0 && vEff[0] < 0) {
1824 // in this case, supplement the positive entries in v with their negative analogues
1825 std::vector<double> vtmp(v);
1826 for (std::vector<double>::iterator it = vtmp.begin(); it != vtmp.end(); ++it)
1827 if (*it > 0) v.insert(v.begin(), -(*it));
1828 }
1829 }
1830
1831 // Now that the individual sets of boundaries have been determined, merge these
1832 for (unsigned int t = 0; t < vars.size(); ++t) {
1833 if (effBoundaries.find(vars[t]) == effBoundaries.end())
1834 // Variables not present in the efficiency object can go in unmodified
1835 mergedBoundaries[vars[t]] = boundaries[vars[t]];
1836 else {
1837 // Merge the boundaries for variables existing in both objects.
1838 // Take the MC array as a starting point, as it's likely to be the longest.
1839 mergedBoundaries[vars[t]] = effBoundaries[vars[t]];
1840
1841 for (std::vector<double>::iterator it = boundaries[vars[t]].begin(); it != boundaries[vars[t]].end(); ++it) {
1842 std::vector<double>::iterator itcmp = mergedBoundaries[vars[t]].begin();
1843 // Iterate until we've found a value in the target array equal to
1844 // or larger than the given element
1845 while (itcmp != mergedBoundaries[vars[t]].end() &&
1846 (! CalibrationDataContainer::isNearlyEqual(*itcmp, *it)) &&
1847 *itcmp < *it) ++itcmp;
1848 // Nothing needs to be done if the values are "nearly identical"
1849 // (or if we don't find such an element).
1850 if (itcmp == mergedBoundaries[vars[t]].end() || CalibrationDataContainer::isNearlyEqual(*itcmp, *it)) continue;
1851 // Otherwise insert the given element (this can mean adding to the end)
1852 mergedBoundaries[vars[t]].insert(itcmp, *it);
1853 }
1854 }
1855 }
1856 // Variables not present in the scale factor object still need to go in
1857 for (unsigned int t = 0; t < effVars.size(); ++t)
1858 if (boundaries.find(effVars[t]) == boundaries.end())
1859 mergedBoundaries[effVars[t]] = effBoundaries[effVars[t]];
1860
1861 // Carry out a rudimentary cross-check of the tag weight bin
1862 // (the binning used for the scale factor and MC objects should be identical).
1863 if (boundaries.find(CalibrationDataContainer::kTagWeight) == boundaries.end()) {
1864 cerr << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: " << "no tag weight axis found for object " << nameFromIndex(indexSF) << endl;
1865 } else if (effBoundaries.find(CalibrationDataContainer::kTagWeight) == effBoundaries.end()) {
1866 cerr << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: " << "no tag weight axis found for object " << nameFromIndex(indexEff) << endl;
1867 } else if (boundaries[CalibrationDataContainer::kTagWeight].size() != effBoundaries[CalibrationDataContainer::kTagWeight].size()) {
1868 cerr << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: " << "different tag weight binning for objects " << nameFromIndex(indexSF) << " (";
1869 std::vector<double>& v = boundaries[CalibrationDataContainer::kTagWeight];
1870 for (unsigned int ib = 0; ib < v.size()-1; ++ib) cerr << v[ib] << ",";
1871 cerr << v[v.size()-1] << ") and " << nameFromIndex(indexEff) << " (";
1872 v = effBoundaries[CalibrationDataContainer::kTagWeight];
1873 for (unsigned int ib = 0; ib < v.size()-1; ++ib) cerr << v[ib] << ",";
1874 cerr << v[v.size()-1] << ") do not match!" << endl;
1875 } else {
1876 // Make sure that (possibly) dummy vectors exist for _all_ known variables
1877 // (this is a mere technicality allowing to loop over all variables explicitly).
1878 mergedBoundaries.try_emplace(CalibrationDataContainer::kPt, std::vector<double>{20.,300.});
1879 mergedBoundaries.try_emplace(CalibrationDataContainer::kEta, std::vector<double>{-2.5, 2.5});
1880
1881 // Finally, carry out the cross-check that all this is about: recompute the scale factor
1882 // in each pseudo-bin
1883 if (m_verbose){
1884 cout << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: cross-checking scale factors for objects " << nameFromIndex(indexSF) << " and " << nameFromIndex(indexEff) << "\n" << std::setfill('-') << std::setw(100) << "-" << endl;
1885 cout << std::setfill(' ');
1886 }
1888 std::vector<double>& vPt = mergedBoundaries[CalibrationDataContainer::kPt], vEta = mergedBoundaries[CalibrationDataContainer::kEta], vTagWeight = mergedBoundaries[CalibrationDataContainer::kTagWeight];
1889 for (unsigned int ipt = 0; ipt < vPt.size()-1; ++ipt) {
1890 x.jetPt = (vPt[ipt] + vPt[ipt+1]) * 500.; // account for MeV -> GeV conversion
1891 for (unsigned int ieta = 0; ieta < vEta.size()-1; ++ieta) {
1892 x.jetEta = (vEta[ieta] + vEta[ieta+1]) / 2.;
1893 for (unsigned int iwt = 0; iwt < vTagWeight.size()-1; ++iwt) {
1894 x.jetTagWeight = (vTagWeight[iwt] + vTagWeight[iwt+1]) / 2.;
1895 // Retrieve the central scale factor value and the old and new MC tag weight fractions
1896 double value;
1897 container->getResult(x, value);
1898 Analysis::UncertaintyResult uncertaintyResult(0,0);
1899 container->getUncertainty("MCreference", x, uncertaintyResult);
1900 double fracMCref = uncertaintyResult.first;
1901 double fracMCnew;
1902 effContainer->getResult(x, fracMCnew);
1903 // Compute the new scale factor value
1904 if (!(fracMCnew > 0.)) {
1905 cout << "\tfor (pt=" << x.jetPt << ",eta=" << x.jetEta << ",tagweight=" << x.jetTagWeight << "): invalid new MC fraction: " << fracMCnew << endl;
1906 } else {
1907 double newvalue = 1.0 + (value - 1.0) * fracMCref/fracMCnew;
1908 if (newvalue <= 0 || newvalue > m_maxTagWeight) cout << "\tfor (pt=" << x.jetPt << ",eta=" << x.jetEta << ",tagweight=" << x.jetTagWeight << "): old (value=" << value << ",MC=" << fracMCref << "), new (value=" << newvalue << ",MC=" << fracMCnew << ")" << endl;
1909 }
1910 }
1911 }
1912 }
1913 }
1914
1915 m_checkedWeightScaleFactors.push_back(std::make_pair(indexSF, indexEff));
1916}
1917
1918//________________________________________________________________________________
1919bool
1921 unsigned int index)
1922{
1923 // Check whether the jet eta value is outside the range of validity, subject to the strategy
1924 // specified in the configuration file.
1925 bool pass = true;
1926 if (m_absEtaStrategy == Ignore) return pass;
1927
1928 switch (m_absEtaStrategy) {
1929 case GiveUp:
1930 if (std::fabs(variables.jetEta) > m_maxAbsEta) {
1931 pass = false;
1932 }
1933 break;
1934 case Flag:
1935 default:
1936 if (std::fabs(variables.jetEta) > m_maxAbsEta) {
1938 }
1939
1940 }
1941 return pass;
1942}
1943
1944//________________________________________________________________________________
1945std::string
1947{
1948 // Return the object name corresponding to the given index.
1949
1950 for (std::map<std::string, unsigned int>::const_iterator it = m_objectIndices.begin();
1951 it != m_objectIndices.end(); ++it)
1952 if (it->second == index) return it->first;
1953
1954 // This should never happen..
1955 return string("");
1956}
1957
1958//________________________________________________________________________________
1959void
1961 OutOfBoundsType oob)
1962{
1963 // Internal method bumping the relevant counter out-of-bounds counter for the specified object.
1964 //
1965 // oob: further classification of out-of-bounds case
1966 // index: object index
1967
1968 // make sure the vectors are appropriately dimensioned
1969 if (index >= m_mainCounters.size()) {
1970 unsigned int minsize = (index == 0) ? 2 : 2*index;
1971 m_mainCounters.resize(minsize, 0);
1972 m_etaCounters.resize(minsize, 0);
1973 m_extrapolatedCounters.resize(minsize, 0);
1974 }
1975 switch (oob) {
1976 case Main:
1977 m_mainCounters[index]++; break;
1978 case Eta:
1979 m_etaCounters[index]++; break;
1980 case Extrapolated:
1981 default:
1983 }
1984}
1985
1986//________________________________________________________________________________
1987std::vector<string>
1989 const string& label,
1990 const string& OP,
1991 bool named)
1992{
1993 // Retrieve the sources of uncertainty relevant for the given scale factor calibration object,
1994 // identifying the object by name.
1995 //
1996 // author: jet collection name
1997 // label: jet flavour label
1998 // OP: tagger working point
1999 // named: if false, an unsorted list of sources of uncertainties will be returned.
2000 // if true, only 'named' uncertainties will be returned, and the position in
2001 // the vector that is the return value determines the 'numVariation' index
2002 // that is to be used if named variations are to be retrieved.
2003
2004 unsigned int index;
2005 if (! retrieveCalibrationIndex (label, OP, author, true, index)) {
2006 // Return a dummy result if the object is not found
2007 cerr << "listScaleFactorUncertainties: unable to find SF calibration for object " << fullName(author, OP, label, true) << endl;
2008 std::vector<string> dummy;
2009 return dummy;
2010 }
2012}
2013
2014//________________________________________________________________________________
2015std::vector<string>
2017 const std::string& flavour, bool named)
2018{
2019 // Note: this method already works on a per-flavour basis, so passing flavour in is a simple addition
2020 // Method is called primarily from within the BTaggingEfficiencyTool - W.L.
2021 // Retrieve the sources of uncertainty relevant for the given scale factor calibration object,
2022 // identifying the object by index.
2023 //
2024 // index: index to scale factor calibration object
2025 // named: if false, an unsorted list of sources of uncertainties will be returned.
2026 // if true, only 'named' uncertainties will be returned, and the position in
2027 // the vector that is the return value determines the 'numVariation' index
2028 // that is to be used if named variations are to be retrieved.
2029
2030 std::vector<string> dummy;
2032
2033 if (container) {
2034 if (named) {
2035 // Find out which uncertainties are excluded from eigenvector construction
2036 if (! m_runEigenVectorMethod) return dummy;
2037 std::shared_ptr<CalibrationDataEigenVariations> eigenVariation=m_eigenVariationsMap.at(container);
2038 if (m_EVStrategy == Analysis::Uncertainty::SFEigen){
2039 std::vector<string> unordered = eigenVariation->listNamedVariations(); // this is for the regular EV
2040 std::vector<string> ordered(unordered.size());
2041 for (unsigned int i = 0; i < unordered.size(); ++i) {
2042 ordered[eigenVariation->getNamedVariationIndex(unordered[i])] = unordered[i];
2043 }
2044 return ordered;
2045 } else if (m_EVStrategy == Analysis::Uncertainty::SFGlobalEigen){
2046 // here we want to get the named uncertainties from the global eigenvariations flavour container specifically...
2047 std::shared_ptr<CalibrationDataGlobalEigenVariations> GEV = std::dynamic_pointer_cast<CalibrationDataGlobalEigenVariations>(eigenVariation);
2048 std::vector<std::string> unordered = GEV->listNamedVariations(flavour);
2049 std::vector<std::string> ordered(unordered.size()); // ordered by the NAMED VARIATION (internal) ORDERING
2050 for (unsigned int i = 0; i < unordered.size(); ++i) {
2051 ordered[GEV->getNamedVariationIndex(unordered[i], flavour)] = unordered[i];
2052 }
2053 return ordered;
2054 }
2055 }
2056 return container->listUncertainties(); // return this if not named
2057 }
2058
2059 return dummy;
2060}
2061
2062//________________________________________________________________________________
2063unsigned int
2065 const std::string& label,
2066 const std::string& OP,
2067 Uncertainty unc)
2068{
2069 // Retrieve the number of eigenvector variations or named variations relevant for
2070 // the given scale factor calibration object, identifying the object by name.
2071 //
2072 // author: jet collection name
2073 // label: jet flavour label
2074 // OP: tagger working point
2075 // unc: should be set to SFEigen or SFNamed for the cases of
2076 // eigenvector variations or named variations, respectively
2077
2078 unsigned int index;
2079
2080 if (! retrieveCalibrationIndex (label, OP, author, true, index)) return 0;
2081 return getNumVariations(index, unc, label);
2082}
2083
2084//________________________________________________________________________________
2085unsigned int
2087 Uncertainty unc, const std::string& flavour)
2088{
2089 // Retrieve the number of eigenvector variations or named variations relevant for
2090 // the given scale factor calibration object, identifying the object by index.
2091 //
2092 // index: index to calibration scale factor object
2093 // unc: should be set to SFEigen or SFNamed for the cases of
2094 // eigenvector variations or named variations, respectively
2095
2096 if (! (unc == SFEigen || unc == SFNamed || unc == SFGlobalEigen)) return 0;
2098 if (! container) return 0;
2099 std::shared_ptr<CalibrationDataEigenVariations> eigenVariation=m_eigenVariationsMap.at(container);
2100 if (unc == SFGlobalEigen){
2101 std::shared_ptr<CalibrationDataGlobalEigenVariations> GEV = std::dynamic_pointer_cast<CalibrationDataGlobalEigenVariations>(eigenVariation);
2102 return GEV->getNumberOfEigenVariations(flavour);
2103 }
2104 return (unc == SFEigen) ? eigenVariation->getNumberOfEigenVariations() : eigenVariation->getNumberOfNamedVariations();
2105}
2106
2107//________________________________________________________________________________
2108const TH1*
2110 const std::string& label,
2111 const std::string& OP)
2112{
2113 // Retrieve the actual histogrammed calibration scale factors, identifying the object by name.
2114 //
2115 // author: jet collection name
2116 // label: jet flavour label
2117 // OP: tagger working point
2118
2119 unsigned int index;
2120 if (! retrieveCalibrationIndex (label, OP, author, true, index)) {
2121 // Return a dummy result if the object is not found
2122 cerr << "getBinnedScaleFactors: unable to find SF calibration for object " << fullName(author, OP, label, true) << endl;
2123 return 0;
2124 }
2126 return (container) ? dynamic_cast<TH1*>(container->GetValue("result")) : 0;
2127}
2128
2129//________________________________________________________________________________
2130const TObject*
2132 const std::string& label,
2133 const std::string& OP,
2134 unsigned int mapIndex)
2135{
2136 // Retrieve the actual central values object for the MC efficiences, identifying the object by name.
2137 // The object returned can be either a TH1 or a TF1; it is up to the user to determine which.
2138 //
2139 // author: jet collection name
2140 // label: jet flavour label
2141 // OP: tagger working point
2142 // mapIndex: index to the efficiency map to be used
2143
2144 unsigned int index;
2145 if (! retrieveCalibrationIndex (label, OP, author, false, index, mapIndex)) {
2146 // Return a dummy result if the object is not found
2147 cerr << "getMCEfficiencyObject: unable to find efficiency calibration for object "
2148 << fullName(author, OP, label, false, mapIndex) << endl;
2149 return 0;
2150 }
2152 return (container) ? container->GetValue("result") : 0;
2153}
2154
2155//====================== retrieval of shifted calibration object ===========================
2156
2157//________________________________________________________________________________
2158const TH1*
2160 const std::string& label,
2161 const std::string& OP,
2162 const std::string& unc,
2163 double sigmas)
2164{
2165 // Retrieve the actual histogrammed calibration scale factors, identifying the object by name
2166 // and with the scale factors shifted by the uncertainties due to the given source of uncertainty
2167 // (where bin-to-bin correlations are accounted for, i.e., shifts may be either positive or negative).
2168 //
2169 // author: jet collection name
2170 // label: jet flavour label
2171 // OP: tagger working point
2172 // unc: source of uncertainty to consider
2173 // sigmas: number of standard deviations by which to shift the scale factor central values
2174
2175 // quick sanity check
2176 if (unc == "comment" || unc == "result" || unc == "combined" || unc == "statistics") return 0;
2177
2178 unsigned int index;
2179 if (! retrieveCalibrationIndex (label, OP, author, true, index)) {
2180 // Return a null result if the object is not found
2181 cerr << "getShiftedScaleFactors: unable to find SF calibration for object " << fullName(author, OP, label, true) << endl;
2182 return nullptr;
2183 }
2185 if (! container) return nullptr;
2186
2187 TH1* result = dynamic_cast<TH1*>(container->GetValue("result"));
2188 TH1* hunc = dynamic_cast<TH1*>(container->GetValue(unc.c_str()));
2189 // another sanity check...
2190 if ((! hunc) || (! result)) return nullptr;
2191 if (hunc->GetDimension() != result->GetDimension() || hunc->GetNbinsX() != result->GetNbinsX() ||
2192 hunc->GetNbinsX() != result->GetNbinsX() || hunc->GetNbinsX() != result->GetNbinsX())
2193 return nullptr;
2194 // also check that the uncertainty is to be treated as correlated from bin to bin
2195 // (for the variation is applied coherently, which isn't appropriate for uncertainties
2196 // that aren't correlated from bin to bin)
2197 if (! container->isBinCorrelated(unc)) return 0;
2198
2199 // if everything is consistent, the actual operation simply consists of adding histograms...
2200 std::string name(container->GetName()); name += "_"; name += unc; name += "_";
2201 TH1* shifted = dynamic_cast<TH1*>(result->Clone(name.c_str()));
2202 if (not shifted) return nullptr;
2203 shifted->Add(hunc, sigmas);
2204 return shifted;
2205}
2206//====================== run EigenVectorRecomposition method ===========================
2209 const std::string& label,
2210 const std::string& OP,
2211 unsigned int mapIndex){
2212 // run eigen vector recomposition method. If success, stored the retrieved coefficient map
2213 // in m_coefficientMap and return success. Otherwise return error and keep m_coefficientMap
2214 // untouched.
2215 // author: jet collection name
2216 // label: jet flavour label
2217 // OP: tagger working point
2218 // mapIndex: index to the MC efficiency map to be used. Should be 0?
2219 // Todo: What is mapindex?
2220 // Todo: Check the way xAODBTaggingTool initialize CDI. Check if that is the as how we are initialize CDI.
2222 cerr << "runEigenVectorRecomposition: Recomposition need to be ran with CalibrationDataInterfaceRoot initialized in eigenvector mode" << endl;
2223 return Analysis::kError;
2224 }
2225
2226 unsigned int indexSF;
2227 if (! retrieveCalibrationIndex (label, OP, author, true, indexSF, mapIndex)) {
2228 cerr << "runEigenVectorRecomposition: unable to find SF calibration for object "
2229 << fullName(author, OP, label, true) << endl;
2230 return Analysis::kError;
2231 }
2232
2233 return runEigenVectorRecomposition (label, indexSF);
2234}
2235
2238 unsigned int indexSF){
2239 // run eigen vector recomposition method. If success, stored the retrieved coefficient map
2240 // in m_coefficientMap and return success. Otherwise return error and keep m_coefficientMap
2241 // untouched.
2242 // label: jet flavour label
2243 // indexSF: index to scale factor calibration object
2244 CalibrationDataContainer* container = m_objects[indexSF];
2245 if (! container) {
2246 cerr << "runEigenVectorRecomposition: error retrieving container!" << endl;
2247 return Analysis::kError;
2248 }
2249
2250 // Retrieve eigenvariation
2251 std::shared_ptr<CalibrationDataEigenVariations> eigenVariation;
2252 try {
2253 eigenVariation = m_eigenVariationsMap.at(container);
2254 } catch (const std::out_of_range&) {
2255 cerr << "runEigenVectorRecomposition: Could not retrieve eigenvector variation, while it should have been there." << endl;
2256 return Analysis::kError;
2257 }
2258 // Doing eigenvector recomposition
2259 std::map<std::string, std::map<std::string, float>> coefficientMap;
2260 if(!eigenVariation->EigenVectorRecomposition(label, coefficientMap))
2261 return Analysis::kError;
2262
2263 m_coefficientMap = std::move(coefficientMap);
2264 return Analysis::kSuccess;
2265}
2266
2267std::map<std::string, std::map<std::string, float>>
2269 if(m_coefficientMap.empty())
2270 cerr << "getCoefficientMap: Call runEigenVectorRecomposition() before retrieving coefficient map! " <<endl;
2271 return m_coefficientMap;
2272}
2273
2274
2275//====================== put some utility functions here ===================================
2276
2277namespace {
2278 // Construct the (diagonal) covariance matrix for the statistical uncertainties on the "ref" results
2279 TMatrixDSym getStatCovarianceMatrix(const TH1* hist) {
2280 Int_t nbinx = hist->GetNbinsX()+2, nbiny = hist->GetNbinsY()+2, nbinz = hist->GetNbinsZ()+2;
2281 Int_t rows = nbinx;
2282 if (hist->GetDimension() > 1) rows *= nbiny;
2283 if (hist->GetDimension() > 2) rows *= nbinz;
2284 TMatrixDSym stat(rows);
2285 for (Int_t binx = 1; binx < nbinx; ++binx){
2286 for (Int_t biny = 1; biny < nbiny; ++biny){
2287 for (Int_t binz = 1; binz < nbinz; ++binz) {
2288 Int_t bin = hist->GetBin(binx, biny, binz);
2289 double err = hist->GetBinError(bin);
2290 stat(bin, bin) = err*err;
2291 }
2292 }
2293 }
2294 return stat;
2295 }
2296
2297 // Construct the covariance matrix assuming that histogram "unc" contains systematic uncertainties
2298 // pertaining to the "ref" results, and that the uncertainties are fully correlated from bin to bin
2299 // (unless option "doCorrelated" is false, in which bins are assumed uncorrelated)
2300 TMatrixDSym getSystCovarianceMatrix(const TH1* ref, const TH1* unc, bool doCorrelated, const std::string& uncname, int tagWeightAxis) {
2301 Int_t nbinx = ref->GetNbinsX()+2, nbiny = ref->GetNbinsY()+2, nbinz = ref->GetNbinsZ()+2;
2302 Int_t rows = nbinx;
2303 if (ref->GetDimension() > 1) rows *= nbiny;
2304 if (ref->GetDimension() > 2) rows *= nbinz;
2305 TMatrixDSym cov(rows);
2306
2307 for(int i=0 ; i<10 ; i++){
2308 Int_t bin = unc->GetBin(1,i,1);
2309 double uncval = unc->GetBinContent(bin);
2310 cout << uncval << ", ";
2311 } cout << endl;
2312
2313 // Carry out a minimal consistency check
2314 if (unc->GetNbinsX()+2 != nbinx || unc->GetNbinsY()+2 != nbiny || unc->GetNbinsZ()+2 != nbinz || unc->GetDimension() != ref->GetDimension()) {
2315 std::cout << "getSystCovarianceMatrix: inconsistency found in histograms " << ref->GetName() << " and " << unc->GetName() << " : " << uncname << std::endl;
2316 return cov;
2317 }
2318
2319 // // the "2" below doesn't actually imply that two bins are used...
2320 // // this is just to make the loops below work
2321 // if (ref->GetDimension() <= 1) nbiny = 2;
2322 // if (ref->GetDimension() <= 2) nbinz = 2;
2323
2324 // Special case: uncertainties not correlated from bin to bin.
2325 // The exception here is for tag weight bins, which are always assumed to be fully correlated.
2326 if (! doCorrelated) {
2327 if (tagWeightAxis < 0) {
2328 // truly uncorrelated uncertainties
2329 for (Int_t binx = 1; binx < nbinx-1; ++binx){
2330 for (Int_t biny = 1; biny < nbiny-1; ++biny){
2331 for (Int_t binz = 1; binz < nbinz-1; ++binz) {
2332 Int_t bin = ref->GetBin(binx, biny, binz);
2333 double err = unc->GetBinContent(bin);
2334 cov(bin,bin) = err*err;
2335 }
2336 }
2337 }
2338 return cov;
2339 } else if (tagWeightAxis == 0) {
2340 // continuous histogram with tag weight X axis
2341 for (Int_t biny = 1; biny < nbiny-1; ++biny){
2342 for (Int_t binz = 1; binz < nbinz-1; ++binz){
2343 for (Int_t binx = 1; binx < nbinx-1; ++binx) {
2344 Int_t bin = ref->GetBin(binx, biny, binz);
2345 double err = unc->GetBinContent(bin);
2346 for (Int_t binx2 = 1; binx2 < nbinx-1; ++binx2) {
2347 Int_t bin2 = ref->GetBin(binx2, biny, binz);
2348 double err2 = unc->GetBinContent(bin2);
2349 cov(bin,bin2) = err*err2;
2350 }
2351 }
2352 }
2353 }
2354 return cov;
2355 } else if (tagWeightAxis == 1) {
2356 // continuous histogram with tag weight Y axis
2357 for (Int_t binx = 1; binx < nbinx-1; ++binx){
2358 for (Int_t binz = 1; binz < nbinz-1; ++binz){
2359 for (Int_t biny = 1; biny < nbiny-1; ++biny) {
2360 Int_t bin = ref->GetBin(binx, biny, binz);
2361 double err = unc->GetBinContent(bin);
2362 for (Int_t biny2 = 1; biny2 < nbiny-1; ++biny2) {
2363 Int_t bin2 = ref->GetBin(binx, biny2, binz);
2364 double err2 = unc->GetBinContent(bin2);
2365 cov(bin,bin2) = err*err2;
2366 }
2367 }
2368 }
2369 }
2370 return cov;
2371 } else if (tagWeightAxis == 2) {
2372 // continuous histogram with tag weight Z axis
2373 for (Int_t binx = 1; binx < nbinx-1; ++binx){
2374 for (Int_t biny = 1; biny < nbiny-1; ++biny){
2375 for (Int_t binz = 1; binz < nbinz-1; ++binz) {
2376 Int_t bin = ref->GetBin(binx, biny, binz);
2377 double err = unc->GetBinContent(bin);
2378 for (Int_t binz2 = 1; binz2 < nbinz-1; ++binz2) {
2379 Int_t bin2 = ref->GetBin(binx, biny, binz2);
2380 double err2 = unc->GetBinContent(bin2);
2381 cov(bin,bin2) = err*err2;
2382 }
2383 }
2384 }
2385 }
2386 return cov;
2387 }
2388 }
2389
2390 for (Int_t binx = 1; binx < nbinx-1; ++binx){
2391 for (Int_t biny = 1; biny < nbiny-1; ++biny){
2392 for (Int_t binz = 1; binz < nbinz-1; ++binz) {
2393 Int_t bin = ref->GetBin(binx, biny, binz);
2394 double err = unc->GetBinContent(bin); // <------------- For every bin in the "ref" ("result") TH1*, GetBinContents of the corresponding uncertainty bin
2395 for (Int_t binx2 = 1; binx2 < nbinx-1; ++binx2){
2396 for (Int_t biny2 = 1; biny2 < nbiny-1; ++biny2){
2397 for (Int_t binz2 = 1; binz2 < nbinz-1; ++binz2) {
2398 Int_t bin2 = ref->GetBin(binx2, biny2, binz2);
2399 double err2 = unc->GetBinContent(bin2); // <------- Grab the unc contents of every bin, and compute the covariance matrix element
2400 cov(bin, bin2) = err*err2; // <------- err1 and err2 are the uncertainty content of the bins, so "cov" is real, symmetric
2401 } // <------- "cov" would imply that the "hunc" histogram stores "x - E[x]" differences from the mean. So in the end, it computes the covariance (as a sum of these)
2402 }
2403 }
2404 }
2405 }
2406 }
2407 return cov;
2408 }
2409
2410}
2411
2412//====================== retrieval of calibration covariance matrix ========================
2413
2414//________________________________________________________________________________
2415TMatrixDSym
2417 const std::string& label,
2418 const std::string& OP,
2419 const std::string& unc)
2420{
2421 // Return the scale factor covariance matrix for the given calibration object.
2422 // This function is deprecated since its functionality is duplicated in the
2423 // CalibrationDataEigenVariations class.
2424 //
2425 // author: jet collection name
2426 // label: jet flavour label
2427 // OP: tagger working point
2428 // unc: source of uncertainty to consider
2429 // Catch issues with the specified input as early as possible
2430 TMatrixDSym dummy;
2431 if (unc == "comment" || unc == "result" || unc == "combined") return dummy;
2432
2433 unsigned int index;
2434 if (! retrieveCalibrationIndex (label, OP, author, true, index)) {
2435 // Return a dummy result if the object is not found
2436 cerr << "getScaleFactorCovarianceMatrix: unable to find SF calibration for object " << fullName(author, OP, label, true) << endl;
2437 return dummy;
2438 }
2440 if (!container) return dummy;
2441
2442 // retrieve the central calibration and its axes
2443 TH1* result = dynamic_cast<TH1*>(container->GetValue("result"));
2444 if (! result) return dummy;
2445 // "normal" case: single source of uncertainty
2446 if (unc != "all") {
2447 if (unc == "statistics") {
2448 return getStatCovarianceMatrix(result);
2449 } else {
2450 TH1* hunc = dynamic_cast<TH1*>(container->GetValue(unc.c_str()));
2451 if (! hunc) {
2452 cout << "getScaleFactorCovarianceMatrix: no uncertainty object found "
2453 << "corresponding to name " << unc << endl;
2454 return dummy;
2455 }
2456 return getSystCovarianceMatrix(result, hunc, container->isBinCorrelated(unc), unc, container->getTagWeightAxis());
2457 }
2458 }
2459
2460 // special case: complete covariance matrix. This is to be constructed
2461 // as the sum over all individual contributions.
2462 // First, treat the statistics separately (as above)
2463 TMatrixDSym cov = getStatCovarianceMatrix(result);
2464
2465 // Then loop through the list of (other) uncertainties
2466 std::vector<string> uncs = container->listUncertainties();
2467 for (unsigned int t = 0; t < uncs.size(); ++t) {
2468 if (uncs[t] == "comment" || uncs[t] == "result" || uncs[t] == "combined" ||
2469 uncs[t] == "statistics" || uncs[t]=="extrapolation" || uncs[t]=="MChadronisation" ||
2470 uncs[t]=="ReducedSets" || uncs[t]=="systematics") continue;
2471 TH1* hunc = dynamic_cast<TH1*>(container->GetValue(uncs[t].c_str()));
2472 if (not hunc) {
2473 std::cerr<<"Analysis::CalibrationDataInterfaceROOT::getScaleFactorCovarianceMatrix : dynamic cast failed\n";
2474 continue;
2475 }
2476 TMatrixDSym syst_cov = getSystCovarianceMatrix(result, hunc, container->isBinCorrelated(uncs[t]), uncs[t], container->getTagWeightAxis());
2477 cov += syst_cov;
2478 }
2479
2480 return cov;
2481}
2482
2483//________________________________________________________________________________
2484void
2485Analysis::CalibrationDataInterfaceROOT::initialize(const string& jetauthor, const string& OP, Uncertainty unc)
2486{
2487 // Preload objects necessary so that the input calibration file can be closed.
2488 // This functionality is only needed when using PROOF.
2489
2490 if((!m_fileEff)||(!m_fileSF)) {
2491 cerr << "initialize can only be called once per CalibrationDataInterfaceROOT object" << endl;
2492 return;
2493 } else {
2494 cout << "initializing BTagCalibrationDataInterfaceROOT for PROOF with jetAuthor = " << jetauthor << ", tagger = " << m_taggerName << ", operating point = " << OP << ", uncertainty = " << unc << endl;
2495 }
2496
2497 CalibrationDataVariables BTagVars;
2498 BTagVars.jetAuthor = jetauthor;
2499 BTagVars.jetPt = 100000.; //Irrelevant, just has to be valid to retrieve objects
2500 BTagVars.jetEta = 1.5; //Irrelevant, just has to be valid to retrieve objects
2501
2502 for(const auto& flavour : m_flavours){
2503 std::pair<double, double> BTagCalibResult;
2504 BTagCalibResult = getScaleFactor(BTagVars, flavour, OP, unc);
2505 std::cout << "CalibrationDataInterfaceROOT->initialize : BTagCalibResult " << std::endl;
2506
2507 std::pair<double, double> BTagCalibMCEff;
2508 BTagCalibMCEff = getMCEfficiency(BTagVars, flavour, OP, unc);
2509 std::cout << "CalibrationDataInterfaceROOT->initialize : BTagCalibMCEff " << std::endl;
2510 }
2511
2512 if (m_fileEff != m_fileSF) {
2513 m_fileEff->Close();
2514 delete m_fileEff;
2515 }
2516 m_fileSF->Close();
2517 delete m_fileSF;
2518 m_fileEff = 0; //prevents repeat deletion in destructor
2519 m_fileSF = 0; //prevents repeat deletion in destructor
2520}
2521
2522//________________________________________________________________________________
2524Analysis::CalibrationDataInterfaceROOT::retrieveContainer(const string& label, const string& OP, const string& author, const string& cntname, bool isSF, bool doPrint)
2525{
2526 // Attempt to retrieve the given container from file. Note that also the corresponding
2527 // "hadronisation" reference is retrieved (if possible and not yet done).
2528 //
2529 // dir: name of the directory containing the requested container
2530 // cntname: name of the requested container itself (not including the full path)
2531 // isSF: set to false (true) if the object is to be retrieved from the MC efficiencies
2532 // file (the calibration scale factor file). Note that it is assumed that scale
2533 // factor objects will always be retrieved from the calibration scale factor file.
2534 // doPrint: if true, print out some basic information about the successfully retrieved container
2535 // (note that this is typically steered by the m_verbose setting;
2536 // only for the retrieval of the maps used for MC/MC SF calculations, this printout is always switched off)
2537
2538 string dir = m_taggerName + "/" + getAlias(author) + "/" + OP + "/" + label;
2539 // construct the full object name
2540 string name = dir + "/" + cntname;
2541
2542 // If the object cannot be found, then each call will result in a new attempt to
2543 // retrieve the object from the ROOT file. Hopefully this will not happen too often...
2544 unsigned int idx = m_objectIndices[name] = m_objects.size();
2545 // CalibrationDataContainer* cnt =
2546 // dynamic_cast<CalibrationDataContainer*>((isSF ? m_fileSF : m_fileEff) ->Get(name.c_str()));
2548 (isSF ? m_fileSF : m_fileEff)->GetObject(name.c_str(), cnt);
2549 // If the requested object is a MC efficiency container and is not found, make a second attempt
2550 // to retrieve it from the calibration scale factor file. This will avoid the need to duplicate
2551 // efficiency containers so that the MC efficiency file needs to store only those containers
2552 // not already present in the calibration scale factor file. Of course this is meaningful only
2553 // if separate files are used to begin with.
2554 if (!isSF && !cnt && m_fileSF != m_fileEff) m_fileSF->GetObject(name.c_str(), cnt);
2555 m_objects.push_back(cnt);
2556 if (!cnt) {
2557 cerr << "btag Calib: retrieveContainer: failed to retrieve container named " << name << " from file" << endl;
2558 return 0;
2559 }
2560
2561 // For successfully retrieved containers, also print some more information (implemented on user request)
2562 if (doPrint) {
2563 cout << "CalibrationDataInterface: retrieved container " << name << " (with comment: '" << cnt->getComment() << "' and hadronisation setting '" << cnt->getHadronisation() << "')" << endl;
2564 }
2565
2566
2567 // If the requested object is a MC efficiency container, make sure to retrieve the corresponding
2568 // calibration scale factor container first (a feature first thought to be necessary, erroneously,
2569 // but left in since this ordering should not hurt in any case).
2570 if (m_refMap.find(dir) == m_refMap.end()) {
2571 if (isSF) {
2572 // Retrieve the mapping objects from both files and merge their information using the 'helper' class.
2573 // The map resulting from this is used to retrieve the information required to compute MC/MC scale factors.
2574 string hadronisationRefs(dir + "/MChadronisation_ref");
2575 TMap* mapSF = 0; m_fileSF->GetObject(hadronisationRefs.c_str(), mapSF);
2576 TMap* mapEff = 0; if (m_fileEff != m_fileSF) m_fileEff->GetObject(hadronisationRefs.c_str(), mapEff);
2577 m_refMap[dir] = new HadronisationReferenceHelper(mapSF, mapEff);
2578 delete mapSF;
2579 delete mapEff;
2580 } else {
2581 string SFCalibName = getContainername(getBasename(dir), true);
2582 if (m_objectIndices.find(SFCalibName) == m_objectIndices.end()) retrieveContainer(label, OP, author, SFCalibName, true, doPrint);
2583 }
2584 }
2585
2586 // Attempt to find the corresponding hadronisation reference container needed for the application of
2587 // MC/MC scale factors.
2588 if (idx+1 > m_hadronisationReference.size()) m_hadronisationReference.resize(idx+1, -1);
2589 m_hadronisationReference[idx] = -1;
2590 string spec = cnt->getHadronisation();
2591 if (spec != "") {
2592 std::map<string, HadronisationReferenceHelper*>::const_iterator mapit = m_refMap.find(dir);
2593 if (mapit != m_refMap.end()) {
2594 string ref;
2595 if (mapit->second->getReference(spec, ref)) {
2596 // Retrieve the hadronisation reference if not already done. Note that the "isSF" is left unchanged:
2597 // this allows to retrieve the reference from the same file as the scale factor object. An exception
2598 // is the reference for the calibration scale factor object, which should always be obtained from
2599 // the scale factor file.
2600 // An efficiency container can be its own hadronisation reference (this is not "protected" against).
2601 string refname(dir + "/" + ref);
2602 std::map<string, unsigned int>::const_iterator it = m_objectIndices.find(refname);
2603 // If the reference cannot be found, assume that it hasn't yet been retrieved so attempt it now.
2604 if (it == m_objectIndices.end()) {
2605 // Omit the printout of container information here (the idea being that showing MC/MC SF information would confuse rather than help)
2606 retrieveContainer(label, OP, author, ref, isSF, false);
2607 it = m_objectIndices.find(refname);
2608 }
2609 if (it != m_objectIndices.end()) {
2610 m_hadronisationReference[idx] = it->second;
2611 }
2612 }
2613 } else if (m_useMCMCSF) {
2614 cerr << "btag Calib: retrieveContainer: MC hadronisation reference map not found -- this should not happen!" << endl;
2615 }
2616 }
2618 // Not being able to construct the MC/MC scale factors will lead to a potential bias.
2619 // However, this is not considered sufficiently severe that we will flag it as an error.
2620 if (m_useMCMCSF){
2621 cerr << "btag Calib: retrieveContainer: warning: unable to apply MC/MC scale factors for container " << name << " with hadronisation reference = '" << spec << "'" << endl;
2622 }
2623 }
2624
2625 // Initialize the Eigenvector variation object corresponding to this object, if applicable. Notes:
2626 // - the dual use of "isSF" (both referring to the file and to the object, see above) requires another protection here
2627 // - the constructor's second argument is used to determine whether to exclude a pre-determined set of uncertainties from the EV decomposition
2628 //
2629 // We also want to separate behavior between SFEigen and SFGlobalEigen systematic strategies
2630 // The former requires a CalibrationDataEigenVariations object to be made per flavour.
2631 // The latter combines all corresponding flavours, so once it's been made for a single flavour, it's cached under all the corresponding "flavour containers"
2632 // simulataneously in m_eigenVariationsMap, and is checked for on each subsequent call to this method.
2633 if (m_runEigenVectorMethod && isSF && name.find("_SF") != string::npos) {
2634 CalibrationDataHistogramContainer* histoContainer=dynamic_cast<CalibrationDataHistogramContainer*>(cnt);
2635 if (histoContainer==0) {
2636 cerr << "Could not cast Container to a HistogramContainer. " << endl;
2637 return 0;
2638 }
2639 if (m_EVStrategy == Analysis::Uncertainty::SFEigen){
2641 std::shared_ptr<CalibrationDataEigenVariations> newEigenVariation(new CalibrationDataEigenVariations(m_filenameSF, m_taggerName, OP, author, histoContainer, m_useRecommendedEVExclusions));
2642 newEigenVariation->setVerbose(m_verbose);
2643
2644 // At this point we may also want to reduce the number of eigenvector variations.
2645 // The choices are stored with the container object; but first we need to know what flavour we are dealing with.
2646 string flavour = dir.substr(dir.find_last_of("/")+1);
2647
2648 for (const auto & entry : m_excludeFromCovMatrix[flavour]) {
2649 newEigenVariation->excludeNamedUncertainty(entry, cnt);
2650 }
2651 newEigenVariation->initialize();
2652 int to_retain = histoContainer->getEigenvectorReduction(m_EVReductions[flavour]); // returns the number of eigenvariations to retain as per the EV reduction strategy
2653 if (to_retain > -1) {
2654 if (m_verbose) cout << "btag Calib: reducing number of eigenvector variations for flavour " << flavour << " to " << to_retain << endl;
2655 // The merged variations will end up as the first entry in the specified list, i.e., as the last of the variations to be "retained"
2656 newEigenVariation->mergeVariationsFrom(size_t(to_retain-1)); // All variations stored with indices larger than this are merged
2657 } else if (m_EVReductions[flavour] != Loose) {
2658 cerr << "btag Calib: unable to retrieve eigenvector reduction information for flavour " << flavour << " and scheme " << m_EVReductions[flavour] << "; not applying any reduction" << endl;
2659 }
2660 m_eigenVariationsMap[cnt]=std::move(newEigenVariation);
2661
2663 } else if (m_EVStrategy == Analysis::Uncertainty::SFGlobalEigen) {
2665 std::map<const CalibrationDataContainer*, std::shared_ptr<CalibrationDataEigenVariations> >::iterator evit = m_eigenVariationsMap.find(cnt);
2666 // The global implementation internally combines all the "flavour containers" (containers that correspond to each other, only with different flavours)
2667 // But the CalibrationDataInterfaceROOT object doesn't need to know that, so we want to get all the flavour containers in one go here
2668 // and map them (with m_eigenVariationsMap) to the same CalibrationDataGlobalEigenVariations pointer.
2669 // Then, in methods like getScaleFactor, we call the virtual methods which will give the proper result e.g. if you want the SF for a b-jet, it'll call the
2670
2671 if (evit == m_eigenVariationsMap.end()){
2672 // now to see if it's completely empty or not
2673 if (m_eigenVariationsMap.empty()){
2674 std::shared_ptr<CalibrationDataGlobalEigenVariations> newEigenVariation(new CalibrationDataGlobalEigenVariations(m_filenameSF, m_taggerName, OP, author, m_flavours, histoContainer, m_useRecommendedEVExclusions));
2675 for (const auto & entry : m_excludeFromCovMatrix[label]) {
2676 newEigenVariation->excludeNamedUncertainty(entry, label); // <---- custom exclude named uncertainties method for global variations
2677 }
2678
2679 newEigenVariation->initialize();
2680
2681 // flavour loop to get the flavour reduction schemes and apply them
2682 for (std::string& flavour : m_flavours){
2683 int to_retain = histoContainer->getEigenvectorReduction(m_EVReductions[flavour]); // returns the number of eigenvariations to retain as per the EV reduction strategy
2684 if (to_retain > -1) {
2685 if (m_verbose) cout << "btag Calib: reducing number of eigenvector variations for flavour " << flavour << " to " << to_retain << endl;
2686 // The merged variations will end up as the first entry in the specified list, i.e., as the last of the variations to be "retained"
2687 newEigenVariation->mergeVariationsFrom(size_t(to_retain-1), flavour); // All variations stored with indices larger than this are merged
2688 } else if (m_EVReductions[flavour] != Loose) {
2689 cerr << "btag Calib: unable to retrieve eigenvector reduction information for flavour " << flavour << " and scheme " << m_EVReductions[flavour] << "; not applying any reduction" << endl;
2690 }
2691 }
2692
2693 m_eigenVariationsMap.insert({cnt, newEigenVariation});
2694 } else {
2695 // Need to point to the CDGEV object four times in the m_eigenVariationsMap to appease the CDIROOT backend design...
2696 // Ok, turns out I can't retrieve the containers from CDGEV and insert them directly, because I'd have to use the containers directly instead..
2697 // So the strategy is to just take the CGEV objects that are already in the map, and mpa the present container to it
2698 std::shared_ptr<CalibrationDataEigenVariations> previous_eigenvariation = m_eigenVariationsMap.begin()->second;
2699 m_eigenVariationsMap.insert({cnt, previous_eigenvariation});
2700 }
2701
2702 } else {
2703 std::cout << "CalibrationDataInterfaceROOT->retrieveContainer : the CDGEV object for " << name << " already exists! " << std::endl;
2704 }
2706 }
2707 }
2708
2709 return cnt;
2710}
2711
2712//________________________________________________________________________________
2713string
2715{
2716 // Return the alias for the given jet collection name, if an alias exists.
2717 // If this is not the case, the return value will simply equal the input jet collection name.
2718
2719 std::map<string,string>::const_iterator it = m_aliases.find(author);
2720 return (it == m_aliases.end()) ? author : it->second;
2721}
2722
2723//________________________________________________________________________________
2724string
2725Analysis::CalibrationDataInterfaceROOT::fullName(const string& author, const string& OP,
2726 const string& label, bool isSF,
2727 unsigned mapIndex) const
2728{
2729 // Construct the full calibration object's pathname within the calibration ROOT file.
2730 //
2731 // author: jet collection name
2732 // OP: tagger working point
2733 // label: jet flavour label
2734 // isSF: set to true (false) for scale factors (MC efficiencies)
2735 // mapIndex: index in the list of MC efficiency calibration objects
2736
2737 string flavour = (label == "N/A") ? "Light" : label;
2738 string full(m_taggerName + "/" + getAlias(author) + "/" + OP + "/" + flavour + "/");
2739 full += getContainername(flavour, isSF, mapIndex);
2740 // full += getAlias(author); full += "/";
2741 // string name = (isSF) ?
2742 // getBasename(OP, label, "_SF", true) :
2743 // getBasename(OP, label, "_Eff", false, mapIndex);
2744 // full += name;
2745 return full;
2746}
2747
2748//________________________________________________________________________________
2750{
2751 // Create the map from hadronisation specifications to reference container names for
2752 // a given ROOT file directory.
2753 //
2754 // mapSF: reference specification as extracted from calibration scale factor file
2755 // mapEff: reference specification as extracted from MC efficiency file
2756 // (null if the two files are identical)
2757
2758 // First take the scale factor file's map
2759 if (mapSF) {
2760 TMapIter next(mapSF); TObjString* spec;
2761 while ((spec = (TObjString*) next())) {
2762 TObjString* ref = (TObjString*) mapSF->GetValue(spec);
2763 m_refs[string(spec->GetName())] = string(ref->GetName());
2764 }
2765 }
2766 // Then do the same with the efficiency file's map. The result will be to override any
2767 // items from the SF file's map. An exception is made for the scale factor calibration object,
2768 // for which (for the sake of consistency) the SF reference must be retained.
2769 if (mapEff) {
2770 TMapIter next(mapEff); TObjString* spec;
2771 while ((spec = (TObjString*) next())) {
2772 TObjString* ref = (TObjString*) mapEff->GetValue(spec);
2773 m_refs[string(spec->GetName())] = string(ref->GetName());
2774 }
2775 }
2776}
2777
2778//________________________________________________________________________________
2779bool
2781 string& ref) const
2782{
2783 // Extract the reference histogram name corresponding to the given hadronisation specification (if existing).
2784 // The return value is used to indicate whether the specification could be found.
2785 //
2786 // spec: hadronisation specification
2787 // ref: container name corresponding to this specification
2788
2789 std::map<string, string>::const_iterator it = m_refs.find(spec);
2790 if (it == m_refs.end()) return false;
2791
2792 ref = it->second;
2793 return true;
2794}
const boost::regex ref(r_ef)
static const std::string hadronisationRefs("MChadronisation_ref")
ClassImp(Analysis::CalibrationDataInterfaceROOT) Analysis
std::vector< std::string > split(const std::string &str, const char token=';')
local utility function: split string into a vector of substrings separated by a specified separator,...
#define GEV
string trim(string s)
#define x
This is the interface for the objects to be stored in the calibration ROOT file.
virtual CalibrationStatus getUncertainty(const std::string &unc, const CalibrationDataVariables &x, UncertaintyResult &result, TObject *obj=0)=0
retrieve the calibration uncertainty due to the given source.
virtual CalibrationStatus getStatUncertainty(const CalibrationDataVariables &x, double &result)=0
retrieve the calibration statistical uncertainty.
CalibrationStatus getSystUncertainty(const CalibrationDataVariables &x, UncertaintyResult &result, TObject *obj=0)
retrieve the calibration total systematic uncertainty
static bool isNearlyEqual(double a, double b)
utility for comparison of doubles
std::vector< unsigned int > getVariableTypes()
utility to retrieve variable types
std::vector< std::string > listUncertainties() const
retrieve the list of "uncertainties" accessible to this object.
virtual CalibrationStatus getResult(const CalibrationDataVariables &x, double &result, TObject *obj=0, bool extrapolate=false)=0
retrieve the calibration result.
This is the class holding information for histogram-based calibration results.
virtual int getEigenvectorReduction(unsigned int choice) const
Retrieve the number of eigenvectors to be retained for the purpose of eigenvector variation reduction...
virtual CalibrationStatus getUncertainty(const std::string &unc, const CalibrationDataVariables &x, UncertaintyResult &result, TObject *obj=0)
retrieve the calibration uncertainty due to the given source.
virtual std::vector< double > getBinBoundaries(unsigned int vartype)
Retrieve the bin boundaries for the specified variable type (which should be a CalibrationParametriza...
virtual int getTagWeightAxis()
Test whether this calibration object is one for "continuous" calibration (this has some subtle conseq...
virtual CalibrationStatus getResult(const CalibrationDataVariables &x, double &result, TObject *obj=0, bool extrapolate=false)
retrieve the calibration result.
bool isBinCorrelated(const std::string &unc) const
Indicate whether the given uncertainty is correlated from bin to bin or not (note that this function ...
std::string m_taggerName
tagging algorithm name
void setEffCalibrationNames(const std::map< std::string, std::vector< std::string > > &names)
std::string getContainername(const std::string &flavour, bool SF, unsigned int mapIndex=0) const
auxiliary function for string concatenation
void setSFCalibrationNames(const std::map< std::string, std::string > &names)
std::string getBasename(const std::string &name) const
auxiliary function for retrieval of name within the directory
double combinedUncertainty(double stat, const std::pair< double, double > &syst) const
utility function for combination of statistical and (a priori asymmetric) systematic uncertainty.
bool getReference(const std::string &spec, std::string &ref) const
Retrieve the (full) name of the reference histogram, given the hadronisation specification.
std::map< std::string, std::string > m_refs
map from hadronisation specification to container name
This tool provides an interface to flavour tagging performance estimates.
bool m_verbose
if true, allow also for some informational (and not only error/warning) messages
std::vector< std::pair< unsigned int, unsigned int > > m_checkedWeightScaleFactors
bool m_useTopologyRescaling
specify whether or not to use MC/MC (topology) scale factors (also this steering option may be remove...
CalibResult getWeightScaleFactor(const CalibrationDataVariables &variables, const std::string &label, Uncertainty unc, unsigned int numVariation=0, unsigned int mapIndex=0)
efficiency scale factor retrieval by name
std::map< std::string, unsigned int > m_objectIndices
std::map< std::string, std::string > m_aliases
Do not attempt to persistify (PROOF)
TFile * m_fileSF
Do not attempt to persistify (PROOF)
void increaseCounter(unsigned int index, OutOfBoundsType oob=Main)
bool m_useRecommendedEVExclusions
if true, exclude pre-recommended lists of uncertainties from the covariance matrix building,...
std::map< std::string, HadronisationReferenceHelper * > m_refMap
the following maps (one for each directory) specify the name of the container serving as the 'hadroni...
TFile * m_fileEff
pointer to the TFile object providing access to the calibrations
CalibResult getInefficiencyScaleFactor(const CalibrationDataVariables &variables, const std::string &label, const std::string &OP, Uncertainty unc, unsigned int numVariation=0, unsigned int mapIndex=0)
"MC" inefficiency scale factor retrieval by name
std::string nameFromIndex(unsigned int index) const
Retrieve the name of the calibration object (container) given its index.
bool retrieveCalibrationIndex(const std::string &label, const std::string &OP, const std::string &author, bool isSF, unsigned int &index, unsigned int mapIndex=0)
Retrieve the index of the calibration object (container) starting from the label and operating point.
std::string fullName(const std::string &author, const std::string &OP, const std::string &label, bool isSF, unsigned mapIndex=0) const
@ brief construct the full object pathname from its individual components
const TH1 * getBinnedScaleFactors(const std::string &author, const std::string &label, const std::string &OP)
retrieve the binned calibration object for the given flavour label and operating point.
unsigned int getNumVariations(const std::string &author, const std::string &label, const std::string &OP, Uncertainty unc)
retrieve the number of variations relevant to the calibration object.
CalibrationDataContainer * retrieveContainer(const std::string &label, const std::string &OP, const std::string &author, const std::string &cntname, bool isSF, bool doPrint=true)
utility function taking care of object retrieval
bool m_runEigenVectorMethod
decide whether to run the eigenvector method or not
CalibrationDataInterfaceROOT()
default constructor for PROOF object retrieval
std::map< std::string, std::map< std::string, float > > m_coefficientMap
CalibrationStatus runEigenVectorRecomposition(const std::string &author, const std::string &label, const std::string &OP, unsigned int mapindex=0)
run EigenVector Recomposition method
CalibResult getInefficiency(const CalibrationDataVariables &variables, const std::string &label, const std::string &OP, Uncertainty unc, unsigned int numVariation=0, unsigned int mapIndex=0)
inefficiency retrieval by name
std::map< std::string, Analysis::EVReductionStrategy > m_EVReductions
Eigenvector reduction strategy (per flavour)
const TObject * getMCEfficiencyObject(const std::string &author, const std::string &label, const std::string &OP, unsigned int mapIndex=0)
retrieve the MC efficiency (central values) object for the given flavour label and operating point.
bool m_useMCMCSF
specify whether or not to use MC/MC (hadronisation) scale factors (the fact that this is steerable is...
std::vector< std::string > listScaleFactorUncertainties(const std::string &author, const std::string &label, const std::string &OP, bool named=false)
retrieve the list of "uncertainties" relevant to the calibration object.
CalibResult getScaleFactor(const CalibrationDataVariables &variables, const std::string &label, const std::string &OP, Uncertainty unc, unsigned int numVariation=0, unsigned int mapIndex=0)
efficiency scale factor retrieval by name.
std::map< std::string, std::map< std::string, float > > getEigenVectorRecompositionCoefficientMap()
Get Eigenvector recomposition map after running runEigenVectorRecomposition()
std::vector< int > m_hadronisationReference
store the 'hadronisation' reference for each object (-1 means no reference found)
TMatrixDSym getScaleFactorCovarianceMatrix(const std::string &author, const std::string &label, const std::string &OP, const std::string &unc="all")
retrieve the named covariance matrix element corresponding to the binned calibration object.
CalibResult getMCInefficiency(const CalibrationDataVariables &variables, const std::string &label, const std::string &OP, Uncertainty unc=None, unsigned int mapIndex=0)
"MC" inefficiency retrieval by name
const TH1 * getShiftedScaleFactors(const std::string &author, const std::string &label, const std::string &OP, const std::string &unc, double sigmas)
retrieve the binned calibration object for the given flavour label and operating point,...
CalibResult getMCEfficiency(const CalibrationDataVariables &variables, const std::string &label, const std::string &OP, Uncertainty unc=None, unsigned int mapIndex=0)
"MC" efficiency retrieval by name
std::string getAlias(const std::string &author) const
associated alias retrieval method
std::map< std::string, std::vector< std::string > > m_excludeFromCovMatrix
store the uncertainties which should be excluded from building the full covariance matrix
double getMCMCScaleFactor(const CalibrationDataVariables &variables, unsigned indexSF, unsigned int indexEff) const
MC/MC scale factor retrieval.
std::vector< CalibrationDataContainer * > m_objects
cache the objects themselves (so that the user will not have to delete them after each call etc....
double m_maxAbsEta
|eta| bounds and strategy for dealing with out-of-bounds conditions
CalibResult getEfficiency(const CalibrationDataVariables &variables, const std::string &label, const std::string &OP, Uncertainty unc, const std::string &flavour, unsigned int numVariation=0, unsigned int mapIndex=0)
efficiency retrieval by name
void initialize(const std::string &jetauthor, const std::string &OP, Uncertainty unc)
initialization for PROOF usage
std::vector< unsigned int > m_etaCounters
counters for flagging out-of-bound cases
std::map< const CalibrationDataContainer *, std::shared_ptr< CalibrationDataEigenVariations > > m_eigenVariationsMap
store the eigenvector class and associate to its CalibrationDataContainer
bool checkAbsEta(const CalibrationDataVariables &variables, unsigned int index)
void checkWeightScaleFactors(unsigned int indexSF, unsigned int indexEff)
std::string m_filenameSF
in addition, store also the filenames themselves (needed for the copy constructor)
This class (struct, actually) is nothing but a light-weight container of (kinematic or other) variabl...
bool verbose
Definition hcg.cxx:73
std::string label(const std::string &format, int i)
Definition label.h:19
std::vector< std::string > split(const std::string &str, const char token=';')
local utility function: split string into a vector of substrings separated by a specified separator,...
The namespace of all packages in PhysicsAnalysis/JetTagging.
OutOfBoundsType
counter types (to be used when flagging out-of-bounds cases)
const CalibResult dummyResult(dummyValue, dummyValue)
std::pair< double, double > CalibResult
std::pair< double, double > UncertaintyResult
The following typedef is for convenience: most uncertainties can be asymmetric.
Uncertainty
specification of type information requested by the user
Definition index.py:1
STL namespace.