ATLAS Offline Software
Loading...
Searching...
No Matches
CalibrationDataInterfaceROOT.cxx
Go to the documentation of this file.
1/*
2 Copyright (C) 2002-2025 CERN for the benefit of the ATLAS collaboration
3*/
4
6// //
7// CalibrationDataInterfaceROOT //
8/* Begin_Html
9<h2>Purpose</h2>
10<p>This class provides the main interface to b-tagging calibration information.
11 Each instance can be used to handle a single tagger (so if multiple taggers' information
12 needs to be accessed, multiple CalibrationDataInterfaceROOT instances need to be created).<br />
13 Its action is steered by a configuration file (which is parsed using TEnv and which is
14 specified in the CalibrationDataInterfaceROOT constructor).
15</p>
16<p>
17 Further details can be found below, arranged by topic: <a href="#basic">basic usage</a>,
18 <a href="#ev">eigenvector variation usage</a>, and <a href="#configuration">configuration file
19 specification</a>.
20</p>
21<a name="basic"><h2>Basic usage</h2></a>
22<p>
23 The main functionality is to retrieve data/MC calibration scale factors and MC efficiencies,
24 on a jet-by-jet basis. This is done using the following methods:
25 <pre>
26 getScaleFactor (variables, label, OP, unc)
27 getMCEfficiency (variables, label, OP, unc, mapIndex)
28 </pre>
29 with the following meaning for the arguments:
30 <pre>
31 variables: a CalibrationDataVariables object that should be filled by the user with all
32 the information needed to extract the calibration information
33 label: flavour label. Currently recognised labels follow the Athena TruthInfo conventions
34 (see package PhysicsAnalysis/JetTagging/JetTagInfo): "B", "C", "T", "N/A", and in
35 addition "Light" (the "N/A" is internally converted to "Light")
36 OP: tagger working point. This should correspond to a documented weight cut for the
37 tagger under consideration, but converted to a string, and with any period (".")
38 replaced with an underscore for technical reasons. Alternatively, "Continuous" may
39 be used if a "continuous tagging" calibration object exists for the tagger under
40 consideration. (Note that the use of this method for "continuous tagging" is not
41 in general to be recommended, as it does not not allow for scale factor rescaling,
42 which is ne
43 unc: enum specifying what uncertainty information is to be retrieved. Recognised values
44 for normal usage are <b>None</b> (retrieval of central values only), <b>Statistical</b>
45 (statistical uncertainty), <b>Systematic</b> (systematic uncertainty), <b>Total</b>
46 (combined statistical and systematic uncertainty). (Other choices exist for more advanced
47 usage and will be described <a href="#ev">below</a>.)
48 mapIndex: index specifying the precise MC efficiency calibration to be used. This is relevant if
49 multiple "efficiency calibration" names are specified for the given flavour label (see
50 the <a href="#configuration">configuration section</a>).
51 </pre>
52 All methods return a pair&lt;double, double&gt; (the stated <b>CalibResult</b> return type is a simple typedef),
53 the first member of which contains the central value and the second the requested uncertainty (or 0 if no
54 uncertainty was requested). Note that central values are retricted to be within the physical region (between 0
55 and 1 for the MC efficiencies, and positive for the scale factors).
56</p>
57<p>
58 A few additional methods exist, based on (and internally using) the above methods:
59 <pre>
60 getEfficiency (variables, label, OP, unc, mapIndex): data efficiency calculated as MC efficiency times SF
61 getInefficiency (variables, label, OP, unc, mapIndex): 1 - data efficiency
62 getMCInefficiency (variables, label, OP, unc, mapIndex): 1 - MC efficiency
63 getInefficiencyScaleFactor (variables, label, OP, unc, mapIndex): ratio of data to MC inefficiencies
64 </pre>
65 Especially the last of these methods is likely to be of interest when data/MC scale factors are applied;
66 <a href="http://cdsweb.cern.ch/record/1269912">this note</a> provides more detail. Restrictions to the physical
67 range apply, as is the case for the data/MC scale factor and MC efficiency methods.
68</p>
69<a name="speed-up"><h3>Code speed-up using variable caching</h3></a>
70<p>
71 Internally, pointers to the calibration objects are cached. However, the string matching needed to
72 exploit this feature is slow and leads to a non-negligible CPU overhead. Since internally all pointers
73 are stored in a vector, it is possible to request the position in this vector for a given calibration
74 object. This position can then be used in subsequent alternative calls to retrieve information:
75 <pre>
76 getScaleFactor (variables, indexSF, indexEff, unc)
77 getMCEfficiency (variables, index, unc)
78 </pre>
79 Here the "index" variables replace the specification of the flavour label and operating point. They can
80 be retrieved by calling
81 <pre>
82 retrieveCalibrationIndex(label, OP, author, isSF, index)
83 </pre>
84 where the following additional variables are to be used:
85 <pre>
86 author: jet collection name
87 isSF: set to true (false) if the object concerned is a scale factor object (a MC efficiency object)
88 index: requested information
89 </pre>
90 Note that this method has a boolean return value, indicating whether or not the retrieval succeeded.
91 This return value should be checked by the user, as no checks are carried in the alternative getScaleFactor()
92 etc. methods as to the validity of the index specified. Note also that for all methods except those using
93 only the MC information, both indices need to be provided.
94</p>
95<a name="continuous"><h3>Continuous tagging</h3></a>
96<p>
97 Rather than merely inquiring whether the tag weight discriminant resulting from a given tagger satisfies a
98 given criterion, it may be useful to use more detailed information. In particular, "continuous tagging"
99 information can be made available. This is essentially a calibrated version of the (binned) tag weight
100 discriminant distribution, or to be more precise, the corresponding data/MC ratio.
101</p>
102<p>
103 The use of this information differs somewhat from the regular calibration usage, as the "transport" from
104 the calibration sample(s) to the sample used in physics becomes a non-trivial extension of the efficiency
105 and inefficiency scale factors used for the regular calibrations. Therefore, two separate methods exist
106 which should be used for such cases:
107 <pre>
108 getWeightScaleFactor (variables, label, unc, numVariation, mapIndex)
109 getWeightScaleFactor (variables, indexSF, indexEff, unc, numVariation)
110 </pre>
111 The meaning of the variables is as above.
112</p>
113<a name="ev"><h2>Eigenvector variations</h2></a>
114<p>
115 The 'basic usage' described above does not do justice to the fact that data/MC calibration scale factors
116 are derived in kinematic bins, with uncertainties not being fully correlated from bin to bin (indeed, the
117 methods themselves do not know anything about binning to start with). It is however possible to arrive at
118 a statistically more correct approach by considering 'variations'. This exploits the fact that in the case
119 of scale factor calibrations, besides statistical and total systematic uncertainties also individual
120 systematic uncertainties are stored (along with a model of their bin-to-bin correlations). The typical use
121 of this, applicable for a systematic uncertainty that is fully correlated (or anti-correlated) between bins,
122 would be to consider the effect on all calibration scale factors from a &pm; 1 standard deviation of the
123 underlying source of systematic.
124</p>
125<p>
126 In practice, this approach is a bit cumbersome due to the sometimes large number of contributions to the
127 systematic uncertainty. In addition, no infrastructure exists at present to deal with uncertainties that are
128 not, or only partially, correlated from bin to bin. An eigenvector decomposition technique allows to address
129 both of these issues at the same time.
130</p>
131<p>
132 This method starts from the scale factor covariance matrix that can be constructed on the basis of the
133 available information (uncertainties plus a model of their bin-to-bin correlations). This covariance matrix
134 can be diagonalised and its eigenvectors and corresponding eigenvalues determined. The product of an eigenvector
135 and the square root of its corresponding eigenvalue is what is referred to as an 'eigenvector variation'.
136 The number of such (statistically independent) variations equals the number of calibration bins, and the set
137 of variations is the minimal set needed to (re-)construct the covariance matrix. After this eigenvector
138 decomposition, the eigenvector variations may be used instead of the variations corresponding to the original
139 sources of uncertainty.
140</p>
141<p>
142 One further refinement can be made, related to the fact that sometimes sources of uncertainty affecting
143 the calibration analyses also affect the physics analysis in which the calibration results are used.
144 Including these sources of uncertainty in the eigenvector decomposition would not allow to correlate the
145 effect on the calibration properly with the effect on the physics analysis; therefore it is possible to
146 exclude these sources of uncertainty from being used in the construction of the covariance matrix that is
147 diagonalised, and consider their variations separately (note, per the above, that this will be done
148 correctly only in the case of full bin-to-bin correlations).
149</p>
150<p>
151 The use of the eigenvector variation is not enabled by default and must be switched on in the configuration
152 file (see <a href="#configuration">Configuration</a>).
153</p>
154<p>
155 The above can be used to obtain jet-by-jet information by a slight modification of the arguments to the
156 normal methods for information retrieval also employed for <a href="#basic">basic usage</a>:
157 <pre>
158 getScaleFactor (variables, label, OP, unc, numVariation)
159 </pre>
160 Here, the first three variables function exactly as in the basic usage case. As for the two last arguments:
161 <pre>
162 unc: this should take the value <b>SFEigen</b> or <b>SFNamed</b> for eigenvector variations or named variations, respectively.
163 numVariation: this indicates exactly which eigenvector variation or named variation is to be considered.
164 </pre>
165 In addition, it should be noticed that the methods' return values have a different meaning in this context:
166 while in normal usage they are (value, uncertainty) pairs, here they are (up, down) pairs, with up and down
167 meaning the result of an up- or downward eigenvector variation or named variation.
168</p>
169<p>
170 The number of valid variations for each type can be retrieved using
171 <pre>
172 getNumVariations(author, label, OP, Uncertainty unc)
173 </pre>
174 with "unc" set appropriately. Finally, calling
175 <pre>
176 listScaleFactorUncertainties(author, label, OP, named)
177 </pre>
178 can be used to retrieve information about uncertainties in two ways, depending on the value of the last argument:
179 <pre>
180 named = false (default): retrieve an unsorted list of all the uncertainties associated with the relevant scale factor
181 calibration object. This can be used to identify sources of uncertainty to be excluded from
182 the eigenvector decomposition.
183 named = true: the list in this case is restricted to the named uncertainties (these should have been specified
184 in the configuration file by the user) and ordered: the position in the vector constitutes the
185 link between the name and the index to be used as the "numVariation" argument to the
186 getScaleFactor() etc. method.
187 </pre>
188 Note that as in the basic usage case, the code can be sped up by replacing the jet author/label/OP specification by the appropriate
189 integer index (see the <a href="#speed-up">corresponding section</a>).
190</p>
191<a name="configuration"><h2>Configuration</h2></a>
192<p>
193 The configuration file may specify the following settings:
194</p>
195<ul>
196 <li>calibration file specification:
197 <pre>
198
199 filename: common ROOT file for calibration scale factors and MC efficiencies
200 (default: BTaggingPerformanceCalibrations.root)
201 filenameEff: separate ROOT file for MC efficiencies
202 filenameSF: separate ROOT file for calibration scale factors
203 </pre>
204 Note that it is not necessary to specify all three files. The most common use for the specification
205 of more than one file is likely to be the specification of analysis-specific MC efficiencies (the
206 efficiencies provided in the common file are expected to be sufficient for many analysis purposes,
207 but depending on the accuracy needed it may be desirable to add custom efficiencies).
208 </li>
209 <li>choice of calibration:
210 <pre>
211
212 ScaleFactorCalibrationBName: name for scale factor calibration object to be used for b-jets (default: default)
213 EfficiencyCalibrationBName: name for MC efficiencies object(s) to be used for b-jets (default: default).
214 Note that multiple (semicolon-separated) MC efficiency objects may be specified;
215 internally they will be converted to a vector. The position in the vector can
216 then be used as the 'mapIndex' argument in the methods involving MC information
217 (see e.g. the <a href="#basic">basic usage</a> section).
218 </pre>
219 Analogous keywords (with B replaced with C, T, Light) can be used to specify the objects to be used for
220 charm, tau, and light-flavour jets, respectively. Note that the "default" object should always exist and reflect
221 the recommended choice of object unless analysis-specific needs dictate otherwise.
222 </li>
223 <li>jet aliasing:
224 <pre>
225 aliases: semicolon-separated list of jet collection aliases
226 </pre>
227 The feature of jet aliasing follows the strategy also used in the CalibrationBroker Athena class.
228 It allows for the use of a calibration for different jet collection names than those actually
229 employed in physics analysis (caveat emptor!). Each alias consists of the specification
230 <pre>
231 source-&gt;target
232 </pre>
233 where source indicates the name used in physics analysis, while target is the name to be used for calibration purposes.
234 </li>
235 <li>eigenvector decomposition:
236 <pre>
237 runEigenVectorMethod: if set to true this will carry out an eigenvector decomposition, the results
238 of which can subsequently be used as calibration scale factor "variations"
239 correctly accounting for bin-to-bin correlations. For detailed usage of
240 b-tagging results (e.g. in profile likelihood fits) this is recommended (default: false)
241 excludeFromCovMatrix: semicolon-separated list of uncertainties to be excluded from the eigenvector
242 decomposition, for all flavours.
243 excludeFromBCovMatrix: semicolon-separated list of uncertainties to be excluded from the eigenvector
244 decomposition for b jets
245 excludeFromCCovMatrix: semicolon-separated list of uncertainties to be excluded from the eigenvector
246 decomposition for c jets
247 excludeFromTCovMatrix: semicolon-separated list of uncertainties to be excluded from the eigenvector
248 decomposition for tau "jets"
249 excludeFromLightCovMatrix: semicolon-separated list of uncertainties to be excluded from the eigenvector
250 decomposition for light-flavour jets
251 </pre>
252 </li>
253</ul>
254End_Html */
255//
256// CalibrationDataInterfaceROOT.cxx, (c) ATLAS Detector software //
258
261
264#include <boost/algorithm/string.hpp>
265
266#include "TMath.h"
267#include "TEnv.h"
268#include "TFile.h"
269#include "TObjString.h"
270#include <iostream>
271#include <iomanip>
272#include <cmath>
273#include <cassert>
274#include <cstring>
275
276using std::string;
277using std::cout;
278using std::cerr;
279using std::endl;
280using std::setw;
281
286using boost::trim;
287
288#ifndef __CINT__
290#endif
291
292//________________________________________________________________________________
293Analysis::CalibrationDataInterfaceROOT::CalibrationDataInterfaceROOT(const string& taggerName, const std::string& configname, const std::string& pathname) :
294 m_runEigenVectorMethod(false), m_EVStrategy(SFEigen), m_useRecommendedEVExclusions(false), m_verbose(true),
295 m_absEtaStrategy(GiveUp), m_otherStrategy(Flag)
296{
297 // Normal constructor.
298 //
299 // taggerName: this should correspond to the tagger name as used in the calibration ROOT file
300 // configname: full name of the configuration file
301 // pathname: directory specification for separate scale factor or efficiency ROOT file
302
303 m_taggerName = taggerName;
304
305 TEnv env;
306 env.ReadFile(configname.c_str(),kEnvGlobal);
307
308
309 // ROOT file containing the calibrations
310 TString filename = env.GetValue("File", "BTaggingPerformanceCalibrations.root");
311 m_filenameEff = string(env.GetValue("FileEff", "")); trim(m_filenameEff);
312 m_filenameSF = string(env.GetValue("FileSF", "")); trim(m_filenameSF);
313 if (m_filenameEff == "") {
314 m_filenameEff = pathname + filename.Data();
315 }
316 if (m_filenameSF == "") {
317 m_filenameSF = pathname + filename.Data();
318 }
319
320 if (m_verbose) {
321 cout << "=== CalibrationDataInterfaceROOT::CalibrationDataInterfaceROOT ===" << endl;
322 cout << " Config name : " << configname << endl;
323 cout << " taggerName : " << taggerName << endl;
324 cout << " Efficiency file name : " << m_filenameEff << endl
325 << " SF file name : " << m_filenameSF << endl;
326 }
327
328 m_fileEff = TFile::Open(m_filenameEff.c_str(), "READ");
329 if (m_filenameEff == m_filenameSF)
330 m_fileSF = m_fileEff;
331 else
332 m_fileSF = TFile::Open(m_filenameSF.c_str(), "READ");
333
334 if (m_verbose) {
335 TObjString* s;
336 m_fileSF->GetObject("VersionInfo/BuildNumber", s);
337 if (s) cout << " CDI file build number: " << s->GetName() << endl;
338 cout << endl;
339 }
340
341 m_flavours = { "B", "C", "T", "Light" };
342 string testPrefix(taggerName); testPrefix += ".";
343
344 // Since TEnv doesn't allow for straight retrieval of vectors of strings, expect
345 // semicolon-separated entries (semicolon because ROOT considers this as a "special"
346 // token anyway in object names).
347 string::size_type end;
348
349 // Calibration names for the efficiencies
350 std::map<string, std::vector<string> > effNames;
351 for (auto const& flavour : m_flavours) {
352 string test(testPrefix); test += "EfficiencyCalibration"; test += flavour; test += "Name";
353 effNames[flavour] = split(string(env.GetValue(test.c_str(), "default")));
354 }
355 setEffCalibrationNames(effNames);
356
357 // Calibration names for the efficiency scale factors
358 std::map<string, string> SFNames;
359 for (auto const& flavour : m_flavours) {
360 string test(testPrefix); test += "ScaleFactorCalibration"; test += flavour; test += "Name";
361 SFNames[flavour] = string(env.GetValue(test.c_str(), "default")); trim(SFNames[flavour]);
362 }
363 setSFCalibrationNames(SFNames);
364
365 // Since TEnv doesn't allow for straight retrieval of vectors of strings, expect
366 // semicolon-separated entries (semicolon because ROOT considers this as a "special"
367 // token anyway in object names).
368 // Don't prefix this since the aliases are common to all taggers (even if they are read again for each tagger).
369 string AL(env.GetValue("aliases", ""));
370 if (AL.size() > 0) {
371 do {
372 end = AL.find(";");
373 string alias = AL.substr(0, end);
374 // Each alias specification uses an arrow ("->"). Forget about entries
375 // not properly following this specification.
376 // NB: TEnv imposes a maximum string length of 1024 characters -- is this a problem?
377 string::size_type arrow = alias.find("->");
378 if (arrow == string::npos) continue;
379 string target = alias.substr(0,arrow); trim(target);
380 m_aliases[target] = alias.substr(arrow+2); trim(m_aliases[target]);
381 if (end != string::npos) AL = AL.substr(end+1);
382 } while (end != string::npos);
383 }
384
385 //run egenvector method or not?
386 string test="runEigenVectorMethod";
387 m_runEigenVectorMethod=(bool)env.GetValue(test.c_str(),0);
388
389 if (m_runEigenVectorMethod) {
390 // Retrieve the list of systematic uncertainties not to be considered when building up
391 // the full covariance matrix used for the eigenvector method.
392 // We do this in two steps: first, for backward compatibility reasons, a flavour-independent list is scanned.
393 // Second, flavour-specific lists are scanned.
394 test = "excludeFromCovMatrix";
395 std::vector<std::string> to_exclude = split(env.GetValue(test.c_str(), ""));
396 // Copy the resulting list to all flavours
397 for (auto const& flavour : m_flavours) {
398 m_excludeFromCovMatrix[flavour] = to_exclude;
399 }
400 for (auto const& flavour : m_flavours) {
401 test = "excludeFrom"; test += flavour; test += "CovMatrix";
402 to_exclude = split(env.GetValue(test.c_str(), ""));
403 // Append to the existing list
404 m_excludeFromCovMatrix[flavour].insert(m_excludeFromCovMatrix[flavour].end(), to_exclude.begin(), to_exclude.end());
405 }
406
407 unsigned int n_excluded = 0;
408 for (auto const& flavour : m_flavours) {
409 n_excluded += m_excludeFromCovMatrix[flavour].size();
410 }
411 if (m_verbose) {
412 cout << " List of uncertainties to exclude:";
413 if (n_excluded == 0) cout << " none";
414 for (auto const& flavour : m_flavours) {
415 if (m_excludeFromCovMatrix[flavour].size() > 0) {
416 cout << "\n\t" << flavour << ":\t";
417 for (unsigned int i = 0; i < m_excludeFromCovMatrix[flavour].size(); ++i) {
418 cout << m_excludeFromCovMatrix[flavour].at(i);
419 if (i+1 == m_excludeFromCovMatrix[flavour].size()) cout << "; ";
420 }
421 cout << endl;
422 }
423 }
424 cout << endl;
425 }
426
427 // The following determines whether also pre-determined (recommended) lists of uncertainties are to be excluded from EV decomposition.
428 // These lists are stored with the CalibrationDataContainers, which have not been instantiated yet (so we cannot show them at this point).
429 m_useRecommendedEVExclusions = (bool) env.GetValue("ExcludeRecommendedFromEigenVectorTreatment", false);
430
431 // determine also the eigenvector reduction strategies
432 std::map<string, EVReductionStrategy> mappings;
433 mappings["Loose"] = Loose;
434 mappings["Medium"] = Medium;
435 mappings["Tight"] = Tight;
436 for (auto const& flavour : m_flavours) {
437 test = testPrefix; test += "EigenvectorReduction"; test += flavour;
438 std::string reduction = string(env.GetValue(test.c_str(), "Loose")); trim(reduction);
439 m_EVReductions[flavour] = mappings.find(reduction) == mappings.end() ? mappings["Loose"] : mappings.find(reduction)->second;
440 }
441 }
442
443 // determine |eta| validity range
444 m_maxAbsEta = env.GetValue("MaxAbsEta", 2.5);
445 if (m_maxAbsEta < 0) m_maxAbsEta = 2.5;
446
447 // set validation / protection strategy in case an out-of-bounds eta value is specified
448 string strategy = string(env.GetValue("OutOfBoundsEta", "GiveUp")); trim(strategy);
449 if (strategy == "GiveUp") m_absEtaStrategy = GiveUp;
450 else if (strategy == "Flag") m_absEtaStrategy = Flag;
451 else if (strategy == "Ignore") m_absEtaStrategy = Ignore;
452 else {
453 cerr << "unknown |eta| extrapolation strategy: " << strategy << ", setting to GiveUp" << endl;
454 m_absEtaStrategy = GiveUp;
455 }
456
457 // set validation / protection strategy in case out-of-bounds variables are specified
458 strategy = string(env.GetValue("OutOfBoundsOther", "Flag")); trim(strategy);
459 if (strategy == "GiveUp") m_otherStrategy = GiveUp;
460 else if (strategy == "GiveUpExtrapolated") m_otherStrategy = GiveUpExtrapolated;
461 else if (strategy == "Flag") m_otherStrategy = Flag;
462 else if (strategy == "Ignore") m_otherStrategy = Ignore;
463 else {
464 cerr << "unknown general extrapolation strategy: " << strategy << ", setting to Flag" << endl;
465 m_otherStrategy = Flag;
466 }
467
468 // maximum tag weight to accept
469 m_maxTagWeight = env.GetValue("MaxTagWeight", 10.0);
470
471 // MC/MC (hadronisation) scale factors: making this user-steerable is intended to be *temporary* only
472 m_useMCMCSF = (bool) env.GetValue("useMCMCSF", 1);
473 // MC/MC (topology) scale factors: making this user-steerable is intended to be *temporary* only
474 m_useTopologyRescaling = (bool) env.GetValue("useTopologySF", 0);
475
476 if (m_verbose) cout << "======= end of CalibrationDataInterfaceROOT instantiation ========" << endl;
477}
478
479
480//________________________________________________________________________________
482 const char* fileSF, const char* fileEff,
483 const std::vector<std::string>& jetAliases,
484 const std::map<std::string, std::string>& SFNames,
485 const std::map<std::string, std::vector<std::string> >& EffNames,
486 const std::map<std::string, std::vector<std::string> >& excludeFromEV,
487 const std::map<std::string, EVReductionStrategy>& EVReductions,
488 bool useEV, Uncertainty strat, bool useMCMCSF, bool useTopologyRescaling,
489 bool useRecommendedEEVExclusions, bool verbose,
490 std::vector<std::string> flavours) :
491 m_filenameSF(fileSF), m_filenameEff(""), m_flavours(std::move(flavours)),
492 m_runEigenVectorMethod(useEV), m_EVStrategy(strat), m_EVReductions(EVReductions),
493 m_useRecommendedEVExclusions(useRecommendedEEVExclusions), m_verbose(verbose),
494 m_useMCMCSF(useMCMCSF), m_useTopologyRescaling(useTopologyRescaling),
497{
498 // Normal constructor avoiding the need for a .env file.
499 //
500 // taggerName: this should correspond to the tagger name as used in the calibration ROOT file
501 // fileSF: full path of the calibration ROOT file containing the calibration scale factors
502 // fileEff: optional full path name of a ROOT file containing additional MC efficiency maps
503 // (use a null pointer to disable the use of such additional file)
504 // flavours; This should correspond to the list of flavour labels that's used by a tagger, and
505 // which corresponds to the labels used in internal maps
506 // jetAliases: this can be used to convert jet collection names to the corresponding names in the
507 // calibration ROOT file (this may be useful as e.g. the collection names in the
508 // calibration ROOT file have the JVF criterion attached as a suffix).
509 // Each alias is specified as
510 // nameOrig->nameTarget,
511 // where nameOrig and nameTarget are the names of the input jet collection and the
512 // jet collection name as used in the calibration ROOT file, respectively.
513 // SFNames: map specifying for each of the calibration flavours ("B", "C", "T", "Light") the
514 // name of the scale factor calibration object
515 // EffNames: map specifying for each of the calibration flavours ("B", "C", "T", "Light") the
516 // names of the possibly relevant efficiency calibration objects
517 // excludeFromEV: map specifying for each of the calibration flavours ("B", "C", "T", "Light") the
518 // systematic uncertainties to be excluded from the Eigenvector variation treatment
519 // (this is used only if Eigenvector variations are used to begin with)
520 // EVReductions: Eigenvector variation reduction strategies for "B", "C", "Light" jets (again,
521 // this is only relevant if Eigenvector variations are used to begin with)
522 // useEV: switch specifying if Eigenvector variations will be used or not
523 // useMCMCSF: switch specifying if generator-dependent scale factors are to be applied or not
524
525 // Note: at present, the means to change the strategies and maximum values initialized above do not exist
526 // when using this constructor
527
528 if (m_verbose) {
529 cout << "=== CalibrationDataInterfaceROOT::CalibrationDataInterfaceROOT ===" << endl;
530 cout << " taggerName : " << taggerName.c_str() << endl;
531 cout << " Systematic strategy : ";
532 if (m_EVStrategy == Analysis::Uncertainty::SFEigen){
533 cout << "SFEigen" << endl;
534 } else if (m_EVStrategy == Analysis::Uncertainty::SFGlobalEigen){
535 cout << "SFGlobalEigen" << endl;
536 } else {
537 cout << " Other" << endl;
538 }
539 if (fileEff) cout << " Efficiency file name : " << fileEff << endl;
540 cout << " SF file name : " << fileSF << endl
541 << endl;
542 }
543
544 m_taggerName = taggerName;
545
546 m_fileSF = TFile::Open(fileSF, "READ");
547 if (fileEff && strcmp(fileSF, fileEff) != 0) {
548 m_filenameEff = string(fileEff);
549 m_fileEff = TFile::Open(fileEff, "READ");
550 } else
552
553 if (m_verbose) {
554 TObjString* s;
555 m_fileSF->GetObject("VersionInfo/BuildNumber", s);
556 if (s) cout << " CDI file build number: " << s->GetName() << endl;
557 cout << endl;
558 }
559
560 for (unsigned int i = 0; i < jetAliases.size(); ++i) {
561 // Each alias specification uses an arrow ("->"). Forget about entries
562 // not properly following this specification.
563 string::size_type arrow = jetAliases[i].find("->");
564 if (arrow == string::npos) continue;
565 m_aliases[jetAliases[i].substr(0,arrow)] = jetAliases[i].substr(arrow+2);
566 }
567
568 setEffCalibrationNames(EffNames);
569 setSFCalibrationNames(SFNames);
570
572 // if we want to run EV, then decide which one
573 // The following should hold for both eigenvector decomposition methods (SFEigen and SFGlobalEigen)
574 // The global one simply adapts itself to using the m_excludeFromCovMatrix to perform the same task
575
576 m_excludeFromCovMatrix = excludeFromEV;
577 unsigned int n_excluded = 0;
578 for (auto const& flavour : m_flavours) {
579 n_excluded += m_excludeFromCovMatrix[flavour].size();
580 }
581 if (m_verbose) {
582 cout << " List of uncertainties to exclude:";
583 if (n_excluded == 0) cout << " none";
584 for (auto const& flavour : m_flavours) {
585 if (m_excludeFromCovMatrix[flavour].size() > 0) {
586 cout << "\n\t" << flavour << ":\t";
587 for (unsigned int i = 0; i < m_excludeFromCovMatrix[flavour].size(); ++i) {
588 cout << m_excludeFromCovMatrix[flavour].at(i);
589 if (i+1 == m_excludeFromCovMatrix[flavour].size()) cout << "; ";
590 }
591 cout << endl;
592 }
593 }
594 cout << endl;
595 }
596
597 }
598
599 if (m_verbose) cout << "======= end of CalibrationDataInterfaceROOT instantiation ========" << endl;
600}
601
602//________________________________________________________________________________
604{
605 // Default constructor for PROOF purposes
606
607 m_fileEff=0;
608 m_fileSF=0;
609}
610
611//________________________________________________________________________________
621{
622 // Copy constructor. Note that the "cacheable" items aren't copied (they will be re-created if needed)
623
624 // The TFile objects cannot be copied. Therefore, create duplicate objects starting from the filenames
625 m_fileSF = TFile::Open(m_filenameSF.c_str(), "READ");
628 else
629 m_fileEff = TFile::Open(m_filenameEff.c_str(), "READ");
630}
631
632//________________________________________________________________________________
634{
635 // Destructor
636 if ((m_fileEff!=0) && (m_fileSF!=0)) {
637 if (m_fileEff == m_fileSF) {
638 m_fileEff->Close();
639 delete m_fileEff; m_fileEff = 0;
640 } else {
641 m_fileEff->Close();
642 m_fileSF->Close();
643 delete m_fileEff; m_fileEff = 0;
644 delete m_fileSF; m_fileSF = 0;
645 }
646 }
647 // delete also the stored objects (these are owned by us)
648 for (std::vector<CalibrationDataContainer*>::iterator it = m_objects.begin(); it != m_objects.end(); ++it) {
649 if (*it) {
650 delete *it; *it = 0;
651 }
652 }
653
654 for (std::map<std::string, HadronisationReferenceHelper*>::iterator it = m_refMap.begin();
655 it != m_refMap.end(); ++it) {
656 if(it->second)
657 { delete it->second; it->second=nullptr; }
658 }
659
660 // Print summary output on out-of-bounds issues
661 if (m_absEtaStrategy == Flag && m_verbose) {
662 bool found = false;
663 cout << "\t\tCalibrationDataInterfaceROOT |eta| out-of-bounds summary:" << endl;
664 for (unsigned int index = 0; index < m_mainCounters.size(); ++index)
665 if (m_etaCounters[index] > 0) {
666 found = true;
667 cout << "\t\t\t" << nameFromIndex(index) << ": " << m_etaCounters[index] << endl;
668 }
669 if (!found) cout << "\t\t\tNo issues found" << endl;
670 }
671 if (m_otherStrategy == Flag && m_verbose) {
672 bool found = false;
673 cout << "\t\tCalibrationDataInterfaceROOT object out-of-bounds summary:" << endl;
674 for (unsigned int index = 0; index < m_mainCounters.size(); ++index)
676 found = true;
677 cout << "\t\t\t" << nameFromIndex(index)
678 << " general: " << m_mainCounters[index]
679 << ", extrapolated: " << m_extrapolatedCounters[index]
680 << endl;
681 }
682 if (!found) cout << "\t\t\tNo issues found" << endl;
683 }
684}
685
686//________________________________________________________________________________
687bool
689 const std::string& OP,
690 const std::string& author,
691 bool isSF, unsigned int& index,
692 unsigned int mapIndex)
693{
694 // Retrieve the integer index corresponding to a given combination of
695 // flavour label / tagger / working point / jet collection name, and separately
696 // for calibration scale factors and MC efficiencies (all these ingredients are needed
697 // to specify fully the calibration object).
698 // In fact this method will also trigger the retrieval of the object itself, if not already
699 // done, and will cache it internally. The absence of the requested calibration object will
700 // be flagged by a false return value.
701 // This method is used internally but should also be called by users in order to exploit the
702 // "code speed-up" features documented above.
703 //
704 // label: jet flavour label
705 // OP: tagger working point
706 // author: jet collection name
707 // isSF: set to true (false) for scale factors (MC efficiencies)
708 // index: resulting index (meaningful only for a 'true' function return value)
709 // mapIndex: index to the MC efficiency map to be used
710
711 index = 0;
712
713 // construct the full name from the label, operating point, SF/Eff choice;
714 // then look up this full name
715 string name = fullName(author, OP, label, isSF, mapIndex);
716 std::map<string, unsigned int>::const_iterator it = m_objectIndices.find(name);
717 if (it == m_objectIndices.end()) {
718 // If no container is found, attempt to retrieve it here (this is so that users won't
719 // have to call the named scale factor etc. methods once just to retrieve the container).
720 string flavour = (label == "N/A") ? "Light" : label;
721 string cntname = getContainername(flavour, isSF, mapIndex);
722 if (m_verbose) std::cout << "CalibrationDataInterfaceROOT->retrieveCalibrationIndex : container name is " << cntname << std::endl;
723 retrieveContainer(flavour, OP, author, cntname, isSF, m_verbose); // Only call this if you want to retrieve a currently not available container
724 it = m_objectIndices.find(name);
725 if (it == m_objectIndices.end()) return false;
726 } else {
727 if (m_verbose) std::cout << "CalibrationDataInterfaceROOT->retrieveCalibrationIndex : container " << name << " already cached! " << std::endl;
728 }
729
730 index = it->second;
731 return true;
732}
733
734//________________________________________________________________________________
737 const string& label, const string& OP,
738 Uncertainty unc, unsigned int numVariation,
739 unsigned int mapIndex) //const
740{
741 // Scale factor retrieval identifying the requested calibration object by name.
742 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
743 // above, and will be a dummy value in case an error occurs.
744 //
745 // variables: object holding kinematic (and other) information needed to compute the result
746 // label: jet flavour label
747 // OP: tagger operating point
748 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
749 // named variations are to be computed)
750 // numVariation: variation index (in case of eigenvector or named variations)
751 // mapIndex: index to the efficiency map to be used (this is needed for MC/MC scale factor
752 // application)
753 unsigned int indexEff, indexSF;
754 if (! (retrieveCalibrationIndex (label, OP, variables.jetAuthor, false, indexEff, mapIndex) && retrieveCalibrationIndex (label, OP, variables.jetAuthor, true, indexSF))) {
755 cerr << "getScaleFactor: unable to find SF calibration for object " << fullName(variables.jetAuthor, OP, label, false, mapIndex) << " or SF calibration for object " << fullName(variables.jetAuthor, OP, label, true) << endl;
756 // Return a dummy result if the object is not found
758 }
759
760 Analysis::CalibResult result; // the following is SF #3
761 return (getScaleFactor(variables, indexSF, indexEff, unc, numVariation, result, label) == Analysis::kError) ?
763}
764
765//________________________________________________________________________________
768 unsigned int indexSF, unsigned int indexEff,
769 Uncertainty unc, const std::string& flavour, unsigned int numVariation)
770{
771 // Scale factor retrieval identifying the requested calibration object by index.
772 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
773 // above, and will be a dummy value in case an error occurs.
774 //
775 // variables: object holding kinematic (and other) information needed to compute the result
776 // indexSF: index to scale factor calibration object
777 // indexEff: index to MC efficiency object
778 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
779 // named variations are to be computed)
780 // numVariation: variation index (in case of eigenvector or named variations)
781 Analysis::CalibResult result; // the following is SF #3
782 return (getScaleFactor(variables, indexSF, indexEff, unc, numVariation, result, flavour) == Analysis::kError) ?
784}
785
786//________________________________________________________________________________
789 unsigned int indexSF, unsigned int indexEff,
790 Uncertainty unc, unsigned int numVariation,
791 Analysis::CalibResult& result, const string& flavour)
792{
793 // Scale factor retrieval identifying the requested calibration object by index.
794 //
795 // variables: object holding kinematic (and other) information needed to compute the result
796 // indexSF: index to scale factor calibration object
797 // indexEff: index to MC efficiency object
798 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
799 // named variations are to be computed)
800 // numVariation: variation index (in case of eigenvector or named variations)
801 // result: (value, uncertainty) or (up, down) variation pair, depending on the unc value.
802 // A dummy value will be returned in case of an error.
803
805 if (! container) {
806 cerr << "getScaleFactor: error retrieving container!" << endl;
807 return Analysis::kError;
808 }
809
810 // perform out-of-bound check of jet eta
811 if (!checkAbsEta(variables, indexSF)) {
812 if (m_verbose)
813 cerr << "Jet |eta| is outside of the boundary!" << endl;
814 return Analysis::kRange;
815 }
816
817 // retrieve the MC/MC scale factor
818 double MCMCSF = m_useMCMCSF ? getMCMCScaleFactor(variables, indexSF, indexEff) : 1; // if we don't want to switch generator, MCMCSF = 1, as it should be
819
820 if (!m_runEigenVectorMethod && (unc == SFEigen || unc == SFNamed || unc == SFGlobalEigen))
821 {
822 cerr << " ERROR. Trying to call eigenvector method but initialization not switched on in b-tagging configuration." << endl;
823 cerr << " Please correct your configuration first. Nominal uncertainties used. " << endl;
824 }
825
826 // Procede with eigenvariations methods i.e. return the SF variations
827 if (unc == SFEigen || unc == SFNamed || unc==SFGlobalEigen) {
828 std::shared_ptr<CalibrationDataEigenVariations> eigenVariation;
829 try {
830 eigenVariation=m_eigenVariationsMap.at(container);
831 } catch (const std::out_of_range&) {
832 cerr << " Could not retrieve eigenvector variation, while it should have been there." << endl;
833 return Analysis::kError;
834 }
835 TH1* up=0;
836 TH1* down=0;
837 bool extrapolate = false; // store if the numVariation is the extrapolation named uncertainty index
838 if (unc == SFEigen || unc==SFNamed){
839 unsigned int maxVariations = (unc == SFEigen) ? eigenVariation->getNumberOfEigenVariations() : eigenVariation->getNumberOfNamedVariations();
840 if (numVariation > maxVariations-1) {
841 cerr << "Asked for " << ((unc == SFEigen) ? "eigenvariation" : "named variation") << " number: " << numVariation << " but overall number of available variations is: " << maxVariations << endl;
842 return Analysis::kError;
843 }
844 bool isOK = eigenVariation->getEigenvectorVariation(numVariation,up,down);
845 if (!isOK) {
846 cerr << "Eigenvector object is there but cannot retrieve up and down uncertainty histograms." << endl;
847 return Analysis::kError;
848 }
849 // the 'extrapolation' uncertainty (always a named one) needs a somewhat special treatment
850 extrapolate = (unc == SFNamed) ? eigenVariation->isExtrapolationVariation(numVariation) : false;
851
852 } else if (unc == SFGlobalEigen) {
853 std::shared_ptr<CalibrationDataGlobalEigenVariations> GEV = std::dynamic_pointer_cast<CalibrationDataGlobalEigenVariations>(eigenVariation); //dynamic_cast<std::shared_ptr<CalibrationDataGlobalEigenVariations> >(eigenVariation);
854 if (not GEV){
855 cerr << "Analysis::CalibrationDataInterfaceROOT::getScaleFactor: dynamic cast failed\n";
856 return Analysis::kError;
857 }
858 unsigned int maxVariations = GEV->getNumberOfEigenVariations(flavour); // <----- This gets the number of variations of the flavour
859 if (numVariation > maxVariations-1) {
860 cerr << "Asked for global eigenvariation number: " << numVariation << " but overall number of available variations is: " << maxVariations << endl;
861 return Analysis::kError;
862 }
863 bool isOK = GEV->getEigenvectorVariation(flavour, numVariation,up,down);
864 if (!isOK) {
865 cerr << "Eigenvector object is there but cannot retrieve up and down uncertainty histograms." << endl;
866 return Analysis::kError;
867 }
868 // the 'extrapolation' uncertainty (always a named one) needs a somewhat special treatment
869 extrapolate = GEV->isExtrapolationVariation(numVariation, flavour);
870 } else {
871 std::cerr << "ERROR: you requested " << unc << " but that isn't in the set of (SFEigen, SFGlobalEigen, SFNamed) for eigenvariations. " << std::endl;
872 return Analysis::kError;
873 }
874
875 double valueUp;
876 double valueDown;
877 Analysis::CalibrationStatus statUp = container->getResult(variables, valueUp, up, extrapolate); // This is what actually retrieves results from the container
878 Analysis::CalibrationStatus statDown = container->getResult(variables, valueDown,down, extrapolate);
879
880 if (statUp == Analysis::kError || statDown == Analysis::kError)
881 return Analysis::kError;
882 if (m_otherStrategy == GiveUp)
883 assert (statUp != Analysis::kRange); // no need to test also statDown
885 assert (statUp != Analysis::kExtrapolatedRange); // no need to test also statDown
886 else if (m_otherStrategy == Flag) {
887 if (statUp == Analysis::kRange)
888 increaseCounter(indexSF);
889 else if (statUp == Analysis::kExtrapolatedRange)
891 }
892
893 result.first = MCMCSF*valueUp;
894 result.second = MCMCSF*valueDown;
895
896 // Prevent negative return values. Should the comparison be against a strict 0?
897 result.first = std::max(Analysis::CalibZERO, result.first);
898 result.second = std::max(Analysis::CalibZERO, result.second);
899
900 return statUp; // end of getScaleFactor if SFEigen, SFGlobalEigen, or SFNamed is set
901
902
903 } // The above returns the up/down varied scale factor
904 //Proceed with no-eigenvector result
905
906 // always retrieve the result itself
907 double value;
908 Analysis::CalibrationStatus status = container->getResult(variables, value);
909 if (status == Analysis::kError) {
910 cerr << "getScaleFactor: error retrieving result in non-EV context!" << endl;
911 return status;
912 }
913 if (m_otherStrategy == GiveUp){
914 assert (status != Analysis::kRange);
915 } else if (m_otherStrategy == GiveUpExtrapolated) {
916 assert (status != Analysis::kExtrapolatedRange);
917 } else if (m_otherStrategy == Flag) {
918 if (status == Analysis::kRange){
919 increaseCounter(indexSF);
920 } else if (status == Analysis::kExtrapolatedRange) {
922 }
923 }
924
925 // retrieve the statistical uncertainty if desired
926 double stat(0);
927 if (unc == Total || unc == Statistical) {
928 if (container->getStatUncertainty(variables, stat) == Analysis::kError) {
929 cerr << "getScaleFactor: error retrieving Scale factor parameter covariance matrix!" << endl;
930 return Analysis::kError;
931 }
932 }
933
934 Analysis::UncertaintyResult resSyst(0,0);
935 if (unc == Total || unc == Systematic) {
936 if (container->getSystUncertainty(variables, resSyst) == Analysis::kError) {
937 cerr << "getScaleFactor: error retrieving Scale factor parameter systematic uncertainty!" << endl;
938 return Analysis::kError;
939 }
940 } else if (unc == Extrapolation) {
941 // this uncertainty is special, since it is not normally to be combined into the overall systematic uncertainty
942 if (container->getUncertainty("extrapolation", variables, resSyst) == Analysis::kError)
943 cerr << "getScaleFactor: error retrieving Scale factor parameter extrapolation uncertainty!" << endl;
944 } else if (unc == TauExtrapolation) {
945 // also this uncertainty is special, since it it singles out an uncertainty relevant only for tau "jets",
946 // and some care has to be taken not to duplicate or omit uncertainties
947 if (container->getUncertainty("extrapolation from charm", variables, resSyst) == Analysis::kError)
948 cerr << "getScaleFactor: error retrieving Scale factor parameter extrapolation uncertainty!" << endl;
949 }
950
951 double uncertainty = combinedUncertainty(stat, resSyst);
952 result.first = MCMCSF*value;
953 result.second = MCMCSF*uncertainty;
954
955 // Prevent negative return values. Should the comparison be against a strict 0?
956 result.first = std::max(Analysis::CalibZERO, result.first);
957 return status;
958
959}
960
961//________________________________________________________________________________
964 const string& label, const string& OP,
965 Uncertainty unc, unsigned int mapIndex)
966{
967 // MC efficiency retrieval identifying the requested calibration object by name.
968 // The return value is a (value, uncertainty) pair, as documented above, and will
969 // be a dummy value in case an error occurs.
970 //
971 // variables: object holding kinematic (and other) information needed to compute the result
972 // label: jet flavour label
973 // OP: tagger operating point
974 // unc: keyword indicating what uncertainties to evaluate
975 // mapIndex: index to the efficiency map to be used
976
977 unsigned int index;
978 if (! retrieveCalibrationIndex (label, OP, variables.jetAuthor, false, index, mapIndex)) {
979 cerr << "getMCEfficiency: unable to find Eff calibration for object " << fullName(variables.jetAuthor, OP, label, false, mapIndex) << endl;
980 // Return a dummy result if the object is not found
982 }
983
985 return (getMCEfficiency(variables, index, unc, result) == Analysis::kError) ?
987}
988
989//________________________________________________________________________________
992 unsigned int index, Uncertainty unc) //const
993{
994 // MC efficiency retrieval identifying the requested calibration object by index.
995 // The return value is a (value, uncertainty) pair, as documented above, and will
996 // be a dummy value in case an error occurs.
997 //
998 // variables: object holding kinematic (and other) information needed to compute the result
999 // index: index to calibration object
1000 // unc: keyword indicating what uncertainties to evaluate
1001
1003 return (getMCEfficiency(variables, index, unc, result) == Analysis::kError) ?
1005}
1006
1007//________________________________________________________________________________
1010 unsigned int index, Uncertainty unc,
1012{
1013 // MC efficiency retrieval identifying the requested calibration object by index.
1014 //
1015 // variables: object holding kinematic (and other) information needed to compute the result
1016 // index: index to calibration object
1017 // unc: keyword indicating what uncertainties to evaluate
1018 // result: (value, uncertainty) variation pair.
1019 // A dummy value will be returned in case of an error.
1020
1022 if (! container) return Analysis::kError;
1023
1024 // perform out-of-bound check of jet eta
1025 if (!checkAbsEta(variables, index)) {
1026 if (m_verbose)
1027 cerr << "Jet |eta| is outside of the boundary!" << endl;
1028 return Analysis::kRange;
1029 }
1030
1031
1032 // always retrieve the result itself
1033 double value;
1034 Analysis::CalibrationStatus status = container->getResult(variables, value);
1035 if (status == Analysis::kError) return status;
1036 if (m_otherStrategy == GiveUp)
1037 assert (status != Analysis::kRange); // no need to test also statDown
1038 else if (m_otherStrategy == Flag)
1039 if (status == Analysis::kRange)
1040 this->increaseCounter(index);
1041
1042 // retrieve the statistical uncertainty if desired
1043 double stat(0);
1044 if (unc == Total || unc == Statistical) {
1045 if (container->getStatUncertainty(variables, stat) == Analysis::kError) {
1046 cerr << "getMCEfficiency: error retrieving MC efficiency parameter covariance matrix!" << endl;
1047 return Analysis::kError;
1048 }
1049 }
1050
1051 // Temporary(?) hack: comment this out since the present MC results don't have "systematics" contributions
1052 // Analysis::UncertaintyResult resSyst(0,0);
1053 // if (unc == Total || unc == Systematic) {
1054 // if (container->getSystUncertainty(variables, resSyst) == Analysis::kError)
1055 // cerr << "getScaleFactor: error retrieving Scale factor parameter covariance matrix!"
1056 // << endl;
1057 // }
1058
1059 // since there is no combination of stat/syst uncertainties to be made, comment this out too
1060 double uncertainty = stat; // combinedUncertainty(stat, resSyst);
1061 result.first = std::max(0., std::min(1., value));
1062 result.second = uncertainty;
1063
1064 return status;
1065}
1066
1067//====================== efficiency retrieval ==========================================
1068
1069//________________________________________________________________________________
1072 const string& label,
1073 const string& OP, Uncertainty unc, const std::string& flavour,
1074 unsigned int numVariation, unsigned int mapIndex)
1075{
1076 // Data efficiency retrieval identifying the requested calibration objects by name.
1077 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor.
1078 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
1079 // above, and will be a dummy value in case an error occurs.
1080 //
1081 // variables: object holding kinematic (and other) information needed to compute the result
1082 // label: jet flavour label
1083 // OP: tagger operating point
1084 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1085 // named variations are to be computed)
1086 // numVariation: variation index (in case of eigenvector or named variations)
1087 // mapIndex: index to the efficiency map to be used
1088
1089 unsigned int indexSF, indexEff;
1090 if (! (retrieveCalibrationIndex (label, OP, variables.jetAuthor, false, indexEff, mapIndex) &&
1091 retrieveCalibrationIndex (label, OP, variables.jetAuthor, true, indexSF))) {
1092 cerr << "getEfficiency: unable to find Eff calibration for object " << fullName(variables.jetAuthor, OP, label, false, mapIndex) << " or SF calibration for object " << fullName(variables.jetAuthor, OP, label, true) << endl;
1093 // Return a dummy result if the object is not found
1094 return Analysis::dummyResult;
1095 }
1096
1098 return (getEfficiency(variables, indexSF, indexEff, unc, numVariation, result, flavour) == Analysis::kError) ? Analysis::dummyResult : result;
1099}
1100
1101//________________________________________________________________________________
1104 unsigned int indexSF, unsigned int indexEff,
1105 Uncertainty unc, const std::string& flavour, unsigned int numVariation)
1106{
1107 // Data efficiency retrieval identifying the requested calibration objects by index.
1108 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor.
1109 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
1110 // above, and will be a dummy value in case an error occurs.
1111 //
1112 // variables: object holding kinematic (and other) information needed to compute the result
1113 // indexSF: index to scale factor calibration object
1114 // indexEff: index to MC efficiency object
1115 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1116 // named variations are to be computed)
1117 // numVariation: variation index (in case of eigenvector or named variations)
1118
1120 return (getEfficiency(variables, indexSF, indexEff, unc, numVariation, result, flavour) == Analysis::kError) ?
1122}
1123
1124//________________________________________________________________________________
1127 unsigned int indexSF, unsigned int indexEff,
1128 Uncertainty unc, unsigned int numVariation,
1129 Analysis::CalibResult& result, const std::string& flavour)
1130{
1131 // Data efficiency retrieval identifying the requested calibration objects by index.
1132 //
1133 // variables: object holding kinematic (and other) information needed to compute the result
1134 // indexSF: index to scale factor calibration object
1135 // indexEff: index to MC efficiency object
1136 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1137 // named variations are to be computed)
1138 // numVariation: variation index (in case of eigenvector or named variations)
1139 // result: (value, uncertainty) or (up, down) variation pair, depending on the unc value.
1140 // A dummy value will be returned in case of an error.
1141
1142 Analysis::CalibResult sfResult;
1143 Analysis::CalibrationStatus sfStatus = getScaleFactor(variables, indexSF, indexEff, unc, numVariation, sfResult, flavour);
1144 if (sfStatus == Analysis::kError) return sfStatus;
1145 Analysis::CalibResult effResult;
1146 Analysis::CalibrationStatus effStatus= getMCEfficiency(variables, indexEff, unc, effResult);
1147 if (effStatus == Analysis::kError) return effStatus;
1148
1149 double relative = 0;
1150 double value = effResult.first;
1151 if (TMath::Abs(sfResult.first) > Analysis::CalibZERO) {
1152 value = std::min(effResult.first*sfResult.first, 1.);
1153
1154 // Treat the scale factor variation cases separately since the contents of the CalibResult are different
1155 // (e.g. 'value' above contains the upward variation)
1156 if (unc == SFEigen || unc == SFNamed) {
1157 double valueDown = effResult.first*sfResult.second;
1158 result.first = value; // up/down variataions of data-efficiency
1159 result.second = valueDown;
1160 return sfStatus;
1161 }
1162 if (value > 0.) {
1163 relative = effResult.second/effResult.first;
1164 double sfRelative = sfResult.second/sfResult.first;
1165 /*
1166 cout << "sferr=" << sfResult.second
1167 << "btag Calib relative=" << relative << " sfRelative=" << sfRelative << endl;
1168 */
1169 relative = TMath::Sqrt(sfRelative*sfRelative + relative*relative);
1170 }
1171 } else {
1172 // now never happens due to protection of SF return value:
1173 cerr << "ERROR: CalibrationDataInterfaceROOT::getEfficiency: SF null result, SF=" << sfResult.first << " MC eff=" << effResult.first << "; setting SF=1." << endl;
1174 relative = Analysis::dummyValue;
1175 }
1176
1177 result.first = value;
1178 result.second = value*relative;
1179 // "Select" the status code for the actual calibration (it is subject to more constraints)
1180 return sfStatus;
1181}
1182
1183
1184//________________________________________________________________________________
1187 const string& label,
1188 const string& OP, Uncertainty unc,
1189 unsigned int numVariation, unsigned int mapIndex)
1190{
1191 // Inefficiency scale factor retrieval identifying the requested calibration objects by name.
1192 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor;
1193 // the inefficiency scale factor is then computed as the ratio of data to MC inefficiencies.
1194 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
1195 // above, and will be a dummy value in case an error occurs.
1196 //
1197 // variables: object holding kinematic (and other) information needed to compute the result
1198 // label: jet flavour label
1199 // OP: tagger operating point
1200 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1201 // named variations are to be computed)
1202 // numVariation: variation index (in case of eigenvector or named variations)
1203 // mapIndex: index to the efficiency map to be used
1204
1205 unsigned int indexSF, indexEff;
1206 if (! (retrieveCalibrationIndex (label, OP, variables.jetAuthor, false, indexEff, mapIndex) &&
1207 retrieveCalibrationIndex (label, OP, variables.jetAuthor, true, indexSF))) {
1208 cerr << "getInefficiencyScaleFactor: unable to find Eff calibration for object "
1209 << fullName(variables.jetAuthor, OP, label, false, mapIndex)
1210 << " or SF calibration for object "
1211 << fullName(variables.jetAuthor, OP, label, true) << endl;
1212 // Return a dummy result if the object is not found
1213 return Analysis::dummyResult;
1214 }
1215
1217 return (getInefficiencyScaleFactor(variables, indexSF, indexEff, unc, numVariation, result, label) == Analysis::kError) ?
1219}
1220
1221//________________________________________________________________________________
1224 unsigned int indexSF, unsigned int indexEff,
1225 Uncertainty unc, const std::string& flavour, unsigned int numVariation)
1226{
1227 // Inefficiency scale factor retrieval identifying the requested calibration objects by index.
1228 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor;
1229 // the inefficiency scale factor is then computed as the ratio of data to MC inefficiencies.
1230 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
1231 // above, and will be a dummy value in case an error occurs.
1232 //
1233 // variables: object holding kinematic (and other) information needed to compute the result
1234 // indexSF: index to scale factor calibration object
1235 // indexEff: index to MC efficiency object
1236 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1237 // named variations are to be computed)
1238 // numVariation: variation index (in case of eigenvector or named variations)
1239
1241 return (getInefficiencyScaleFactor(variables, indexSF, indexEff, unc, numVariation, result, flavour) == Analysis::kError) ?
1243}
1244
1245//________________________________________________________________________________
1248 unsigned int indexSF, unsigned int indexEff,
1249 Uncertainty unc, unsigned int numVariation,
1250 Analysis::CalibResult& result, const std::string& flavour)
1251{
1252 // Inefficiency scale factor retrieval identifying the requested calibration objects by index.
1253 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor;
1254 // the inefficiency scale factor is then computed as the ratio of data to MC inefficiencies.
1255 //
1256 // variables: object holding kinematic (and other) information needed to compute the result
1257 // indexSF: index to scale factor calibration object
1258 // indexEff: index to MC efficiency object
1259 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1260 // named variations are to be computed)
1261 // numVariation: variation index (in case of eigenvector or named variations)
1262 // result: (value, uncertainty) or (up, down) variation pair, depending on the unc value.
1263 // A dummy value will be returned in case of an error.
1264
1265 Analysis::CalibResult sfResult;
1266 Analysis::CalibrationStatus sfStatus = getScaleFactor(variables, indexSF, indexEff, unc, numVariation, sfResult, flavour);
1267 if (sfStatus == Analysis::kError) return sfStatus;
1268 Analysis::CalibResult effResult;
1269 Analysis::CalibrationStatus effStatus= getMCEfficiency(variables, indexEff, unc, effResult);
1270 if (effStatus == Analysis::kError) return effStatus;
1271
1272 double eff = std::min(effResult.first, 1.);
1273 // double efferr = effResult.second; // not needed as (per the code change indicated below) we are not doing anything with MC statistical uncertainties
1274 double sf = sfResult.first;
1275 double sferr = sfResult.second;
1276
1277 double val = 0.; // Analysis::dummyValue;
1278 double err = 0.; // Analysis::dummyValue;
1279 if (1. - eff > CalibZERO) {
1280 // Protect against negative scale factors
1281 val = std::max((1. - eff*sf), CalibZERO) / (1. - eff);
1282 // Treat the scale factor variation cases separately since the contents of the CalibResult are different
1283 // ('sf' and 'sferr' above contain the upward and downward variations, respectively).
1284 if (unc == SFEigen || unc == SFNamed) {
1285 double valDown = std::max((1. - eff*sferr), CalibZERO) / (1. - eff);
1286 result.first = val;
1287 result.second = valDown;
1288 return sfStatus;
1289 }
1290 // When using eigenvector (or named) variations (as above), only scale factor variations are considered.
1291 // For the sake of consistency, it has been decided (see https://its.cern.ch/jira/browse/AFT-350) to remove them also when EV variations aren't used
1292 //err = pow((1. - sf) / (1. - eff) * efferr, 2) + pow(eff*sferr, 2);
1293 err = pow(eff*sferr, 2);
1294 if (err > 0.)
1295 err = 1./(1. - eff) * TMath::Sqrt(err);
1296 // cout << "btag Calib Ineff err=" << err << endl;
1297 }
1298
1299 result.first = std::max(CalibZERO, val);
1300 result.second = err;
1301 // "Select" the status code for the actual calibration (it is subject to more constraints)
1302 return sfStatus;
1303}
1304
1305//________________________________________________________________________________
1308 const string& label,
1309 const string& OP, Uncertainty unc,
1310 unsigned int numVariation, unsigned int mapIndex)
1311{
1312 // Data inefficiency retrieval identifying the requested calibration objects by name.
1313 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor;
1314 // the inefficiency is then computed as the 1 minus the efficiency.
1315 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
1316 // above, and will be a dummy value in case an error occurs.
1317 //
1318 // variables: object holding kinematic (and other) information needed to compute the result
1319 // label: jet flavour label
1320 // OP: tagger operating point
1321 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1322 // named variations are to be computed)
1323 // numVariation: variation index (in case of eigenvector or named variations)
1324 // mapIndex: index to the efficiency map to be used
1325
1326 unsigned int indexSF, indexEff;
1327 if (! (retrieveCalibrationIndex (label, OP, variables.jetAuthor, false, indexEff, mapIndex) &&
1328 retrieveCalibrationIndex (label, OP, variables.jetAuthor, true, indexSF))) {
1329 cerr << "getInefficiency: unable to find Eff calibration for object "
1330 << fullName(variables.jetAuthor, OP, label, false, mapIndex)
1331 << " or SF calibration for object "
1332 << fullName(variables.jetAuthor, OP, label, true) << endl;
1333 // Return a dummy result if the object is not found
1334 return Analysis::dummyResult;
1335 }
1336
1338 return (getInefficiency(variables, indexSF, indexEff, unc, numVariation, result, label) == Analysis::kError) ?
1340}
1341
1342//________________________________________________________________________________
1345 unsigned int indexSF, unsigned int indexEff,
1346 Uncertainty unc, const std::string& flavour, unsigned int numVariation)
1347{
1348 // Data inefficiency retrieval identifying the requested calibration objects by index.
1349 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor;
1350 // the inefficiency is then computed as the 1 minus the efficiency.
1351 // The return value is either a (value, uncertainty) or an (up, down) variation pair, as documented
1352 // above, and will be a dummy value in case an error occurs.
1353 //
1354 // variables: object holding kinematic (and other) information needed to compute the result
1355 // indexSF: index to scale factor calibration object
1356 // indexEff: index to MC efficiency object
1357 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1358 // named variations are to be computed)
1359 // numVariation: variation index (in case of eigenvector or named variations)
1360
1362 return (getInefficiency(variables, indexSF, indexEff, unc, numVariation, result, flavour) == Analysis::kError) ?
1364}
1365
1366//________________________________________________________________________________
1369 unsigned int indexSF, unsigned int indexEff,
1370 Uncertainty unc, unsigned int numVariation,
1371 Analysis::CalibResult& result, const std::string& flavour)
1372{
1373 // Data inefficiency retrieval identifying the requested calibration objects by index.
1374 // The data efficiency is computed as the product of MC efficiency and data/MC efficiency scale factor;
1375 // the inefficiency is then computed as the 1 minus the efficiency.
1376 //
1377 // variables: object holding kinematic (and other) information needed to compute the result
1378 // indexSF: index to scale factor calibration object
1379 // indexEff: index to MC efficiency object
1380 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1381 // named variations are to be computed)
1382 // numVariation: variation index (in case of eigenvector or named variations)
1383 // result: (value, uncertainty) or (up, down) variation pair, depending on the unc value.
1384 // A dummy value will be returned in case of an error.
1385
1386 Analysis::CalibResult sfResult;
1387 Analysis::CalibrationStatus sfStatus = getScaleFactor(variables, indexSF, indexEff, unc, numVariation, sfResult, flavour);
1388 if (sfStatus == Analysis::kError) return sfStatus;
1389 Analysis::CalibResult effResult;
1390 Analysis::CalibrationStatus effStatus= getMCEfficiency(variables, indexEff, unc, effResult);
1391 if (effStatus == Analysis::kError) return effStatus;
1392
1393 double val = std::max(0., 1. - effResult.first * sfResult.first);
1394 double err = 0.; // Analysis::dummyValue;
1395
1396 // Bail out here if not both results are strictly positive
1397 if (effResult.first <= 0. || sfResult.first <= 0.) return Analysis::kError;
1398
1399 // Treat the scale factor variation cases separately since the contents of the CalibResult are different
1400 // (e.g. 'val' above contains the upward variation)
1401 if (unc == SFEigen || unc == SFNamed) {
1402 double valDown = std::max(0., 1. - effResult.first*sfResult.second);
1403
1404 result.first = val;
1405 result.second = valDown;
1406 } else {
1407 // safer than pow(x, 2):
1408 err = effResult.second/effResult.first*effResult.second/effResult.first
1409 + sfResult.second/sfResult.first*sfResult.second/sfResult.first;
1410 err = val*TMath::Sqrt(err);
1411
1412 result.first = std::max(0., std::min(1., val));
1413 result.second = err;
1414 }
1415
1416 // "Select" the status code for the actual calibration (it is subject to more constraints)
1417 return sfStatus;
1418}
1419
1420//________________________________________________________________________________
1423 const string& label, const string& OP,
1424 Uncertainty unc, unsigned int mapIndex)
1425{
1426 // Data inefficiency retrieval identifying the requested calibration objects by name.
1427 // The inefficiency is computed as the 1 minus the efficiency.
1428 // The return value is a (value, uncertainty), as documented above, and will be a dummy value
1429 // in case an error occurs.
1430 //
1431 // variables: object holding kinematic (and other) information needed to compute the result
1432 // label: jet flavour label
1433 // OP: tagger operating point
1434 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1435 // named variations are to be computed)
1436 // numVariation: variation index (in case of eigenvector or named variations)
1437 // mapIndex: index to the efficiency map to be used
1438
1439 Analysis::CalibResult effResult = getMCEfficiency(variables, label, OP, unc, mapIndex);
1440 return std::make_pair(std::max(0., 1. - effResult.first), effResult.second);
1441}
1442
1443//________________________________________________________________________________
1446 unsigned int index, Uncertainty unc)
1447{
1448 // MC inefficiency retrieval identifying the requested calibration object by index.
1449 // The inefficiency is computed as the 1 minus the efficiency.
1450 // The return value is a (value, uncertainty), as documented above, and will be a dummy value
1451 // in case an error occurs.
1452 //
1453 // variables: object holding kinematic (and other) information needed to compute the result
1454 // index: index to MC efficiency object
1455 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1456 // named variations are to be computed)
1457 // numVariation: variation index (in case of eigenvector or named variations)
1458
1459 Analysis::CalibResult effResult = getMCEfficiency(variables, index, unc);
1460 return std::make_pair(std::max(0., 1. - effResult.first), effResult.second);
1461}
1462
1463//________________________________________________________________________________
1464double
1466 unsigned indexSF, unsigned int indexEff) const
1467{
1468 // Retrieve the MC/MC scale factor given the set of scale factor and efficiency indices.
1469 // variables: object holding kinematic (and other) information needed to compute the result
1470 // indexSF: index to scale factor calibration object
1471 // indexEff: index to MC efficiency object
1472
1473 // If either reference doesn't exist, or if they are the same, nothing can / needs to be done.
1474 int indexSFRef = m_hadronisationReference[indexSF], indexEffRef = m_hadronisationReference[indexEff];
1475 if (indexSFRef < 0 || indexEffRef < 0 || indexSFRef == indexEffRef) return 1;
1476
1477 // Verify also that the individual efficiencies are physically meaningful.
1478 double effSFRef; m_objects[indexSFRef]->getResult(variables, effSFRef);
1479 double effEffRef; m_objects[indexEffRef]->getResult(variables, effEffRef);
1480 return (effSFRef > 0 && effEffRef > 0) ? effSFRef/effEffRef : 1;
1481}
1482
1483//________________________________________________________________________________
1486 const string& label, Uncertainty unc,
1487 unsigned int numVariation, unsigned int mapIndex)
1488{
1489 // #1
1490 // Tag weight fraction scale factor retrieval identifying the requested calibration object by name.
1491 // The return value is either a (value, uncertainty) or (if eigenvector or named variations are specified)
1492 // an (up, down) variation pair, and will be a dummy value in case an error occurs.
1493 // Note that in contrast to the "regular" (non-continuous) case, the computation of the scale factor in
1494 // general needs the (selection- or even process-specific) MC tag weight fractions, in order to rescale
1495 // scale factors. This is used to ensure that the tag weight fractions (both in data and in MC) sum up to
1496 // unity for each given kinematic bin.
1497 //
1498 // variables: object holding kinematic (and other) information needed to compute the result
1499 // label: jet flavour label
1500 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1501 // named variations are to be computed)
1502 // numVariation: variation index (in case of eigenvector or named variations)
1503 // mapIndex: index to the MC efficiency map to be used for scale factor rescaling
1504
1505 static const string cont("Continuous");
1506
1507 unsigned int indexSF, indexEff;
1508 if (! (retrieveCalibrationIndex (label, cont, variables.jetAuthor, false, indexEff, mapIndex) &&
1509 retrieveCalibrationIndex (label, cont, variables.jetAuthor, true, indexSF))) {
1510 cerr << "getWeightScaleFactor: unable to find Eff calibration for object "
1511 << fullName(variables.jetAuthor, cont, label, false, mapIndex)
1512 << " or SF calibration for object "
1513 << fullName(variables.jetAuthor, cont, label, true) << endl;
1514 return Analysis::dummyResult;
1515 }
1516
1518 return (getWeightScaleFactor(variables, indexSF, indexEff, unc, numVariation, result) == Analysis::kError) ? Analysis::dummyResult : result;
1519}
1520
1521//________________________________________________________________________________
1524 unsigned int indexSF, unsigned int indexEff,
1525 Uncertainty unc, unsigned int numVariation)
1526{
1527 // #2
1528 // Tag weight fraction scale factor retrieval identifying the requested calibration object by index.
1529 // The return value is either a (value, uncertainty) or (if eigenvector or named variations are specified)
1530 // an (up, down) variation pair, and will be a dummy value in case an error occurs.
1531 // Note that in contrast to the "regular" (non-continuous) case, the computation of the scale factor in
1532 // general needs the (selection- or even process-specific) MC tag weight fractions, in order to rescale
1533 // scale factors. This is used to ensure that the tag weight fractions (both in data and in MC) sum up to
1534 // unity for each given kinematic bin.
1535 //
1536 // variables: object holding kinematic (and other) information needed to compute the result
1537 // indexSF: index to calibration object
1538 // indexEff: index to MC tag weight
1539 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1540 // named variations are to be computed)
1541 // numVariation: variation index (in case of eigenvector or named variations)
1542
1544 return (getWeightScaleFactor(variables, indexSF, indexEff, unc, numVariation, result) == Analysis::kError) ?
1546}
1547
1548//________________________________________________________________________________
1551 unsigned int indexSF, unsigned int indexEff,
1552 Uncertainty unc, unsigned int numVariation,
1554{
1555 // #3
1556 // Tag weight fraction scale factor retrieval identifying the requested calibration object by index.
1557 // Note that in contrast to the "regular" (non-continuous) case, the computation of the scale factor in
1558 // general needs the (selection- or even process-specific) MC tag weight fractions, in order to rescale
1559 // scale factors. This is used to ensure that the tag weight fractions (both in data and in MC) sum up to
1560 // unity for each given kinematic bin.
1561 //
1562 // variables: object holding kinematic (and other) information needed to compute the result
1563 // indexSF: index to calibration object
1564 // indexEff: index to MC tag weight
1565 // unc: keyword indicating what uncertainties to evaluate (or whether eigenvector or
1566 // named variations are to be computed)
1567 // numVariation: variation index (in case of eigenvector or named variations)
1568 // result: (value, uncertainty) or (up, down) variation pair, depending on the unc value.
1569 // A dummy value will be returned in case of an error.
1571 if (! container) return Analysis::kError;
1572 CalibrationDataContainer* effContainer = m_objects[indexEff];
1573 if (! effContainer) return Analysis::kError;
1574
1575 // the first time this combination of scale factor and "efficiency" objects is given, check on the
1576 // scale factors that will result from their combination (where the computations reproduce those
1577 // shown below)
1578 checkWeightScaleFactors(indexSF, indexEff);
1579
1580 // perform out-of-bound check of jet eta
1581 if (!checkAbsEta(variables, indexSF)) {
1582 if (m_verbose)
1583 cerr << "Jet |eta| is outside of the boundary!" << endl;
1584 return Analysis::kRange;
1585 }
1586
1587 // Always retrieve the result itself
1588 double value;
1589 Analysis::CalibrationStatus status = container->getResult(variables, value);
1590 if (status == Analysis::kError) return status;
1591 if (m_otherStrategy == GiveUp) assert (status != Analysis::kRange);
1592 else if (m_otherStrategy == GiveUpExtrapolated) assert (status != Analysis::kExtrapolatedRange);
1593 else if (m_otherStrategy == Flag) {
1594 if (status == Analysis::kRange)
1595 increaseCounter(indexSF);
1596 else if (status == Analysis::kExtrapolatedRange)
1597 increaseCounter(indexSF, Extrapolated);
1598 }
1599
1600 // Retrieve the reference MC tag weight fraction (corresponding to the calibration scale factors)
1601 Analysis::UncertaintyResult refMCResult(0,0);
1602 if (container->getUncertainty("MCreference", variables, refMCResult) == Analysis::kError)
1603 return Analysis::kError;
1604 double fracMCref = refMCResult.first;
1605 // Retrieve the MC reference information, if requested (the initialisation below is to make sure
1606 // that no exceptions in the code will be needed)
1607 double fracSFref = fracMCref, fracEffref = fracMCref;
1608 if (m_useMCMCSF) {
1609 int indexSFref = m_hadronisationReference[indexSF], indexEffref = m_hadronisationReference[indexEff];
1610 if (indexSFref < 0 || indexEffref < 0) {
1611 cerr << "getWeightScaleFactor: error: generator-specific corrections requested but necessary reference containers lacking " << endl;
1612 return Analysis::kError;
1613 } else {
1614 m_objects[indexSFref]->getResult(variables, fracSFref);
1615 m_objects[indexEffref]->getResult(variables, fracEffref);
1616 if (! (fracSFref > 0. && fracEffref > 0.)) {
1617 cerr << "getWeightScaleFactor: error: invalid reference tag weight fraction " <<fracSFref <<" " <<fracEffref << std::endl;
1618 return Analysis::kError;
1619 }
1620 }
1621 }
1622
1623 // Retrieve the MC tag weight fraction for the sample we need to reweight to
1624 double fracMCnew;
1625 Analysis::CalibrationStatus effStatus = effContainer->getResult(variables, fracMCnew);
1626 if (effStatus == Analysis::kError) return effStatus;
1627 if (m_otherStrategy == GiveUp) assert (effStatus != Analysis::kRange);
1628 else if (m_otherStrategy == Flag)
1629 if (effStatus == Analysis::kRange) increaseCounter(indexEff);
1630 // since we need to divide by this quantity, check that it is well-defined
1631 if (!(fracMCnew > 0.) and m_useTopologyRescaling) {// but we only care if using topology rescaling
1632 cerr << "getWeightScaleFactor: error: null fracMCnew would lead to invalid operation" << endl;
1633 return Analysis::kError;
1634 }
1635
1636 if (!m_runEigenVectorMethod && (unc == SFEigen || unc == SFNamed)) {
1637 cerr << "getWeightScaleFactor: ERROR. Trying to call eigenvector method but initialization not switched on in b-tagging .env config file." << endl;
1638 cerr << " Please correct your .env config file first. Nominal uncertainties used. " << endl;
1639 }
1640
1641 if (unc == SFEigen || unc == SFNamed) {
1642 std::shared_ptr<CalibrationDataEigenVariations> eigenVariation;
1643 try {
1644 eigenVariation = m_eigenVariationsMap.at(container);
1645 } catch (const std::out_of_range&) {
1646 cerr << "getWeightScaleFactor: could not retrieve eigenvector variation, while it should have been there." << endl;
1647 return Analysis::kError;
1648 }
1649 unsigned int maxVariations = (unc == SFEigen) ? eigenVariation->getNumberOfEigenVariations() : eigenVariation->getNumberOfNamedVariations();
1650 if (numVariation > maxVariations-1) {
1651 cerr << "getWeightScaleFactor: asked for " << ((unc == SFEigen) ? "eigenvariation" : "named variation") << " number: " << numVariation << " but overall number of available variations is: " << maxVariations << endl;
1652 return Analysis::kError;
1653 }
1654 TH1* up=0;
1655 TH1* down=0;
1656 bool isOK = (unc == SFEigen) ? eigenVariation->getEigenvectorVariation(numVariation,up,down) : eigenVariation->getNamedVariation(numVariation,up,down);
1657 if (!isOK) {
1658 cerr << "getWeightScaleFactor: Eigenvector object is there but cannot retrieve up and down uncertainty histograms." << endl;
1659 return Analysis::kError;
1660 }
1661 // the 'extrapolation' uncertainty (always a named one) needs a somewhat special treatment
1662 bool extrapolate = ( unc == SFNamed ) ? eigenVariation->isExtrapolationVariation(numVariation) : false;
1663
1664 double valueUp;
1665 double valueDown;
1666 Analysis::CalibrationStatus statusUp = container->getResult(variables, valueUp, up, extrapolate);
1667 Analysis::CalibrationStatus statusDown = container->getResult(variables, valueDown,down, extrapolate);
1668 if (statusUp == Analysis::kError || statusDown == Analysis::kError) return Analysis::kError;
1669
1670 // now carry out the rescaling. Protect against unphysical or suspiciously large scale factors
1671 double variationUp = valueUp - value;
1672 double variationDown = valueDown - value;
1673 // First step: from the calibration sample to its reference sample
1674 if (m_useTopologyRescaling) value = 1.0 + (value - 1.0) * (fracMCref / fracSFref);
1675 // Second step: from the calibration reference sample to the MC object's reference sample
1676 if (m_useMCMCSF) value *= (fracSFref / fracEffref);
1677 // Third step: from the MC object's reference sample to the MC sample itself
1678 if (m_useTopologyRescaling) value = 1.0 + (value - 1.0) * (fracEffref / fracMCnew);
1679 // Since all transformations of the scale factor itself are linear, the transformation of the variations is simpler.
1681 double f = (fracMCref / fracMCnew);
1682 variationUp *= f;
1683 variationDown *= f;
1684 } else if (m_useMCMCSF) {
1685 double f = (fracSFref / fracEffref);
1686 variationUp *= f;
1687 variationDown *= f;
1688 }
1689 valueUp = value + variationUp;
1690 valueDown = value + variationDown;
1691 if (valueUp < 0) {
1692 valueUp = 0; increaseCounter(indexSF, TagWeight);
1693 } else if (valueUp > m_maxTagWeight) {
1694 valueUp = m_maxTagWeight; increaseCounter(indexSF, TagWeight);
1695 }
1696 if (valueDown < 0) {
1697 valueDown = 0; increaseCounter(indexSF, TagWeight);
1698 } else if (valueDown > m_maxTagWeight) {
1699 valueDown = m_maxTagWeight; increaseCounter(indexSF, TagWeight);
1700 }
1701
1702 result.first = valueUp;
1703 result.second = valueDown;
1704 return statusUp;
1705 } //end eigenvector method
1706
1707 //Proceed with no-eigenvector result
1708
1709 // retrieve the statistical uncertainty if desired
1710 double stat(0);
1711 if (unc == Total || unc == Statistical) {
1712 if (container->getStatUncertainty(variables, stat) == Analysis::kError) {
1713 cerr << "getWeightScaleFactor: error retrieving Scale factor parameter covariance matrix!" << endl;
1714 return Analysis::kError;
1715 }
1716 }
1717 Analysis::UncertaintyResult uncertaintyResult(0,0);
1718 if (unc == Total || unc == Systematic) {
1719 if (container->getSystUncertainty(variables, uncertaintyResult) == Analysis::kError) {
1720 cerr << "getWeightScaleFactor: error retrieving Scale factor parameter systematic uncertainty!" << endl;
1721 return Analysis::kError;
1722 }
1723 } else if (unc == Extrapolation) {
1724 // this uncertainty is special, since it is not normally to be combined into the overall systematic uncertainty
1725 if (container->getUncertainty("extrapolation", variables, uncertaintyResult) == Analysis::kError)
1726 cerr << "getWeightScaleFactor: error retrieving Scale factor parameter extrapolation uncertainty!" << endl;
1727 } else if (unc == TauExtrapolation) {
1728 // also this uncertainty is special, since it it singles out an uncertainty relevant only for tau "jets",
1729 // and some care has to be taken not to duplicate or omit uncertainties
1730 if (container->getUncertainty("extrapolation from charm", variables, uncertaintyResult) == Analysis::kError)
1731 cerr << "getWeightScaleFactor: error retrieving Scale factor parameter extrapolation uncertainty!" << endl;
1732 }
1733
1734 double uncertainty = combinedUncertainty(stat, uncertaintyResult);
1735
1736 // Now carry out the rescaling. Again protect against unphysical or suspiciously large scale factors
1737 // First step: from the calibration sample to its reference sample
1738 if (m_useTopologyRescaling) value = 1.0 + (value - 1.0) * (fracMCref / fracSFref);
1739 // Second step: from the calibration reference sample to the MC object's reference sample
1740 if (m_useMCMCSF) value *= (fracSFref / fracEffref);
1741 // Third step: from the MC object's reference sample to the MC sample itself
1742 if (m_useTopologyRescaling) value = 1.0 + (value - 1.0) * (fracEffref / fracMCnew);
1743 if (value < 0) {
1744 value = 0; increaseCounter(indexSF, TagWeight);
1745 } else if (value > m_maxTagWeight) {
1746 value = m_maxTagWeight; increaseCounter(indexSF, TagWeight);
1747 }
1748 // Since all transformations of the scale factor itself are linear, the transformation of the uncertainty is simpler.
1750 uncertainty *= (fracMCref / fracMCnew);
1751 } else if (m_useMCMCSF) {
1752 uncertainty *= (fracSFref / fracEffref);
1753 }
1754
1755 result.first = std::max(0., value);
1756 result.second = uncertainty;
1757 // "Select" the status code for the actual calibration object (it is subject to more constraints)
1758 return status;
1759}
1760
1761//________________________________________________________________________________
1762void
1764 unsigned int indexEff)
1765{
1766 // Check the tag weight scale factors that would result from the combination of
1767 // the provided scale factor and MC tag weight objects.
1768 // The way this is done is by determining the binning that would apply to the
1769 // combination of the two individual inputs, and then by explicitly computing
1770 // the scale factors in each of these resulting bins.
1771
1772 std::vector<std::pair<unsigned int, unsigned int> >::const_iterator it = std::find(m_checkedWeightScaleFactors.begin(), m_checkedWeightScaleFactors.end(), std::make_pair(indexSF, indexEff));
1773 if (it != m_checkedWeightScaleFactors.end()) return;
1774
1775
1776 // Assume that only histogram containers are involved here (this should be the case
1777 // as at least a strict tag weight binning should be applied).
1779 if (! container) {
1780 cerr << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: error: container for object " << nameFromIndex(indexSF) << " not found!" << endl;
1781 return;
1782 } else if (! container->GetValue("MCreference")) {
1783 cerr << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: error: no MCreference histogram for object " << nameFromIndex(indexSF) << "!" << endl;
1784 return;
1785 }
1786 CalibrationDataHistogramContainer* effContainer = dynamic_cast<CalibrationDataHistogramContainer*>(m_objects[indexEff]);
1787 if (! effContainer) {
1788 cerr << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: error: container for object " << nameFromIndex(indexEff) << " not found!" << endl;
1789 return;
1790 }
1791
1792 // Retrieve the variable types and corresponding bin boundaries
1793 std::vector<unsigned int> vars = container->getVariableTypes();
1794 std::vector<unsigned int> effVars = effContainer->getVariableTypes();
1795 // Retrieve the corresponding bin boundaries
1796 std::map<unsigned int, std::vector<double> > boundaries, effBoundaries, mergedBoundaries;
1797 for (unsigned int t = 0; t < vars.size(); ++t)
1798 boundaries[vars[t]] = container->getBinBoundaries(vars[t]);
1799 for (unsigned int t = 0; t < effVars.size(); ++t)
1800 effBoundaries[effVars[t]] = effContainer->getBinBoundaries(effVars[t]);
1801
1802 // Special case: handle |eta| versus eta differences, by transforming to the latter
1803 if (boundaries.find(CalibrationDataContainer::kEta) == boundaries.end() && boundaries.find(CalibrationDataContainer::kAbsEta) != boundaries.end()) {
1805 boundaries.erase(CalibrationDataContainer::kAbsEta);
1806 }
1807 if (effBoundaries.find(CalibrationDataContainer::kEta) == effBoundaries.end() && effBoundaries.find(CalibrationDataContainer::kAbsEta) != effBoundaries.end()) {
1809 effBoundaries.erase(CalibrationDataContainer::kAbsEta);
1810 }
1811 if (boundaries.find(CalibrationDataContainer::kEta) != boundaries.end() && effBoundaries.find(CalibrationDataContainer::kEta) != effBoundaries.end()) {
1812 std::vector<double>& v = boundaries[CalibrationDataContainer::kEta];
1813 std::vector<double>& vEff = effBoundaries[CalibrationDataContainer::kEta];
1814 if (v[0] < 0 && vEff[0] >= 0) {
1815 // in this case, supplement the positive entries in vEff with their negative analogues
1816 std::vector<double> vtmp(vEff);
1817 for (std::vector<double>::iterator it = vtmp.begin(); it != vtmp.end(); ++it)
1818 if (*it > 0) vEff.insert(vEff.begin(), -(*it));
1819 } else if (v[0] >= 0 && vEff[0] < 0) {
1820 // in this case, supplement the positive entries in v with their negative analogues
1821 std::vector<double> vtmp(v);
1822 for (std::vector<double>::iterator it = vtmp.begin(); it != vtmp.end(); ++it)
1823 if (*it > 0) v.insert(v.begin(), -(*it));
1824 }
1825 }
1826
1827 // Now that the individual sets of boundaries have been determined, merge these
1828 for (unsigned int t = 0; t < vars.size(); ++t) {
1829 if (effBoundaries.find(vars[t]) == effBoundaries.end())
1830 // Variables not present in the efficiency object can go in unmodified
1831 mergedBoundaries[vars[t]] = boundaries[vars[t]];
1832 else {
1833 // Merge the boundaries for variables existing in both objects.
1834 // Take the MC array as a starting point, as it's likely to be the longest.
1835 mergedBoundaries[vars[t]] = effBoundaries[vars[t]];
1836
1837 for (std::vector<double>::iterator it = boundaries[vars[t]].begin(); it != boundaries[vars[t]].end(); ++it) {
1838 std::vector<double>::iterator itcmp = mergedBoundaries[vars[t]].begin();
1839 // Iterate until we've found a value in the target array equal to
1840 // or larger than the given element
1841 while (itcmp != mergedBoundaries[vars[t]].end() &&
1842 (! CalibrationDataContainer::isNearlyEqual(*itcmp, *it)) &&
1843 *itcmp < *it) ++itcmp;
1844 // Nothing needs to be done if the values are "nearly identical"
1845 // (or if we don't find such an element).
1846 if (itcmp == mergedBoundaries[vars[t]].end() || CalibrationDataContainer::isNearlyEqual(*itcmp, *it)) continue;
1847 // Otherwise insert the given element (this can mean adding to the end)
1848 mergedBoundaries[vars[t]].insert(itcmp, *it);
1849 }
1850 }
1851 }
1852 // Variables not present in the scale factor object still need to go in
1853 for (unsigned int t = 0; t < effVars.size(); ++t)
1854 if (boundaries.find(effVars[t]) == boundaries.end())
1855 mergedBoundaries[effVars[t]] = effBoundaries[effVars[t]];
1856
1857 // Carry out a rudimentary cross-check of the tag weight bin
1858 // (the binning used for the scale factor and MC objects should be identical).
1859 if (boundaries.find(CalibrationDataContainer::kTagWeight) == boundaries.end()) {
1860 cerr << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: " << "no tag weight axis found for object " << nameFromIndex(indexSF) << endl;
1861 } else if (effBoundaries.find(CalibrationDataContainer::kTagWeight) == effBoundaries.end()) {
1862 cerr << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: " << "no tag weight axis found for object " << nameFromIndex(indexEff) << endl;
1863 } else if (boundaries[CalibrationDataContainer::kTagWeight].size() != effBoundaries[CalibrationDataContainer::kTagWeight].size()) {
1864 cerr << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: " << "different tag weight binning for objects " << nameFromIndex(indexSF) << " (";
1865 std::vector<double>& v = boundaries[CalibrationDataContainer::kTagWeight];
1866 for (unsigned int ib = 0; ib < v.size()-1; ++ib) cerr << v[ib] << ",";
1867 cerr << v[v.size()-1] << ") and " << nameFromIndex(indexEff) << " (";
1868 v = effBoundaries[CalibrationDataContainer::kTagWeight];
1869 for (unsigned int ib = 0; ib < v.size()-1; ++ib) cerr << v[ib] << ",";
1870 cerr << v[v.size()-1] << ") do not match!" << endl;
1871 } else {
1872 // Make sure that (possibly) dummy vectors exist for _all_ known variables
1873 // (this is a mere technicality allowing to loop over all variables explicitly).
1874 mergedBoundaries.try_emplace(CalibrationDataContainer::kPt, std::vector<double>{20.,300.});
1875 mergedBoundaries.try_emplace(CalibrationDataContainer::kEta, std::vector<double>{-2.5, 2.5});
1876
1877 // Finally, carry out the cross-check that all this is about: recompute the scale factor
1878 // in each pseudo-bin
1879 if (m_verbose){
1880 cout << "CalibrationDataInterfaceROOT::checkWeightScaleFactors: cross-checking scale factors for objects " << nameFromIndex(indexSF) << " and " << nameFromIndex(indexEff) << "\n" << std::setfill('-') << std::setw(100) << "-" << endl;
1881 cout << std::setfill(' ');
1882 }
1884 std::vector<double>& vPt = mergedBoundaries[CalibrationDataContainer::kPt], vEta = mergedBoundaries[CalibrationDataContainer::kEta], vTagWeight = mergedBoundaries[CalibrationDataContainer::kTagWeight];
1885 for (unsigned int ipt = 0; ipt < vPt.size()-1; ++ipt) {
1886 x.jetPt = (vPt[ipt] + vPt[ipt+1]) * 500.; // account for MeV -> GeV conversion
1887 for (unsigned int ieta = 0; ieta < vEta.size()-1; ++ieta) {
1888 x.jetEta = (vEta[ieta] + vEta[ieta+1]) / 2.;
1889 for (unsigned int iwt = 0; iwt < vTagWeight.size()-1; ++iwt) {
1890 x.jetTagWeight = (vTagWeight[iwt] + vTagWeight[iwt+1]) / 2.;
1891 // Retrieve the central scale factor value and the old and new MC tag weight fractions
1892 double value;
1893 container->getResult(x, value);
1894 Analysis::UncertaintyResult uncertaintyResult(0,0);
1895 container->getUncertainty("MCreference", x, uncertaintyResult);
1896 double fracMCref = uncertaintyResult.first;
1897 double fracMCnew;
1898 effContainer->getResult(x, fracMCnew);
1899 // Compute the new scale factor value
1900 if (!(fracMCnew > 0.)) {
1901 cout << "\tfor (pt=" << x.jetPt << ",eta=" << x.jetEta << ",tagweight=" << x.jetTagWeight << "): invalid new MC fraction: " << fracMCnew << endl;
1902 } else {
1903 double newvalue = 1.0 + (value - 1.0) * fracMCref/fracMCnew;
1904 if (newvalue <= 0 || newvalue > m_maxTagWeight) cout << "\tfor (pt=" << x.jetPt << ",eta=" << x.jetEta << ",tagweight=" << x.jetTagWeight << "): old (value=" << value << ",MC=" << fracMCref << "), new (value=" << newvalue << ",MC=" << fracMCnew << ")" << endl;
1905 }
1906 }
1907 }
1908 }
1909 }
1910
1911 m_checkedWeightScaleFactors.push_back(std::make_pair(indexSF, indexEff));
1912}
1913
1914//________________________________________________________________________________
1915bool
1917 unsigned int index)
1918{
1919 // Check whether the jet eta value is outside the range of validity, subject to the strategy
1920 // specified in the configuration file.
1921 bool pass = true;
1922 if (m_absEtaStrategy == Ignore) return pass;
1923
1924 switch (m_absEtaStrategy) {
1925 case GiveUp:
1926 if (std::fabs(variables.jetEta) > m_maxAbsEta) {
1927 pass = false;
1928 }
1929 break;
1930 case Flag:
1931 default:
1932 if (std::fabs(variables.jetEta) > m_maxAbsEta) {
1934 }
1935
1936 }
1937 return pass;
1938}
1939
1940//________________________________________________________________________________
1941std::string
1943{
1944 // Return the object name corresponding to the given index.
1945
1946 for (std::map<std::string, unsigned int>::const_iterator it = m_objectIndices.begin();
1947 it != m_objectIndices.end(); ++it)
1948 if (it->second == index) return it->first;
1949
1950 // This should never happen..
1951 return string("");
1952}
1953
1954//________________________________________________________________________________
1955void
1957 OutOfBoundsType oob)
1958{
1959 // Internal method bumping the relevant counter out-of-bounds counter for the specified object.
1960 //
1961 // oob: further classification of out-of-bounds case
1962 // index: object index
1963
1964 // make sure the vectors are appropriately dimensioned
1965 if (index >= m_mainCounters.size()) {
1966 unsigned int minsize = (index == 0) ? 2 : 2*index;
1967 m_mainCounters.resize(minsize, 0);
1968 m_etaCounters.resize(minsize, 0);
1969 m_extrapolatedCounters.resize(minsize, 0);
1970 }
1971 switch (oob) {
1972 case Main:
1973 m_mainCounters[index]++; break;
1974 case Eta:
1975 m_etaCounters[index]++; break;
1976 case Extrapolated:
1977 default:
1979 }
1980}
1981
1982//________________________________________________________________________________
1983std::vector<string>
1985 const string& label,
1986 const string& OP,
1987 bool named)
1988{
1989 // Retrieve the sources of uncertainty relevant for the given scale factor calibration object,
1990 // identifying the object by name.
1991 //
1992 // author: jet collection name
1993 // label: jet flavour label
1994 // OP: tagger working point
1995 // named: if false, an unsorted list of sources of uncertainties will be returned.
1996 // if true, only 'named' uncertainties will be returned, and the position in
1997 // the vector that is the return value determines the 'numVariation' index
1998 // that is to be used if named variations are to be retrieved.
1999
2000 unsigned int index;
2001 if (! retrieveCalibrationIndex (label, OP, author, true, index)) {
2002 // Return a dummy result if the object is not found
2003 cerr << "listScaleFactorUncertainties: unable to find SF calibration for object " << fullName(author, OP, label, true) << endl;
2004 std::vector<string> dummy;
2005 return dummy;
2006 }
2008}
2009
2010//________________________________________________________________________________
2011std::vector<string>
2013 const std::string& flavour, bool named)
2014{
2015 // Note: this method already works on a per-flavour basis, so passing flavour in is a simple addition
2016 // Method is called primarily from within the BTaggingEfficiencyTool - W.L.
2017 // Retrieve the sources of uncertainty relevant for the given scale factor calibration object,
2018 // identifying the object by index.
2019 //
2020 // index: index to scale factor calibration object
2021 // named: if false, an unsorted list of sources of uncertainties will be returned.
2022 // if true, only 'named' uncertainties will be returned, and the position in
2023 // the vector that is the return value determines the 'numVariation' index
2024 // that is to be used if named variations are to be retrieved.
2025
2026 std::vector<string> dummy;
2028
2029 if (container) {
2030 if (named) {
2031 // Find out which uncertainties are excluded from eigenvector construction
2032 if (! m_runEigenVectorMethod) return dummy;
2033 std::shared_ptr<CalibrationDataEigenVariations> eigenVariation=m_eigenVariationsMap.at(container);
2034 if (m_EVStrategy == Analysis::Uncertainty::SFEigen){
2035 std::vector<string> unordered = eigenVariation->listNamedVariations(); // this is for the regular EV
2036 std::vector<string> ordered(unordered.size());
2037 for (unsigned int i = 0; i < unordered.size(); ++i) {
2038 ordered[eigenVariation->getNamedVariationIndex(unordered[i])] = unordered[i];
2039 }
2040 return ordered;
2041 } else if (m_EVStrategy == Analysis::Uncertainty::SFGlobalEigen){
2042 // here we want to get the named uncertainties from the global eigenvariations flavour container specifically...
2043 std::shared_ptr<CalibrationDataGlobalEigenVariations> GEV = std::dynamic_pointer_cast<CalibrationDataGlobalEigenVariations>(eigenVariation);
2044 std::vector<std::string> unordered = GEV->listNamedVariations(flavour);
2045 std::vector<std::string> ordered(unordered.size()); // ordered by the NAMED VARIATION (internal) ORDERING
2046 for (unsigned int i = 0; i < unordered.size(); ++i) {
2047 ordered[GEV->getNamedVariationIndex(unordered[i], flavour)] = unordered[i];
2048 }
2049 return ordered;
2050 }
2051 }
2052 return container->listUncertainties(); // return this if not named
2053 }
2054
2055 return dummy;
2056}
2057
2058//________________________________________________________________________________
2059unsigned int
2061 const std::string& label,
2062 const std::string& OP,
2063 Uncertainty unc)
2064{
2065 // Retrieve the number of eigenvector variations or named variations relevant for
2066 // the given scale factor calibration object, identifying the object by name.
2067 //
2068 // author: jet collection name
2069 // label: jet flavour label
2070 // OP: tagger working point
2071 // unc: should be set to SFEigen or SFNamed for the cases of
2072 // eigenvector variations or named variations, respectively
2073
2074 unsigned int index;
2075
2076 if (! retrieveCalibrationIndex (label, OP, author, true, index)) return 0;
2077 return getNumVariations(index, unc, label);
2078}
2079
2080//________________________________________________________________________________
2081unsigned int
2083 Uncertainty unc, const std::string& flavour)
2084{
2085 // Retrieve the number of eigenvector variations or named variations relevant for
2086 // the given scale factor calibration object, identifying the object by index.
2087 //
2088 // index: index to calibration scale factor object
2089 // unc: should be set to SFEigen or SFNamed for the cases of
2090 // eigenvector variations or named variations, respectively
2091
2092 if (! (unc == SFEigen || unc == SFNamed || unc == SFGlobalEigen)) return 0;
2094 if (! container) return 0;
2095 std::shared_ptr<CalibrationDataEigenVariations> eigenVariation=m_eigenVariationsMap.at(container);
2096 if (unc == SFGlobalEigen){
2097 std::shared_ptr<CalibrationDataGlobalEigenVariations> GEV = std::dynamic_pointer_cast<CalibrationDataGlobalEigenVariations>(eigenVariation);
2098 return GEV->getNumberOfEigenVariations(flavour);
2099 }
2100 return (unc == SFEigen) ? eigenVariation->getNumberOfEigenVariations() : eigenVariation->getNumberOfNamedVariations();
2101}
2102
2103//________________________________________________________________________________
2104const TH1*
2106 const std::string& label,
2107 const std::string& OP)
2108{
2109 // Retrieve the actual histogrammed calibration scale factors, identifying the object by name.
2110 //
2111 // author: jet collection name
2112 // label: jet flavour label
2113 // OP: tagger working point
2114
2115 unsigned int index;
2116 if (! retrieveCalibrationIndex (label, OP, author, true, index)) {
2117 // Return a dummy result if the object is not found
2118 cerr << "getBinnedScaleFactors: unable to find SF calibration for object " << fullName(author, OP, label, true) << endl;
2119 return 0;
2120 }
2122 return (container) ? dynamic_cast<TH1*>(container->GetValue("result")) : 0;
2123}
2124
2125//________________________________________________________________________________
2126const TObject*
2128 const std::string& label,
2129 const std::string& OP,
2130 unsigned int mapIndex)
2131{
2132 // Retrieve the actual central values object for the MC efficiences, identifying the object by name.
2133 // The object returned can be either a TH1 or a TF1; it is up to the user to determine which.
2134 //
2135 // author: jet collection name
2136 // label: jet flavour label
2137 // OP: tagger working point
2138 // mapIndex: index to the efficiency map to be used
2139
2140 unsigned int index;
2141 if (! retrieveCalibrationIndex (label, OP, author, false, index, mapIndex)) {
2142 // Return a dummy result if the object is not found
2143 cerr << "getMCEfficiencyObject: unable to find efficiency calibration for object "
2144 << fullName(author, OP, label, false, mapIndex) << endl;
2145 return 0;
2146 }
2148 return (container) ? container->GetValue("result") : 0;
2149}
2150
2151//====================== retrieval of shifted calibration object ===========================
2152
2153//________________________________________________________________________________
2154const TH1*
2156 const std::string& label,
2157 const std::string& OP,
2158 const std::string& unc,
2159 double sigmas)
2160{
2161 // Retrieve the actual histogrammed calibration scale factors, identifying the object by name
2162 // and with the scale factors shifted by the uncertainties due to the given source of uncertainty
2163 // (where bin-to-bin correlations are accounted for, i.e., shifts may be either positive or negative).
2164 //
2165 // author: jet collection name
2166 // label: jet flavour label
2167 // OP: tagger working point
2168 // unc: source of uncertainty to consider
2169 // sigmas: number of standard deviations by which to shift the scale factor central values
2170
2171 // quick sanity check
2172 if (unc == "comment" || unc == "result" || unc == "combined" || unc == "statistics") return 0;
2173
2174 unsigned int index;
2175 if (! retrieveCalibrationIndex (label, OP, author, true, index)) {
2176 // Return a null result if the object is not found
2177 cerr << "getShiftedScaleFactors: unable to find SF calibration for object " << fullName(author, OP, label, true) << endl;
2178 return nullptr;
2179 }
2181 if (! container) return nullptr;
2182
2183 TH1* result = dynamic_cast<TH1*>(container->GetValue("result"));
2184 TH1* hunc = dynamic_cast<TH1*>(container->GetValue(unc.c_str()));
2185 // another sanity check...
2186 if ((! hunc) || (! result)) return nullptr;
2187 if (hunc->GetDimension() != result->GetDimension() || hunc->GetNbinsX() != result->GetNbinsX() ||
2188 hunc->GetNbinsX() != result->GetNbinsX() || hunc->GetNbinsX() != result->GetNbinsX())
2189 return nullptr;
2190 // also check that the uncertainty is to be treated as correlated from bin to bin
2191 // (for the variation is applied coherently, which isn't appropriate for uncertainties
2192 // that aren't correlated from bin to bin)
2193 if (! container->isBinCorrelated(unc)) return 0;
2194
2195 // if everything is consistent, the actual operation simply consists of adding histograms...
2196 std::string name(container->GetName()); name += "_"; name += unc; name += "_";
2197 TH1* shifted = dynamic_cast<TH1*>(result->Clone(name.c_str()));
2198 if (not shifted) return nullptr;
2199 shifted->Add(hunc, sigmas);
2200 return shifted;
2201}
2202//====================== run EigenVectorRecomposition method ===========================
2205 const std::string& label,
2206 const std::string& OP,
2207 unsigned int mapIndex){
2208 // run eigen vector recomposition method. If success, stored the retrieved coefficient map
2209 // in m_coefficientMap and return success. Otherwise return error and keep m_coefficientMap
2210 // untouched.
2211 // author: jet collection name
2212 // label: jet flavour label
2213 // OP: tagger working point
2214 // mapIndex: index to the MC efficiency map to be used. Should be 0?
2215 // Todo: What is mapindex?
2216 // Todo: Check the way xAODBTaggingTool initialize CDI. Check if that is the as how we are initialize CDI.
2218 cerr << "runEigenVectorRecomposition: Recomposition need to be ran with CalibrationDataInterfaceRoot initialized in eigenvector mode" << endl;
2219 return Analysis::kError;
2220 }
2221
2222 unsigned int indexSF;
2223 if (! retrieveCalibrationIndex (label, OP, author, true, indexSF, mapIndex)) {
2224 cerr << "runEigenVectorRecomposition: unable to find SF calibration for object "
2225 << fullName(author, OP, label, true) << endl;
2226 return Analysis::kError;
2227 }
2228
2229 return runEigenVectorRecomposition (label, indexSF);
2230}
2231
2234 unsigned int indexSF){
2235 // run eigen vector recomposition method. If success, stored the retrieved coefficient map
2236 // in m_coefficientMap and return success. Otherwise return error and keep m_coefficientMap
2237 // untouched.
2238 // label: jet flavour label
2239 // indexSF: index to scale factor calibration object
2241 if (! container) {
2242 cerr << "runEigenVectorRecomposition: error retrieving container!" << endl;
2243 return Analysis::kError;
2244 }
2245
2246 // Retrieve eigenvariation
2247 std::shared_ptr<CalibrationDataEigenVariations> eigenVariation;
2248 try {
2249 eigenVariation = m_eigenVariationsMap.at(container);
2250 } catch (const std::out_of_range&) {
2251 cerr << "runEigenVectorRecomposition: Could not retrieve eigenvector variation, while it should have been there." << endl;
2252 return Analysis::kError;
2253 }
2254 // Doing eigenvector recomposition
2255 std::map<std::string, std::map<std::string, float>> coefficientMap;
2256 if(!eigenVariation->EigenVectorRecomposition(label, coefficientMap))
2257 return Analysis::kError;
2258
2259 m_coefficientMap = std::move(coefficientMap);
2260 return Analysis::kSuccess;
2261}
2262
2263std::map<std::string, std::map<std::string, float>>
2265 if(m_coefficientMap.empty())
2266 cerr << "getCoefficientMap: Call runEigenVectorRecomposition() before retrieving coefficient map! " <<endl;
2267 return m_coefficientMap;
2268}
2269
2270
2271//====================== put some utility functions here ===================================
2272
2273namespace {
2274 // Construct the (diagonal) covariance matrix for the statistical uncertainties on the "ref" results
2275 TMatrixDSym getStatCovarianceMatrix(const TH1* hist) {
2276 Int_t nbinx = hist->GetNbinsX()+2, nbiny = hist->GetNbinsY()+2, nbinz = hist->GetNbinsZ()+2;
2277 Int_t rows = nbinx;
2278 if (hist->GetDimension() > 1) rows *= nbiny;
2279 if (hist->GetDimension() > 2) rows *= nbinz;
2280 TMatrixDSym stat(rows);
2281 for (Int_t binx = 1; binx < nbinx; ++binx){
2282 for (Int_t biny = 1; biny < nbiny; ++biny){
2283 for (Int_t binz = 1; binz < nbinz; ++binz) {
2284 Int_t bin = hist->GetBin(binx, biny, binz);
2285 double err = hist->GetBinError(bin);
2286 stat(bin, bin) = err*err;
2287 }
2288 }
2289 }
2290 return stat;
2291 }
2292
2293 // Construct the covariance matrix assuming that histogram "unc" contains systematic uncertainties
2294 // pertaining to the "ref" results, and that the uncertainties are fully correlated from bin to bin
2295 // (unless option "doCorrelated" is false, in which bins are assumed uncorrelated)
2296 TMatrixDSym getSystCovarianceMatrix(const TH1* ref, const TH1* unc, bool doCorrelated, const std::string& uncname, int tagWeightAxis) {
2297 Int_t nbinx = ref->GetNbinsX()+2, nbiny = ref->GetNbinsY()+2, nbinz = ref->GetNbinsZ()+2;
2298 Int_t rows = nbinx;
2299 if (ref->GetDimension() > 1) rows *= nbiny;
2300 if (ref->GetDimension() > 2) rows *= nbinz;
2301 TMatrixDSym cov(rows);
2302
2303 for(int i=0 ; i<10 ; i++){
2304 Int_t bin = unc->GetBin(1,i,1);
2305 double uncval = unc->GetBinContent(bin);
2306 cout << uncval << ", ";
2307 } cout << endl;
2308
2309 // Carry out a minimal consistency check
2310 if (unc->GetNbinsX()+2 != nbinx || unc->GetNbinsY()+2 != nbiny || unc->GetNbinsZ()+2 != nbinz || unc->GetDimension() != ref->GetDimension()) {
2311 std::cout << "getSystCovarianceMatrix: inconsistency found in histograms " << ref->GetName() << " and " << unc->GetName() << " : " << uncname << std::endl;
2312 return cov;
2313 }
2314
2315 // // the "2" below doesn't actually imply that two bins are used...
2316 // // this is just to make the loops below work
2317 // if (ref->GetDimension() <= 1) nbiny = 2;
2318 // if (ref->GetDimension() <= 2) nbinz = 2;
2319
2320 // Special case: uncertainties not correlated from bin to bin.
2321 // The exception here is for tag weight bins, which are always assumed to be fully correlated.
2322 if (! doCorrelated) {
2323 if (tagWeightAxis < 0) {
2324 // truly uncorrelated uncertainties
2325 for (Int_t binx = 1; binx < nbinx-1; ++binx){
2326 for (Int_t biny = 1; biny < nbiny-1; ++biny){
2327 for (Int_t binz = 1; binz < nbinz-1; ++binz) {
2328 Int_t bin = ref->GetBin(binx, biny, binz);
2329 double err = unc->GetBinContent(bin);
2330 cov(bin,bin) = err*err;
2331 }
2332 }
2333 }
2334 return cov;
2335 } else if (tagWeightAxis == 0) {
2336 // continuous histogram with tag weight X axis
2337 for (Int_t biny = 1; biny < nbiny-1; ++biny){
2338 for (Int_t binz = 1; binz < nbinz-1; ++binz){
2339 for (Int_t binx = 1; binx < nbinx-1; ++binx) {
2340 Int_t bin = ref->GetBin(binx, biny, binz);
2341 double err = unc->GetBinContent(bin);
2342 for (Int_t binx2 = 1; binx2 < nbinx-1; ++binx2) {
2343 Int_t bin2 = ref->GetBin(binx2, biny, binz);
2344 double err2 = unc->GetBinContent(bin2);
2345 cov(bin,bin2) = err*err2;
2346 }
2347 }
2348 }
2349 }
2350 return cov;
2351 } else if (tagWeightAxis == 1) {
2352 // continuous histogram with tag weight Y axis
2353 for (Int_t binx = 1; binx < nbinx-1; ++binx){
2354 for (Int_t binz = 1; binz < nbinz-1; ++binz){
2355 for (Int_t biny = 1; biny < nbiny-1; ++biny) {
2356 Int_t bin = ref->GetBin(binx, biny, binz);
2357 double err = unc->GetBinContent(bin);
2358 for (Int_t biny2 = 1; biny2 < nbiny-1; ++biny2) {
2359 Int_t bin2 = ref->GetBin(binx, biny2, binz);
2360 double err2 = unc->GetBinContent(bin2);
2361 cov(bin,bin2) = err*err2;
2362 }
2363 }
2364 }
2365 }
2366 return cov;
2367 } else if (tagWeightAxis == 2) {
2368 // continuous histogram with tag weight Z axis
2369 for (Int_t binx = 1; binx < nbinx-1; ++binx){
2370 for (Int_t biny = 1; biny < nbiny-1; ++biny){
2371 for (Int_t binz = 1; binz < nbinz-1; ++binz) {
2372 Int_t bin = ref->GetBin(binx, biny, binz);
2373 double err = unc->GetBinContent(bin);
2374 for (Int_t binz2 = 1; binz2 < nbinz-1; ++binz2) {
2375 Int_t bin2 = ref->GetBin(binx, biny, binz2);
2376 double err2 = unc->GetBinContent(bin2);
2377 cov(bin,bin2) = err*err2;
2378 }
2379 }
2380 }
2381 }
2382 return cov;
2383 }
2384 }
2385
2386 for (Int_t binx = 1; binx < nbinx-1; ++binx){
2387 for (Int_t biny = 1; biny < nbiny-1; ++biny){
2388 for (Int_t binz = 1; binz < nbinz-1; ++binz) {
2389 Int_t bin = ref->GetBin(binx, biny, binz);
2390 double err = unc->GetBinContent(bin); // <------------- For every bin in the "ref" ("result") TH1*, GetBinContents of the corresponding uncertainty bin
2391 for (Int_t binx2 = 1; binx2 < nbinx-1; ++binx2){
2392 for (Int_t biny2 = 1; biny2 < nbiny-1; ++biny2){
2393 for (Int_t binz2 = 1; binz2 < nbinz-1; ++binz2) {
2394 Int_t bin2 = ref->GetBin(binx2, biny2, binz2);
2395 double err2 = unc->GetBinContent(bin2); // <------- Grab the unc contents of every bin, and compute the covariance matrix element
2396 cov(bin, bin2) = err*err2; // <------- err1 and err2 are the uncertainty content of the bins, so "cov" is real, symmetric
2397 } // <------- "cov" would imply that the "hunc" histogram stores "x - E[x]" differences from the mean. So in the end, it computes the covariance (as a sum of these)
2398 }
2399 }
2400 }
2401 }
2402 }
2403 return cov;
2404 }
2405
2406}
2407
2408//====================== retrieval of calibration covariance matrix ========================
2409
2410//________________________________________________________________________________
2411TMatrixDSym
2413 const std::string& label,
2414 const std::string& OP,
2415 const std::string& unc)
2416{
2417 // Return the scale factor covariance matrix for the given calibration object.
2418 // This function is deprecated since its functionality is duplicated in the
2419 // CalibrationDataEigenVariations class.
2420 //
2421 // author: jet collection name
2422 // label: jet flavour label
2423 // OP: tagger working point
2424 // unc: source of uncertainty to consider
2425 // Catch issues with the specified input as early as possible
2426 TMatrixDSym dummy;
2427 if (unc == "comment" || unc == "result" || unc == "combined") return dummy;
2428
2429 unsigned int index;
2430 if (! retrieveCalibrationIndex (label, OP, author, true, index)) {
2431 // Return a dummy result if the object is not found
2432 cerr << "getScaleFactorCovarianceMatrix: unable to find SF calibration for object " << fullName(author, OP, label, true) << endl;
2433 return dummy;
2434 }
2436 if (!container) return dummy;
2437
2438 // retrieve the central calibration and its axes
2439 TH1* result = dynamic_cast<TH1*>(container->GetValue("result"));
2440 if (! result) return dummy;
2441 // "normal" case: single source of uncertainty
2442 if (unc != "all") {
2443 if (unc == "statistics") {
2444 return getStatCovarianceMatrix(result);
2445 } else {
2446 TH1* hunc = dynamic_cast<TH1*>(container->GetValue(unc.c_str()));
2447 if (! hunc) {
2448 cout << "getScaleFactorCovarianceMatrix: no uncertainty object found "
2449 << "corresponding to name " << unc << endl;
2450 return dummy;
2451 }
2452 return getSystCovarianceMatrix(result, hunc, container->isBinCorrelated(unc), unc, container->getTagWeightAxis());
2453 }
2454 }
2455
2456 // special case: complete covariance matrix. This is to be constructed
2457 // as the sum over all individual contributions.
2458 // First, treat the statistics separately (as above)
2459 TMatrixDSym cov = getStatCovarianceMatrix(result);
2460
2461 // Then loop through the list of (other) uncertainties
2462 std::vector<string> uncs = container->listUncertainties();
2463 for (unsigned int t = 0; t < uncs.size(); ++t) {
2464 if (uncs[t] == "comment" || uncs[t] == "result" || uncs[t] == "combined" ||
2465 uncs[t] == "statistics" || uncs[t]=="extrapolation" || uncs[t]=="MChadronisation" ||
2466 uncs[t]=="ReducedSets" || uncs[t]=="systematics") continue;
2467 TH1* hunc = dynamic_cast<TH1*>(container->GetValue(uncs[t].c_str()));
2468 if (not hunc) {
2469 std::cerr<<"Analysis::CalibrationDataInterfaceROOT::getScaleFactorCovarianceMatrix : dynamic cast failed\n";
2470 continue;
2471 }
2472 TMatrixDSym syst_cov = getSystCovarianceMatrix(result, hunc, container->isBinCorrelated(uncs[t]), uncs[t], container->getTagWeightAxis());
2473 cov += syst_cov;
2474 }
2475
2476 return cov;
2477}
2478
2479//________________________________________________________________________________
2480void
2481Analysis::CalibrationDataInterfaceROOT::initialize(const string& jetauthor, const string& OP, Uncertainty unc)
2482{
2483 // Preload objects necessary so that the input calibration file can be closed.
2484 // This functionality is only needed when using PROOF.
2485
2486 if((!m_fileEff)||(!m_fileSF)) {
2487 cerr << "initialize can only be called once per CalibrationDataInterfaceROOT object" << endl;
2488 return;
2489 } else {
2490 cout << "initializing BTagCalibrationDataInterfaceROOT for PROOF with jetAuthor = " << jetauthor << ", tagger = " << m_taggerName << ", operating point = " << OP << ", uncertainty = " << unc << endl;
2491 }
2492
2493 CalibrationDataVariables BTagVars;
2494 BTagVars.jetAuthor = jetauthor;
2495 BTagVars.jetPt = 100000.; //Irrelevant, just has to be valid to retrieve objects
2496 BTagVars.jetEta = 1.5; //Irrelevant, just has to be valid to retrieve objects
2497
2498 for(const auto& flavour : m_flavours){
2499 std::pair<double, double> BTagCalibResult;
2500 BTagCalibResult = getScaleFactor(BTagVars, flavour, OP, unc);
2501 std::cout << "CalibrationDataInterfaceROOT->initialize : BTagCalibResult " << std::endl;
2502
2503 std::pair<double, double> BTagCalibMCEff;
2504 BTagCalibMCEff = getMCEfficiency(BTagVars, flavour, OP, unc);
2505 std::cout << "CalibrationDataInterfaceROOT->initialize : BTagCalibMCEff " << std::endl;
2506 }
2507
2508 if (m_fileEff != m_fileSF) {
2509 m_fileEff->Close();
2510 delete m_fileEff;
2511 }
2512 m_fileSF->Close();
2513 delete m_fileSF;
2514 m_fileEff = 0; //prevents repeat deletion in destructor
2515 m_fileSF = 0; //prevents repeat deletion in destructor
2516}
2517
2518//________________________________________________________________________________
2520Analysis::CalibrationDataInterfaceROOT::retrieveContainer(const string& label, const string& OP, const string& author, const string& cntname, bool isSF, bool doPrint)
2521{
2522 // Attempt to retrieve the given container from file. Note that also the corresponding
2523 // "hadronisation" reference is retrieved (if possible and not yet done).
2524 //
2525 // dir: name of the directory containing the requested container
2526 // cntname: name of the requested container itself (not including the full path)
2527 // isSF: set to false (true) if the object is to be retrieved from the MC efficiencies
2528 // file (the calibration scale factor file). Note that it is assumed that scale
2529 // factor objects will always be retrieved from the calibration scale factor file.
2530 // doPrint: if true, print out some basic information about the successfully retrieved container
2531 // (note that this is typically steered by the m_verbose setting;
2532 // only for the retrieval of the maps used for MC/MC SF calculations, this printout is always switched off)
2533
2534 string dir = m_taggerName + "/" + getAlias(author) + "/" + OP + "/" + label;
2535 // construct the full object name
2536 string name = dir + "/" + cntname;
2537
2538 // If the object cannot be found, then each call will result in a new attempt to
2539 // retrieve the object from the ROOT file. Hopefully this will not happen too often...
2540 unsigned int idx = m_objectIndices[name] = m_objects.size();
2541 // CalibrationDataContainer* cnt =
2542 // dynamic_cast<CalibrationDataContainer*>((isSF ? m_fileSF : m_fileEff) ->Get(name.c_str()));
2544 (isSF ? m_fileSF : m_fileEff)->GetObject(name.c_str(), cnt);
2545 // If the requested object is a MC efficiency container and is not found, make a second attempt
2546 // to retrieve it from the calibration scale factor file. This will avoid the need to duplicate
2547 // efficiency containers so that the MC efficiency file needs to store only those containers
2548 // not already present in the calibration scale factor file. Of course this is meaningful only
2549 // if separate files are used to begin with.
2550 if (!isSF && !cnt && m_fileSF != m_fileEff) m_fileSF->GetObject(name.c_str(), cnt);
2551 m_objects.push_back(cnt);
2552 if (!cnt) {
2553 cerr << "btag Calib: retrieveContainer: failed to retrieve container named " << name.c_str() << " from file" << endl;
2554 return 0;
2555 }
2556
2557 // For successfully retrieved containers, also print some more information (implemented on user request)
2558 if (doPrint) {
2559 cout << "CalibrationDataInterface: retrieved container " << name << " (with comment: '" << cnt->getComment() << "' and hadronisation setting '" << cnt->getHadronisation() << "')" << endl;
2560 }
2561
2562
2563 // If the requested object is a MC efficiency container, make sure to retrieve the corresponding
2564 // calibration scale factor container first (a feature first thought to be necessary, erroneously,
2565 // but left in since this ordering should not hurt in any case).
2566 if (m_refMap.find(dir) == m_refMap.end()) {
2567 if (isSF) {
2568 // Retrieve the mapping objects from both files and merge their information using the 'helper' class.
2569 // The map resulting from this is used to retrieve the information required to compute MC/MC scale factors.
2570 string hadronisationRefs(dir + "/MChadronisation_ref");
2571 TMap* mapSF = 0; m_fileSF->GetObject(hadronisationRefs.c_str(), mapSF);
2572 TMap* mapEff = 0; if (m_fileEff != m_fileSF) m_fileEff->GetObject(hadronisationRefs.c_str(), mapEff);
2573 m_refMap[dir] = new HadronisationReferenceHelper(mapSF, mapEff);
2574 delete mapSF;
2575 delete mapEff;
2576 } else {
2577 string SFCalibName = getContainername(getBasename(dir), true);
2578 if (m_objectIndices.find(SFCalibName) == m_objectIndices.end()) retrieveContainer(label, OP, author, SFCalibName, true, doPrint);
2579 }
2580 }
2581
2582 // Attempt to find the corresponding hadronisation reference container needed for the application of
2583 // MC/MC scale factors.
2584 if (idx+1 > m_hadronisationReference.size()) m_hadronisationReference.resize(idx+1, -1);
2585 m_hadronisationReference[idx] = -1;
2586 string spec = cnt->getHadronisation();
2587 if (spec != "") {
2588 std::map<string, HadronisationReferenceHelper*>::const_iterator mapit = m_refMap.find(dir);
2589 if (mapit != m_refMap.end()) {
2590 string ref;
2591 if (mapit->second->getReference(spec, ref)) {
2592 // Retrieve the hadronisation reference if not already done. Note that the "isSF" is left unchanged:
2593 // this allows to retrieve the reference from the same file as the scale factor object. An exception
2594 // is the reference for the calibration scale factor object, which should always be obtained from
2595 // the scale factor file.
2596 // An efficiency container can be its own hadronisation reference (this is not "protected" against).
2597 string refname(dir + "/" + ref);
2598 std::map<string, unsigned int>::const_iterator it = m_objectIndices.find(refname);
2599 // If the reference cannot be found, assume that it hasn't yet been retrieved so attempt it now.
2600 if (it == m_objectIndices.end()) {
2601 // Omit the printout of container information here (the idea being that showing MC/MC SF information would confuse rather than help)
2602 retrieveContainer(label, OP, author, ref, isSF, false); it = m_objectIndices.find(refname);
2603 }
2604 m_hadronisationReference[idx] = it->second;
2605 }
2606 } else if (m_useMCMCSF) {
2607 cerr << "btag Calib: retrieveContainer: MC hadronisation reference map not found -- this should not happen!" << endl;
2608 }
2609 }
2611 // Not being able to construct the MC/MC scale factors will lead to a potential bias.
2612 // However, this is not considered sufficiently severe that we will flag it as an error.
2613 if (m_useMCMCSF){
2614 cerr << "btag Calib: retrieveContainer: warning: unable to apply MC/MC scale factors for container " << name << " with hadronisation reference = '" << spec << "'" << endl;
2615 }
2616 }
2617
2618 // Initialize the Eigenvector variation object corresponding to this object, if applicable. Notes:
2619 // - the dual use of "isSF" (both referring to the file and to the object, see above) requires another protection here
2620 // - the constructor's second argument is used to determine whether to exclude a pre-determined set of uncertainties from the EV decomposition
2621 //
2622 // We also want to separate behavior between SFEigen and SFGlobalEigen systematic strategies
2623 // The former requires a CalibrationDataEigenVariations object to be made per flavour.
2624 // The latter combines all corresponding flavours, so once it's been made for a single flavour, it's cached under all the corresponding "flavour containers"
2625 // simulataneously in m_eigenVariationsMap, and is checked for on each subsequent call to this method.
2626 if (m_runEigenVectorMethod && isSF && name.find("_SF") != string::npos) {
2627 CalibrationDataHistogramContainer* histoContainer=dynamic_cast<CalibrationDataHistogramContainer*>(cnt);
2628 if (histoContainer==0) {
2629 cerr << "Could not cast Container to a HistogramContainer. " << endl;
2630 return 0;
2631 }
2632 if (m_EVStrategy == Analysis::Uncertainty::SFEigen){
2634 std::shared_ptr<CalibrationDataEigenVariations> newEigenVariation(new CalibrationDataEigenVariations(m_filenameSF, m_taggerName, OP, author, histoContainer, m_useRecommendedEVExclusions));
2635 newEigenVariation->setVerbose(m_verbose);
2636
2637 // At this point we may also want to reduce the number of eigenvector variations.
2638 // The choices are stored with the container object; but first we need to know what flavour we are dealing with.
2639 string flavour = dir.substr(dir.find_last_of("/")+1);
2640
2641 for (const auto & entry : m_excludeFromCovMatrix[flavour]) {
2642 newEigenVariation->excludeNamedUncertainty(entry, cnt);
2643 }
2644 newEigenVariation->initialize();
2645 int to_retain = histoContainer->getEigenvectorReduction(m_EVReductions[flavour]); // returns the number of eigenvariations to retain as per the EV reduction strategy
2646 if (to_retain > -1) {
2647 if (m_verbose) cout << "btag Calib: reducing number of eigenvector variations for flavour " << flavour << " to " << to_retain << endl;
2648 // The merged variations will end up as the first entry in the specified list, i.e., as the last of the variations to be "retained"
2649 newEigenVariation->mergeVariationsFrom(size_t(to_retain-1)); // All variations stored with indices larger than this are merged
2650 } else if (m_EVReductions[flavour] != Loose) {
2651 cerr << "btag Calib: unable to retrieve eigenvector reduction information for flavour " << flavour << " and scheme " << m_EVReductions[flavour] << "; not applying any reduction" << endl;
2652 }
2653 m_eigenVariationsMap[cnt]=std::move(newEigenVariation);
2654
2656 } else if (m_EVStrategy == Analysis::Uncertainty::SFGlobalEigen) {
2658 std::map<const CalibrationDataContainer*, std::shared_ptr<CalibrationDataEigenVariations> >::iterator evit = m_eigenVariationsMap.find(cnt);
2659 // The global implementation internally combines all the "flavour containers" (containers that correspond to each other, only with different flavours)
2660 // But the CalibrationDataInterfaceROOT object doesn't need to know that, so we want to get all the flavour containers in one go here
2661 // and map them (with m_eigenVariationsMap) to the same CalibrationDataGlobalEigenVariations pointer.
2662 // Then, in methods like getScaleFactor, we call the virtual methods which will give the proper result e.g. if you want the SF for a b-jet, it'll call the
2663
2664 if (evit == m_eigenVariationsMap.end()){
2665 // now to see if it's completely empty or not
2666 if (m_eigenVariationsMap.empty()){
2667 std::shared_ptr<CalibrationDataGlobalEigenVariations> newEigenVariation(new CalibrationDataGlobalEigenVariations(m_filenameSF, m_taggerName, OP, author, m_flavours, histoContainer, m_useRecommendedEVExclusions));
2668 for (const auto & entry : m_excludeFromCovMatrix[label]) {
2669 newEigenVariation->excludeNamedUncertainty(entry, label); // <---- custom exclude named uncertainties method for global variations
2670 }
2671
2672 newEigenVariation->initialize();
2673
2674 // flavour loop to get the flavour reduction schemes and apply them
2675 for (std::string& flavour : m_flavours){
2676 int to_retain = histoContainer->getEigenvectorReduction(m_EVReductions[flavour]); // returns the number of eigenvariations to retain as per the EV reduction strategy
2677 if (to_retain > -1) {
2678 if (m_verbose) cout << "btag Calib: reducing number of eigenvector variations for flavour " << flavour << " to " << to_retain << endl;
2679 // The merged variations will end up as the first entry in the specified list, i.e., as the last of the variations to be "retained"
2680 newEigenVariation->mergeVariationsFrom(size_t(to_retain-1), flavour); // All variations stored with indices larger than this are merged
2681 } else if (m_EVReductions[flavour] != Loose) {
2682 cerr << "btag Calib: unable to retrieve eigenvector reduction information for flavour " << flavour << " and scheme " << m_EVReductions[flavour] << "; not applying any reduction" << endl;
2683 }
2684 }
2685
2686 m_eigenVariationsMap.insert({cnt, newEigenVariation});
2687 } else {
2688 // Need to point to the CDGEV object four times in the m_eigenVariationsMap to appease the CDIROOT backend design...
2689 // Ok, turns out I can't retrieve the containers from CDGEV and insert them directly, because I'd have to use the containers directly instead..
2690 // So the strategy is to just take the CGEV objects that are already in the map, and mpa the present container to it
2691 std::shared_ptr<CalibrationDataEigenVariations> previous_eigenvariation = m_eigenVariationsMap.begin()->second;
2692 m_eigenVariationsMap.insert({cnt, previous_eigenvariation});
2693 }
2694
2695 } else {
2696 std::cout << "CalibrationDataInterfaceROOT->retrieveContainer : the CDGEV object for " << name << " already exists! " << std::endl;
2697 }
2699 }
2700 }
2701
2702 return cnt;
2703}
2704
2705//________________________________________________________________________________
2706string
2708{
2709 // Return the alias for the given jet collection name, if an alias exists.
2710 // If this is not the case, the return value will simply equal the input jet collection name.
2711
2712 std::map<string,string>::const_iterator it = m_aliases.find(author);
2713 return (it == m_aliases.end()) ? author : it->second;
2714}
2715
2716//________________________________________________________________________________
2717string
2718Analysis::CalibrationDataInterfaceROOT::fullName(const string& author, const string& OP,
2719 const string& label, bool isSF,
2720 unsigned mapIndex) const
2721{
2722 // Construct the full calibration object's pathname within the calibration ROOT file.
2723 //
2724 // author: jet collection name
2725 // OP: tagger working point
2726 // label: jet flavour label
2727 // isSF: set to true (false) for scale factors (MC efficiencies)
2728 // mapIndex: index in the list of MC efficiency calibration objects
2729
2730 string flavour = (label == "N/A") ? "Light" : label;
2731 string full(m_taggerName + "/" + getAlias(author) + "/" + OP + "/" + flavour + "/");
2732 full += getContainername(flavour, isSF, mapIndex);
2733 // full += getAlias(author); full += "/";
2734 // string name = (isSF) ?
2735 // getBasename(OP, label, "_SF", true) :
2736 // getBasename(OP, label, "_Eff", false, mapIndex);
2737 // full += name;
2738 return full;
2739}
2740
2741//________________________________________________________________________________
2743{
2744 // Create the map from hadronisation specifications to reference container names for
2745 // a given ROOT file directory.
2746 //
2747 // mapSF: reference specification as extracted from calibration scale factor file
2748 // mapEff: reference specification as extracted from MC efficiency file
2749 // (null if the two files are identical)
2750
2751 // First take the scale factor file's map
2752 if (mapSF) {
2753 TMapIter next(mapSF); TObjString* spec;
2754 while ((spec = (TObjString*) next())) {
2755 TObjString* ref = (TObjString*) mapSF->GetValue(spec);
2756 m_refs[string(spec->GetName())] = string(ref->GetName());
2757 }
2758 }
2759 // Then do the same with the efficiency file's map. The result will be to override any
2760 // items from the SF file's map. An exception is made for the scale factor calibration object,
2761 // for which (for the sake of consistency) the SF reference must be retained.
2762 if (mapEff) {
2763 TMapIter next(mapEff); TObjString* spec;
2764 while ((spec = (TObjString*) next())) {
2765 TObjString* ref = (TObjString*) mapEff->GetValue(spec);
2766 m_refs[string(spec->GetName())] = string(ref->GetName());
2767 }
2768 }
2769}
2770
2771//________________________________________________________________________________
2772bool
2774 string& ref) const
2775{
2776 // Extract the reference histogram name corresponding to the given hadronisation specification (if existing).
2777 // The return value is used to indicate whether the specification could be found.
2778 //
2779 // spec: hadronisation specification
2780 // ref: container name corresponding to this specification
2781
2782 std::map<string, string>::const_iterator it = m_refs.find(spec);
2783 if (it == m_refs.end()) return false;
2784
2785 ref = it->second;
2786 return true;
2787}
const boost::regex ref(r_ef)
static const std::string hadronisationRefs("MChadronisation_ref")
ClassImp(Analysis::CalibrationDataInterfaceROOT) Analysis
std::vector< std::string > split(const std::string &str, const char token=';')
local utility function: split string into a vector of substrings separated by a specified separator,...
#define GEV
string trim(string s)
#define x
constexpr int pow(int base, int exp) noexcept
This is the interface for the objects to be stored in the calibration ROOT file.
static bool isNearlyEqual(double a, double b)
utility for comparison of doubles
std::vector< unsigned int > getVariableTypes()
utility to retrieve variable types
virtual CalibrationStatus getResult(const CalibrationDataVariables &x, double &result, TObject *obj=0, bool extrapolate=false)=0
retrieve the calibration result.
This is the class holding information for histogram-based calibration results.
virtual int getEigenvectorReduction(unsigned int choice) const
Retrieve the number of eigenvectors to be retained for the purpose of eigenvector variation reduction...
virtual std::vector< double > getBinBoundaries(unsigned int vartype)
Retrieve the bin boundaries for the specified variable type (which should be a CalibrationParametriza...
virtual CalibrationStatus getResult(const CalibrationDataVariables &x, double &result, TObject *obj=0, bool extrapolate=false)
retrieve the calibration result.
std::string m_taggerName
tagging algorithm name
void setEffCalibrationNames(const std::map< std::string, std::vector< std::string > > &names)
std::string getContainername(const std::string &flavour, bool SF, unsigned int mapIndex=0) const
auxiliary function for string concatenation
void setSFCalibrationNames(const std::map< std::string, std::string > &names)
std::string getBasename(const std::string &name) const
auxiliary function for retrieval of name within the directory
double combinedUncertainty(double stat, const std::pair< double, double > &syst) const
utility function for combination of statistical and (a priori asymmetric) systematic uncertainty.
bool getReference(const std::string &spec, std::string &ref) const
Retrieve the (full) name of the reference histogram, given the hadronisation specification.
std::map< std::string, std::string > m_refs
map from hadronisation specification to container name
This tool provides an interface to flavour tagging performance estimates.
bool m_verbose
if true, allow also for some informational (and not only error/warning) messages
std::vector< std::pair< unsigned int, unsigned int > > m_checkedWeightScaleFactors
bool m_useTopologyRescaling
specify whether or not to use MC/MC (topology) scale factors (also this steering option may be remove...
CalibResult getWeightScaleFactor(const CalibrationDataVariables &variables, const std::string &label, Uncertainty unc, unsigned int numVariation=0, unsigned int mapIndex=0)
efficiency scale factor retrieval by name
std::map< std::string, unsigned int > m_objectIndices
std::map< std::string, std::string > m_aliases
Do not attempt to persistify (PROOF)
TFile * m_fileSF
Do not attempt to persistify (PROOF)
void increaseCounter(unsigned int index, OutOfBoundsType oob=Main)
bool m_useRecommendedEVExclusions
if true, exclude pre-recommended lists of uncertainties from the covariance matrix building,...
std::map< std::string, HadronisationReferenceHelper * > m_refMap
the following maps (one for each directory) specify the name of the container serving as the 'hadroni...
TFile * m_fileEff
pointer to the TFile object providing access to the calibrations
CalibResult getInefficiencyScaleFactor(const CalibrationDataVariables &variables, const std::string &label, const std::string &OP, Uncertainty unc, unsigned int numVariation=0, unsigned int mapIndex=0)
"MC" inefficiency scale factor retrieval by name
std::string nameFromIndex(unsigned int index) const
Retrieve the name of the calibration object (container) given its index.
bool retrieveCalibrationIndex(const std::string &label, const std::string &OP, const std::string &author, bool isSF, unsigned int &index, unsigned int mapIndex=0)
Retrieve the index of the calibration object (container) starting from the label and operating point.
std::string fullName(const std::string &author, const std::string &OP, const std::string &label, bool isSF, unsigned mapIndex=0) const
@ brief construct the full object pathname from its individual components
const TH1 * getBinnedScaleFactors(const std::string &author, const std::string &label, const std::string &OP)
retrieve the binned calibration object for the given flavour label and operating point.
unsigned int getNumVariations(const std::string &author, const std::string &label, const std::string &OP, Uncertainty unc)
retrieve the number of variations relevant to the calibration object.
CalibrationDataContainer * retrieveContainer(const std::string &label, const std::string &OP, const std::string &author, const std::string &cntname, bool isSF, bool doPrint=true)
utility function taking care of object retrieval
bool m_runEigenVectorMethod
decide whether to run the eigenvector method or not
CalibrationDataInterfaceROOT()
default constructor for PROOF object retrieval
std::map< std::string, std::map< std::string, float > > m_coefficientMap
CalibrationStatus runEigenVectorRecomposition(const std::string &author, const std::string &label, const std::string &OP, unsigned int mapindex=0)
run EigenVector Recomposition method
CalibResult getInefficiency(const CalibrationDataVariables &variables, const std::string &label, const std::string &OP, Uncertainty unc, unsigned int numVariation=0, unsigned int mapIndex=0)
inefficiency retrieval by name
std::map< std::string, Analysis::EVReductionStrategy > m_EVReductions
Eigenvector reduction strategy (per flavour)
const TObject * getMCEfficiencyObject(const std::string &author, const std::string &label, const std::string &OP, unsigned int mapIndex=0)
retrieve the MC efficiency (central values) object for the given flavour label and operating point.
bool m_useMCMCSF
specify whether or not to use MC/MC (hadronisation) scale factors (the fact that this is steerable is...
std::vector< std::string > listScaleFactorUncertainties(const std::string &author, const std::string &label, const std::string &OP, bool named=false)
retrieve the list of "uncertainties" relevant to the calibration object.
CalibResult getScaleFactor(const CalibrationDataVariables &variables, const std::string &label, const std::string &OP, Uncertainty unc, unsigned int numVariation=0, unsigned int mapIndex=0)
efficiency scale factor retrieval by name.
std::map< std::string, std::map< std::string, float > > getEigenVectorRecompositionCoefficientMap()
Get Eigenvector recomposition map after running runEigenVectorRecomposition()
std::vector< int > m_hadronisationReference
store the 'hadronisation' reference for each object (-1 means no reference found)
TMatrixDSym getScaleFactorCovarianceMatrix(const std::string &author, const std::string &label, const std::string &OP, const std::string &unc="all")
retrieve the named covariance matrix element corresponding to the binned calibration object.
CalibResult getMCInefficiency(const CalibrationDataVariables &variables, const std::string &label, const std::string &OP, Uncertainty unc=None, unsigned int mapIndex=0)
"MC" inefficiency retrieval by name
const TH1 * getShiftedScaleFactors(const std::string &author, const std::string &label, const std::string &OP, const std::string &unc, double sigmas)
retrieve the binned calibration object for the given flavour label and operating point,...
CalibResult getMCEfficiency(const CalibrationDataVariables &variables, const std::string &label, const std::string &OP, Uncertainty unc=None, unsigned int mapIndex=0)
"MC" efficiency retrieval by name
std::string getAlias(const std::string &author) const
associated alias retrieval method
std::map< std::string, std::vector< std::string > > m_excludeFromCovMatrix
store the uncertainties which should be excluded from building the full covariance matrix
double getMCMCScaleFactor(const CalibrationDataVariables &variables, unsigned indexSF, unsigned int indexEff) const
MC/MC scale factor retrieval.
std::vector< CalibrationDataContainer * > m_objects
cache the objects themselves (so that the user will not have to delete them after each call etc....
double m_maxAbsEta
|eta| bounds and strategy for dealing with out-of-bounds conditions
CalibResult getEfficiency(const CalibrationDataVariables &variables, const std::string &label, const std::string &OP, Uncertainty unc, const std::string &flavour, unsigned int numVariation=0, unsigned int mapIndex=0)
efficiency retrieval by name
void initialize(const std::string &jetauthor, const std::string &OP, Uncertainty unc)
initialization for PROOF usage
std::vector< unsigned int > m_etaCounters
counters for flagging out-of-bound cases
std::map< const CalibrationDataContainer *, std::shared_ptr< CalibrationDataEigenVariations > > m_eigenVariationsMap
store the eigenvector class and associate to its CalibrationDataContainer
bool checkAbsEta(const CalibrationDataVariables &variables, unsigned int index)
void checkWeightScaleFactors(unsigned int indexSF, unsigned int indexEff)
std::string m_filenameSF
in addition, store also the filenames themselves (needed for the copy constructor)
This class (struct, actually) is nothing but a light-weight container of (kinematic or other) variabl...
bool verbose
Definition hcg.cxx:73
std::string label(const std::string &format, int i)
Definition label.h:19
std::vector< std::string > split(const std::string &str, const char token=';')
local utility function: split string into a vector of substrings separated by a specified separator,...
The namespace of all packages in PhysicsAnalysis/JetTagging.
OutOfBoundsType
counter types (to be used when flagging out-of-bounds cases)
const CalibResult dummyResult(dummyValue, dummyValue)
std::pair< double, double > CalibResult
std::pair< double, double > UncertaintyResult
The following typedef is for convenience: most uncertainties can be asymmetric.
Uncertainty
specification of type information requested by the user
Definition index.py:1
STL namespace.