ATLAS Offline Software
Loading...
Searching...
No Matches
CP::MakeSystematicsVector Class Reference

This class handles turning the list of systematics into the actual list of nuisance parameter points to evaluate. More...

#include <MakeSystematicsVector.h>

Collaboration diagram for CP::MakeSystematicsVector:

Classes

struct  GroupConfig
 the configuration for the given group More...

Public Member Functions

void testInvariant () const
 test the invariant of this object
 MakeSystematicsVector ()
 standard default constructor
const std::vector< SystematicSet > & result (const std::string &label) const
 the list of nuisance parameter points generated with the given label
void calc (const SystematicSet &sysList)
 fill in result
void addGroup (const std::string &val_label)
 finish configuration for this group and add a new one
void setPattern (const std::string &val_pattern)
 set the pattern for the current group
void setSigma (float val_sigma)
 set the number of sigmas to vary this group by
void setToys (unsigned val_toys)
 set the number of toys to run for this group
void useForNominal ()
 set this group as the default, i.e.

Private Member Functions

std::vector< std::map< std::string, std::vector< SystematicVariation > > > calcBaseSys (const SystematicSet &sysList)
 make the list of base systematics for calc

Private Attributes

std::map< std::string, std::vector< SystematicSet > > m_result
 the value of result
std::vector< GroupConfigm_config
 the configuration on a per-group basis
std::string m_useForNominal
 the group for which useForNominal was set

Detailed Description

This class handles turning the list of systematics into the actual list of nuisance parameter points to evaluate.

This is meant as a placeholder for a generic tool to be developed by the statistics forum (or as a prototype for it).

For now I decided to keep it as a single class, but there are other options, e.g. the member MakeSystematicsVector::Config could be made a class that the user configures directly and then passes in. However, for now this approach seems better, as it hides some of the mechanics from the user and gives me more freedom on the backend side.

Definition at line 33 of file MakeSystematicsVector.h.

Constructor & Destructor Documentation

◆ MakeSystematicsVector()

CP::MakeSystematicsVector::MakeSystematicsVector ( )

standard default constructor

Guarantee
no-fail

Definition at line 71 of file MakeSystematicsVector.cxx.

73 : m_config (1)
74 {
75 RCU_NEW_INVARIANT (this);
76 }
#define RCU_NEW_INVARIANT(x)
Definition Assert.h:233
std::vector< GroupConfig > m_config
the configuration on a per-group basis

Member Function Documentation

◆ addGroup()

void CP::MakeSystematicsVector::addGroup ( const std::string & val_label)

finish configuration for this group and add a new one

Parameters
val_labelthe label for the new group
Guarantee
strong
Failures
out of memory II

Definition at line 182 of file MakeSystematicsVector.cxx.

184 {
187 config.label = val_label;
188 m_config.push_back (std::move(config));
189 }
#define RCU_CHANGE_INVARIANT(x)
Definition Assert.h:231
the configuration for the given group

◆ calc()

void CP::MakeSystematicsVector::calc ( const SystematicSet & sysList)

fill in result

Parameters
sysListthe list of systematics to use, usually the list of recommended systematics from your CP tools, framework or the systematics registry
Guarantee
strong
Failures
out of memory II
configuration errors

Definition at line 93 of file MakeSystematicsVector.cxx.

95 {
97
98 auto baseSys = calcBaseSys (sysList);
99
100 std::map<std::string,std::vector<SystematicSet>> myresult;
101 myresult[m_useForNominal].push_back (SystematicSet ());
102 for (std::size_t group = 0; group != m_config.size(); ++ group)
103 {
104 const auto& config = m_config[group];
105
106 // note: this is not just a short-cut, but also makes sure that
107 // we have an entry for each label, even if there are no
108 // systematics for the label
109 auto& subresult = myresult[config.label];
110
111 // this skips groups that don't match any requested systematics,
112 // which is mainly important for toy systematics as you wouldn't
113 // want to generate a bunch of empty systematics
114 if (baseSys[group].empty())
115 continue;
116
117 if (config.toys == 0)
118 {
119 for (auto sys : baseSys[group])
120 {
121 RCU_ASSERT (!sys.second.empty());
122 RCU_ASSERT (!sys.second.front().isToyEnsemble());
123 if (sys.second.front().isContinuousEnsemble())
124 {
125 // for continuous systematics
126 subresult.push_back(CP::SystematicSet());
127 subresult.back().insert (CP::SystematicVariation (sys.first, config.sigma));
128 subresult.push_back(CP::SystematicSet());
129 subresult.back().insert (CP::SystematicVariation (sys.first, -config.sigma));
130 } else if (sys.second.front().isEnsemble())
131 {
132 // we must have added a new kind of ensemble after I wrote
133 // this code
134 RCU_THROW_MSG ("unsupported ensemble systematic: " + sys.first);
135 } else
136 {
137 // otherwise just add all of them flat
138 for (const auto & mysys : sys.second)
139 {
140 subresult.push_back(CP::SystematicSet());
141 subresult.back().insert(mysys);
142 }
143 }
144 }
145 } else
146 {
147 std::vector<CP::SystematicSet> toys (config.toys);
148
149 for (auto sys : baseSys[group])
150 {
151 RCU_ASSERT (!sys.second.empty());
152 RCU_ASSERT (sys.second.front().isEnsemble());
153
154 if (sys.second.front().isContinuousEnsemble())
155 {
156 std::unique_ptr<TRandom3> random (new TRandom3);
157 random->SetSeed (hash_string (sys.first));
158
159 for (auto& toy : toys)
160 toy.insert (CP::SystematicVariation (sys.first, random->Gaus (0, config.sigma)));
161 } else if (sys.second.front().isToyEnsemble())
162 {
163 for (unsigned toy = 0; toy != config.toys; ++ toy)
164 toys[toy].insert (CP::SystematicVariation::makeToyVariation (sys.first, toy + 1, config.sigma));
165 } else
166 {
167 // we must have added a new kind of ensemble after I
168 // wrote this code
169 RCU_THROW_MSG ("unsupported ensemble systematic for toys: " + sys.first);
170 }
171 }
172 for (auto& toy : toys)
173 subresult.push_back (std::move (toy));
174 }
175 }
176
177 m_result = std::move(myresult);
178 }
#define RCU_ASSERT(x)
Definition Assert.h:222
#define RCU_THROW_MSG(message)
Definition PrintMsg.h:58
static const Attributes_t empty
std::string m_useForNominal
the group for which useForNominal was set
std::vector< std::map< std::string, std::vector< SystematicVariation > > > calcBaseSys(const SystematicSet &sysList)
make the list of base systematics for calc
std::map< std::string, std::vector< SystematicSet > > m_result
the value of result
static SystematicVariation makeToyVariation(const std::string &basename, unsigned toyIndex, float toyScale)
constructor for toy systematics

◆ calcBaseSys()

std::vector< std::map< std::string, std::vector< SystematicVariation > > > CP::MakeSystematicsVector::calcBaseSys ( const SystematicSet & sysList)
private

make the list of base systematics for calc

Guarantee
strong
Failures
out of memory II

Definition at line 232 of file MakeSystematicsVector.cxx.

234 {
235 std::map<std::string,std::vector<SystematicVariation> > basesys;
236 for (const auto & sys : sysList)
237 {
238 basesys[sys.basename()].push_back (sys);
239 }
240 std::vector<std::map<std::string,std::vector<SystematicVariation> >>
241 basesysList (m_config.size());
242 for (auto sys : basesys)
243 {
244 // extract the ensemble if we have one
245 SystematicVariation ensemble;
246 for (const auto & mysys : sys.second)
247 {
248 if (mysys.isEnsemble())
249 {
250 if (!ensemble.empty())
251 RCU_THROW_MSG ("inconsistent ensembles requested: " + ensemble.name() + " " + mysys.name());
252 ensemble = mysys;
253 }
254 }
255
256 // setting this beyond the valid groups in case none matches
257 std::size_t group = m_config.size();
258 for (std::size_t iter = 0; iter != m_config.size(); ++ iter)
259 {
260 if (m_config[iter].pattern.empty())
261 {
262 // only use empty patterns if no previous pattern already took this
263 if (group == m_config.size())
264 {
265 if (m_config[iter].toys > 0)
266 {
267 if (ensemble.isToyEnsemble())
268 group = iter;
269 } else
270 {
271 if (!ensemble.isToyEnsemble())
272 group = iter;
273 }
274 }
275 } else if (RCU::match_expr (std::regex (m_config[iter].pattern.c_str()), sys.first))
276 {
277 if (m_config[iter].toys > 0 && ensemble.empty())
278 RCU_THROW_MSG ("toys only supported for ensemble systematics");
279 group = iter;
280 }
281 }
282 if (group == m_config.size())
283 RCU_THROW_MSG ("no systematics group for systematic: " + sys.first);
284
285 if (!ensemble.empty())
286 {
287 basesysList[group][sys.first].push_back (std::move(ensemble));
288 } else
289 {
290 basesysList[group][sys.first] = std::move (sys.second);
291 }
292 }
293 return basesysList;
294 }
bool isToyEnsemble() const
whether this represents a toy ensemble
bool empty() const
returns: whether this is an empty systematic, i.e.
const std::string & name() const
description: the full systematics name, for use in strings, etc.
bool match_expr(const std::regex &expr, const std::string &str)
returns: whether we can match the entire string with the regular expression guarantee: strong failure...

◆ result()

const std::vector< SystematicSet > & CP::MakeSystematicsVector::result ( const std::string & label) const

the list of nuisance parameter points generated with the given label

Guarantee
strong
Failures
unknown label
Precondition
calculate() has been called

Definition at line 80 of file MakeSystematicsVector.cxx.

82 {
83 RCU_READ_INVARIANT (this);
84 RCU_REQUIRE2 (!m_result.empty(), "calculate() has been called");
85 auto iter = m_result.find (label);
86 if (iter == m_result.end())
87 RCU_THROW_MSG ("unknown systematics group: " + label);
88 return iter->second;
89 }
#define RCU_REQUIRE2(x, y)
Definition Assert.h:210
#define RCU_READ_INVARIANT(x)
Definition Assert.h:229
std::string label(const std::string &format, int i)
Definition label.h:19

◆ setPattern()

void CP::MakeSystematicsVector::setPattern ( const std::string & val_pattern)

set the pattern for the current group

Guarantee
strong
Failures
out of memory II

Definition at line 193 of file MakeSystematicsVector.cxx.

195 {
197 m_config.back().pattern = val_pattern;
198 }

◆ setSigma()

void CP::MakeSystematicsVector::setSigma ( float val_sigma)

set the number of sigmas to vary this group by

Normally we are using just +/-1 sigma variations, but if the systematics are very small that can get lost in the statistical jitter. For those cases it is better to do a multi-sigma variation and then scale it back to +/-1 sigma thereby reducing the statistical jitter introduced into the systematic. A traditional scaling factor for these cases is to scale by five sigma.

Please note that if you do this, you normally only want to do this for small systematics for which statistical jitter is an issue. For large systematics there is a legitimate concern that a 5 sigma variation won't be 5 times the size of a 1 sigma variation, introducing a different kind of bias. To that end, if you use this, you should normally put the small systematics into a separate group from your regular systematics (which then also makes it easier for you to know which one to scale down).

Guarantee
no-fail
Precondition
val_sigma > 0

Definition at line 202 of file MakeSystematicsVector.cxx.

204 {
206 RCU_REQUIRE (val_sigma > 0);
207 m_config.back().sigma = val_sigma;
208 }
#define RCU_REQUIRE(x)
Definition Assert.h:208

◆ setToys()

void CP::MakeSystematicsVector::setToys ( unsigned val_toys)

set the number of toys to run for this group

This is a specialized mechanism pioneered for the e/gamma and muon scale factors. Instead of evaluating a large number of systematics separately, it allows to vary all of them at the same time repeatedly, so that instead of hundreds of systematic variations you only have to perform maybe 10 or 20 "toy" variations. You then just take the spread of the variations as the overall systematic uncertainty from the toys. There are reports that compared to the "old" method this can yield a factor five reduction in systematic uncertainty without increasing the number of systematic variations to evaluate.

Please note that this approach requires a more expert handling than "regular" systematics. The main point here is that the "toy" variations need a different post-processing than regular variations, i.e. you need to look at the spread between the output histograms for the different "toys" and use that to construct the combined systematic. The e/gamma group currently (Oct 15) provides a tool for that. For Bayesian marginalization you may alternatively consider to integrate the "toy" variations directly into the integral over nuisance parameter space to extract more information than you could with a single systematic.

A general concern for all methods that reduce the number of nuisance parameters (including the "toy" approach) is if you use profiling and are able to constrain the "toy" systematic, in which case we generally assume that such an approach is invalid. Similar precautions should be taken when used with Bayesian marginalization. This is typically of little practical concern as long as the systematic in question is small, but it is something that you should check for and that you should include in your supporting documentation.

The "toy" approach to systematics evaluation introduces an additional statistical jitter into your systematics, due to the sampling fluctuations of your "toy" variations. Naturally this uncertainty decreases if you use a larger number of "toy" variations. We currently (Oct 15) provide no recommendations for evaluating the size of that uncertainty or for determining whether your chosen number of "toys" is sufficient. Anecdotal evidence from the e/gamma group suggests that for evaluating their scale factor systematic as little as 10 or 20 "toys" may be sufficient. However, this will vary depending on your analysis and the systematics you use the "toys" for.

If you decide to go for a large number of "toy" variations it may be better to go for a "traditional" evaluation of your systematics instead, as that can cut down on the aforementioned statistical jitter by interpolating between the variations. The number of variations needed for the traditional approach to be better (or even feasible) will depend on the interpolation algorithm used and is likely to improve as we implement better interpolation algorithms.

Guarantee
no-fail

Definition at line 212 of file MakeSystematicsVector.cxx.

214 {
216 RCU_REQUIRE (val_toys > 0);
217 m_config.back().toys = val_toys;
218 }

◆ testInvariant()

void CP::MakeSystematicsVector::testInvariant ( ) const

test the invariant of this object

Guarantee
no-fail

Definition at line 62 of file MakeSystematicsVector.cxx.

64 {
65 //RCU_INVARIANT (this != nullptr);
66 RCU_INVARIANT (!m_config.empty());
67 }
#define RCU_INVARIANT(x)
Definition Assert.h:201

◆ useForNominal()

void CP::MakeSystematicsVector::useForNominal ( )

set this group as the default, i.e.

the group containing the nominal variation

Guarantee
no-fail

Definition at line 222 of file MakeSystematicsVector.cxx.

224 {
226 m_useForNominal = m_config.back().label;
227 }

Member Data Documentation

◆ m_config

std::vector<GroupConfig> CP::MakeSystematicsVector::m_config
private

the configuration on a per-group basis

Definition at line 229 of file MakeSystematicsVector.h.

◆ m_result

std::map<std::string,std::vector<SystematicSet> > CP::MakeSystematicsVector::m_result
private

the value of result

Definition at line 202 of file MakeSystematicsVector.h.

◆ m_useForNominal

std::string CP::MakeSystematicsVector::m_useForNominal
private

the group for which useForNominal was set

Definition at line 234 of file MakeSystematicsVector.h.


The documentation for this class was generated from the following files: