![]() |
ATLAS Offline Software
|
#include <DuplicateChecker.h>
Public Member Functions | |
void | testInvariant () const |
test the invariant of this object More... | |
DuplicateChecker () | |
standard constructor More... | |
const std::string & | eventInfoName () const |
the name of the EventInfo structure to use More... | |
void | setEventInfoName (const std::string &val_eventInfoName) |
set the value of eventInfoName More... | |
const std::string & | outputTreeName () const |
the name of the output tree to create, or the empty string if none is created More... | |
void | setOutputTreeName (const std::string &val_outputTreeName) |
set the value of outputTreeName More... | |
void | addKnownDuplicate (const std::string &sampleName, const std::string &fileName, Long64_t entry, number_type runNumber, number_type eventNumber) |
add a known duplicate event More... | |
void | addKnownDuplicatesFile (const std::string &duplicatesFile) |
add a file with known duplicates More... | |
IWorker * | wk () const |
description: the worker that is controlling us guarantee: no-fail More... | |
void | book (const TH1 &hist) |
book the given histogram More... | |
TH1 * | hist (const std::string &name) const |
get the histogram with the given name More... | |
asg::SgTEvent * | evtStore () const |
get the (main) event store for this algorithm More... | |
virtual const std::string & | name () const |
Static Public Member Functions | |
static bool | processSummary (const std::string &submitdir, const std::string &treeName) |
process the summary tree from the given submission More... | |
static bool | processSummary (const SH::SampleHandler &sh, const std::string &outputFile) |
process the summary tree from the given submission More... | |
Private Types | |
typedef uint32_t | number_type |
the integer type to use for run and event numbers More... | |
Private Member Functions | |
virtual StatusCode | setupJob (Job &job) override |
effects: give the algorithm a chance to intialize the job with anything this algorithm needs. More... | |
virtual StatusCode | changeInput (bool firstFile) override |
effects: do all changes to work with a new input file, e.g. More... | |
virtual StatusCode | initialize () override |
effects: do everything that needs to be done before running the algorithm, e.g. More... | |
virtual StatusCode | execute () override |
effects: process the next event guarantee: basic failures: algorithm dependent More... | |
void | read_run_event_number () |
get the run and event number for the current event More... | |
ClassDef (DuplicateChecker, 1) | |
virtual StatusCode | fileExecute () |
effects: do all the processing that needs to be done once per file More... | |
virtual StatusCode | endOfFile () |
effects: do the post-processing for each input file guarantee: basic failures: algorithm dependent rationale: this is mainly used for specialized services that need to save partial results for each input file More... | |
virtual StatusCode | histInitialize () |
effects: this is a pre-initialization routine that is called before changeInput is called. More... | |
virtual StatusCode | postExecute () |
effects: do the post-processing for the event guarantee: basic failures: algorithm dependent rationale: this is mainly used for specialized services that need to get input from subsequent algorithms before filling their event data More... | |
virtual StatusCode | finalize () |
effects: do everything that needs to be done after completing work on this worker guarantee: basic failures: algorithm dependent rationale: currently there is no use foreseen, but this routine is provided regardless More... | |
virtual StatusCode | histFinalize () |
effects: this is a post-initialization routine that is called after finalize has been called. More... | |
virtual bool | hasName (const std::string &name) const |
returns: whether this algorithm has the given name guarantee: basic failures: algorithm dependent rationale: this is to allow an algorithm to be known by multiple names. More... | |
void | sysSetupJob (Job &job) |
effects: give the algorithm a chance to intialize the job with anything this algorithm needs. More... | |
Private Attributes | |
std::string | m_eventInfoName |
the value returned by eventInfoName More... | |
std::string | m_outputTreeName |
the value returned by outputTreeName More... | |
std::map< std::pair< std::string, std::string >, std::map< Long64_t, std::pair< number_type, number_type > > > | m_duplicates |
the list of known duplicates to skip More... | |
std::map< Long64_t, std::pair< number_type, number_type > > * | m_currentDuplicates = nullptr |
the list of the duplicates in the current file to skip, or the null pointer if there are none More... | |
std::set< std::pair< number_type, number_type > > | m_processed |
the list of run-event numbers already encountered More... | |
xAOD::TEvent * | m_event = nullptr |
the event we are reading from More... | |
TTree * | m_outputTree = nullptr |
the output tree, if we are creating one More... | |
std::string | m_inputFileName |
the name of the input file (connected to m_outputTree, if present) More... | |
Long64_t | m_inputFileIndex |
the index in the input file (connected to m_outputTree, if present) More... | |
number_type | m_runNumber |
the run number of the current event (connected to m_outputTree, if present) More... | |
number_type | m_eventNumber |
the event number of the current event (connected to m_outputTree, if present) More... | |
Bool_t | m_processEvent |
whether the current event is/should be processed (connected to m_outputTree, if present) More... | |
IWorker * | m_wk |
asg::SgTEvent * | m_evtStorePtr = nullptr |
the value of evtStore More... | |
asg::SgTEvent | m_evtStore |
when configured, the object returned by evtStore More... | |
MsgStream * | m_msg = nullptr |
the message stream, if it has been instantiated More... | |
std::string | m_msgName |
the algorithm name for which the message stream has been instantiated More... | |
int | m_msgLevel = 3 |
the message level configured More... | |
std::string | m_nameCache |
the cache for name More... | |
Definition at line 30 of file DuplicateChecker.h.
|
private |
the integer type to use for run and event numbers
Definition at line 38 of file DuplicateChecker.h.
EL::DuplicateChecker::DuplicateChecker | ( | ) |
standard constructor
void EL::DuplicateChecker::addKnownDuplicate | ( | const std::string & | sampleName, |
const std::string & | fileName, | ||
Long64_t | entry, | ||
number_type | runNumber, | ||
number_type | eventNumber | ||
) |
add a known duplicate event
void EL::DuplicateChecker::addKnownDuplicatesFile | ( | const std::string & | duplicatesFile | ) |
add a file with known duplicates
book the given histogram
|
overrideprivatevirtual |
effects: do all changes to work with a new input file, e.g.
set new branch addresses. if firstFile is set, this method is called just before init() is called
Warning: If a file is split across multiple jobs this will be called more than once. This only happens for specific batch drivers and/or if it is explicitly configured by the user. With PROOF it could even happen multiple times within the same job, and while PROOF is no longer supported that behavior may come back if support for a similar framework is added in the future. As such, this method should not be used for accounting that relies to be called exactly once per file, take a look at fileExecute() if you want something that is guaranteed to be executed exactly once per input file.
Warning: The execution order of changeInput and fileExecute is currently unspecified.
guarantee: basic failures: algorithm dependent
Reimplemented from EL::Algorithm.
|
private |
|
privatevirtualinherited |
effects: do the post-processing for each input file guarantee: basic failures: algorithm dependent rationale: this is mainly used for specialized services that need to save partial results for each input file
Reimplemented in EL::MetricsSvc.
const std::string& EL::DuplicateChecker::eventInfoName | ( | ) | const |
|
inherited |
get the (main) event store for this algorithm
This is mostly to mirror the method of the same name in AthAlgorithm, allowing to make the tutorial instructions more dual-use.
|
overrideprivatevirtual |
effects: process the next event guarantee: basic failures: algorithm dependent
Reimplemented from EL::Algorithm.
|
privatevirtualinherited |
effects: do all the processing that needs to be done once per file
Warning: The user should not expect this to be called at any particular point in execution. If a file is split between multiple jobs this will be called in only one of these jobs, and not the others. It usually gets called before the first event in a file, but that is not guaranteed and relying on this is a bug. Take a look at changeInput if you want something that is guaranteed to be executed at the beginning of each input file.
Warning: The execution order of changeInput and fileExecute is currently unspecified.
guarantee: basic failures: algorithm dependent rationale: this is to read per-file accounting data, e.g. the list of lumi-blocks processed
Reimplemented in EL::UnitTestAlg1, EL::UnitTestAlg, EL::UnitTestAlgXAOD, and EL::MetricsSvc.
|
privatevirtualinherited |
effects: do everything that needs to be done after completing work on this worker guarantee: basic failures: algorithm dependent rationale: currently there is no use foreseen, but this routine is provided regardless
Reimplemented in EL::UnitTestAlg1, EL::UnitTestAlg, and EL::UnitTestAlgXAOD.
|
privatevirtualinherited |
returns: whether this algorithm has the given name guarantee: basic failures: algorithm dependent rationale: this is to allow an algorithm to be known by multiple names.
this is needed for NTupleSvc, so that it can be located with and without the output tree name.
Reimplemented in EL::NTupleSvc.
get the histogram with the given name
|
privatevirtualinherited |
effects: this is a post-initialization routine that is called after finalize has been called.
guarantee: basic failures: algorithm dependent rationale: unlike finalize(), this method is called all the time, even on empty input files.
Reimplemented in EL::UnitTestAlg1, EL::UnitTestAlg, EL::UnitTestAlgXAOD, and EL::MetricsSvc.
|
privatevirtualinherited |
effects: this is a pre-initialization routine that is called before changeInput is called.
guarantee: basic failures: algorithm dependent rationale: unlike initialize(), this method is called all the time, even on empty input files. so you should create any histograms or n-tuples here that subsequent code expects
Reimplemented in EL::UnitTestAlg1, EL::UnitTestAlg, EL::UnitTestAlgXAOD, EL::MetricsSvc, and EL::VomsProxySvc.
|
overrideprivatevirtual |
effects: do everything that needs to be done before running the algorithm, e.g.
create output n-tuples and histograms. this method is called only once right after changeInput(true) is called guarantee: basic failures: algorithm dependent rationale: in principle all this work could be done on changeInput(true). However, providing this method should make it easier for the user to set up all his outputs and to do so only once.
Reimplemented from EL::Algorithm.
|
inherited |
messaging interface
this is the interface to work with the standard messaging macros from AsgTools. Instead of enums I pass ints, so that I can avoid the include dependency (forward declarations are only allowed for enum classes AFAIK).
the standard message stream for this object
|
inherited |
the message stream for this object, configured for the given level
|
inherited |
whether we are configured to print messages at the given level
|
virtualinherited |
const std::string& EL::DuplicateChecker::outputTreeName | ( | ) | const |
the name of the output tree to create, or the empty string if none is created
The output tree contains a list of run and event numbers for all events, and whether they were processed by this job. This can be used to check whether duplicate events were processed (or whether we somehow eliminated events as duplicates that we shouldn't have). It can also be used to create a list of duplicate events for future processing rounds.
|
privatevirtualinherited |
effects: do the post-processing for the event guarantee: basic failures: algorithm dependent rationale: this is mainly used for specialized services that need to get input from subsequent algorithms before filling their event data
Reimplemented in EL::NTupleSvc.
|
static |
process the summary tree from the given submission
This will create a file "duplicates" inside the submission directory that contains the list of duplicates that can be fed into future submissions to filter them out.
This version of the method provides a lower level interface, in which the list of inputs is given via a sample handler (with the tree name properly set), and the output file name freely choosable.
|
static |
process the summary tree from the given submission
This will create a file "duplicates" inside the submission directory that contains the list of duplicates that can be fed into future submissions to filter them out.
|
private |
get the run and event number for the current event
void EL::DuplicateChecker::setEventInfoName | ( | const std::string & | val_eventInfoName | ) |
|
inherited |
set the message level for the message stream for this object
void EL::DuplicateChecker::setOutputTreeName | ( | const std::string & | val_outputTreeName | ) |
|
overrideprivatevirtual |
effects: give the algorithm a chance to intialize the job with anything this algorithm needs.
this method is automatically called before the algorithm is actually added to the job. guarantee: basic failures: algorithm dependent rationale: this is currently used to give algorithms a chance to register their output datasets, but can also be used for other stuff.
Reimplemented from EL::Algorithm.
|
privateinherited |
effects: give the algorithm a chance to intialize the job with anything this algorithm needs.
this method is automatically called before the algorithm is actually added to the job. guarantee: basic failures: algorithm dependent rationale: this is currently used to give algorithms a chance to register their output datasets, but can also be used for other stuff.
void EL::DuplicateChecker::testInvariant | ( | ) | const |
test the invariant of this object
|
inherited |
description: the worker that is controlling us guarantee: no-fail
|
private |
the list of the duplicates in the current file to skip, or the null pointer if there are none
Definition at line 183 of file DuplicateChecker.h.
|
private |
the list of known duplicates to skip
Definition at line 178 of file DuplicateChecker.h.
|
private |
the event we are reading from
Definition at line 192 of file DuplicateChecker.h.
|
private |
the value returned by eventInfoName
Definition at line 170 of file DuplicateChecker.h.
|
private |
the event number of the current event (connected to m_outputTree, if present)
Definition at line 217 of file DuplicateChecker.h.
|
mutableprivateinherited |
when configured, the object returned by evtStore
Definition at line 329 of file Algorithm.h.
|
mutableprivateinherited |
the value of evtStore
Definition at line 325 of file Algorithm.h.
|
private |
the index in the input file (connected to m_outputTree, if present)
Definition at line 207 of file DuplicateChecker.h.
|
private |
the name of the input file (connected to m_outputTree, if present)
Definition at line 202 of file DuplicateChecker.h.
|
mutableprivateinherited |
the message stream, if it has been instantiated
Definition at line 333 of file Algorithm.h.
|
privateinherited |
the message level configured
Definition at line 342 of file Algorithm.h.
|
mutableprivateinherited |
the algorithm name for which the message stream has been instantiated
Definition at line 338 of file Algorithm.h.
|
mutableprivateinherited |
the cache for name
Definition at line 346 of file Algorithm.h.
|
private |
the output tree, if we are creating one
Definition at line 197 of file DuplicateChecker.h.
|
private |
the value returned by outputTreeName
Definition at line 174 of file DuplicateChecker.h.
|
private |
the list of run-event numbers already encountered
Definition at line 188 of file DuplicateChecker.h.
|
private |
whether the current event is/should be processed (connected to m_outputTree, if present)
Definition at line 222 of file DuplicateChecker.h.
|
private |
the run number of the current event (connected to m_outputTree, if present)
Definition at line 212 of file DuplicateChecker.h.
|
privateinherited |
Definition at line 321 of file Algorithm.h.