Small class used for vailiadating event counts between input and output files. More...

Inheritance diagram for python.trfValidation.eventMatch:

Collaboration diagram for python.trfValidation.eventMatch:

Public Member Functions
def	__init__ (self, executor, eventCountConf=None, eventCountConfOverwrite=False)
	check in- and output event counts More...

def	eventCount (self)

def	configureCheck (self, override=False)
	Setup the parameters needed to define particular checks. More...

def	decide (self)
	Perform an event count check. More...

Private Attributes
	_executor

	_eventCount

	_eventCountConf

	_inEventDict

	_outEventDict

	_skipEvents

	_maxEvents

	_evAccEff

Detailed Description

Small class used for vailiadating event counts between input and output files.

Definition at line 912 of file trfValidation.py.

Constructor & Destructor Documentation

◆ init()

def python.trfValidation.eventMatch.__init__	(	self,
		executor,
		eventCountConf = `None`,
		eventCountConfOverwrite = `False`
	)

check in- and output event counts

Class to verify that in- and output event counts are in a reasonable relationship.

Parameters

Definition at line 921 of file trfValidation.py.

     def __init__(self, executor, eventCountConf=None, eventCountConfOverwrite=False):
         self._executor = executor
         self._eventCount = None
  
         
         simEventEff = 0.995
         self._eventCountConf = {}
         self._eventCountConf['EVNT'] = {'EVNT_MRG':"match", "HITS": simEventEff, "EVNT_TR": "filter", "DAOD_TRUTH*" : "match"}
         self._eventCountConf['EVNT_TR'] = {'HITS': simEventEff}
         self._eventCountConf['HITS'] = {'RDO':"match", 'HITS_RSM': simEventEff, "HITS_MRG":"match", 'HITS_FILT': simEventEff, "RDO_FILT": "filter", "DAOD_TRUTH*" : "match", "HIST_SIM" : "match"}
         self._eventCountConf['BS'] = {'ESD': "match", 'DRAW_*':"filter", 'NTUP_*':"filter", "BS_MRG":"match", 'DESD*': "filter", 'AOD':"match", 'DAOD*':"filter", "DAOD_PHYS":"match", "DAOD_PHYSLITE":"match"}
         self._eventCountConf['RDO*'] = {'ESD': "match", 'DRAW_*':"filter", 'NTUP_*':"filter", "RDO_MRG":"match", "RDO_TRIG":"match", 'AOD':"match", 'DAOD*':"filter", "DAOD_PHYS":"match", "DAOD_PHYSLITE":"match", "HIST_DIGI":"match"}
         self._eventCountConf['ESD'] = {'ESD_MRG': "match", 'AOD':"match", 'DESD*':"filter", 'DAOD_*':"filter", 'NTUP_*':"filter", "DAOD_PHYS":"match", "DAOD_PHYSLITE":"match"}
         self._eventCountConf['AOD'] = {'AOD_MRG' : "match", 'TAG':"match", "NTUP_*":"filter", "DAOD_*":"filter", 'NTUP_*':"filter", "DAOD_PHYS":"match", "DAOD_PHYSLITE":"match"}
         self._eventCountConf['AOD_MRG'] = {'TAG':"match"}
         self._eventCountConf['DAOD_*'] = {'DAOD_*_MRG' : "match"}
         self._eventCountConf['TAG'] = {'TAG_MRG': "match"}
         self._eventCountConf['HIST'] = {'HIST_MRG': "match"}
         self._eventCountConf['NTUP_COMMON'] = {'DNTUP*': "filter"}
         self._eventCountConf['NTUP_*'] = {'NTUP_*_MRG': "match"}
         # Next one comprises special data type names for smart merging of AthenaMP worker outputs
         self._eventCountConf['POOL_MRG_INPUT'] = {'POOL_MRG_OUTPUT': "match"}
  
  
         if eventCountConf:
             if eventCountConfOverwrite is True:
                 self._eventCountConf = eventCountConf
             else:
                 self._eventCountConf.update(eventCountConf)
  
         msg.debug('Event count check configuration is: {0}'.format(self._eventCountConf))
         if hasattr(self._executor, 'name'):
             msg.debug('Event count check ready for executor {0}'.format(self._executor.name))
  
         if self._executor is not None:
             self.configureCheck(override=False)
  

Member Function Documentation

◆ configureCheck()

def python.trfValidation.eventMatch.configureCheck	(	self,
		override = `False`
	)

Setup the parameters needed to define particular checks.

Parameters

override If set then configure the checks using this dictionary, which needs to have keys inEventDict, outEventDict, skipEvents, maxEvents, evAccEff

Note: Default is to configure the checks from the associated executor

Definition at line 976 of file trfValidation.py.

     def configureCheck(self, override=False):
         if override:
             msg.info('Overriding check configuration with: {0}'.format(override))
             self._inEventDict = override['inEventDict']
             self._outEventDict = override['outEventDict']
             self._skipEvents = override['skipEvents']
             self._maxEvents = override['maxEvents']
             self._evAccEff = override['evAccEff']
         else:
             # Input data from executor
             self._inEventDict = {}
             for dataTypeName in self._executor.input:
                 try:
                     self._inEventDict[dataTypeName] = self._executor.conf.dataDictionary[dataTypeName].nentries
                     msg.debug('Input data type {0} has {1} events'.format(dataTypeName, self._inEventDict[dataTypeName]))
                 except KeyError:
                     msg.warning('Found no dataDictionary entry for input data type {0}'.format(dataTypeName))
  
             # Output data from executor
             self._outEventDict = {}
             for dataTypeName in self._executor.output:
                 try:
                     self._outEventDict[dataTypeName] = self._executor.conf.dataDictionary[dataTypeName].nentries
                     msg.debug('Output data type {0} has {1} events'.format(dataTypeName, self._outEventDict[dataTypeName]))
                 except KeyError:
                     msg.warning('Found no dataDictionary entry for output data type {0}'.format(dataTypeName))
  
             # Find if we have a skipEvents applied
             if "skipEvents" in self._executor.conf.argdict:
                 self._skipEvents = self._executor.conf.argdict['skipEvents'].returnMyValue(exe=self._executor)
             else:
                 self._skipEvents = None
  
             # Find if we have a maxEvents applied
             if "maxEvents" in self._executor.conf.argdict:
                 self._maxEvents = self._executor.conf.argdict['maxEvents'].returnMyValue(exe=self._executor)
                 if self._maxEvents == -1:
                     self._maxEvents = None
             else:
                 self._maxEvents = None
  
             # Executor substeps handling
             if self._executor.conf.totalExecutorSteps > 1 and self._executor.conf.executorStep < self._executor.conf.totalExecutorSteps - 1:
                 executorEventCounts, executorEventSkips = getExecutorStepEventCounts(self._executor)
                 self._maxEvents = executorEventCounts[self._executor.conf.executorStep]
                 self._skipEvents = executorEventSkips[self._executor.conf.executorStep]
  
             # Global eventAcceptanceEfficiency set?
             if "eventAcceptanceEfficiency" in self._executor.conf.argdict:
                 self._evAccEff = self._executor.conf.argdict['eventAcceptanceEfficiency'].returnMyValue(exe=self._executor)
                 if (self._evAccEff is None):
                     self._evAccEff = 0.99
             else:
                 self._evAccEff = 0.99
  
         msg.debug("Event check conf: {0} {1}, {2}, {3}, {4}".format(self._inEventDict, self._outEventDict, self._skipEvents,
                                                                     self._maxEvents, self._evAccEff))
  
  

◆ decide()

def python.trfValidation.eventMatch.decide ( self )

Perform an event count check.

Definition at line 1036 of file trfValidation.py.

     def decide(self):
         # We have all that we need to proceed: input and output data, skip and max events plus any efficiency factor
         # So loop over the input and output data and make our checks
         for inData, neventsInData in self._inEventDict.items():
             if not isinstance(neventsInData, int):
                 msg.warning('File size metadata for {inData} was not countable, found {neventsInData}. No event checks possible for this input data.'.format(inData=inData, neventsInData=neventsInData))
                 continue
             if inData in self._eventCountConf:
                 inDataKey = inData
             else:
                 # OK, try a glob match in this case (YMMV)
                 matchedInData = False
                 for inDataKey in self._eventCountConf:
                     if fnmatch.fnmatch(inData, inDataKey):
                         msg.info("Matched input data type {inData} to {inDataKey} by globbing".format(inData=inData, inDataKey=inDataKey))
                         matchedInData = True
                         break
                 if not matchedInData:
                     msg.warning('No defined event count match for {inData} -> {outData}, so no check(s) possible in this case.'.format(inData=inData, outData=list(self._outEventDict)))
                     continue
  
             # Now calculate the expected number of processed events for this input
             expectedEvents = neventsInData
             if self._skipEvents is not None and self._skipEvents > 0:
                 expectedEvents -= self._skipEvents
                 if expectedEvents < 0:
                     msg.warning('skipEvents was set higher than the input events in {inData}: {skipEvents} > {neventsInData}. This is not an error, but it is not a normal configuration. Expected events is now 0.'.format(inData=inData, skipEvents=self._skipEvents, neventsInData=neventsInData))
                     expectedEvents = 0
             if self._maxEvents is not None:
                 if expectedEvents < self._maxEvents:
                     if self._skipEvents is not None:
                         msg.warning('maxEvents was set higher than inputEvents-skipEvents for {inData}: {maxEvents} > {neventsInData}-{skipEvents}. This is not an error, but it is not a normal configuration. Expected events remains {expectedEvents}.'.format(inData=inData, maxEvents=self._maxEvents, neventsInData=neventsInData, skipEvents=self._skipEvents, expectedEvents=expectedEvents))
                     else:
                         msg.warning('maxEvents was set higher than inputEvents for {inData}: {maxEvents} > {neventsInData}. This is not an error, but it is not a normal configuration. Expected events remains {expectedEvents}.'.format(inData=inData, maxEvents=self._maxEvents, neventsInData=neventsInData, expectedEvents=expectedEvents))
                 else:
                     expectedEvents = self._maxEvents
             msg.debug('Expected number of processed events for {0} is {1}'.format(inData, expectedEvents))
  
             # Loop over output data - first find event count configuration
             for outData, neventsOutData in self._outEventDict.items():
                 if not isinstance(neventsOutData, int):
                     msg.warning('File size metadata for {outData} was not countable, found "{neventsOutData}". No event checks possible for this output data.'.format(outData=outData, neventsOutData=neventsOutData))
                     continue
                 if outData in self._eventCountConf[inDataKey]:
                     checkConf = self._eventCountConf[inDataKey][outData]
                     outDataKey = outData
                 else:
                     # Look for glob matches
                     checkConf = None
                     for outDataKey, outDataConf in self._eventCountConf[inDataKey].items():
                         if fnmatch.fnmatch(outData, outDataKey):
                             msg.info('Matched output data type {outData} to {outDatakey} by globbing'.format(outData=outData, outDatakey=outDataKey))
                             outDataKey = outData
                             checkConf = outDataConf
                             break
                     if not checkConf:
                         msg.warning('No defined event count match for {inData} -> {outData}, so no check possible in this case.'.format(inData=inData, outData=outData))
                         continue
                 msg.debug('Event count check for {inData} to {outData} is {checkConf}'.format(inData=inData, outData=outData, checkConf=checkConf))
  
                 # Do the check for thsi input/output combination
                 if checkConf == 'match':
                     # We need an exact match
                     if neventsOutData == expectedEvents:
                         msg.info("Event count check for {inData} to {outData} passed: all processed events found ({neventsOutData} output events)".format(inData=inData, outData=outData, neventsOutData=neventsOutData))
                     else:
                         raise trfExceptions.TransformValidationException(trfExit.nameToCode('TRF_EXEC_VALIDATION_EVENTCOUNT'),
                                                                          'Event count check for {inData} to {outData} failed: found {neventsOutData} events, expected {expectedEvents}'.format(inData=inData, outData=outData, neventsOutData=neventsOutData, expectedEvents=expectedEvents))
                 elif checkConf == 'filter':
                     if neventsOutData <= expectedEvents and neventsOutData >= 0:
                         msg.info("Event count check for {inData} to {outData} passed: found ({neventsOutData} output events selected from {expectedEvents} processed events)".format(inData=inData, outData=outData, neventsOutData=neventsOutData, expectedEvents=expectedEvents))
                     else:
                         raise trfExceptions.TransformValidationException(trfExit.nameToCode('TRF_EXEC_VALIDATION_EVENTCOUNT'),
                                                                          'Event count check for {inData} to {outData} failed: found {neventsOutData} events, expected from 0 to {expectedEvents}'.format(inData=inData, outData=outData, neventsOutData=neventsOutData, expectedEvents=expectedEvents))
                 elif checkConf == 'minEff':
                     if neventsOutData >= int(expectedEvents * self._evAccEff) and neventsOutData <= expectedEvents:
                         msg.info("Event count check for {inData} to {outData} passed: found ({neventsOutData} output events selected from {expectedEvents} processed events)".format(inData=inData, outData=outData, neventsOutData=neventsOutData, expectedEvents=expectedEvents))
                     else:
                         raise trfExceptions.TransformValidationException(trfExit.nameToCode('TRF_EXEC_VALIDATION_EVENTCOUNT'),
                                                                          'Event count check for {inData} to {outData} failed: found {neventsOutData} events, expected from {minEvents} to {expectedEvents}'.format(inData=inData, outData=outData, neventsOutData=neventsOutData,
                                                                                                                                                                  minEvents=int(expectedEvents * self._evAccEff), expectedEvents=expectedEvents))
                 elif isinstance(checkConf, (float, int)):
                     checkConf = float(checkConf)
                     if checkConf < 0.0 or checkConf > 1.0:
                         raise trfExceptions.TransformValidationException(trfExit.nameToCode('TRF_EXEC_VALIDATION_EVENTCOUNT'),
                                                                          'Event count check for {inData} to {outData} is misconfigured: the efficiency factor of {eff} is not between 0 and 1.'.format(inData=inData, outData=outData, eff=checkConf))
                     if neventsOutData >= int(expectedEvents * checkConf) and neventsOutData <= expectedEvents:
                         msg.info("Event count check for {inData} to {outData} passed: found ({neventsOutData} output events selected from {expectedEvents} processed events)".format(inData=inData, outData=outData, neventsOutData=neventsOutData, expectedEvents=expectedEvents))
                     else:
                         raise trfExceptions.TransformValidationException(trfExit.nameToCode('TRF_EXEC_VALIDATION_EVENTCOUNT'),
                                                                          'Event count check for {inData} to {outData} failed: found {neventsOutData} events, expected from {minEvents} to {expectedEvents}'.format(inData=inData, outData=outData, neventsOutData=neventsOutData,
                                                                                                                                                                  minEvents=int(expectedEvents * checkConf), expectedEvents=expectedEvents))
                 else:
                     raise trfExceptions.TransformValidationException(trfExit.nameToCode('TRF_EXEC_VALIDATION_EVENTCOUNT'),
                                                                      'Unrecognised event count configuration for {inData} to {outData}: "{conf}" is not known'.format(inData=inData, outData=outData, conf=checkConf))
             self._eventCount = expectedEvents
         return True

◆ eventCount()

def python.trfValidation.eventMatch.eventCount ( self )

Definition at line 969 of file trfValidation.py.

     def eventCount(self):
         return self._eventCount
  

Member Data Documentation

◆ _evAccEff

python.trfValidation.eventMatch._evAccEff

private

Definition at line 983 of file trfValidation.py.

◆ _eventCount

python.trfValidation.eventMatch._eventCount

private

Definition at line 923 of file trfValidation.py.

◆ _eventCountConf

python.trfValidation.eventMatch._eventCountConf

private

Note

This double dictionary is formed of INPUT data, then a dictionary of the expected event counts from different output data types. If there is no exact match for the output datatype then globbing matches are allowed. Thus self._eventCountConf[input][output] gives the test for input -> output. The dictionary recognises the following options:

match : exact match of input and output events, n_in = n_out
filter : any event count from 0 up to input events is ok, 0 <= n_out <= n_in
minEff : any event count from n_in * eventAcceptanceEfficiency <= n_out <= n_in
float in range [0,1] : same as minEff with this efficiency factor For any case where the output events can be less than the input ones an integer conversion is applied, so the result is rounded down. i.e., 1 * 0.5 -> 0.

Definition at line 937 of file trfValidation.py.

◆ _executor

python.trfValidation.eventMatch._executor

private

Definition at line 922 of file trfValidation.py.

◆ _inEventDict

python.trfValidation.eventMatch._inEventDict

private

Definition at line 979 of file trfValidation.py.

◆ _maxEvents

python.trfValidation.eventMatch._maxEvents

private

Definition at line 982 of file trfValidation.py.

◆ _outEventDict

python.trfValidation.eventMatch._outEventDict

private

Definition at line 980 of file trfValidation.py.

◆ _skipEvents

python.trfValidation.eventMatch._skipEvents

private

Definition at line 981 of file trfValidation.py.

The documentation for this class was generated from the following file:

trfValidation.py

Public Member Functions

Private Attributes

Detailed Description

Constructor & Destructor Documentation

◆ __init__()

Member Function Documentation

◆ configureCheck()

◆ decide()

◆ eventCount()

Member Data Documentation

◆ _evAccEff

◆ _eventCount

◆ _eventCountConf

◆ _executor

◆ _inEventDict

◆ _maxEvents

◆ _outEventDict

◆ _skipEvents

◆ init()