ATLAS Offline Software
Public Member Functions | Private Attributes | List of all members
python.trfValidation.eventMatch Class Reference

Small class used for vailiadating event counts between input and output files. More...

Inheritance diagram for python.trfValidation.eventMatch:
Collaboration diagram for python.trfValidation.eventMatch:

Public Member Functions

def __init__ (self, executor, eventCountConf=None, eventCountConfOverwrite=False)
 check in- and output event counts More...
 
def eventCount (self)
 
def configureCheck (self, override=False)
 Setup the parameters needed to define particular checks. More...
 
def decide (self)
 Perform an event count check. More...
 

Private Attributes

 _executor
 
 _eventCount
 
 _eventCountConf
 
 _inEventDict
 
 _outEventDict
 
 _skipEvents
 
 _maxEvents
 
 _evAccEff
 

Detailed Description

Small class used for vailiadating event counts between input and output files.

Definition at line 911 of file trfValidation.py.

Constructor & Destructor Documentation

◆ __init__()

def python.trfValidation.eventMatch.__init__ (   self,
  executor,
  eventCountConf = None,
  eventCountConfOverwrite = False 
)

check in- and output event counts

Class to verify that in- and output event counts are in a reasonable relationship.

Parameters

Definition at line 920 of file trfValidation.py.

920  def __init__(self, executor, eventCountConf=None, eventCountConfOverwrite=False):
921  self._executor = executor
922  self._eventCount = None
923 
924 
935  simEventEff = 0.995
936  self._eventCountConf = {}
937  self._eventCountConf['EVNT'] = {'EVNT_MRG':"match", "HITS": simEventEff, "EVNT_TR": "filter", "DAOD_TRUTH*" : "match"}
938  self._eventCountConf['EVNT_TR'] = {'HITS': simEventEff}
939  self._eventCountConf['HITS'] = {'RDO':"match", 'HITS_RSM': simEventEff, "HITS_MRG":"match", 'HITS_FILT': simEventEff, "RDO_FILT": "filter", "DAOD_TRUTH*" : "match", "HIST_SIM" : "match"}
940  self._eventCountConf['BS'] = {'ESD': "match", 'DRAW_*':"filter", 'NTUP_*':"filter", "BS_MRG":"match", 'DESD*': "filter", 'AOD':"match", 'DAOD*':"filter", "DAOD_PHYS":"match", "DAOD_PHYSLITE":"match"}
941  self._eventCountConf['RDO*'] = {'ESD': "match", 'DRAW_*':"filter", 'NTUP_*':"filter", "RDO_MRG":"match", "RDO_TRIG":"match", 'AOD':"match", 'DAOD*':"filter", "DAOD_PHYS":"match", "DAOD_PHYSLITE":"match", "HIST_DIGI":"match"}
942  self._eventCountConf['ESD'] = {'ESD_MRG': "match", 'AOD':"match", 'DESD*':"filter", 'DAOD_*':"filter", 'NTUP_*':"filter", "DAOD_PHYS":"match", "DAOD_PHYSLITE":"match"}
943  self._eventCountConf['AOD'] = {'AOD_MRG' : "match", 'TAG':"match", "NTUP_*":"filter", "DAOD_*":"filter", 'NTUP_*':"filter", "DAOD_PHYS":"match", "DAOD_PHYSLITE":"match"}
944  self._eventCountConf['AOD_MRG'] = {'TAG':"match"}
945  self._eventCountConf['DAOD_*'] = {'DAOD_*_MRG' : "match"}
946  self._eventCountConf['TAG'] = {'TAG_MRG': "match"}
947  self._eventCountConf['HIST'] = {'HIST_MRG': "match"}
948  self._eventCountConf['NTUP_COMMON'] = {'DNTUP*': "filter"}
949  self._eventCountConf['NTUP_*'] = {'NTUP_*_MRG': "match"}
950  # Next one comprises special data type names for smart merging of AthenaMP worker outputs
951  self._eventCountConf['POOL_MRG_INPUT'] = {'POOL_MRG_OUTPUT': "match"}
952 
953 
954  if eventCountConf:
955  if eventCountConfOverwrite is True:
956  self._eventCountConf = eventCountConf
957  else:
958  self._eventCountConf.update(eventCountConf)
959 
960  msg.debug('Event count check configuration is: {0}'.format(self._eventCountConf))
961  if hasattr(self._executor, 'name'):
962  msg.debug('Event count check ready for executor {0}'.format(self._executor.name))
963 
964  if self._executor is not None:
965  self.configureCheck(override=False)
966 

Member Function Documentation

◆ configureCheck()

def python.trfValidation.eventMatch.configureCheck (   self,
  override = False 
)

Setup the parameters needed to define particular checks.

Parameters
overrideIf set then configure the checks using this dictionary, which needs to have keys inEventDict, outEventDict, skipEvents, maxEvents, evAccEff
Note
Default is to configure the checks from the associated executor

Definition at line 975 of file trfValidation.py.

975  def configureCheck(self, override=False):
976  if override:
977  msg.info('Overriding check configuration with: {0}'.format(override))
978  self._inEventDict = override['inEventDict']
979  self._outEventDict = override['outEventDict']
980  self._skipEvents = override['skipEvents']
981  self._maxEvents = override['maxEvents']
982  self._evAccEff = override['evAccEff']
983  else:
984  # Input data from executor
985  self._inEventDict = {}
986  for dataTypeName in self._executor.input:
987  try:
988  self._inEventDict[dataTypeName] = self._executor.conf.dataDictionary[dataTypeName].nentries
989  msg.debug('Input data type {0} has {1} events'.format(dataTypeName, self._inEventDict[dataTypeName]))
990  except KeyError:
991  msg.warning('Found no dataDictionary entry for input data type {0}'.format(dataTypeName))
992 
993  # Output data from executor
994  self._outEventDict = {}
995  for dataTypeName in self._executor.output:
996  try:
997  self._outEventDict[dataTypeName] = self._executor.conf.dataDictionary[dataTypeName].nentries
998  msg.debug('Output data type {0} has {1} events'.format(dataTypeName, self._outEventDict[dataTypeName]))
999  except KeyError:
1000  msg.warning('Found no dataDictionary entry for output data type {0}'.format(dataTypeName))
1001 
1002  # Find if we have a skipEvents applied
1003  if "skipEvents" in self._executor.conf.argdict:
1004  self._skipEvents = self._executor.conf.argdict['skipEvents'].returnMyValue(exe=self._executor)
1005  else:
1006  self._skipEvents = None
1007 
1008  # Find if we have a maxEvents applied
1009  if "maxEvents" in self._executor.conf.argdict:
1010  self._maxEvents = self._executor.conf.argdict['maxEvents'].returnMyValue(exe=self._executor)
1011  if self._maxEvents == -1:
1012  self._maxEvents = None
1013  else:
1014  self._maxEvents = None
1015 
1016  # Executor substeps handling
1017  if self._executor.conf.totalExecutorSteps > 1 and self._executor.conf.executorStep < self._executor.conf.totalExecutorSteps - 1:
1018  executorEventCounts, executorEventSkips = getExecutorStepEventCounts(self._executor)
1019  self._maxEvents = executorEventCounts[self._executor.conf.executorStep]
1020  self._skipEvents = executorEventSkips[self._executor.conf.executorStep]
1021 
1022  # Global eventAcceptanceEfficiency set?
1023  if "eventAcceptanceEfficiency" in self._executor.conf.argdict:
1024  self._evAccEff = self._executor.conf.argdict['eventAcceptanceEfficiency'].returnMyValue(exe=self._executor)
1025  if (self._evAccEff is None):
1026  self._evAccEff = 0.99
1027  else:
1028  self._evAccEff = 0.99
1029 
1030  msg.debug("Event check conf: {0} {1}, {2}, {3}, {4}".format(self._inEventDict, self._outEventDict, self._skipEvents,
1031  self._maxEvents, self._evAccEff))
1032 
1033 

◆ decide()

def python.trfValidation.eventMatch.decide (   self)

Perform an event count check.

Definition at line 1035 of file trfValidation.py.

1035  def decide(self):
1036  # We have all that we need to proceed: input and output data, skip and max events plus any efficiency factor
1037  # So loop over the input and output data and make our checks
1038  for inData, neventsInData in self._inEventDict.items():
1039  if not isinstance(neventsInData, int):
1040  msg.warning('File size metadata for {inData} was not countable, found {neventsInData}. No event checks possible for this input data.'.format(inData=inData, neventsInData=neventsInData))
1041  continue
1042  if inData in self._eventCountConf:
1043  inDataKey = inData
1044  else:
1045  # OK, try a glob match in this case (YMMV)
1046  matchedInData = False
1047  for inDataKey in self._eventCountConf:
1048  if fnmatch.fnmatch(inData, inDataKey):
1049  msg.info("Matched input data type {inData} to {inDataKey} by globbing".format(inData=inData, inDataKey=inDataKey))
1050  matchedInData = True
1051  break
1052  if not matchedInData:
1053  msg.warning('No defined event count match for {inData} -> {outData}, so no check(s) possible in this case.'.format(inData=inData, outData=list(self._outEventDict)))
1054  continue
1055 
1056  # Now calculate the expected number of processed events for this input
1057  expectedEvents = neventsInData
1058  if self._skipEvents is not None and self._skipEvents > 0:
1059  expectedEvents -= self._skipEvents
1060  if expectedEvents < 0:
1061  msg.warning('skipEvents was set higher than the input events in {inData}: {skipEvents} > {neventsInData}. This is not an error, but it is not a normal configuration. Expected events is now 0.'.format(inData=inData, skipEvents=self._skipEvents, neventsInData=neventsInData))
1062  expectedEvents = 0
1063  if self._maxEvents is not None:
1064  if expectedEvents < self._maxEvents:
1065  if self._skipEvents is not None:
1066  msg.warning('maxEvents was set higher than inputEvents-skipEvents for {inData}: {maxEvents} > {neventsInData}-{skipEvents}. This is not an error, but it is not a normal configuration. Expected events remains {expectedEvents}.'.format(inData=inData, maxEvents=self._maxEvents, neventsInData=neventsInData, skipEvents=self._skipEvents, expectedEvents=expectedEvents))
1067  else:
1068  msg.warning('maxEvents was set higher than inputEvents for {inData}: {maxEvents} > {neventsInData}. This is not an error, but it is not a normal configuration. Expected events remains {expectedEvents}.'.format(inData=inData, maxEvents=self._maxEvents, neventsInData=neventsInData, expectedEvents=expectedEvents))
1069  else:
1070  expectedEvents = self._maxEvents
1071  msg.debug('Expected number of processed events for {0} is {1}'.format(inData, expectedEvents))
1072 
1073  # Loop over output data - first find event count configuration
1074  for outData, neventsOutData in self._outEventDict.items():
1075  if not isinstance(neventsOutData, int):
1076  msg.warning('File size metadata for {outData} was not countable, found "{neventsOutData}". No event checks possible for this output data.'.format(outData=outData, neventsOutData=neventsOutData))
1077  continue
1078  if outData in self._eventCountConf[inDataKey]:
1079  checkConf = self._eventCountConf[inDataKey][outData]
1080  outDataKey = outData
1081  else:
1082  # Look for glob matches
1083  checkConf = None
1084  for outDataKey, outDataConf in self._eventCountConf[inDataKey].items():
1085  if fnmatch.fnmatch(outData, outDataKey):
1086  msg.info('Matched output data type {outData} to {outDatakey} by globbing'.format(outData=outData, outDatakey=outDataKey))
1087  outDataKey = outData
1088  checkConf = outDataConf
1089  break
1090  if not checkConf:
1091  msg.warning('No defined event count match for {inData} -> {outData}, so no check possible in this case.'.format(inData=inData, outData=outData))
1092  continue
1093  msg.debug('Event count check for {inData} to {outData} is {checkConf}'.format(inData=inData, outData=outData, checkConf=checkConf))
1094 
1095  # Do the check for thsi input/output combination
1096  if checkConf == 'match':
1097  # We need an exact match
1098  if neventsOutData == expectedEvents:
1099  msg.info("Event count check for {inData} to {outData} passed: all processed events found ({neventsOutData} output events)".format(inData=inData, outData=outData, neventsOutData=neventsOutData))
1100  else:
1101  raise trfExceptions.TransformValidationException(trfExit.nameToCode('TRF_EXEC_VALIDATION_EVENTCOUNT'),
1102  'Event count check for {inData} to {outData} failed: found {neventsOutData} events, expected {expectedEvents}'.format(inData=inData, outData=outData, neventsOutData=neventsOutData, expectedEvents=expectedEvents))
1103  elif checkConf == 'filter':
1104  if neventsOutData <= expectedEvents and neventsOutData >= 0:
1105  msg.info("Event count check for {inData} to {outData} passed: found ({neventsOutData} output events selected from {expectedEvents} processed events)".format(inData=inData, outData=outData, neventsOutData=neventsOutData, expectedEvents=expectedEvents))
1106  else:
1107  raise trfExceptions.TransformValidationException(trfExit.nameToCode('TRF_EXEC_VALIDATION_EVENTCOUNT'),
1108  'Event count check for {inData} to {outData} failed: found {neventsOutData} events, expected from 0 to {expectedEvents}'.format(inData=inData, outData=outData, neventsOutData=neventsOutData, expectedEvents=expectedEvents))
1109  elif checkConf == 'minEff':
1110  if neventsOutData >= int(expectedEvents * self._evAccEff) and neventsOutData <= expectedEvents:
1111  msg.info("Event count check for {inData} to {outData} passed: found ({neventsOutData} output events selected from {expectedEvents} processed events)".format(inData=inData, outData=outData, neventsOutData=neventsOutData, expectedEvents=expectedEvents))
1112  else:
1113  raise trfExceptions.TransformValidationException(trfExit.nameToCode('TRF_EXEC_VALIDATION_EVENTCOUNT'),
1114  'Event count check for {inData} to {outData} failed: found {neventsOutData} events, expected from {minEvents} to {expectedEvents}'.format(inData=inData, outData=outData, neventsOutData=neventsOutData,
1115  minEvents=int(expectedEvents * self._evAccEff), expectedEvents=expectedEvents))
1116  elif isinstance(checkConf, (float, int)):
1117  checkConf = float(checkConf)
1118  if checkConf < 0.0 or checkConf > 1.0:
1119  raise trfExceptions.TransformValidationException(trfExit.nameToCode('TRF_EXEC_VALIDATION_EVENTCOUNT'),
1120  'Event count check for {inData} to {outData} is misconfigured: the efficiency factor of {eff} is not between 0 and 1.'.format(inData=inData, outData=outData, eff=checkConf))
1121  if neventsOutData >= int(expectedEvents * checkConf) and neventsOutData <= expectedEvents:
1122  msg.info("Event count check for {inData} to {outData} passed: found ({neventsOutData} output events selected from {expectedEvents} processed events)".format(inData=inData, outData=outData, neventsOutData=neventsOutData, expectedEvents=expectedEvents))
1123  else:
1124  raise trfExceptions.TransformValidationException(trfExit.nameToCode('TRF_EXEC_VALIDATION_EVENTCOUNT'),
1125  'Event count check for {inData} to {outData} failed: found {neventsOutData} events, expected from {minEvents} to {expectedEvents}'.format(inData=inData, outData=outData, neventsOutData=neventsOutData,
1126  minEvents=int(expectedEvents * checkConf), expectedEvents=expectedEvents))
1127  else:
1128  raise trfExceptions.TransformValidationException(trfExit.nameToCode('TRF_EXEC_VALIDATION_EVENTCOUNT'),
1129  'Unrecognised event count configuration for {inData} to {outData}: "{conf}" is not known'.format(inData=inData, outData=outData, conf=checkConf))
1130  self._eventCount = expectedEvents
1131  return True

◆ eventCount()

def python.trfValidation.eventMatch.eventCount (   self)

Definition at line 968 of file trfValidation.py.

968  def eventCount(self):
969  return self._eventCount
970 

Member Data Documentation

◆ _evAccEff

python.trfValidation.eventMatch._evAccEff
private

Definition at line 982 of file trfValidation.py.

◆ _eventCount

python.trfValidation.eventMatch._eventCount
private

Definition at line 922 of file trfValidation.py.

◆ _eventCountConf

python.trfValidation.eventMatch._eventCountConf
private
Note
This double dictionary is formed of INPUT data, then a dictionary of the expected event counts from different output data types. If there is no exact match for the output datatype then globbing matches are allowed. Thus self._eventCountConf[input][output] gives the test for input -> output. The dictionary recognises the following options:
  • match : exact match of input and output events, n_in = n_out
  • filter : any event count from 0 up to input events is ok, 0 <= n_out <= n_in
  • minEff : any event count from n_in * eventAcceptanceEfficiency <= n_out <= n_in
  • float in range [0,1] : same as minEff with this efficiency factor For any case where the output events can be less than the input ones an integer conversion is applied, so the result is rounded down. i.e., 1 * 0.5 -> 0.

Definition at line 936 of file trfValidation.py.

◆ _executor

python.trfValidation.eventMatch._executor
private

Definition at line 921 of file trfValidation.py.

◆ _inEventDict

python.trfValidation.eventMatch._inEventDict
private

Definition at line 978 of file trfValidation.py.

◆ _maxEvents

python.trfValidation.eventMatch._maxEvents
private

Definition at line 981 of file trfValidation.py.

◆ _outEventDict

python.trfValidation.eventMatch._outEventDict
private

Definition at line 979 of file trfValidation.py.

◆ _skipEvents

python.trfValidation.eventMatch._skipEvents
private

Definition at line 980 of file trfValidation.py.


The documentation for this class was generated from the following file:
vtune_athena.format
format
Definition: vtune_athena.py:14
CaloCellPos2Ntuple.int
int
Definition: CaloCellPos2Ntuple.py:24
histSizes.list
def list(name, path='/')
Definition: histSizes.py:38
python.trfExeStepTools.getExecutorStepEventCounts
def getExecutorStepEventCounts(executor, argdict=None)
Definition: trfExeStepTools.py:44
TrigJetMonitorAlgorithm.items
items
Definition: TrigJetMonitorAlgorithm.py:79
python.processes.powheg.ZZ.ZZ.__init__
def __init__(self, base_directory, **kwargs)
Constructor: all process options are set here.
Definition: ZZ.py:18
dqt_zlumi_pandas.update
update
Definition: dqt_zlumi_pandas.py:42
readCCLHist.float
float
Definition: readCCLHist.py:83