ATLAS Offline Software
Public Member Functions | Private Attributes | Static Private Attributes | List of all members
python.trfReports.trfJobReport Class Reference

Class holding a transform job report. More...

Inheritance diagram for python.trfReports.trfJobReport:
Collaboration diagram for python.trfReports.trfJobReport:

Public Member Functions

def __init__ (self, parentTrf)
 Constructor. More...
 
def python (self, fast=False, fileReport=defaultFileReport)
 generate the python transform job report More...
 
def classicEltree (self, fast=False)
 Classic metadata.xml report. More...
 
def classicPython (self, fast=False)
 Classic Tier 0 metadata python object. More...
 
def roundoff (self, value)
 
def __str__ (self)
 String representation of the job report. More...
 
def json (self, fast=False)
 Method which returns a JSON representation of a report. More...
 
def writeJSONReport (self, filename, sort_keys=True, indent=2, fast=False, fileReport=defaultFileReport)
 
def writeTxtReport (self, filename, dumpEnv=True, fast=False, fileReport=defaultFileReport)
 
def writeGPickleReport (self, filename, fast=False)
 
def writeClassicXMLReport (self, filename, fast=False)
 
def writePilotPickleReport (self, filename, fast=False, fileReport=defaultFileReport)
 

Private Attributes

 _trf
 
 _precisionDigits
 
 _dbDataTotal
 
 _dbTimeTotal
 
 _dataDictionary
 

Static Private Attributes

 _reportVersion
 This is the version counter for transform job reports any changes to the format must be reflected by incrementing this. More...
 
 _metadataKeyMap
 
 _maxMsgLen
 
 _truncationMsg
 

Detailed Description

Class holding a transform job report.

Definition at line 117 of file trfReports.py.

Constructor & Destructor Documentation

◆ __init__()

def python.trfReports.trfJobReport.__init__ (   self,
  parentTrf 
)

Constructor.

Parameters
parentTrfMandatory link to the transform this job report represents

Definition at line 127 of file trfReports.py.

127  def __init__(self, parentTrf):
128  super(trfJobReport, self).__init__()
129  self._trf = parentTrf
130  self._precisionDigits = 3
131  self._dbDataTotal = 0
132  self._dbTimeTotal = 0.0
133 

Member Function Documentation

◆ __str__()

def python.trfReports.trfReport.__str__ (   self)
inherited

String representation of the job report.

Uses pprint module to output the python object as text

Note
This is a 'property', so no fast option is available

Definition at line 47 of file trfReports.py.

47  def __str__(self):
48  return pprint.pformat(self.python())
49 

◆ classicEltree()

def python.trfReports.trfJobReport.classicEltree (   self,
  fast = False 
)

Classic metadata.xml report.

Reimplemented from python.trfReports.trfReport.

Definition at line 253 of file trfReports.py.

253  def classicEltree(self, fast = False):
254  trfTree = ElementTree.Element('POOLFILECATALOG')
255  # Extract some executor parameters here
256  for exeKey in ('preExec', 'postExec', 'preInclude', 'postInclude'):
257  if exeKey in self._trf.argdict:
258  for substep, pyfrag in self._trf.argdict[exeKey].value.items():
259  if substep == 'all':
260  ElementTree.SubElement(trfTree, 'META', type = 'string', name = exeKey, value = str(pyfrag))
261  else:
262  ElementTree.SubElement(trfTree, 'META', type = 'string', name = exeKey + '_' + substep, value = str(pyfrag))
263  for exeKey in ('autoConfiguration', 'AMIConfig', 'AMITag'):
264  if exeKey in self._trf.argdict:
265  if exeKey in self._metadataKeyMap:
266  classicName = self._metadataKeyMap[exeKey]
267  else:
268  classicName = exeKey
269  ElementTree.SubElement(trfTree, 'META', type = 'string', name = classicName,
270  value = str(self._trf.argdict[exeKey].value))
271 
272  # Now add information about output files
273  for dataArg in self._trf._dataDictionary.values():
274  if isinstance(dataArg, list): # Always skip lists from the report (auxiliary files)
275  continue
276  if dataArg.io == 'output':
277  for fileEltree in trfFileReport(dataArg).classicEltreeList(fast = fast):
278  trfTree.append(fileEltree)
279 
280  return trfTree
281 

◆ classicPython()

def python.trfReports.trfJobReport.classicPython (   self,
  fast = False 
)

Classic Tier 0 metadata python object.

Metadata in python nested dictionary form, which will produce a Tier 0 .gpickle when pickled

Reimplemented from python.trfReports.trfReport.

Definition at line 285 of file trfReports.py.

285  def classicPython(self, fast = False):
286  # Things we can get directly from the transform
287  trfDict = {'jobInputs' : [], # Always empty?
288  'jobOutputs' : [], # Filled in below...
289  'more' : {'Machine' : 'unknown'},
290  'trfAcronym' : trfExit.codeToName(self._trf.exitCode),
291  'trfCode' : self._trf.exitCode,
292  'trfExitCode' : self._trf.exitCode,
293  }
294 
295  if self._trf.lastExecuted is not None:
296  trfDict.update({'athAcronym' : self._trf.lastExecuted.errMsg,
297  'athCode' : self._trf.lastExecuted.rc})
298 
299 
300  # Emulate the NEEDCHECK behaviour
301  if hasattr(self._trf, '_executorPath'):
302  for executor in self._trf._executorPath:
303  if hasattr(executor, '_logScan') and self._trf.exitCode == 0:
304  if executor._logScan._levelCounter['FATAL'] > 0 or executor._logScan._levelCounter['CRITICAL'] > 0:
305  # This should not happen!
306  msg.warning('Found FATAL/CRITICAL errors and exit code 0 - reseting to TRF_LOGFILE_FAIL')
307  self._trf.exitCode = trfExit.nameToCode('TRF_LOGFILE_FAIL')
308  trfDict['trfAcronym'] = 'TRF_LOGFILE_FAIL'
309  elif executor._logScan._levelCounter['ERROR'] > 0:
310  msg.warning('Found errors in logfile scan - changing exit acronymn to NEEDCHECK.')
311  trfDict['trfAcronym'] = 'NEEDCHECK'
312 
313  # Now add files
314  fileArgs = self._trf.getFiles(io = 'output')
315  for fileArg in fileArgs:
316  # N.B. In the original Tier 0 gpickles there was executor
317  # information added for each file (such as autoConfiguration, preExec).
318  # However, Luc tells me it is ignored, so let's not bother.
319  trfDict['jobOutputs'].extend(trfFileReport(fileArg).classicPython(fast = fast))
320  # AMITag and friends is added per-file, but it's known only to the transform, so set it here:
321  for argdictKey in ('AMITag', 'autoConfiguration',):
322  if argdictKey in self._trf.argdict:
323  trfDict['jobOutputs'][-1]['more']['metadata'][argdictKey] = self._trf.argdict[argdictKey].value
324  # Mangle substep argumemts back to the old format
325  for substepKey in ('preExec', 'postExec', 'preInclude', 'postInclude'):
326  if substepKey in self._trf.argdict:
327  for substep, values in self._trf.argdict[substepKey].value.items():
328  if substep == 'all':
329  trfDict['jobOutputs'][-1]['more']['metadata'][substepKey] = values
330  else:
331  trfDict['jobOutputs'][-1]['more']['metadata'][substepKey + '_' + substep] = values
332 
333  # Now retrieve the input event count
334  nentries = 'UNKNOWN'
335  for fileArg in self._trf.getFiles(io = 'input'):
336  thisArgNentries = fileArg.nentries
337  if isinstance(thisArgNentries, int):
338  if nentries == 'UNKNOWN':
339  nentries = thisArgNentries
340  elif thisArgNentries != nentries:
341  msg.warning('Found a file with different event count than others: {0} != {1} for {2}'.format(thisArgNentries, nentries, fileArg))
342  # Take highest number?
343  if thisArgNentries > nentries:
344  nentries = thisArgNentries
345  trfDict['nevents'] = nentries
346 
347  # Tier 0 expects the report to be in a top level dictionary under the prodsys key
348  return {'prodsys' : trfDict}
349 

◆ json()

def python.trfReports.trfReport.json (   self,
  fast = False 
)
inherited

Method which returns a JSON representation of a report.

Parameters
fastBoolean which forces the fastest possible report to be written

Calls json.dumps on the python representation

Definition at line 58 of file trfReports.py.

58  def json(self, fast = False):
59  return json.dumps(self.python, type)
60 

◆ python()

def python.trfReports.trfJobReport.python (   self,
  fast = False,
  fileReport = defaultFileReport 
)

generate the python transform job report

Parameters
typeThe general type of this report (e.g. fast)
fileReportDictionary giving the type of report to make for each type of file. This dictionary has to have all io types as keys and valid values are: None - skip this io type; 'full' - Provide all details; 'name' - only dataset and filename will be reported on.

Reimplemented from python.trfReports.trfReport.

Definition at line 140 of file trfReports.py.

140  def python(self, fast = False, fileReport = defaultFileReport):
141  myDict = {'name': self._trf.name,
142  'reportVersion': self._reportVersion,
143  'cmdLine': ' '.join(shQuoteStrings(sys.argv)),
144  'exitAcronym': trfExit.codeToName(self._trf.exitCode),
145  'exitCode': self._trf.exitCode,
146  'created': isodate(),
147  'resource': {'executor': {}, 'transform': {}},
148  'files': {}
149  }
150  if len(self._trf.exitMsg) > self._maxMsgLen:
151  myDict['exitMsg'] = self._trf.exitMsg[:self._maxMsgLen-len(self._truncationMsg)] + self._truncationMsg
152  myDict['exitMsgExtra'] = self._trf.exitMsg[self._maxMsgLen-len(self._truncationMsg):]
153  else:
154  myDict['exitMsg'] = self._trf.exitMsg
155  myDict['exitMsgExtra'] = ""
156 
157  # Iterate over files
158  for fileType in ('input', 'output', 'temporary'):
159  if fileReport[fileType]:
160  myDict['files'][fileType] = []
161  # Should have a dataDictionary, unless something went wrong very early...
162  for dataType, dataArg in self._trf._dataDictionary.items():
163  if isinstance(dataArg, list): # Always skip lists from the report (auxiliary files)
164  continue
165  if dataArg.auxiliaryFile: # Always skip auxilliary files from the report
166  continue
167  if fileReport[dataArg.io]:
168  entry = {"type": dataType}
169  entry.update(trfFileReport(dataArg).python(fast = fast, type = fileReport[dataArg.io]))
170  # Supress RAW if all subfiles had nentries == 0
171  if 'subFiles' in entry and len(entry['subFiles']) == 0 and isinstance(dataArg, trfArgClasses.argBSFile) :
172  msg.info('No subFiles for entry {0}, suppressing from report.'.format(entry['argName']))
173  else:
174  myDict['files'][dataArg.io].append(entry)
175 
176  # We report on all executors, in execution order
177  myDict['executor'] = []
178  if hasattr(self._trf, '_executorPath'):
179  for executionStep in self._trf._executorPath:
180  exe = self._trf._executorDictionary[executionStep['name']]
181  myDict['executor'].append(trfExecutorReport(exe).python(fast = fast))
182  # Executor resources are gathered here to unify where this information is held
183  # and allow T0/PanDA to just store this JSON fragment on its own
184  myDict['resource']['executor'][exe.name] = exeResourceReport(exe, self)
185  for mergeStep in exe.myMerger:
186  myDict['resource']['executor'][mergeStep.name] = exeResourceReport(mergeStep, self)
187  if self._dbDataTotal > 0 or self._dbTimeTotal > 0:
188  myDict['resource']['dbDataTotal'] = self._dbDataTotal
189  myDict['resource']['dbTimeTotal'] = self.roundoff(self._dbTimeTotal)
190  # Resource consumption
191  reportTime = os.times()
192 
193  # Calculate total cpu time we used -
194  myCpuTime = reportTime[0] + reportTime[1]
195  childCpuTime = reportTime[2] + reportTime[3]
196  wallTime = reportTime[4] - self._trf.transformStart[4]
197  cpuTime = myCpuTime
198  cpuTimeTotal = 0
199  cpuTimePerWorker = myCpuTime
200  maxWorkers = 1
201  msg.debug('Raw cpu resource consumption: transform {0}, children {1}'.format(myCpuTime, childCpuTime))
202  # Reduce childCpuTime by times reported in the executors (broken for MP...?)
203  for exeName, exeReport in myDict['resource']['executor'].items():
204  if 'mpworkers' in exeReport:
205  if exeReport['mpworkers'] > maxWorkers : maxWorkers = exeReport['mpworkers']
206  try:
207  msg.debug('Subtracting {0}s time for executor {1}'.format(exeReport['cpuTime'], exeName))
208  childCpuTime -= exeReport['cpuTime']
209  except TypeError:
210  pass
211  try:
212  cpuTime += exeReport['cpuTime']
213  cpuTimeTotal += exeReport['total']['cpuTime']
214  if 'cpuTimePerWorker' in exeReport:
215  msg.debug('Adding {0}s to cpuTimePerWorker'.format(exeReport['cpuTimePerWorker']))
216  cpuTimePerWorker += exeReport['cpuTimePerWorker']
217  else:
218  msg.debug('Adding nonMP cpuTime {0}s to cpuTimePerWorker'.format(exeReport['cpuTime']))
219  cpuTimePerWorker += exeReport['cpuTime']
220  except TypeError:
221  pass
222 
223  msg.debug('maxWorkers: {0}, cpuTimeTotal: {1}, cpuTimePerWorker: {2}'.format(maxWorkers, cpuTime, cpuTimePerWorker))
224  reportGenerationCpuTime = reportGenerationWallTime = None
225  if self._trf.outFileValidationStop and reportTime:
226  reportGenerationCpuTime = calcCpuTime(self._trf.outFileValidationStop, reportTime)
227  reportGenerationWallTime = calcWallTime(self._trf.outFileValidationStop, reportTime)
228 
229  myDict['resource']['transform'] = {'cpuTime': self.roundoff(myCpuTime),
230  'cpuTimeTotal': self.roundoff(cpuTimeTotal),
231  'externalCpuTime': self.roundoff(childCpuTime),
232  'wallTime': self.roundoff(wallTime),
233  'transformSetup': {'cpuTime': self.roundoff(self._trf.transformSetupCpuTime),
234  'wallTime': self.roundoff(self._trf.transformSetupWallTime)},
235  'inFileValidation': {'cpuTime': self.roundoff(self._trf.inFileValidationCpuTime),
236  'wallTime': self.roundoff(self._trf.inFileValidationWallTime)},
237  'outFileValidation': {'cpuTime': self.roundoff(self._trf.outFileValidationCpuTime),
238  'wallTime': self.roundoff(self._trf.outFileValidationWallTime)},
239  'reportGeneration': {'cpuTime': self.roundoff(reportGenerationCpuTime),
240  'wallTime': self.roundoff(reportGenerationWallTime)}, }
241  if self._trf.processedEvents:
242  myDict['resource']['transform']['processedEvents'] = self._trf.processedEvents
243  myDict['resource']['transform']['trfPredata'] = self._trf.trfPredata
244  # check for devision by zero for fast jobs, unit tests
245  if wallTime > 0:
246  myDict['resource']['transform']['cpuEfficiency'] = round(cpuTime/maxWorkers/wallTime, 4)
247  myDict['resource']['transform']['cpuPWEfficiency'] = round(cpuTimePerWorker/wallTime, 4)
248  myDict['resource']['machine'] = machineReport().python(fast = fast)
249 
250  return myDict
251 

◆ roundoff()

def python.trfReports.trfJobReport.roundoff (   self,
  value 
)

Definition at line 352 of file trfReports.py.

352  def roundoff(self, value):
353  return round(value, self._precisionDigits) if (value is not None) else value
354 
355 

◆ writeClassicXMLReport()

def python.trfReports.trfReport.writeClassicXMLReport (   self,
  filename,
  fast = False 
)
inherited

Definition at line 104 of file trfReports.py.

104  def writeClassicXMLReport(self, filename, fast = False):
105  with open(filename, 'w') as report:
106  print(prettyXML(self.classicEltree(fast = fast), poolFileCatalogFormat = True), file=report)
107 

◆ writeGPickleReport()

def python.trfReports.trfReport.writeGPickleReport (   self,
  filename,
  fast = False 
)
inherited

Definition at line 100 of file trfReports.py.

100  def writeGPickleReport(self, filename, fast = False):
101  with open(filename, 'wb') as report:
102  pickle.dump(self.classicPython(fast = fast), report)
103 

◆ writeJSONReport()

def python.trfReports.trfReport.writeJSONReport (   self,
  filename,
  sort_keys = True,
  indent = 2,
  fast = False,
  fileReport = defaultFileReport 
)
inherited

Definition at line 71 of file trfReports.py.

71  def writeJSONReport(self, filename, sort_keys = True, indent = 2, fast = False, fileReport = defaultFileReport):
72  with open(filename, 'w') as report:
73  try:
74  if not self._dataDictionary:
75  self._dataDictionary = self.python(fast=fast, fileReport=fileReport)
76 
77  json.dump(self._dataDictionary, report, sort_keys = sort_keys, indent = indent)
78  except TypeError as e:
79  # TypeError means we had an unserialisable object - re-raise as a trf internal
80  message = 'TypeError raised during JSON report output: {0!s}'.format(e)
81  msg.error(message)
82  raise trfExceptions.TransformReportException(trfExit.nameToCode('TRF_INTERNAL_REPORT_ERROR'), message)
83 

◆ writePilotPickleReport()

def python.trfReports.trfReport.writePilotPickleReport (   self,
  filename,
  fast = False,
  fileReport = defaultFileReport 
)
inherited

Definition at line 108 of file trfReports.py.

108  def writePilotPickleReport(self, filename, fast = False, fileReport = defaultFileReport):
109  with open(filename, 'w') as report:
110  if not self._dataDictionary:
111  self._dataDictionary = self.python(fast = fast, fileReport = fileReport)
112 
113  pickle.dump(self._dataDictionary, report)
114 
115 

◆ writeTxtReport()

def python.trfReports.trfReport.writeTxtReport (   self,
  filename,
  dumpEnv = True,
  fast = False,
  fileReport = defaultFileReport 
)
inherited

Definition at line 84 of file trfReports.py.

84  def writeTxtReport(self, filename, dumpEnv = True, fast = False, fileReport = defaultFileReport):
85  with open(filename, 'w') as report:
86  if not self._dataDictionary:
87  self._dataDictionary = self.python(fast = fast, fileReport = fileReport)
88 
89  print('# {0} file generated on'.format(self.__class__.__name__), isodate(), file=report)
90  print(pprint.pformat(self._dataDictionary), file=report)
91  if dumpEnv:
92  print('# Environment dump', file=report)
93  eKeys = list(os.environ)
94  eKeys.sort()
95  for k in eKeys:
96  print('%s=%s' % (k, os.environ[k]), file=report)
97  print('# Machine report', file=report)
98  print(pprint.pformat(machineReport().python(fast = fast)), file=report)
99 

Member Data Documentation

◆ _dataDictionary

python.trfReports.trfReport._dataDictionary
privateinherited

Definition at line 41 of file trfReports.py.

◆ _dbDataTotal

python.trfReports.trfJobReport._dbDataTotal
private

Definition at line 131 of file trfReports.py.

◆ _dbTimeTotal

python.trfReports.trfJobReport._dbTimeTotal
private

Definition at line 132 of file trfReports.py.

◆ _maxMsgLen

python.trfReports.trfJobReport._maxMsgLen
staticprivate

Definition at line 122 of file trfReports.py.

◆ _metadataKeyMap

python.trfReports.trfJobReport._metadataKeyMap
staticprivate

Definition at line 121 of file trfReports.py.

◆ _precisionDigits

python.trfReports.trfJobReport._precisionDigits
private

Definition at line 130 of file trfReports.py.

◆ _reportVersion

python.trfReports.trfJobReport._reportVersion
staticprivate

This is the version counter for transform job reports any changes to the format must be reflected by incrementing this.

Definition at line 120 of file trfReports.py.

◆ _trf

python.trfReports.trfJobReport._trf
private

Definition at line 129 of file trfReports.py.

◆ _truncationMsg

python.trfReports.trfJobReport._truncationMsg
staticprivate

Definition at line 123 of file trfReports.py.


The documentation for this class was generated from the following file:
vtune_athena.format
format
Definition: vtune_athena.py:14
json
nlohmann::json json
Definition: HistogramDef.cxx:9
python.trfUtils.prettyXML
def prettyXML(element, indent=' ', poolFileCatalogFormat=False)
XML pretty print an ElementTree.ELement object.
Definition: trfUtils.py:397
python.trfUtils.calcCpuTime
def calcCpuTime(start, stop)
Definition: trfUtils.py:1683
MuonGM::round
float round(const float toRound, const unsigned int decimals)
Definition: Mdt.cxx:27
dumpHVPathFromNtuple.append
bool append
Definition: dumpHVPathFromNtuple.py:91
python.trfUtils.isodate
def isodate()
Return isoformated 'now' string.
Definition: trfUtils.py:434
python
Definition: AtlasTest/TestTools/python/__init__.py:1
python.MadGraphUtils.python
string python
Definition: MadGraphUtils.py:14
histSizes.list
def list(name, path='/')
Definition: histSizes.py:38
python.trfReports.exeResourceReport
def exeResourceReport(exe, report)
Definition: trfReports.py:636
TCS::join
std::string join(const std::vector< std::string > &v, const char c=',')
Definition: Trigger/TrigT1/L1Topo/L1TopoCommon/Root/StringUtils.cxx:10
TrigJetMonitorAlgorithm.items
items
Definition: TrigJetMonitorAlgorithm.py:79
python.trfUtils.calcWallTime
def calcWallTime(start, stop)
Definition: trfUtils.py:1691
python.processes.powheg.ZZ.ZZ.__init__
def __init__(self, base_directory, **kwargs)
Constructor: all process options are set here.
Definition: ZZ.py:18
Trk::open
@ open
Definition: BinningType.h:40
Muon::print
std::string print(const MuPatSegment &)
Definition: MuonTrackSteering.cxx:28
str
Definition: BTagTrackIpAccessor.cxx:11
python.trfUtils.shQuoteStrings
def shQuoteStrings(strArray=sys.argv)
Quote a string array so that it can be echoed back on the command line in a cut 'n' paste safe way.
Definition: trfUtils.py:361