ATLAS Offline Software
Public Member Functions | Public Attributes | Private Member Functions | Private Attributes | List of all members
python.trfArgClasses.argBZ2File Class Reference

TarBZ filetype. More...

Inheritance diagram for python.trfArgClasses.argBZ2File:
Collaboration diagram for python.trfArgClasses.argBZ2File:

Public Member Functions

def prodsysDescription (self)
 
def io (self)
 
def io (self, value)
 
def dataset (self)
 
def dataset (self, value)
 
def value (self)
 Argument value getter. More...
 
def value (self, value)
 Argument value setter. More...
 
def multipleOK (self)
 multipleOK getter More...
 
def multipleOK (self, value)
 multipleOK value setter More...
 
def mergeTargetSize (self)
 mergeTargeSize value getter More...
 
def mergeTargetSize (self, value)
 mergeTargeSize value setter More...
 
def executor (self)
 Executor status getter. More...
 
def valueSetter (self, value)
 Set the argFile value, but allow parameters here. More...
 
def orignalName (self)
 
def originalName (self, value)
 
def type (self)
 
def type (self, value)
 
def subtype (self)
 
def subtype (self, value)
 
def name (self)
 Name getter. More...
 
def name (self, value)
 Name setter. More...
 
def auxiliaryFile (self)
 
def metadata (self)
 Returns the whole kit and kaboodle... More...
 
def nentries (self)
 Return total number of events in all constituent files. More...
 
def getnentries (self, fast=False)
 Explicit getter, offering fast switch. More...
 
def getMetadata (self, files=None, metadataKeys=None, maskMetadataKeys=None, populate=True, flush=False)
 Return specific keys for specific files. More...
 
def getSingleMetadata (self, fname, metadataKey, populate=True, flush=False)
 Convenience function to extract a single metadata key for a single file. More...
 
def isCached (self, files=None, metadataKeys=None)
 Test if certain metadata elements are already cached. More...
 
def __str__ (self)
 String representation of a file argument. More...
 
def append (self, addme)
 Append a value to the list. More...
 
def __repr__ (self)
 Repr conversion. More...
 
def isRunarg (self)
 Return runarg status. More...
 
def __eq__ (self, other)
 Comparison is based on value attribute. More...
 
def __nq__ (self, other)
 
def __lt__ (self, other)
 
def __gt__ (self, other)
 

Public Attributes

 io
 
 dataset
 
 value
 

Private Member Functions

def _getIntegrity (self, files)
 File integrity checker. More...
 
def _resetMetadata (self, files=[])
 Resets all metadata files in this instance. More...
 
def _readMetadata (self, files, metadataKeys)
 Check metadata is in the cache or generate it if it's missing. More...
 
def _setMetadata (self, files=None, metadataKeys={})
 Set metadata values into the cache. More...
 
def _getDatasetFromFilename (self, reset=False)
 Look for dataset name in dataset::filename Tier0 convention. More...
 
def _getSize (self, files)
 Determines the size of files. More...
 
def _generateGUID (self, files)
 Generate a GUID on demand - no intrinsic for this file type
More...
 
def _exists (self, files)
 Try to determine if a file actually exists... More...
 
def _mergeArgs (self, argdict, copyArgs=None)
 Utility to strip arguments which should not be passed to the selfMerge methods of our child classes. More...
 

Private Attributes

 _dataset
 
 _urlType
 
 _type
 
 _subtype
 
 _guid
 
 _mergeTargetSize
 
 _auxiliaryFile
 
 _originalName
 
 _exe
 
 _metadataKeys
 
 _fileMetadata
 
 _io
 Input file globbing and expansion. More...
 
 _multipleOK
 
 _value
 
 _name
 
 _splitter
 
 _supressEmptyStrings
 
 _runarg
 

Detailed Description

TarBZ filetype.

Definition at line 1798 of file trfArgClasses.py.

Member Function Documentation

◆ __eq__()

def python.trfArgClasses.argument.__eq__ (   self,
  other 
)
inherited

Comparison is based on value attribute.

Definition at line 161 of file trfArgClasses.py.

161  def __eq__(self,other):
162  return self.value == other.value
163 

◆ __gt__()

def python.trfArgClasses.argument.__gt__ (   self,
  other 
)
inherited

Definition at line 170 of file trfArgClasses.py.

170  def __gt__(self, other):
171  return self.value > other.value
172 

◆ __lt__()

def python.trfArgClasses.argument.__lt__ (   self,
  other 
)
inherited

Definition at line 167 of file trfArgClasses.py.

167  def __lt__(self, other):
168  return self.value < other.value
169 

◆ __nq__()

def python.trfArgClasses.argument.__nq__ (   self,
  other 
)
inherited

Definition at line 164 of file trfArgClasses.py.

164  def __nq__(self, other):
165  return self.value != other.value
166 

◆ __repr__()

def python.trfArgClasses.argList.__repr__ (   self)
inherited

Repr conversion.

Return a python parsable string

Reimplemented from python.trfArgClasses.argument.

Definition at line 409 of file trfArgClasses.py.

409  def __repr__(self):
410  return '[' + ','.join([ repr(s) for s in self._value ]) + ']'
411 
412 

◆ __str__()

def python.trfArgClasses.argFile.__str__ (   self)
inherited

String representation of a file argument.

Reimplemented from python.trfArgClasses.argList.

Definition at line 1216 of file trfArgClasses.py.

1216  def __str__(self):
1217  return "{0}={1} (Type {2}, Dataset {3}, IO {4})".format(self.name, self.value, self.type, self.dataset, self.io)
1218 
1219 

◆ _exists()

def python.trfArgClasses.argFile._exists (   self,
  files 
)
privateinherited

Try to determine if a file actually exists...

For a posix file, just call stat; for anything else call TFile.Open A small optimisation is to retieve the file_size metadatum at the same time.

Parameters

Definition at line 1180 of file trfArgClasses.py.

1180  def _exists(self, files):
1181  import re
1182  msg.debug('Testing existance for {0}'.format(files))
1183  def split_filelist(fn):
1184  if self.io != 'output':
1185  return [fn]
1186  file_split_regex = re.compile(r"(.+)\[(.+)](.+)")
1187  if ('[' in fn) and (']' in fn):
1188  match = file_split_regex.match(fn)
1189  return [f"{match.group(1)}{it}{match.group(3)}" for it in match.group(2).split(',')]
1190  else:
1191  return [fn]
1192  for fname in files:
1193  file_list = split_filelist(fname)
1194  if self._urlType == 'posix':
1195  try:
1196  size = map(lambda fn: os.stat(fn).st_size, file_list)
1197  self._fileMetadata[fname]['file_size'] = sum(size)
1198  self._fileMetadata[fname]['_exists'] = True
1199  msg.debug('POSIX file {0} exists (or all elements of list)'.format(fname))
1200  except OSError as e:
1201  msg.error('Got exception {0!s} raised while stating file {1} (or some element of list) - probably it does not exist'.format(e, fname))
1202  self._fileMetadata[fname]['_exists'] = False
1203  else:
1204  # OK, let's see if ROOT can do it...
1205  msg.debug('Calling ROOT TFile.GetSize on {0} (or elements of list)'.format(fname))
1206  size = map(ROOTGetSize, file_list)
1207  if None in size:
1208  self._fileMetadata[fname]['_exists'] = False
1209  msg.error('Non-POSIX file {0} (or element of list) could not be opened - probably it does not exist'.format(fname))
1210  else:
1211  msg.debug('Non-POSIX file {0} (or all elements of list) exists'.format(fname))
1212  self._fileMetadata[fname]['file_size'] = sum(size)
1213  self._fileMetadata[fname]['_exists'] = True
1214 

◆ _generateGUID()

def python.trfArgClasses.argFile._generateGUID (   self,
  files 
)
privateinherited

Generate a GUID on demand - no intrinsic for this file type

Use uuid.uuid4() call to generate a GUID

Note
This generation method will be superceeded in any file type which actually has an intrinsic GUID (e.g. BS or POOL files)

Definition at line 1169 of file trfArgClasses.py.

1169  def _generateGUID(self, files):
1170  for fname in files:
1171  msg.debug('Generating a GUID for file {0}'.format(fname))
1172  self._fileMetadata[fname]['file_guid'] = str(uuid.uuid4()).upper()
1173 
1174 

◆ _getDatasetFromFilename()

def python.trfArgClasses.argFile._getDatasetFromFilename (   self,
  reset = False 
)
privateinherited

Look for dataset name in dataset::filename Tier0 convention.

At the moment all files must be in the same dataset if it's specified. (To change this dataset will need to become a per-file metadatum.)

Note
dsn::lfn notation must be used for all input values and all dsn values must be the same
Parameters

Definition at line 1093 of file trfArgClasses.py.

1093  def _getDatasetFromFilename(self, reset = False):
1094  if reset:
1095  self._dataset = None
1096  newValue = []
1097  for filename in self._value:
1098  if filename.find('#') > -1:
1099  (dataset, fname) = filename.split('#', 1)
1100  newValue.append(fname)
1101  msg.debug('Current dataset: {0}; New dataset {1}'.format(self._dataset, dataset))
1102  if self._dataset and (self._dataset != dataset):
1103  raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_ARG_DATASET'),
1104  'Found inconsistent dataset assignment in argFile setup: %s != %s' % (self._dataset, dataset))
1105  self._dataset = dataset
1106  if len(newValue) == 0:
1107  return
1108  elif len(newValue) != len (self._value):
1109  raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_ARG_DATASET'),
1110  'Found partial dataset assignment in argFile setup from {0} (dsn#lfn notation must be uniform for all inputs)'.format(self._value))
1111  self._value = newValue
1112 

◆ _getIntegrity()

def python.trfArgClasses.argBZ2File._getIntegrity (   self,
  files 
)
private

File integrity checker.

For a 'plain' file, integrity just checks that we can read it

Parameters

Reimplemented from python.trfArgClasses.argFile.

Definition at line 1799 of file trfArgClasses.py.

1799  def _getIntegrity(self, files):
1800  for fname in files:
1801  # bz2 only supports 'with' from python 2.7
1802  try:
1803  f = bz2.BZ2File(fname, 'r')
1804  while True:
1805  chunk = len(f.read(1024*1024))
1806  msg.debug('Read {0} bytes from {1}'.format(chunk, fname))
1807  if chunk == 0:
1808  break
1809  self._fileMetadata[fname]['integrity'] = True
1810  f.close()
1811  except OSError as e:
1812  msg.error('Got exception {0!s} raised while checking integrity of file {1}'.format(e, fname))
1813  self._fileMetadata[fname]['integrity'] = False
1814 
1815 

◆ _getSize()

def python.trfArgClasses.argFile._getSize (   self,
  files 
)
privateinherited

Determines the size of files.

Currently only for statable files (posix fs). Caches the

Parameters
filesList of paths to the files for which the size is determined.
Returns
None (internal self._fileMetadata cache is updated)

Definition at line 1117 of file trfArgClasses.py.

1117  def _getSize(self, files):
1118  for fname in files:
1119  if self._urlType == 'posix':
1120  try:
1121  self._fileMetadata[fname]['size'] = os.stat(fname).st_size
1122  except OSError as e:
1123  msg.error('Got exception {0!s} raised while stating file {1}'.format(e, fname))
1124  self._fileMetadata[fname]['size'] = None
1125  else:
1126  # OK, let's see if ROOT can do it...
1127  msg.debug('Calling ROOT TFile.GetSize({0})'.format(fname))
1128  self._fileMetadata[fname]['size'] = ROOTGetSize(fname)
1129 
1130 

◆ _mergeArgs()

def python.trfArgClasses.argFile._mergeArgs (   self,
  argdict,
  copyArgs = None 
)
privateinherited

Utility to strip arguments which should not be passed to the selfMerge methods of our child classes.

Parameters
copyArgsIf None copy all arguments by default, otherwise only copy the listed keys

Definition at line 1224 of file trfArgClasses.py.

1224  def _mergeArgs(self, argdict, copyArgs=None):
1225  if copyArgs:
1226  myargdict = {}
1227  for arg in copyArgs:
1228  if arg in argdict:
1229  myargdict[arg] = copy.copy(argdict[arg])
1230 
1231  else:
1232  myargdict = copy.copy(argdict)
1233  # Never do event count checks for self merging
1234  myargdict['checkEventCount'] = argSubstepBool('False', runarg=False)
1235  newopts = []
1236  if 'athenaopts' in myargdict:
1237  # Need to ensure that "nprocs" is not passed to merger
1238  # and prevent multiple '--threads' options when there are multiple sub-steps in 'athenopts'
1239  for subStep in myargdict['athenaopts'].value:
1240  hasNprocs = False
1241  hasNthreads = False
1242  for opt in myargdict['athenaopts'].value[subStep]:
1243  if opt.startswith('--nprocs'):
1244  hasNprocs = True
1245  continue
1246  # Keep at least one '--threads'
1247  elif opt.startswith('--threads'):
1248  hasNthreads = True
1249  if opt in newopts:
1250  continue
1251  newopts.append(opt)
1252  # If we have hybrid MP+MT job make sure --threads is not passed to merger
1253  if hasNprocs and hasNthreads:
1254  tmpopts = []
1255  for opt in newopts:
1256  if opt.startswith('--threads'):
1257  continue
1258  tmpopts.append(opt)
1259  newopts = tmpopts
1260  myargdict['athenaopts'] = argSubstepList(newopts, runarg=False)
1261  return myargdict
1262 
1263 

◆ _readMetadata()

def python.trfArgClasses.argFile._readMetadata (   self,
  files,
  metadataKeys 
)
privateinherited

Check metadata is in the cache or generate it if it's missing.

Returns
: dictionary of files with metadata, for any unknown keys 'UNDEFINED' is returned

Definition at line 1000 of file trfArgClasses.py.

1000  def _readMetadata(self, files, metadataKeys):
1001  msg.debug('Retrieving metadata keys {1!s} for files {0!s}'.format(files, metadataKeys))
1002  for fname in files:
1003  if fname not in self._fileMetadata:
1004  self._fileMetadata[fname] = {}
1005  for fname in files:
1006  # Always try for a simple existence test first before producing misleading error messages
1007  # from metadata populator functions
1008  if '_exists' not in self._fileMetadata[fname]:
1009  self._metadataKeys['_exists'](files)
1010  if self._fileMetadata[fname]['_exists'] is False:
1011  # N.B. A log ERROR message has printed by the existence test, so do not repeat that news here
1012  for key in metadataKeys:
1013  if key != '_exists':
1014  self._fileMetadata[fname][key] = None
1015  else:
1016  # OK, file seems to exist at least...
1017  for key in metadataKeys:
1018  if key not in self._metadataKeys:
1019  msg.debug('Metadata key {0} is unknown for {1}'.format(key, self.__class__.__name__))
1020  self._fileMetadata[fname][key] = 'UNDEFINED'
1021  else:
1022  if key in self._fileMetadata[fname]:
1023  msg.debug('Found cached value for {0}:{1} = {2!s}'.format(fname, key, self._fileMetadata[fname][key]))
1024  else:
1025  msg.debug('No cached value for {0}:{1}. Calling generator function {2} ({3})'.format(fname, key, self._metadataKeys[key].__name__, self._metadataKeys[key]))
1026  try:
1027  # For efficiency call this routine with all files we have
1028  msg.info("Metadata generator called to obtain {0} for {1}".format(key, files))
1029  self._metadataKeys[key](files)
1030  except trfExceptions.TransformMetadataException as e:
1031  msg.error('Calling {0!s} raised an exception: {1!s}'.format(self._metadataKeys[key].__name__, e))
1032  if key not in self._fileMetadata[fname]:
1033  msg.warning('Call to function {0} for {1} file {2} failed to populate metadata key {3}'.format(self._metadataKeys[key].__name__, self.__class__.__name__, fname, key))
1034  self._fileMetadata[fname][key] = None
1035  msg.debug('Now have {0}:{1} = {2!s}'.format(fname, key, self._fileMetadata[fname][key]))
1036 
1037 

◆ _resetMetadata()

def python.trfArgClasses.argFile._resetMetadata (   self,
  files = [] 
)
privateinherited

Resets all metadata files in this instance.

Metadata dictionary entry is reset for any files given (default all files) and any files that are no longer in this instance have any metadata removed (useful for self merging).

Note
Metadata is set to {}, except for the case when an explicit GUID option was given

Definition at line 917 of file trfArgClasses.py.

917  def _resetMetadata(self, files=[]):
918  if files == [] or '_fileMetadata' not in dir(self):
919  self._fileMetadata = {}
920  for fname in self.value:
921  self._fileMetadata[fname] = {}
922  else:
923  for fname in files:
924  if fname in self.value:
925  self._fileMetadata[fname] = {}
926  elif fname in self._fileMetadata:
927  del self._fileMetadata[fname]
928  msg.debug('Metadata dictionary now {0}'.format(self._fileMetadata))
929 
930  # If we have the special guid option, then manually try to set GUIDs we find
931  if self._guid is not None:
932  msg.debug('Now trying to set file GUID metadata using {0}'.format(self._guid))
933  for fname, guid in self._guid.items():
934  if fname in self._value:
935  self._fileMetadata[fname]['file_guid'] = guid
936  else:
937  msg.warning('Explicit GUID {0} was passed for file {1}, but this file is not a member of this instance'.format(guid, fname))
938 

◆ _setMetadata()

def python.trfArgClasses.argFile._setMetadata (   self,
  files = None,
  metadataKeys = {} 
)
privateinherited

Set metadata values into the cache.

Manually sets the metadata cache values to the values given in the metadata key dictionary here. This is useful for setting values to make checks on file metadata handling.

Note
To really suppress any external function calls that gather metadata be careful to also set the _exists metadatum to True.
Warning
No checks are done on the values or keys given here, so you'd better know what you are doing.
Parameters
filesFiles to set metadata for (None means "all")
metadataKeysDictionary with metadata keys and values

Definition at line 1048 of file trfArgClasses.py.

1048  def _setMetadata(self, files=None, metadataKeys={}):
1049  if files is None:
1050  files = self._value
1051  for fname in files:
1052  if fname not in self._fileMetadata:
1053  self._fileMetadata[fname] = {}
1054  for k, v in metadataKeys.items():
1055  msg.debug('Manualy setting {0} for file {1} to {2}'.format(k, fname, v))
1056  self._fileMetadata[fname][k] = v
1057 
1058 

◆ append()

def python.trfArgClasses.argList.append (   self,
  addme 
)
inherited

Append a value to the list.

Parameters
addmeItem to add

Definition at line 398 of file trfArgClasses.py.

398  def append(self, addme):
399  self._value.append(addme)
400 

◆ auxiliaryFile()

def python.trfArgClasses.argFile.auxiliaryFile (   self)
inherited

Definition at line 878 of file trfArgClasses.py.

878  def auxiliaryFile(self):
879  return self._auxiliaryFile
880 

◆ dataset() [1/2]

def python.trfArgClasses.argFile.dataset (   self)
inherited

Definition at line 818 of file trfArgClasses.py.

818  def dataset(self):
819  return self._dataset
820 

◆ dataset() [2/2]

def python.trfArgClasses.argFile.dataset (   self,
  value 
)
inherited

Definition at line 822 of file trfArgClasses.py.

822  def dataset(self, value):
823  self._dataset = value
824 

◆ executor()

def python.trfArgClasses.argFile.executor (   self)
inherited

Executor status getter.

Definition at line 638 of file trfArgClasses.py.

638  def executor(self):
639  return self._exe
640 

◆ getMetadata()

def python.trfArgClasses.argFile.getMetadata (   self,
  files = None,
  metadataKeys = None,
  maskMetadataKeys = None,
  populate = True,
  flush = False 
)
inherited

Return specific keys for specific files.

Parameters
filesList of files to return metadata for (default - all files in this instance)
metadataKeysKeys to return (default - all keys valid for this class of files)
maskMetadataKeysKeys to NOT return (useful when metadataKeys is left as default)
populateIf missing keys should be generated by calling the population subroutines
flushIf cached data should be flushed and the generators rerun

Definition at line 945 of file trfArgClasses.py.

945  def getMetadata(self, files = None, metadataKeys = None, maskMetadataKeys = None, populate = True, flush = False):
946  # Normalise the files and keys parameter
947  if files is None:
948  files = self._value
949  elif isinstance(files, str):
950  files = (files,)
951  msg.debug('getMetadata will examine these files: {0!s}'.format(files))
952 
953  if metadataKeys is None:
954  metadataKeys = list(self._metadataKeys)
955  elif isinstance(metadataKeys, str):
956  metadataKeys = [metadataKeys,]
957  if maskMetadataKeys is not None:
958  metadataKeys = [k for k in metadataKeys if k not in maskMetadataKeys]
959  msg.debug('getMetadata will retrieve these keys: {0!s}'.format(metadataKeys))
960 
961  if flush is True:
962  msg.debug('Flushing cached metadata values')
963  self._resetMetadata()
964 
965  if populate is True:
966  msg.debug('Checking metadata values')
967  self._readMetadata(files, metadataKeys)
968 
969  metadata = {}
970  for fname in files:
971  metadata[fname] = {}
972  for mdkey in metadataKeys:
973  try:
974  metadata[fname][mdkey] = self._fileMetadata[fname][mdkey]
975  except KeyError:
976  # This should not happen, unless we skipped populating
977  if populate:
978  msg.error('Did not find metadata key {0!s} for file {1!s} - setting to None'.format(mdkey, fname))
979  metadata[fname][mdkey] = None
980  return metadata
981 

◆ getnentries()

def python.trfArgClasses.argFile.getnentries (   self,
  fast = False 
)
inherited

Explicit getter, offering fast switch.

Definition at line 894 of file trfArgClasses.py.

894  def getnentries(self, fast=False):
895  totalEvents = 0
896  for fname in self._value:
897  events = self.getSingleMetadata(fname=fname, metadataKey='nentries', populate = not fast)
898  if events is None:
899  msg.debug('Got events=None for file {0} - returning None for this instance'.format(fname))
900  return None
901  if events == 'UNDEFINED':
902  msg.debug('Got events=UNDEFINED for file {0} - returning UNDEFINED for this instance'.format(fname))
903  return 'UNDEFINED'
904  if not isinstance(events, int):
905  msg.warning('Got unexpected events metadata for file {0}: {1!s} - returning None for this instance'.format(fname, events))
906  return None
907  totalEvents += events
908 
909  return totalEvents
910 
911 

◆ getSingleMetadata()

def python.trfArgClasses.argFile.getSingleMetadata (   self,
  fname,
  metadataKey,
  populate = True,
  flush = False 
)
inherited

Convenience function to extract a single metadata key for a single file.

Retrieves a single metadata item for a single file, returning it directly

Returns
Single metadata value
Parameters
fnameFile to return metadata for
metadataKeyKey to return
populateIf missing key should be generated by calling the population subroutines
flushIf cached data should be flushed and the generator rerun

Definition at line 989 of file trfArgClasses.py.

989  def getSingleMetadata(self, fname, metadataKey, populate = True, flush = False):
990  if not (isinstance(fname, str) and isinstance(metadataKey, str)):
991  raise trfExceptions.TransformInternalException(trfExit.nameToCode('TRF_INTERNAL'),
992  'Illegal call to getSingleMetadata function: {0!s} {1!s}'.format(fname, metadataKey))
993  md = self.getMetadata(files = fname, metadataKeys = metadataKey, populate = populate, flush = flush)
994  return md[fname][metadataKey]
995 
996 

◆ io() [1/2]

def python.trfArgClasses.argFile.io (   self)
inherited

Definition at line 807 of file trfArgClasses.py.

807  def io(self):
808  return (self._io)
809 

◆ io() [2/2]

def python.trfArgClasses.argFile.io (   self,
  value 
)
inherited

Definition at line 811 of file trfArgClasses.py.

811  def io(self, value):
812  if value not in ('input', 'output', 'temporary'):
813  raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_RUNTIME_ERROR'),
814  'File arguments must be specified as input, output or temporary - got {0}'.format(value))
815  self._io = value
816 

◆ isCached()

def python.trfArgClasses.argFile.isCached (   self,
  files = None,
  metadataKeys = None 
)
inherited

Test if certain metadata elements are already cached.

Will test for a cached value for all files and all keys given, aborting as soon as it finds a single uncached value.

Parameters
filesFiles to check (defaults to all files)
metadataKeysKeys to check (defaults to all keys)
Returns
Boolean if all keys are cached for all files

Definition at line 1065 of file trfArgClasses.py.

1065  def isCached(self, files = None, metadataKeys = None):
1066  msg.debug('Testing for cached values for files {0} and keys {1}'.format(files, metadataKeys))
1067  if files is None:
1068  files = self._value
1069  elif isinstance(files, str):
1070  files = (files,)
1071  if metadataKeys is None:
1072  metadataKeys = list(self._metadataKeys)
1073  elif isinstance(metadataKeys, str):
1074  metadataKeys = (metadataKeys,)
1075 
1076  isCachedFlag = True
1077  for fname in files:
1078  for key in metadataKeys:
1079  if key not in self._fileMetadata[fname]:
1080  isCachedFlag = False
1081  break
1082  if isCachedFlag is False:
1083  break
1084 
1085  return isCachedFlag
1086 

◆ isRunarg()

def python.trfArgClasses.argument.isRunarg (   self)
inherited

Return runarg status.

Definition at line 134 of file trfArgClasses.py.

134  def isRunarg(self):
135  return self._runarg
136 

◆ mergeTargetSize() [1/2]

def python.trfArgClasses.argFile.mergeTargetSize (   self)
inherited

mergeTargeSize value getter

Definition at line 613 of file trfArgClasses.py.

613  def mergeTargetSize(self):
614  return self._mergeTargetSize
615 

◆ mergeTargetSize() [2/2]

def python.trfArgClasses.argFile.mergeTargetSize (   self,
  value 
)
inherited

mergeTargeSize value setter

Definition at line 618 of file trfArgClasses.py.

618  def mergeTargetSize(self, value):
619  if value is None:
620  self._mergeTargetSize = 0
621  else:
622  self._mergeTargetSize = value
623 

◆ metadata()

def python.trfArgClasses.argFile.metadata (   self)
inherited

Returns the whole kit and kaboodle...

Note
Populates the whole metadata dictionary for this instance

Definition at line 884 of file trfArgClasses.py.

884  def metadata(self):
885  self.getMetadata()
886  return self._fileMetadata
887 

◆ multipleOK() [1/2]

def python.trfArgClasses.argFile.multipleOK (   self)
inherited

multipleOK getter

Returns
Current value

Definition at line 603 of file trfArgClasses.py.

603  def multipleOK(self):
604  return self._multipleOK
605 

◆ multipleOK() [2/2]

def python.trfArgClasses.argFile.multipleOK (   self,
  value 
)
inherited

multipleOK value setter

Definition at line 608 of file trfArgClasses.py.

608  def multipleOK(self, value):
609  self._multipleOK = value
610 

◆ name() [1/2]

def python.trfArgClasses.argFile.name (   self)
inherited

Name getter.

Reimplemented from python.trfArgClasses.argument.

Definition at line 851 of file trfArgClasses.py.

851  def name(self):
852  return self._name
853 

◆ name() [2/2]

def python.trfArgClasses.argFile.name (   self,
  value 
)
inherited

Name setter.

Note
This property setter will also set the type and subtype of the argFile if they are not yet set. This means that for most arguments the type and subtype are automatically set correctly.

Reimplemented from python.trfArgClasses.argument.

Definition at line 859 of file trfArgClasses.py.

859  def name(self, value):
860  self._name = value
861  m = re.match(r'(input|output|tmp.)([A-Za-z0-9_]+?)(File)?$', value)
862  if m:
863  msg.debug("ArgFile name setter matched this: {0}".format(m.groups()))
864  if self._type is None:
865  dtype = m.group(2).split('_', 1)[0]
866  # But DRAW/DESD/DAOD are really just RAW, ESD, AOD in format
867  if re.match(r'D(RAW|ESD|AOD)', dtype):
868  dtype = dtype[1:]
869  msg.debug("Autoset data type to {0}".format(dtype))
870  self._type = dtype
871  if self._subtype is None:
872  msg.debug("Autoset data subtype to {0}".format(m.group(2)))
873  self._subtype = m.group(2)
874  else:
875  msg.debug("ArgFile name setter did not match against '{0}'".format(value))
876 

◆ nentries()

def python.trfArgClasses.argFile.nentries (   self)
inherited

Return total number of events in all constituent files.

Definition at line 890 of file trfArgClasses.py.

890  def nentries(self):
891  return self.getnentries()
892 

◆ originalName()

def python.trfArgClasses.argFile.originalName (   self,
  value 
)
inherited

Definition at line 830 of file trfArgClasses.py.

830  def originalName(self, value):
831  self._originalName = value
832 

◆ orignalName()

def python.trfArgClasses.argFile.orignalName (   self)
inherited

Definition at line 826 of file trfArgClasses.py.

826  def orignalName(self):
827  return self._originalName
828 

◆ prodsysDescription()

def python.trfArgClasses.argBZ2File.prodsysDescription (   self)

Reimplemented from python.trfArgClasses.argFile.

Reimplemented in python.trfArgClasses.argFTKIPFile.

Definition at line 1817 of file trfArgClasses.py.

1817  def prodsysDescription(self):
1818  desc=super(argBZ2File, self).prodsysDescription
1819  return desc
1820 
1821 

◆ subtype() [1/2]

def python.trfArgClasses.argFile.subtype (   self)
inherited

Definition at line 842 of file trfArgClasses.py.

842  def subtype(self):
843  return self._subtype
844 

◆ subtype() [2/2]

def python.trfArgClasses.argFile.subtype (   self,
  value 
)
inherited

Definition at line 846 of file trfArgClasses.py.

846  def subtype(self, value):
847  self._subtype = value
848 

◆ type() [1/2]

def python.trfArgClasses.argFile.type (   self)
inherited

Definition at line 834 of file trfArgClasses.py.

834  def type(self):
835  return self._type
836 

◆ type() [2/2]

def python.trfArgClasses.argFile.type (   self,
  value 
)
inherited

Definition at line 838 of file trfArgClasses.py.

838  def type(self, value):
839  self._type = value
840 

◆ value() [1/2]

def python.trfArgClasses.argFile.value (   self)
inherited

Argument value getter.

Returns
Current value

Reimplemented from python.trfArgClasses.argList.

Definition at line 591 of file trfArgClasses.py.

591  def value(self):
592  return self._value
593 

◆ value() [2/2]

def python.trfArgClasses.argFile.value (   self,
  value 
)
inherited

Argument value setter.

Calls the valueSetter function with the standard options

Reimplemented from python.trfArgClasses.argList.

Definition at line 597 of file trfArgClasses.py.

597  def value(self, value):
598  self.valueSetter(value)
599 

◆ valueSetter()

def python.trfArgClasses.argFile.valueSetter (   self,
  value 
)
inherited

Set the argFile value, but allow parameters here.

Note
Normally athena only takes a single value for an output file, but when AthenaMP runs it can produce multiple output files - this is allowed by setting allowMultiOutputs = True
The setter protects against the same file being added multiple times

Definition at line 645 of file trfArgClasses.py.

645  def valueSetter(self, value):
646 
647  if isinstance(value, (list, tuple)):
648  if len(value) > 0 and isinstance(value[0], dict): # Tier-0 style expanded argument with metadata
649  self._value=[]
650  for myfile in value:
651  try:
652  self._value.append(myfile['lfn'])
653  self._resetMetadata(files = [myfile['lfn']])
654  except KeyError:
655  raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_ARG_CONV_FAIL'),
656  'Filename (key "lfn") not found in Tier-0 file dictionary: {0}'.format(myfile))
657  for k, v in myfile.items():
658  if k == 'guid':
659  self._setMetadata([myfile['lfn']], {'file_guid': v})
660  elif k == 'events':
661  self._setMetadata([myfile['lfn']], {'nentries': v})
662  elif k == 'checksum':
663  self._setMetadata([myfile['lfn']], {'checksum': v})
664  elif k == 'dsn':
665  if not self._dataset:
666  self.dataset = v
667  elif self.dataset != v:
668  raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_ARG_DATASET'),
669  'Inconsistent dataset names in Tier-0 dictionary: {0} != {1}'.format(self.dataset, v))
670  else:
671  self._value = list(value)
672  self._getDatasetFromFilename(reset = False)
673  self._resetMetadata()
674  elif value is None:
675  self._value = []
676  return
677  else:
678  try:
679  if value.lower().startswith('lfn'):
680  # Resolve physical filename using pool file catalog.
681  from PyUtils.PoolFile import file_name
682  protocol, pfn = file_name(value)
683  self._value = [pfn]
684  self._getDatasetFromFilename(reset = False)
685  self._resetMetadata()
686  else:
687  # Don't split output filename if it contains a list in square brackets
688  if self._io == 'output' and ('[' in value) and (']' in value):
689  self._value = [value]
690  else:
691  self._value = value.split(self._splitter)
692  self._getDatasetFromFilename(reset = False)
693  self._resetMetadata()
694  except (AttributeError, TypeError):
695  raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_ARG_CONV_FAIL'),
696  'Failed to convert %s to a list' % str(value))
697 
698 
699  deDuplicatedValue = []
700  for fname in self._value:
701  if fname not in deDuplicatedValue:
702  deDuplicatedValue.append(fname)
703  else:
704  msg.warning("Removing duplicated file {0} from file list".format(fname))
705  if len(self._value) != len(deDuplicatedValue):
706  self._value = deDuplicatedValue
707  msg.warning('File list after duplicate removal: {0}'.format(self._value))
708 
709  # Find our URL type (if we actually have files!)
710  # At the moment this is assumed to be the same for all files in this instance
711  # although in principle one could mix different access methods in the one input file type
712  if len(self._value) > 0:
713  self._urlType = urlType(self._value[0])
714  else:
715  self._urlType = None
716 
717 
718  if self._io == 'input':
719 
723  if self._urlType == 'posix':
724  msg.debug('Found POSIX filesystem input - activating globbing')
725  newValue = []
726  for filename in self._value:
727  # Simple case
728  globbedFiles = glob.glob(filename)
729  if len(globbedFiles) == 0: # No files globbed for this 'filename' argument.
730  raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_INPUT_FILE_ERROR'),
731  'Input file argument {0} globbed to NO input files - probably the file(s) are missing'.format(filename))
732 
733  globbedFiles.sort()
734  newValue.extend(globbedFiles)
735 
736  self._value = newValue
737  msg.debug ('File input is globbed to %s' % self._value)
738 
739  elif self._urlType == 'root':
740  msg.debug('Found root filesystem input - activating globbing')
741  newValue = []
742  for filename in self._value:
743  if str(filename).startswith("root"):
744  msg.debug('Found input file name starting with "root," setting XRD_RUNFORKHANDLER=1, which enables fork handlers for xrootd in direct I/O')
745  os.environ["XRD_RUNFORKHANDLER"] = "1"
746  if str(filename).startswith("https") or str(filename).startswith("davs") or not(str(filename).endswith('/')) and '*' not in filename and '?' not in filename:
747  msg.debug('Seems that only one file was given: {0}'.format(filename))
748  newValue.extend(([filename]))
749  else:
750  # Hopefully this recognised wildcards...
751  path = filename
752  fileMask = ''
753  if '*' in filename or '?' in filename:
754  msg.debug('Split input into path for listdir() and a filemask to select available files.')
755  path = filename[0:filename.rfind('/')+1]
756  msg.debug('path: {0}'.format(path))
757  fileMask = filename[filename.rfind('/')+1:len(filename)]
758  msg.debug('Will select according to: {0}'.format(fileMask))
759 
760  cmd = ['/afs/cern.ch/project/eos/installation/atlas/bin/eos.select' ]
761  if not os.access ('/afs/cern.ch/project/eos/installation/atlas/bin/eos.select', os.X_OK ):
762  raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_INPUT_FILE_ERROR'),
763  'No execute access to "eos.select" - could not glob EOS input files.')
764 
765  cmd.extend(['ls'])
766  cmd.extend([path])
767 
768  myFiles = []
769  try:
770  proc = subprocess.Popen(args = cmd,bufsize = 1, shell = False, stdout = subprocess.PIPE, stderr = subprocess.STDOUT)
771  rc = proc.wait()
772  output = proc.stdout.readlines()
773  if rc!=0:
774  raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_INPUT_FILE_ERROR'),
775  'EOS list command ("{0!s}") failed: rc {1}, output {2}'.format(cmd, rc, output))
776  msg.debug("eos returned: {0}".format(output))
777  for line in output:
778  if "root" in line:
779  myFiles += [str(path)+str(line.rstrip('\n'))]
780 
781  patt = re.compile(fileMask.replace('*','.*').replace('?','.'))
782  for srmFile in myFiles:
783  if fileMask != '':
784  if(patt.search(srmFile)) is not None:
785  #if fnmatch.fnmatch(srmFile, fileMask):
786  msg.debug('match: %s',srmFile)
787  newValue.extend(([srmFile]))
788  else:
789  newValue.extend(([srmFile]))
790 
791  msg.debug('Selected files: %s', newValue)
792  except (AttributeError, TypeError, OSError):
793  raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_RUNTIME_ERROR'),
794  'Failed to convert %s to a list' % str(value))
795  if len(self._value) > 0 and len(newValue) == 0:
796  # Woops - no files!
797  raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_INPUT_FILE_ERROR'),
798  'Input file argument(s) {0!s} globbed to NO input files - ls command failed')
799  self._value = newValue
800  msg.debug ('File input is globbed to %s' % self._value)
801  # Check if multiple outputs are ok for this object
802  elif self._multipleOK is False and len(self._value) > 1:
803  raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_OUTPUT_FILE_ERROR'),
804  'Multiple file arguments are not supported for {0} (was given: {1}'.format(self, self._value))
805 

Member Data Documentation

◆ _auxiliaryFile

python.trfArgClasses.argFile._auxiliaryFile
privateinherited

Definition at line 553 of file trfArgClasses.py.

◆ _dataset

python.trfArgClasses.argFile._dataset
privateinherited

Definition at line 547 of file trfArgClasses.py.

◆ _exe

python.trfArgClasses.argFile._exe
privateinherited

Definition at line 559 of file trfArgClasses.py.

◆ _fileMetadata

python.trfArgClasses.argFile._fileMetadata
privateinherited

Definition at line 574 of file trfArgClasses.py.

◆ _guid

python.trfArgClasses.argFile._guid
privateinherited

Definition at line 551 of file trfArgClasses.py.

◆ _io

python.trfArgClasses.argFile._io
privateinherited

Input file globbing and expansion.

Definition at line 576 of file trfArgClasses.py.

◆ _mergeTargetSize

python.trfArgClasses.argFile._mergeTargetSize
privateinherited

Definition at line 552 of file trfArgClasses.py.

◆ _metadataKeys

python.trfArgClasses.argFile._metadataKeys
privateinherited
Note
Variable listing set of file metadata which corresponds to this class, Key is the metadata variable name, the value is the function to call to populate/refresh this metadata value. Function must take a single parameter, which is the list of files to get metadata for. It must return a metadata dictionary: {file1 : {key1: value1, key2: value2}, file2: ...} Keys which start with _ are for transform internal use and should not appear in jobReports

Definition at line 569 of file trfArgClasses.py.

◆ _multipleOK

python.trfArgClasses.argFile._multipleOK
privateinherited

Definition at line 577 of file trfArgClasses.py.

◆ _name

python.trfArgClasses.argFile._name
privateinherited

Definition at line 860 of file trfArgClasses.py.

◆ _originalName

python.trfArgClasses.argFile._originalName
privateinherited

Definition at line 554 of file trfArgClasses.py.

◆ _runarg

python.trfArgClasses.argument._runarg
privateinherited

Definition at line 110 of file trfArgClasses.py.

◆ _splitter

python.trfArgClasses.argList._splitter
privateinherited

Definition at line 357 of file trfArgClasses.py.

◆ _subtype

python.trfArgClasses.argFile._subtype
privateinherited

Definition at line 550 of file trfArgClasses.py.

◆ _supressEmptyStrings

python.trfArgClasses.argList._supressEmptyStrings
privateinherited

Definition at line 358 of file trfArgClasses.py.

◆ _type

python.trfArgClasses.argFile._type
privateinherited

Definition at line 549 of file trfArgClasses.py.

◆ _urlType

python.trfArgClasses.argFile._urlType
privateinherited
Note
TODO: Non-posix URLs Problem is not so much the [] expansion, but the invisible .N attempt number One can only deal with this with a listdir() functionality N.B. Current transforms only do globbing on posix fs too (see trfutil.expandStringToList())

Definition at line 548 of file trfArgClasses.py.

◆ _value

python.trfArgClasses.argFile._value
privateinherited
Note
First do parsing of string vs. lists to get list of files
Check for duplicates (N.B. preserve the order, just remove the duplicates)

Definition at line 649 of file trfArgClasses.py.

◆ dataset

python.trfArgClasses.argFile.dataset
inherited

Definition at line 666 of file trfArgClasses.py.

◆ io

python.trfArgClasses.argFile.io
inherited

Definition at line 557 of file trfArgClasses.py.

◆ value

python.trfArgClasses.argument.value
inherited
Note
We have a default of None here, but all derived classes should definitely have their own value setter and translate this value to something sensible for their underlying value type. N.B. As most argument classes use this default constructor it must call the @value .setter function!

Definition at line 118 of file trfArgClasses.py.


The documentation for this class was generated from the following file:
replace
std::string replace(std::string s, const std::string &s2, const std::string &s3)
Definition: hcg.cxx:307
python.trfFileUtils.ROOTGetSize
def ROOTGetSize(filename)
Get the size of a file via ROOT's TFile.
Definition: trfFileUtils.py:285
vtune_athena.format
format
Definition: vtune_athena.py:14
athena.value
value
Definition: athena.py:124
upper
int upper(int c)
Definition: LArBadChannelParser.cxx:49
dumpHVPathFromNtuple.append
bool append
Definition: dumpHVPathFromNtuple.py:91
python.HanMetadata.getMetadata
def getMetadata(f, key)
Definition: HanMetadata.py:12
python.CaloAddPedShiftConfig.type
type
Definition: CaloAddPedShiftConfig.py:42
python.checkMetadata.metadata
metadata
Definition: checkMetadata.py:175
physics_parameters.file_name
string file_name
Definition: physics_parameters.py:32
PlotCalibFromCool.nentries
nentries
Definition: PlotCalibFromCool.py:798
convertTimingResiduals.sum
sum
Definition: convertTimingResiduals.py:55
PyAthena::repr
std::string repr(PyObject *o)
returns the string representation of a python object equivalent of calling repr(o) in python
Definition: PyAthenaUtils.cxx:106
histSizes.list
def list(name, path='/')
Definition: histSizes.py:38
beamspotman.dataset
dataset
Definition: beamspotman.py:284
beamspotman.dir
string dir
Definition: beamspotman.py:621
TCS::join
std::string join(const std::vector< std::string > &v, const char c=',')
Definition: Trigger/TrigT1/L1Topo/L1TopoCommon/Root/StringUtils.cxx:10
name
std::string name
Definition: Control/AthContainers/Root/debug.cxx:240
python.trfFileUtils.urlType
def urlType(filename)
Return the LAN access type for a file URL.
Definition: trfFileUtils.py:316
TrigJetMonitorAlgorithm.items
items
Definition: TrigJetMonitorAlgorithm.py:71
if
if(febId1==febId2)
Definition: LArRodBlockPhysicsV0.cxx:567
str
Definition: BTagTrackIpAccessor.cxx:11
Trk::split
@ split
Definition: LayerMaterialProperties.h:38
subproc.subtype
string subtype
Definition: subproc.py:19