ATLAS Offline Software
Loading...
Searching...
No Matches
python.trfArgClasses.argBZ2File Class Reference

TarBZ filetype. More...

Inheritance diagram for python.trfArgClasses.argBZ2File:
Collaboration diagram for python.trfArgClasses.argBZ2File:

Public Types

typedef HLT::TypeInformation::for_each_type_c< typenameEDMLIST::map, my_functor, my_result<>, my_arg< HLT::TypeInformation::get_cont, CONTAINER > >::type result

Public Member Functions

 prodsysDescription (self)
 io (self)
 io (self, value)
 dataset (self)
 dataset (self, value)
 value (self)
 Argument value getter.
 value (self, value)
 Argument value setter.
 name (self)
 Name getter.
 name (self, value)
 Name setter.
 type (self)
 type (self, value)
 multipleOK (self)
 multipleOK getter
 multipleOK (self, value)
 multipleOK value setter
 mergeTargetSize (self)
 mergeTargeSize value getter
 mergeTargetSize (self, value)
 mergeTargeSize value setter
 executor (self)
 Executor status getter.
 valueSetter (self, value)
 Set the argFile value, but allow parameters here.
 orignalName (self)
 originalName (self, value)
 subtype (self)
 subtype (self, value)
 auxiliaryFile (self)
 metadata (self)
 Returns the whole kit and kaboodle...
 nentries (self)
 Return total number of events in all constituent files.
 getnentries (self, fast=False)
 Explicit getter, offering fast switch.
 getMetadata (self, files=None, metadataKeys=None, maskMetadataKeys=None, populate=True, flush=False)
 Return specific keys for specific files.
 getSingleMetadata (self, fname, metadataKey, populate=True, flush=False)
 Convenience function to extract a single metadata key for a single file.
 isCached (self, files=None, metadataKeys=None)
 Test if certain metadata elements are already cached.
 __str__ (self)
 String representation of a file argument.
 append (self, addme)
 Append a value to the list.
 __repr__ (self)
 Repr conversion.
 isRunarg (self)
 Return runarg status.
 __eq__ (self, other)
 Comparison is based on value attribute.
 __nq__ (self, other)
 __lt__ (self, other)
 __gt__ (self, other)

Public Attributes

 io = io
 dataset = v
 type

Protected Member Functions

 _getIntegrity (self, files)
 File integrity checker.
 _resetMetadata (self, files=[])
 Resets all metadata files in this instance.
 _readMetadata (self, files, metadataKeys)
 Check metadata is in the cache or generate it if it's missing.
 _setMetadata (self, files=None, metadataKeys={})
 Set metadata values into the cache.
 _getDatasetFromFilename (self, reset=False)
 Look for dataset name in dataset#filename Tier0 convention.
 _getSize (self, files)
 Determines the size of files.
 _generateGUID (self, files)
 Generate a GUID on demand - no intrinsic for this file type.
 _exists (self, files)
 Try to determine if a file actually exists...
 _mergeArgs (self, argdict, copyArgs=None)
 Utility to strip arguments which should not be passed to the selfMerge methods of our child classes.

Protected Attributes

 _dataset = None
str _urlType = None
 Input file globbing and expansion.
dict _type = type
 _subtype = subtype
 _guid = guid
int _mergeTargetSize = mergeTargetSize
 _auxiliaryFile = auxiliaryFile
 _originalName = None
 _exe = executor
dict _metadataKeys
dict _fileMetadata = {}
str _io = 'input':
 Input file globbing and expansion.
bool _multipleOK = True
 _splitter = splitter
 _supressEmptyStrings = supressEmptyStrings
 _runarg = runarg
 _name = name
 _value = value

Detailed Description

TarBZ filetype.

Definition at line 1798 of file trfArgClasses.py.

Member Typedef Documentation

◆ result

Definition at line 90 of file EDM_MasterSearch.h.

Member Function Documentation

◆ __eq__()

python.trfArgClasses.argument.__eq__ ( self,
other )
inherited

Comparison is based on value attribute.

Definition at line 161 of file trfArgClasses.py.

161 def __eq__(self,other):
162 return self.value == other.value
163

◆ __gt__()

python.trfArgClasses.argument.__gt__ ( self,
other )
inherited

Definition at line 170 of file trfArgClasses.py.

170 def __gt__(self, other):
171 return self.value > other.value
172

◆ __lt__()

python.trfArgClasses.argument.__lt__ ( self,
other )
inherited

Definition at line 167 of file trfArgClasses.py.

167 def __lt__(self, other):
168 return self.value < other.value
169

◆ __nq__()

python.trfArgClasses.argument.__nq__ ( self,
other )
inherited

Definition at line 164 of file trfArgClasses.py.

164 def __nq__(self, other):
165 return self.value != other.value
166

◆ __repr__()

python.trfArgClasses.argList.__repr__ ( self)
inherited

Repr conversion.

Return a python parsable string

Definition at line 409 of file trfArgClasses.py.

409 def __repr__(self):
410 return '[' + ','.join([ repr(s) for s in self._value ]) + ']'
411
412

◆ __str__()

python.trfArgClasses.argFile.__str__ ( self)
inherited

String representation of a file argument.

Definition at line 1216 of file trfArgClasses.py.

1216 def __str__(self):
1217 return "{0}={1} (Type {2}, Dataset {3}, IO {4})".format(self.name, self.value, self.type, self.dataset, self.io)
1218
1219

◆ _exists()

python.trfArgClasses.argFile._exists ( self,
files )
protectedinherited

Try to determine if a file actually exists...

For a posix file, just call stat; for anything else call TFile.Open A small optimisation is to retieve the file_size metadatum at the same time.

Parameters

c files List of paths to test for existance

Returns
None (internal self._fileMetadata cache is updated)

Definition at line 1180 of file trfArgClasses.py.

1180 def _exists(self, files):
1181 import re
1182 msg.debug('Testing existance for {0}'.format(files))
1183 def split_filelist(fn):
1184 if self.io != 'output':
1185 return [fn]
1186 file_split_regex = re.compile(r"(.+)\[(.+)](.+)")
1187 if ('[' in fn) and (']' in fn):
1188 match = file_split_regex.match(fn)
1189 return [f"{match.group(1)}{it}{match.group(3)}" for it in match.group(2).split(',')]
1190 else:
1191 return [fn]
1192 for fname in files:
1193 file_list = split_filelist(fname)
1194 if self._urlType == 'posix':
1195 try:
1196 size = map(lambda fn: os.stat(fn).st_size, file_list)
1197 self._fileMetadata[fname]['file_size'] = sum(size)
1198 self._fileMetadata[fname]['_exists'] = True
1199 msg.debug('POSIX file {0} exists (or all elements of list)'.format(fname))
1200 except OSError as e:
1201 msg.error('Got exception {0!s} raised while stating file {1} (or some element of list) - probably it does not exist'.format(e, fname))
1202 self._fileMetadata[fname]['_exists'] = False
1203 else:
1204 # OK, let's see if ROOT can do it...
1205 msg.debug('Calling ROOT TFile.GetSize on {0} (or elements of list)'.format(fname))
1206 size = map(ROOTGetSize, file_list)
1207 if None in size:
1208 self._fileMetadata[fname]['_exists'] = False
1209 msg.error('Non-POSIX file {0} (or element of list) could not be opened - probably it does not exist'.format(fname))
1210 else:
1211 msg.debug('Non-POSIX file {0} (or all elements of list) exists'.format(fname))
1212 self._fileMetadata[fname]['file_size'] = sum(size)
1213 self._fileMetadata[fname]['_exists'] = True
1214
STL class.
std::vector< std::string > split(const std::string &s, const std::string &t=":")
Definition hcg.cxx:177

◆ _generateGUID()

python.trfArgClasses.argFile._generateGUID ( self,
files )
protectedinherited

Generate a GUID on demand - no intrinsic for this file type.

Use uuid.uuid4() call to generate a GUID

Note
This generation method will be superceeded in any file type which actually has an intrinsic GUID (e.g. BS or POOL files)

Definition at line 1169 of file trfArgClasses.py.

1169 def _generateGUID(self, files):
1170 for fname in files:
1171 msg.debug('Generating a GUID for file {0}'.format(fname))
1172 self._fileMetadata[fname]['file_guid'] = str(uuid.uuid4()).upper()
1173
1174
int upper(int c)

◆ _getDatasetFromFilename()

python.trfArgClasses.argFile._getDatasetFromFilename ( self,
reset = False )
protectedinherited

Look for dataset name in dataset#filename Tier0 convention.

At the moment all files must be in the same dataset if it's specified. (To change this dataset will need to become a per-file metadatum.)

Note
dsn#lfn notation must be used for all input values and all dsn values must be the same
Parameters

c reset If True then forget previous dataset setting. Default is False.

Returns
None. Side effect is to set self._metadata.

Definition at line 1093 of file trfArgClasses.py.

1093 def _getDatasetFromFilename(self, reset = False):
1094 if reset:
1095 self._dataset = None
1096 newValue = []
1097 for filename in self._value:
1098 if filename.find('#') > -1:
1099 (dataset, fname) = filename.split('#', 1)
1100 newValue.append(fname)
1101 msg.debug('Current dataset: {0}; New dataset {1}'.format(self._dataset, dataset))
1102 if self._dataset and (self._dataset != dataset):
1103 raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_ARG_DATASET'),
1104 'Found inconsistent dataset assignment in argFile setup: %s != %s' % (self._dataset, dataset))
1105 self._dataset = dataset
1106 if len(newValue) == 0:
1107 return
1108 elif len(newValue) != len (self._value):
1109 raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_ARG_DATASET'),
1110 'Found partial dataset assignment in argFile setup from {0} (dsn#lfn notation must be uniform for all inputs)'.format(self._value))
1111 self._value = newValue
1112

◆ _getIntegrity()

python.trfArgClasses.argBZ2File._getIntegrity ( self,
files )
protected

File integrity checker.

For a 'plain' file, integrity just checks that we can read it

Parameters

c files List of paths to the files for which the integrity is determined

Returns
None (internal self._fileMetadata cache is updated)

Reimplemented from python.trfArgClasses.argFile.

Definition at line 1799 of file trfArgClasses.py.

1799 def _getIntegrity(self, files):
1800 for fname in files:
1801 # bz2 only supports 'with' from python 2.7
1802 try:
1803 f = bz2.BZ2File(fname, 'r')
1804 while True:
1805 chunk = len(f.read(1024*1024))
1806 msg.debug('Read {0} bytes from {1}'.format(chunk, fname))
1807 if chunk == 0:
1808 break
1809 self._fileMetadata[fname]['integrity'] = True
1810 f.close()
1811 except OSError as e:
1812 msg.error('Got exception {0!s} raised while checking integrity of file {1}'.format(e, fname))
1813 self._fileMetadata[fname]['integrity'] = False
1814
1815

◆ _getSize()

python.trfArgClasses.argFile._getSize ( self,
files )
protectedinherited

Determines the size of files.

Currently only for statable files (posix fs). Caches the

Parameters
filesList of paths to the files for which the size is determined.
Returns
None (internal self._fileMetadata cache is updated)

Definition at line 1117 of file trfArgClasses.py.

1117 def _getSize(self, files):
1118 for fname in files:
1119 if self._urlType == 'posix':
1120 try:
1121 self._fileMetadata[fname]['size'] = os.stat(fname).st_size
1122 except OSError as e:
1123 msg.error('Got exception {0!s} raised while stating file {1}'.format(e, fname))
1124 self._fileMetadata[fname]['size'] = None
1125 else:
1126 # OK, let's see if ROOT can do it...
1127 msg.debug('Calling ROOT TFile.GetSize({0})'.format(fname))
1128 self._fileMetadata[fname]['size'] = ROOTGetSize(fname)
1129
1130

◆ _mergeArgs()

python.trfArgClasses.argFile._mergeArgs ( self,
argdict,
copyArgs = None )
protectedinherited

Utility to strip arguments which should not be passed to the selfMerge methods of our child classes.

Parameters
copyArgsIf None copy all arguments by default, otherwise only copy the listed keys

Definition at line 1224 of file trfArgClasses.py.

1224 def _mergeArgs(self, argdict, copyArgs=None):
1225 if copyArgs:
1226 myargdict = {}
1227 for arg in copyArgs:
1228 if arg in argdict:
1229 myargdict[arg] = copy.copy(argdict[arg])
1230
1231 else:
1232 myargdict = copy.copy(argdict)
1233 # Never do event count checks for self merging
1234 myargdict['checkEventCount'] = argSubstepBool('False', runarg=False)
1235 newopts = []
1236 if 'athenaopts' in myargdict:
1237 # Need to ensure that "nprocs" is not passed to merger
1238 # and prevent multiple '--threads' options when there are multiple sub-steps in 'athenopts'
1239 for subStep in myargdict['athenaopts'].value:
1240 hasNprocs = False
1241 hasNthreads = False
1242 for opt in myargdict['athenaopts'].value[subStep]:
1243 if opt.startswith('--nprocs'):
1244 hasNprocs = True
1245 continue
1246 # Keep at least one '--threads'
1247 elif opt.startswith('--threads'):
1248 hasNthreads = True
1249 if opt in newopts:
1250 continue
1251 newopts.append(opt)
1252 # If we have hybrid MP+MT job make sure --threads is not passed to merger
1253 if hasNprocs and hasNthreads:
1254 tmpopts = []
1255 for opt in newopts:
1256 if opt.startswith('--threads'):
1257 continue
1258 tmpopts.append(opt)
1259 newopts = tmpopts
1260 myargdict['athenaopts'] = argSubstepList(newopts, runarg=False)
1261 return myargdict
1262
1263

◆ _readMetadata()

python.trfArgClasses.argFile._readMetadata ( self,
files,
metadataKeys )
protectedinherited

Check metadata is in the cache or generate it if it's missing.

Returns
: dictionary of files with metadata, for any unknown keys 'UNDEFINED' is returned

Definition at line 1000 of file trfArgClasses.py.

1000 def _readMetadata(self, files, metadataKeys):
1001 msg.debug('Retrieving metadata keys {1!s} for files {0!s}'.format(files, metadataKeys))
1002 for fname in files:
1003 if fname not in self._fileMetadata:
1004 self._fileMetadata[fname] = {}
1005 for fname in files:
1006 # Always try for a simple existence test first before producing misleading error messages
1007 # from metadata populator functions
1008 if '_exists' not in self._fileMetadata[fname]:
1009 self._metadataKeys['_exists'](files)
1010 if self._fileMetadata[fname]['_exists'] is False:
1011 # N.B. A log ERROR message has printed by the existence test, so do not repeat that news here
1012 for key in metadataKeys:
1013 if key != '_exists':
1014 self._fileMetadata[fname][key] = None
1015 else:
1016 # OK, file seems to exist at least...
1017 for key in metadataKeys:
1018 if key not in self._metadataKeys:
1019 msg.debug('Metadata key {0} is unknown for {1}'.format(key, self.__class__.__name__))
1020 self._fileMetadata[fname][key] = 'UNDEFINED'
1021 else:
1022 if key in self._fileMetadata[fname]:
1023 msg.debug('Found cached value for {0}:{1} = {2!s}'.format(fname, key, self._fileMetadata[fname][key]))
1024 else:
1025 msg.debug('No cached value for {0}:{1}. Calling generator function {2} ({3})'.format(fname, key, self._metadataKeys[key].__name__, self._metadataKeys[key]))
1026 try:
1027 # For efficiency call this routine with all files we have
1028 msg.info("Metadata generator called to obtain {0} for {1}".format(key, files))
1029 self._metadataKeys[key](files)
1030 except trfExceptions.TransformMetadataException as e:
1031 msg.error('Calling {0!s} raised an exception: {1!s}'.format(self._metadataKeys[key].__name__, e))
1032 if key not in self._fileMetadata[fname]:
1033 msg.warning('Call to function {0} for {1} file {2} failed to populate metadata key {3}'.format(self._metadataKeys[key].__name__, self.__class__.__name__, fname, key))
1034 self._fileMetadata[fname][key] = None
1035 msg.debug('Now have {0}:{1} = {2!s}'.format(fname, key, self._fileMetadata[fname][key]))
1036
1037

◆ _resetMetadata()

python.trfArgClasses.argFile._resetMetadata ( self,
files = [] )
protectedinherited

Resets all metadata files in this instance.

Metadata dictionary entry is reset for any files given (default all files) and any files that are no longer in this instance have any metadata removed (useful for self merging).

Note
Metadata is set to {}, except for the case when an explicit GUID option was given

Definition at line 917 of file trfArgClasses.py.

917 def _resetMetadata(self, files=[]):
918 if files == [] or '_fileMetadata' not in dir(self):
919 self._fileMetadata = {}
920 for fname in self.value:
921 self._fileMetadata[fname] = {}
922 else:
923 for fname in files:
924 if fname in self.value:
925 self._fileMetadata[fname] = {}
926 elif fname in self._fileMetadata:
927 del self._fileMetadata[fname]
928 msg.debug('Metadata dictionary now {0}'.format(self._fileMetadata))
929
930 # If we have the special guid option, then manually try to set GUIDs we find
931 if self._guid is not None:
932 msg.debug('Now trying to set file GUID metadata using {0}'.format(self._guid))
933 for fname, guid in self._guid.items():
934 if fname in self._value:
935 self._fileMetadata[fname]['file_guid'] = guid
936 else:
937 msg.warning('Explicit GUID {0} was passed for file {1}, but this file is not a member of this instance'.format(guid, fname))
938

◆ _setMetadata()

python.trfArgClasses.argFile._setMetadata ( self,
files = None,
metadataKeys = {} )
protectedinherited

Set metadata values into the cache.

Manually sets the metadata cache values to the values given in the metadata key dictionary here. This is useful for setting values to make checks on file metadata handling.

Note
To really suppress any external function calls that gather metadata be careful to also set the _exists metadatum to True.
Warning
No checks are done on the values or keys given here, so you'd better know what you are doing.
Parameters
filesFiles to set metadata for (None means "all")
metadataKeysDictionary with metadata keys and values

Definition at line 1048 of file trfArgClasses.py.

1048 def _setMetadata(self, files=None, metadataKeys={}):
1049 if files is None:
1050 files = self._value
1051 for fname in files:
1052 if fname not in self._fileMetadata:
1053 self._fileMetadata[fname] = {}
1054 for k, v in metadataKeys.items():
1055 msg.debug('Manualy setting {0} for file {1} to {2}'.format(k, fname, v))
1056 self._fileMetadata[fname][k] = v
1057
1058

◆ append()

python.trfArgClasses.argList.append ( self,
addme )
inherited

Append a value to the list.

Parameters
addmeItem to add

Definition at line 398 of file trfArgClasses.py.

398 def append(self, addme):
399 self._value.append(addme)
400

◆ auxiliaryFile()

python.trfArgClasses.argFile.auxiliaryFile ( self)
inherited

Definition at line 878 of file trfArgClasses.py.

878 def auxiliaryFile(self):
879 return self._auxiliaryFile
880

◆ dataset() [1/2]

python.trfArgClasses.argFile.dataset ( self)
inherited

Definition at line 818 of file trfArgClasses.py.

818 def dataset(self):
819 return self._dataset
820

◆ dataset() [2/2]

python.trfArgClasses.argFile.dataset ( self,
value )
inherited

Definition at line 822 of file trfArgClasses.py.

822 def dataset(self, value):
823 self._dataset = value
824

◆ executor()

python.trfArgClasses.argFile.executor ( self)
inherited

Executor status getter.

Definition at line 638 of file trfArgClasses.py.

638 def executor(self):
639 return self._exe
640

◆ getMetadata()

python.trfArgClasses.argFile.getMetadata ( self,
files = None,
metadataKeys = None,
maskMetadataKeys = None,
populate = True,
flush = False )
inherited

Return specific keys for specific files.

Parameters
filesList of files to return metadata for (default - all files in this instance)
metadataKeysKeys to return (default - all keys valid for this class of files)
maskMetadataKeysKeys to NOT return (useful when metadataKeys is left as default)
populateIf missing keys should be generated by calling the population subroutines
flushIf cached data should be flushed and the generators rerun

Definition at line 945 of file trfArgClasses.py.

945 def getMetadata(self, files = None, metadataKeys = None, maskMetadataKeys = None, populate = True, flush = False):
946 # Normalise the files and keys parameter
947 if files is None:
948 files = self._value
949 elif isinstance(files, str):
950 files = (files,)
951 msg.debug('getMetadata will examine these files: {0!s}'.format(files))
952
953 if metadataKeys is None:
954 metadataKeys = list(self._metadataKeys)
955 elif isinstance(metadataKeys, str):
956 metadataKeys = [metadataKeys,]
957 if maskMetadataKeys is not None:
958 metadataKeys = [k for k in metadataKeys if k not in maskMetadataKeys]
959 msg.debug('getMetadata will retrieve these keys: {0!s}'.format(metadataKeys))
960
961 if flush is True:
962 msg.debug('Flushing cached metadata values')
963 self._resetMetadata()
964
965 if populate is True:
966 msg.debug('Checking metadata values')
967 self._readMetadata(files, metadataKeys)
968
969 metadata = {}
970 for fname in files:
971 metadata[fname] = {}
972 for mdkey in metadataKeys:
973 try:
974 metadata[fname][mdkey] = self._fileMetadata[fname][mdkey]
975 except KeyError:
976 # This should not happen, unless we skipped populating
977 if populate:
978 msg.error('Did not find metadata key {0!s} for file {1!s} - setting to None'.format(mdkey, fname))
979 metadata[fname][mdkey] = None
980 return metadata
981

◆ getnentries()

python.trfArgClasses.argFile.getnentries ( self,
fast = False )
inherited

Explicit getter, offering fast switch.

Definition at line 894 of file trfArgClasses.py.

894 def getnentries(self, fast=False):
895 totalEvents = 0
896 for fname in self._value:
897 events = self.getSingleMetadata(fname=fname, metadataKey='nentries', populate = not fast)
898 if events is None:
899 msg.debug('Got events=None for file {0} - returning None for this instance'.format(fname))
900 return None
901 if events == 'UNDEFINED':
902 msg.debug('Got events=UNDEFINED for file {0} - returning UNDEFINED for this instance'.format(fname))
903 return 'UNDEFINED'
904 if not isinstance(events, int):
905 msg.warning('Got unexpected events metadata for file {0}: {1!s} - returning None for this instance'.format(fname, events))
906 return None
907 totalEvents += events
908
909 return totalEvents
910
911

◆ getSingleMetadata()

python.trfArgClasses.argFile.getSingleMetadata ( self,
fname,
metadataKey,
populate = True,
flush = False )
inherited

Convenience function to extract a single metadata key for a single file.

Retrieves a single metadata item for a single file, returning it directly

Returns
Single metadata value
Parameters
fnameFile to return metadata for
metadataKeyKey to return
populateIf missing key should be generated by calling the population subroutines
flushIf cached data should be flushed and the generator rerun

Definition at line 989 of file trfArgClasses.py.

989 def getSingleMetadata(self, fname, metadataKey, populate = True, flush = False):
990 if not (isinstance(fname, str) and isinstance(metadataKey, str)):
991 raise trfExceptions.TransformInternalException(trfExit.nameToCode('TRF_INTERNAL'),
992 'Illegal call to getSingleMetadata function: {0!s} {1!s}'.format(fname, metadataKey))
993 md = self.getMetadata(files = fname, metadataKeys = metadataKey, populate = populate, flush = flush)
994 return md[fname][metadataKey]
995
996

◆ io() [1/2]

python.trfArgClasses.argFile.io ( self)
inherited

Definition at line 807 of file trfArgClasses.py.

807 def io(self):
808 return (self._io)
809

◆ io() [2/2]

python.trfArgClasses.argFile.io ( self,
value )
inherited

Definition at line 811 of file trfArgClasses.py.

811 def io(self, value):
812 if value not in ('input', 'output', 'temporary'):
813 raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_RUNTIME_ERROR'),
814 'File arguments must be specified as input, output or temporary - got {0}'.format(value))
815 self._io = value
816

◆ isCached()

python.trfArgClasses.argFile.isCached ( self,
files = None,
metadataKeys = None )
inherited

Test if certain metadata elements are already cached.

Will test for a cached value for all files and all keys given, aborting as soon as it finds a single uncached value.

Parameters
filesFiles to check (defaults to all files)
metadataKeysKeys to check (defaults to all keys)
Returns
Boolean if all keys are cached for all files

Definition at line 1065 of file trfArgClasses.py.

1065 def isCached(self, files = None, metadataKeys = None):
1066 msg.debug('Testing for cached values for files {0} and keys {1}'.format(files, metadataKeys))
1067 if files is None:
1068 files = self._value
1069 elif isinstance(files, str):
1070 files = (files,)
1071 if metadataKeys is None:
1072 metadataKeys = list(self._metadataKeys)
1073 elif isinstance(metadataKeys, str):
1074 metadataKeys = (metadataKeys,)
1075
1076 isCachedFlag = True
1077 for fname in files:
1078 for key in metadataKeys:
1079 if key not in self._fileMetadata[fname]:
1080 isCachedFlag = False
1081 break
1082 if isCachedFlag is False:
1083 break
1084
1085 return isCachedFlag
1086

◆ isRunarg()

python.trfArgClasses.argument.isRunarg ( self)
inherited

Return runarg status.

Definition at line 134 of file trfArgClasses.py.

134 def isRunarg(self):
135 return self._runarg
136

◆ mergeTargetSize() [1/2]

python.trfArgClasses.argFile.mergeTargetSize ( self)
inherited

mergeTargeSize value getter

Definition at line 613 of file trfArgClasses.py.

613 def mergeTargetSize(self):
614 return self._mergeTargetSize
615

◆ mergeTargetSize() [2/2]

python.trfArgClasses.argFile.mergeTargetSize ( self,
value )
inherited

mergeTargeSize value setter

Definition at line 618 of file trfArgClasses.py.

618 def mergeTargetSize(self, value):
619 if value is None:
620 self._mergeTargetSize = 0
621 else:
622 self._mergeTargetSize = value
623

◆ metadata()

python.trfArgClasses.argFile.metadata ( self)
inherited

Returns the whole kit and kaboodle...

Note
Populates the whole metadata dictionary for this instance

Definition at line 884 of file trfArgClasses.py.

884 def metadata(self):
885 self.getMetadata()
886 return self._fileMetadata
887

◆ multipleOK() [1/2]

python.trfArgClasses.argFile.multipleOK ( self)
inherited

multipleOK getter

Returns
Current value

Definition at line 603 of file trfArgClasses.py.

603 def multipleOK(self):
604 return self._multipleOK
605

◆ multipleOK() [2/2]

python.trfArgClasses.argFile.multipleOK ( self,
value )
inherited

multipleOK value setter

Definition at line 608 of file trfArgClasses.py.

608 def multipleOK(self, value):
609 self._multipleOK = value
610

◆ name() [1/2]

python.trfArgClasses.argFile.name ( self)
inherited

Name getter.

Reimplemented from python.trfArgClasses.argument.

Definition at line 851 of file trfArgClasses.py.

851 def name(self):
852 return self._name
853

◆ name() [2/2]

python.trfArgClasses.argFile.name ( self,
value )
inherited

Name setter.

Note
This property setter will also set the type and subtype of the argFile if they are not yet set. This means that for most arguments the type and subtype are automatically set correctly.

Reimplemented from python.trfArgClasses.argument.

Definition at line 859 of file trfArgClasses.py.

859 def name(self, value):
860 self._name = value
861 m = re.match(r'(input|output|tmp.)([A-Za-z0-9_]+?)(File)?$', value)
862 if m:
863 msg.debug("ArgFile name setter matched this: {0}".format(m.groups()))
864 if self._type is None:
865 dtype = m.group(2).split('_', 1)[0]
866 # But DRAW/DESD/DAOD are really just RAW, ESD, AOD in format
867 if re.match(r'D(RAW|ESD|AOD)', dtype):
868 dtype = dtype[1:]
869 msg.debug("Autoset data type to {0}".format(dtype))
870 self._type = dtype
871 if self._subtype is None:
872 msg.debug("Autoset data subtype to {0}".format(m.group(2)))
873 self._subtype = m.group(2)
874 else:
875 msg.debug("ArgFile name setter did not match against '{0}'".format(value))
876

◆ nentries()

python.trfArgClasses.argFile.nentries ( self)
inherited

Return total number of events in all constituent files.

Definition at line 890 of file trfArgClasses.py.

890 def nentries(self):
891 return self.getnentries()
892

◆ originalName()

python.trfArgClasses.argFile.originalName ( self,
value )
inherited

Definition at line 830 of file trfArgClasses.py.

830 def originalName(self, value):
831 self._originalName = value
832

◆ orignalName()

python.trfArgClasses.argFile.orignalName ( self)
inherited

Definition at line 826 of file trfArgClasses.py.

826 def orignalName(self):
827 return self._originalName
828

◆ prodsysDescription()

python.trfArgClasses.argBZ2File.prodsysDescription ( self)

Reimplemented from python.trfArgClasses.argFile.

Reimplemented in python.trfArgClasses.argFTKIPFile.

Definition at line 1817 of file trfArgClasses.py.

1817 def prodsysDescription(self):
1818 desc=super(argBZ2File, self).prodsysDescription
1819 return desc
1820
1821

◆ subtype() [1/2]

python.trfArgClasses.argFile.subtype ( self)
inherited

Definition at line 842 of file trfArgClasses.py.

842 def subtype(self):
843 return self._subtype
844

◆ subtype() [2/2]

python.trfArgClasses.argFile.subtype ( self,
value )
inherited

Definition at line 846 of file trfArgClasses.py.

846 def subtype(self, value):
847 self._subtype = value
848

◆ type() [1/2]

python.trfArgClasses.argFile.type ( self)
inherited

Definition at line 834 of file trfArgClasses.py.

834 def type(self):
835 return self._type
836

◆ type() [2/2]

python.trfArgClasses.argFile.type ( self,
value )
inherited

Definition at line 838 of file trfArgClasses.py.

838 def type(self, value):
839 self._type = value
840

◆ value() [1/2]

python.trfArgClasses.argFile.value ( self)
inherited

Argument value getter.

Returns
Current value

Reimplemented from python.trfArgClasses.argList.

Definition at line 591 of file trfArgClasses.py.

591 def value(self):
592 return self._value
593

◆ value() [2/2]

python.trfArgClasses.argFile.value ( self,
value )
inherited

Argument value setter.

Calls the valueSetter function with the standard options

Reimplemented from python.trfArgClasses.argList.

Definition at line 597 of file trfArgClasses.py.

597 def value(self, value):
598 self.valueSetter(value)
599

◆ valueSetter()

python.trfArgClasses.argFile.valueSetter ( self,
value )
inherited

Set the argFile value, but allow parameters here.

Note
Normally athena only takes a single value for an output file, but when AthenaMP runs it can produce multiple output files - this is allowed by setting allowMultiOutputs = True
The setter protects against the same file being added multiple times

Definition at line 645 of file trfArgClasses.py.

645 def valueSetter(self, value):
646
647 if isinstance(value, (list, tuple)):
648 if len(value) > 0 and isinstance(value[0], dict): # Tier-0 style expanded argument with metadata
649 self._value=[]
650 for myfile in value:
651 try:
652 self._value.append(myfile['lfn'])
653 self._resetMetadata(files = [myfile['lfn']])
654 except KeyError:
655 raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_ARG_CONV_FAIL'),
656 'Filename (key "lfn") not found in Tier-0 file dictionary: {0}'.format(myfile))
657 for k, v in myfile.items():
658 if k == 'guid':
659 self._setMetadata([myfile['lfn']], {'file_guid': v})
660 elif k == 'events':
661 self._setMetadata([myfile['lfn']], {'nentries': v})
662 elif k == 'checksum':
663 self._setMetadata([myfile['lfn']], {'checksum': v})
664 elif k == 'dsn':
665 if not self._dataset:
666 self.dataset = v
667 elif self.dataset != v:
668 raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_ARG_DATASET'),
669 'Inconsistent dataset names in Tier-0 dictionary: {0} != {1}'.format(self.dataset, v))
670 else:
671 self._value = list(value)
672 self._getDatasetFromFilename(reset = False)
673 self._resetMetadata()
674 elif value is None:
675 self._value = []
676 return
677 else:
678 try:
679 if value.lower().startswith('lfn'):
680 # Resolve physical filename using pool file catalog.
681 from PyUtils.PoolFile import file_name
682 protocol, pfn = file_name(value)
683 self._value = [pfn]
684 self._getDatasetFromFilename(reset = False)
685 self._resetMetadata()
686 else:
687 # Don't split output filename if it contains a list in square brackets
688 if self._io == 'output' and ('[' in value) and (']' in value):
689 self._value = [value]
690 else:
691 self._value = value.split(self._splitter)
692 self._getDatasetFromFilename(reset = False)
693 self._resetMetadata()
694 except (AttributeError, TypeError):
695 raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_ARG_CONV_FAIL'),
696 'Failed to convert %s to a list' % str(value))
697
698
699 deDuplicatedValue = []
700 for fname in self._value:
701 if fname not in deDuplicatedValue:
702 deDuplicatedValue.append(fname)
703 else:
704 msg.warning("Removing duplicated file {0} from file list".format(fname))
705 if len(self._value) != len(deDuplicatedValue):
706 self._value = deDuplicatedValue
707 msg.warning('File list after duplicate removal: {0}'.format(self._value))
708
709 # Find our URL type (if we actually have files!)
710 # At the moment this is assumed to be the same for all files in this instance
711 # although in principle one could mix different access methods in the one input file type
712 if len(self._value) > 0:
713 self._urlType = urlType(self._value[0])
714 else:
715 self._urlType = None
716
717
718 if self._io == 'input':
719
723 if self._urlType == 'posix':
724 msg.debug('Found POSIX filesystem input - activating globbing')
725 newValue = []
726 for filename in self._value:
727 # Simple case
728 globbedFiles = glob.glob(filename)
729 if len(globbedFiles) == 0: # No files globbed for this 'filename' argument.
730 raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_INPUT_FILE_ERROR'),
731 'Input file argument {0} globbed to NO input files - probably the file(s) are missing'.format(filename))
732
733 globbedFiles.sort()
734 newValue.extend(globbedFiles)
735
736 self._value = newValue
737 msg.debug ('File input is globbed to %s' % self._value)
738
739 elif self._urlType == 'root':
740 msg.debug('Found root filesystem input - activating globbing')
741 newValue = []
742 for filename in self._value:
743 if str(filename).startswith("root"):
744 msg.debug('Found input file name starting with "root," setting XRD_RUNFORKHANDLER=1, which enables fork handlers for xrootd in direct I/O')
745 os.environ["XRD_RUNFORKHANDLER"] = "1"
746 if str(filename).startswith("https") or str(filename).startswith("davs") or not(str(filename).endswith('/')) and '*' not in filename and '?' not in filename:
747 msg.debug('Seems that only one file was given: {0}'.format(filename))
748 newValue.extend(([filename]))
749 else:
750 # Hopefully this recognised wildcards...
751 path = filename
752 fileMask = ''
753 if '*' in filename or '?' in filename:
754 msg.debug('Split input into path for listdir() and a filemask to select available files.')
755 path = filename[0:filename.rfind('/')+1]
756 msg.debug('path: {0}'.format(path))
757 fileMask = filename[filename.rfind('/')+1:len(filename)]
758 msg.debug('Will select according to: {0}'.format(fileMask))
759
760 cmd = ['/afs/cern.ch/project/eos/installation/atlas/bin/eos.select' ]
761 if not os.access ('/afs/cern.ch/project/eos/installation/atlas/bin/eos.select', os.X_OK ):
762 raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_INPUT_FILE_ERROR'),
763 'No execute access to "eos.select" - could not glob EOS input files.')
764
765 cmd.extend(['ls'])
766 cmd.extend([path])
767
768 myFiles = []
769 try:
770 proc = subprocess.Popen(args = cmd,bufsize = 1, shell = False, stdout = subprocess.PIPE, stderr = subprocess.STDOUT)
771 rc = proc.wait()
772 output = proc.stdout.readlines()
773 if rc!=0:
774 raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_INPUT_FILE_ERROR'),
775 'EOS list command ("{0!s}") failed: rc {1}, output {2}'.format(cmd, rc, output))
776 msg.debug("eos returned: {0}".format(output))
777 for line in output:
778 if "root" in line:
779 myFiles += [str(path)+str(line.rstrip('\n'))]
780
781 patt = re.compile(fileMask.replace('*','.*').replace('?','.'))
782 for srmFile in myFiles:
783 if fileMask != '':
784 if(patt.search(srmFile)) is not None:
785 #if fnmatch.fnmatch(srmFile, fileMask):
786 msg.debug('match: %s',srmFile)
787 newValue.extend(([srmFile]))
788 else:
789 newValue.extend(([srmFile]))
790
791 msg.debug('Selected files: %s', newValue)
792 except (AttributeError, TypeError, OSError):
793 raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_RUNTIME_ERROR'),
794 'Failed to convert %s to a list' % str(value))
795 if len(self._value) > 0 and len(newValue) == 0:
796 # Woops - no files!
797 raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_INPUT_FILE_ERROR'),
798 'Input file argument(s) {0!s} globbed to NO input files - ls command failed')
799 self._value = newValue
800 msg.debug ('File input is globbed to %s' % self._value)
801 # Check if multiple outputs are ok for this object
802 elif self._multipleOK is False and len(self._value) > 1:
803 raise trfExceptions.TransformArgException(trfExit.nameToCode('TRF_OUTPUT_FILE_ERROR'),
804 'Multiple file arguments are not supported for {0} (was given: {1}'.format(self, self._value))
805
if(febId1==febId2)
std::string replace(std::string s, const std::string &s2, const std::string &s3)
Definition hcg.cxx:310

Member Data Documentation

◆ _auxiliaryFile

python.trfArgClasses.argFile._auxiliaryFile = auxiliaryFile
protectedinherited

Definition at line 554 of file trfArgClasses.py.

◆ _dataset

python.trfArgClasses.argFile._dataset = None
protectedinherited

Definition at line 548 of file trfArgClasses.py.

◆ _exe

python.trfArgClasses.argFile._exe = executor
protectedinherited

Definition at line 560 of file trfArgClasses.py.

◆ _fileMetadata

python.trfArgClasses.argFile._fileMetadata = {}
protectedinherited

Definition at line 575 of file trfArgClasses.py.

◆ _guid

python.trfArgClasses.argFile._guid = guid
protectedinherited

Definition at line 552 of file trfArgClasses.py.

◆ _io

python.trfArgClasses.argFile._io = 'input':
protectedinherited

Input file globbing and expansion.

Definition at line 577 of file trfArgClasses.py.

◆ _mergeTargetSize

int python.trfArgClasses.argFile._mergeTargetSize = mergeTargetSize
protectedinherited

Definition at line 553 of file trfArgClasses.py.

◆ _metadataKeys

dict python.trfArgClasses.argFile._metadataKeys
protectedinherited
Initial value:
= {'file_size': self._getSize,
'integrity': self._getIntegrity,
'file_guid': self._generateGUID,
'_exists': self._exists,
}
Note
Variable listing set of file metadata which corresponds to this class, Key is the metadata variable name, the value is the function to call to populate/refresh this metadata value. Function must take a single parameter, which is the list of files to get metadata for. It must return a metadata dictionary: {file1 : {key1: value1, key2: value2}, file2: ...} Keys which start with _ are for transform internal use and should not appear in jobReports

Definition at line 570 of file trfArgClasses.py.

◆ _multipleOK

bool python.trfArgClasses.argFile._multipleOK = True
protectedinherited

Definition at line 578 of file trfArgClasses.py.

◆ _name

python.trfArgClasses.argument._name = name
protectedinherited

Definition at line 111 of file trfArgClasses.py.

◆ _originalName

python.trfArgClasses.argFile._originalName = None
protectedinherited

Definition at line 555 of file trfArgClasses.py.

◆ _runarg

python.trfArgClasses.argument._runarg = runarg
protectedinherited

Definition at line 110 of file trfArgClasses.py.

◆ _splitter

python.trfArgClasses.argList._splitter = splitter
protectedinherited

Definition at line 357 of file trfArgClasses.py.

◆ _subtype

python.trfArgClasses.argFile._subtype = subtype
protectedinherited

Definition at line 551 of file trfArgClasses.py.

◆ _supressEmptyStrings

python.trfArgClasses.argList._supressEmptyStrings = supressEmptyStrings
protectedinherited

Definition at line 358 of file trfArgClasses.py.

◆ _type

python.trfArgClasses.argFile._type = type
protectedinherited

Definition at line 550 of file trfArgClasses.py.

◆ _urlType

str python.trfArgClasses.argFile._urlType = None
protectedinherited

Input file globbing and expansion.

Note
TODO: Non-posix URLs Problem is not so much the [] expansion, but the invisible .N attempt number One can only deal with this with a listdir() functionality N.B. Current transforms only do globbing on posix fs too (see trfutil.expandStringToList())

Definition at line 549 of file trfArgClasses.py.

◆ _value

python.trfArgClasses.argument._value = value
protectedinherited

Definition at line 130 of file trfArgClasses.py.

◆ dataset

python.trfArgClasses.argFile.dataset = v
inherited

Definition at line 666 of file trfArgClasses.py.

◆ io

python.trfArgClasses.argFile.io = io
inherited

Definition at line 558 of file trfArgClasses.py.

◆ type

python.trfArgClasses.argFile.type
inherited

Definition at line 1217 of file trfArgClasses.py.


The documentation for this class was generated from the following file: