ATLAS Offline Software
Public Member Functions | Public Attributes | Private Member Functions | Private Attributes | List of all members
python.PoolFile.PoolFile Class Reference
Inheritance diagram for python.PoolFile.PoolFile:
Collaboration diagram for python.PoolFile.PoolFile:

Public Member Functions

def __init__ (self, fileName, verbose=True)
 
def fileInfos (self)
 
def checkFile (self, sorting=PoolRecord.Sorter.DiskSize)
 
def detailedDump (self, bufferName=None)
 

Public Attributes

 keys
 first we try to fetch the DataHeader More...
 
 dataHeader
 try to also handle non-T/P separated DataHeaders (from old files)... More...
 
 augNames
 
 dataHeaderA
 
 data
 
 verbose
 
 poolFile
 
 ROOT
 

Private Member Functions

def __openPoolFile (self, fileName)
 
def __processFile (self)
 

Private Attributes

 _fileInfos
 

Detailed Description

A simple class to retrieve informations about the content of a POOL file.
It should be abstracted from the underlying technology used to create this
POOL file (Db, ROOT,...).
Right now, we are using the easy and loosy solution: going straight to the
ROOT 'API'.

Definition at line 519 of file PoolFile.py.

Constructor & Destructor Documentation

◆ __init__()

def python.PoolFile.PoolFile.__init__ (   self,
  fileName,
  verbose = True 
)

Definition at line 528 of file PoolFile.py.

528  def __init__(self, fileName, verbose=True):
529  object.__init__(self)
530 
531  self._fileInfos = None
532  self.keys = None
533  self.dataHeader = PoolRecord("DataHeader", 0, 0, 0,
534  nEntries = 0,
535  dirType = "T")
536  self.augNames = set()
537  self.dataHeaderA = {}
538  self.data = []
539  self.verbose = verbose
540 
541  # get the "final" file name (handles all kind of protocols)
542  try:
543  protocol, fileName = file_name(fileName)
544  except Exception as err:
545  print("## warning: problem opening PoolFileCatalog:\n%s"%err)
546  import traceback
547  traceback.print_exc(err)
548  pass
549 
550  self.poolFile = None
551  dbFileName = whichdb( fileName )
552  if dbFileName not in ( None, '' ):
553  if self.verbose is True:
554  print("## opening file [%s]..." % str(fileName))
555  db = shelve.open( fileName, 'r' )
556  if self.verbose is True:
557  print("## opening file [OK]")
558  report = db['report']
559  self._fileInfos = report['fileInfos']
560  self.dataHeader = report['dataHeader']
561  self.data = report['data']
562  else:
563  if self.verbose is True:
564  print("## opening file [%s]..." % str(fileName))
565  self.__openPoolFile( fileName )
566  if self.verbose is True:
567  print("## opening file [OK]")
568  self.__processFile()
569 
570  return
571 

Member Function Documentation

◆ __openPoolFile()

def python.PoolFile.PoolFile.__openPoolFile (   self,
  fileName 
)
private

Definition at line 572 of file PoolFile.py.

572  def __openPoolFile(self, fileName):
573  # hack to prevent ROOT from loading graphic libraries and hence bother
574  # our fellow Mac users
575  if self.verbose is True:
576  print("## importing ROOT...")
577  import PyUtils.RootUtils as ru
578  ROOT = ru.import_root()
579  self.ROOT = ROOT
580  if self.verbose is True:
581  print("## importing ROOT... [DONE]")
582  # prevent ROOT from being too verbose
583  rootMsg = ShutUp()
584  rootMsg.mute()
585  ROOT.gErrorIgnoreLevel = ROOT.kFatal
586 
587  poolFile = None
588  try:
589  poolFile = ROOT.TFile.Open( fileName, PoolOpts.READ_MODE )
590  except Exception as e:
591  rootMsg.unMute()
592  print("## Failed to open file [%s] !!" % fileName)
593  print("## Reason:")
594  print(e)
595  print("## Bailing out...")
596  raise IOError("Could not open file [%s]" % fileName)
597 
598  rootMsg.unMute()
599 
600  if poolFile is None:
601  print("## Failed to open file [%s] !!" % fileName)
602  msg = "Could not open file [%s]" % fileName
603  raise IOError(msg)
604 
605  self.poolFile = poolFile
606  assert self.poolFile.IsOpen() and not self.poolFile.IsZombie(), \
607  "Invalid POOL file or a Zombie one"
608  self._fileInfos = {
609  'name' : self.poolFile.GetName(),
610  'size' : self.poolFile.GetSize(),
611  }
612  return
613 

◆ __processFile()

def python.PoolFile.PoolFile.__processFile (   self)
private

Definition at line 614 of file PoolFile.py.

614  def __processFile(self):
615 
616  for name in {PoolOpts.TTreeNames.DataHeader, PoolOpts.RNTupleNames.DataHeader}:
617  dhKey = self.poolFile.FindKey( name )
618  if dhKey:
619  obj = self.poolFile.Get( name )
620  if isinstance(obj, self.ROOT.TTree):
621  nEntries = obj.GetEntries()
622  elif isRNTuple(obj):
623  try:
624  nEntries = self.ROOT.Experimental.RNTupleReader.Open(obj).GetNEntries()
625  except AttributeError:
626  # ROOT 6.36 and later
627  nEntries = self.ROOT.RNTupleReader.Open(obj).GetNEntries()
628  else:
629  raise NotImplementedError(f"Keys of type {type(obj)!r} not supported")
630  break
631  else:
632  nEntries = 0
633 
634  keys = []
635  containers = []
636  for k in self.poolFile.GetListOfKeys():
637  keyname = k.GetName()
638  obj = self.poolFile.Get( keyname )
639  if isinstance(obj, self.ROOT.TTree):
640  containerName = obj.GetName()
641  nEntries = obj.GetEntries()
642  dirType = "T"
643  elif isRNTuple(obj):
644  try:
645  reader = self.ROOT.Experimental.RNTupleReader.Open(obj)
646  except AttributeError:
647  # ROOT 6.36 and later
648  reader = self.ROOT.RNTupleReader.Open(obj)
649  containerName = reader.GetDescriptor().GetName()
650  nEntries = reader.GetNEntries()
651  dirType = "N"
652  else:
653  raise NotImplementedError(f"Keys of type {type(obj)!r} not supported")
654  if containerName not in containers:
655  keys.append(k)
656  containers.append(containerName)
657  pass
658  if keyname.startswith(PoolOpts.POOL_HEADER) and not keyname.endswith('Form'):
659  self.dataHeaderA[PoolOpts.augmentationName(keyname)] = \
660  PoolRecord("DataHeader", 0, 0, 0,
661  nEntries = nEntries,
662  dirType = dirType)
663 
664  keys.sort (key = lambda x: x.GetName())
665  self.keys = keys
666  del containers
667 
668  for k in keys:
669  obj = self.poolFile.Get( k.GetName() )
670  if isinstance(obj, self.ROOT.TTree):
671  name = obj.GetName()
672  elif isRNTuple(obj):
673  try:
674  inspector = self.ROOT.Experimental.RNTupleInspector.Create(obj)
675  except AttributeError:
676  inspector = self.ROOT.RNTupleInspector.Create(obj)
677  name = inspector.GetDescriptor().GetName()
678 
679  if PoolOpts.isDataHeader(name):
680  contName = "DataHeader"
681  if isinstance(obj, self.ROOT.TTree):
682  memSize = obj.GetTotBytes() / Units.kb
683  diskSize = obj.GetZipBytes() / Units.kb
684  memSizeNoZip = 0.0
685  if diskSize < 0.001:
686  memSizeNoZip = memSize
687  nEntries = obj.GetEntries()
688 
690  dhBranchNames = [
691  br.GetName() for br in obj.GetListOfBranches()
692  if br.GetName().count("DataHeader_p") > 0
693  ]
694  if len(dhBranchNames) == 1:
695  dhBranch = obj.GetBranch(dhBranchNames[0])
696  typeName = dhBranch.GetClassName()
697  if not typeName and (leaf := dhBranch.GetListOfLeaves().At(0)):
698  typeName = leaf.GetTypeName()
699  poolRecord = retrieveBranchInfos(
700  dhBranch,
701  PoolRecord( contName, 0., 0., 0.,
702  nEntries,
703  dirType = "T",
704  typeName = typeName ),
705  ident = " "
706  )
707  else:
708  poolRecord = PoolRecord(contName, memSize, diskSize, memSizeNoZip,
709  nEntries,
710  dirType = "T")
711 
712  self.dataHeader = poolRecord
713  elif isRNTuple(obj):
714  diskSize = inspector.GetCompressedSize() / Units.kb
715  memSize = inspector.GetUncompressedSize() / Units.kb
716 
717  memSizeNoZip = 0.0
718  if diskSize < 0.001:
719  memSizeNoZip = memSize
720  nEntries = inspector.GetDescriptor().GetNEntries()
721  poolRecord = PoolRecord(contName, memSize, diskSize, memSizeNoZip,
722  nEntries,
723  dirType = "N")
724  self.dataHeader = poolRecord
725  elif PoolOpts.isData(name):
726  if isinstance(obj, self.ROOT.TTree):
727  if not hasattr(obj, 'GetListOfBranches'):
728  continue
729  branches = obj.GetListOfBranches()
730  dirType = "T"
731  if name in (PoolOpts.EVENT_DATA, PoolOpts.META_DATA):
732  dirType = "B"
733  for branch in branches:
734  poolRecord = retrieveBranchInfos(
735  branch,
736  make_pool_record(branch, dirType),
737  ident = " "
738  )
739  poolRecord.augName = PoolOpts.augmentationName(name)
740  self.augNames.add(poolRecord.augName)
741  self.data += [ poolRecord ]
742  elif isRNTuple(obj):
743  descriptor = inspector.GetDescriptor()
744  dirType = "N"
745  if name in {PoolOpts.RNTupleNames.EventData, PoolOpts.RNTupleNames.MetaData}:
746  dirType = "F"
747  fieldZeroId = descriptor.GetFieldZeroId()
748  for fieldDescriptor in descriptor.GetFieldIterable(fieldZeroId):
749  fieldId = fieldDescriptor.GetId()
750  fieldTreeInspector = inspector.GetFieldTreeInspector(fieldId)
751  diskSize = fieldTreeInspector.GetCompressedSize() / Units.kb
752  memSize = fieldTreeInspector.GetUncompressedSize() / Units.kb
753  typeName = fieldDescriptor.GetTypeName()
754  fieldName = fieldDescriptor.GetFieldName()
755  poolRecord = PoolRecord(fieldName, memSize, diskSize, memSize,
756  descriptor.GetNEntries(),
757  dirType=dirType,
758  typeName=typeName)
759  poolRecord.augName = PoolOpts.augmentationName(name)
760  self.augNames.add(poolRecord.augName)
761  self.data += [ poolRecord ]
762  # loop over keys
763 
764  return
765 

◆ checkFile()

def python.PoolFile.PoolFile.checkFile (   self,
  sorting = PoolRecord.Sorter.DiskSize 
)

Definition at line 774 of file PoolFile.py.

774  def checkFile(self, sorting = PoolRecord.Sorter.DiskSize):
775  if self.verbose is True:
776  print(self.fileInfos())
777  if len(self.augNames) > 1:
778  for aug in self.augNames:
779  if len(aug) > 0:
780  print( "Nbr %s Events: %i" % (aug, self.dataHeaderA[aug].nEntries) )
781 
782 
783  data = self.data
784  if sorting in PoolRecord.Sorter.allowedValues():
785  import operator
786  data.sort(key = operator.attrgetter(sorting) )
787 
788  def _get_val(x, dflt=-999.):
789  if PoolOpts.FAST_MODE:
790  return dflt
791  return x
792 
793  totMemSize = _get_val(self.dataHeader.memSize, dflt=0.)
794  totDiskSize = self.dataHeader.diskSize
795 
796  def _safe_div(num,den):
797  if float(den) == 0.:
798  return 0.
799  return num/den
800 
801  if self.verbose is True:
802  print("")
803  print("="*80)
804  print(PoolOpts.HDR_FORMAT % ( "Mem Size", "Disk Size","Size/Evt",
805  "MissZip/Mem","items",
806  "(X) Container Name (X=Tree|Branch)" ))
807  print("="*80)
808 
809  print(PoolOpts.ROW_FORMAT % (
810  _get_val (self.dataHeader.memSize),
811  self.dataHeader.diskSize,
812  _safe_div(self.dataHeader.diskSize,float(self.dataHeader.nEntries)),
813  _get_val (_safe_div(self.dataHeader.memSizeNoZip,
814  self.dataHeader.memSize)),
815  self.dataHeader.nEntries,
816  "("+self.dataHeader.dirType+") "+self.dataHeader.name
817  ))
818  print("-"*80)
819 
820  totMemSizeA = {}
821  totDiskSizeA = {}
822  for d in data:
823  totMemSize += 0. if PoolOpts.FAST_MODE else d.memSize
824  totDiskSize += d.diskSize
825  memSizeNoZip = d.memSizeNoZip/d.memSize if d.memSize != 0. else 0.
826  aug = d.augName
827  totMemSizeA[aug] = totMemSizeA.get(aug,0.) + d.memSize
828  totDiskSizeA[aug] = totDiskSizeA.get(aug,0.) + d.diskSize
829  if self.verbose is True:
830  print(PoolOpts.ROW_FORMAT % (
831  _get_val (d.memSize),
832  d.diskSize,
833  _safe_div(d.diskSize, float(self.dataHeader.nEntries)),
834  _get_val (memSizeNoZip),
835  d.nEntries,
836  "("+d.dirType+") "+d.name
837  ))
838 
839  if self.verbose is True:
840  print("="*80)
841  if len(self.augNames) > 1:
842  augs = sorted(self.augNames)
843  for a in augs:
844  print(PoolOpts.ROW_FORMAT % (
845  totMemSizeA[a], totDiskSizeA[a],
846  _safe_div(totDiskSizeA[a], float(self.dataHeaderA[a].nEntries)),
847  0.0,
848  self.dataHeaderA[a].nEntries,
849  "Aug Stream: " + ('MAIN' if a=='' else a)
850  ))
851  print("-"*80)
852  print(PoolOpts.ROW_FORMAT % (
853  totMemSize, totDiskSize,
854  _safe_div(totDiskSize, float(self.dataHeader.nEntries)),
855  0.0, self.dataHeader.nEntries,
856  "TOTAL (POOL containers)"
857  ))
858  print("="*80)
859  if PoolOpts.FAST_MODE:
860  print("::: warning: FAST_MODE was enabled: some columns' content ",)
861  print("is meaningless...")
862  return
863 

◆ detailedDump()

def python.PoolFile.PoolFile.detailedDump (   self,
  bufferName = None 
)

Definition at line 864 of file PoolFile.py.

864  def detailedDump(self, bufferName = None ):
865  if self.poolFile is None or \
866  self.keys is None:
867  print("Can't perform a detailedDump with a shelve file as input !")
868  return
869 
870  if bufferName is None:
871  bufferName = "/dev/stdout"
872  out = open( bufferName, "w" )
873  sys.stdout.flush()
874  save_stdout_fileno = os.dup (sys.stdout.fileno())
875  os.dup2( out.fileno(), sys.stdout.fileno() )
876 
877  out.write( "#" * 80 + os.linesep )
878  out.write( "## detailed dump" + os.linesep )
879  out.flush()
880 
881  for key in self.keys:
882  tree = key.ReadObj()
883  name = tree.GetName()
884 
885  if PoolOpts.isDataHeader(name) or \
886  PoolOpts.isData(name):
887  try:
888  print ("=== [%s] ===" % name, file=sys.stderr)
889  tree.Print()
890  except Exception as err:
891  print ("Caught:",err, file=sys.stderr)
892  print (sys.exc_info()[0], file=sys.stderr)
893  print (sys.exc_info()[1], file=sys.stderr)
894  pass
895  pass
896  pass
897  out.write( "#" * 80 + os.linesep )
898  out.flush()
899  out.write( "#" * 80 + os.linesep )

◆ fileInfos()

def python.PoolFile.PoolFile.fileInfos (   self)

Definition at line 766 of file PoolFile.py.

766  def fileInfos(self):
767  return os.linesep.join( [
768  "File:" + self._fileInfos['name'],
769  "Size: %12.3f kb" % (self._fileInfos['size'] / Units.kb),
770  "Nbr Events: %i" % self.dataHeader.nEntries
771  ] )
772 
773 

Member Data Documentation

◆ _fileInfos

python.PoolFile.PoolFile._fileInfos
private

Definition at line 531 of file PoolFile.py.

◆ augNames

python.PoolFile.PoolFile.augNames

Definition at line 536 of file PoolFile.py.

◆ data

python.PoolFile.PoolFile.data

Definition at line 538 of file PoolFile.py.

◆ dataHeader

python.PoolFile.PoolFile.dataHeader

try to also handle non-T/P separated DataHeaders (from old files)...

Definition at line 533 of file PoolFile.py.

◆ dataHeaderA

python.PoolFile.PoolFile.dataHeaderA

Definition at line 537 of file PoolFile.py.

◆ keys

python.PoolFile.PoolFile.keys

first we try to fetch the DataHeader

Definition at line 532 of file PoolFile.py.

◆ poolFile

python.PoolFile.PoolFile.poolFile

Definition at line 550 of file PoolFile.py.

◆ ROOT

python.PoolFile.PoolFile.ROOT

Definition at line 579 of file PoolFile.py.

◆ verbose

python.PoolFile.PoolFile.verbose

Definition at line 539 of file PoolFile.py.


The documentation for this class was generated from the following file:
DerivationFramework::TriggerMatchingUtils::sorted
std::vector< typename R::value_type > sorted(const R &r, PROJ proj={})
Helper function to create a sorted vector from an unsorted range.
python.PoolFile.file_name
def file_name(fname)
Definition: PoolFile.py:321
python.PoolFile.retrieveBranchInfos
def retrieveBranchInfos(branch, poolRecord, ident="")
Definition: PoolFile.py:420
XMLtoHeader.count
count
Definition: XMLtoHeader.py:84
Get
T * Get(TFile &f, const std::string &n, const std::string &dir="", const chainmap_t *chainmap=0, std::vector< std::string > *saved=0)
get a histogram given a path, and an optional initial directory if histogram is not found,...
Definition: comparitor.cxx:181
add
bool add(const std::string &hname, TKey *tobj)
Definition: fastadd.cxx:55
CxxUtils::set
constexpr std::enable_if_t< is_bitmask_v< E >, E & > set(E &lhs, E rhs)
Convenience function to set bits in a class enum bitmask.
Definition: bitmask.h:232
print
void print(char *figname, TCanvas *c1)
Definition: TRTCalib_StrawStatusPlots.cxx:25
python.PoolFile.make_pool_record
def make_pool_record(branch, dirType)
Definition: PoolFile.py:441
python.processes.powheg.ZZ.ZZ.__init__
def __init__(self, base_directory, **kwargs)
Constructor: all process options are set here.
Definition: ZZ.py:18
Trk::open
@ open
Definition: BinningType.h:40
python.PoolFile.isRNTuple
def isRNTuple(obj)
Definition: PoolFile.py:36
str
Definition: BTagTrackIpAccessor.cxx:11
python.trfValidateRootFile.checkFile
def checkFile(fileName, the_type, requireTree)
Definition: trfValidateRootFile.py:248
python.LArMinBiasAlgConfig.float
float
Definition: LArMinBiasAlgConfig.py:65