ATLAS Offline Software
Public Member Functions | Public Attributes | Private Member Functions | Private Attributes | List of all members
python.PoolFile.PoolFile Class Reference
Inheritance diagram for python.PoolFile.PoolFile:
Collaboration diagram for python.PoolFile.PoolFile:

Public Member Functions

def __init__ (self, fileName, verbose=True)
 
def fileInfos (self)
 
def checkFile (self, sorting=PoolRecord.Sorter.DiskSize)
 
def detailedDump (self, bufferName=None)
 

Public Attributes

 keys
 first we try to fetch the DataHeader More...
 
 dataHeader
 try to also handle non-T/P separated DataHeaders (from old files)... More...
 
 augNames
 
 dataHeaderA
 
 data
 
 verbose
 
 poolFile
 
 ROOT
 

Private Member Functions

def __openPoolFile (self, fileName)
 
def __processFile (self)
 

Private Attributes

 _fileInfos
 

Detailed Description

A simple class to retrieve informations about the content of a POOL file.
It should be abstracted from the underlying technology used to create this
POOL file (Db, ROOT,...).
Right now, we are using the easy and loosy solution: going straight to the
ROOT 'API'.

Definition at line 519 of file PoolFile.py.

Constructor & Destructor Documentation

◆ __init__()

def python.PoolFile.PoolFile.__init__ (   self,
  fileName,
  verbose = True 
)

Definition at line 528 of file PoolFile.py.

528  def __init__(self, fileName, verbose=True):
529  object.__init__(self)
530 
531  self._fileInfos = None
532  self.keys = None
533  self.dataHeader = PoolRecord("DataHeader", 0, 0, 0,
534  nEntries = 0,
535  dirType = "T")
536  self.augNames = set()
537  self.dataHeaderA = {}
538  self.data = []
539  self.verbose = verbose
540 
541  # get the "final" file name (handles all kind of protocols)
542  try:
543  protocol, fileName = file_name(fileName)
544  except Exception as err:
545  print("## warning: problem opening PoolFileCatalog:\n%s"%err)
546  import traceback
547  traceback.print_exc(err)
548  pass
549 
550  self.poolFile = None
551  dbFileName = whichdb( fileName )
552  if dbFileName not in ( None, '' ):
553  if self.verbose is True:
554  print("## opening file [%s]..." % str(fileName))
555  db = shelve.open( fileName, 'r' )
556  if self.verbose is True:
557  print("## opening file [OK]")
558  report = db['report']
559  self._fileInfos = report['fileInfos']
560  self.dataHeader = report['dataHeader']
561  self.data = report['data']
562  else:
563  if self.verbose is True:
564  print("## opening file [%s]..." % str(fileName))
565  self.__openPoolFile( fileName )
566  if self.verbose is True:
567  print("## opening file [OK]")
568  self.__processFile()
569 
570  return
571 

Member Function Documentation

◆ __openPoolFile()

def python.PoolFile.PoolFile.__openPoolFile (   self,
  fileName 
)
private

Definition at line 572 of file PoolFile.py.

572  def __openPoolFile(self, fileName):
573  # hack to prevent ROOT from loading graphic libraries and hence bother
574  # our fellow Mac users
575  if self.verbose is True:
576  print("## importing ROOT...")
577  import PyUtils.RootUtils as ru
578  ROOT = ru.import_root()
579  self.ROOT = ROOT
580  if self.verbose is True:
581  print("## importing ROOT... [DONE]")
582  # prevent ROOT from being too verbose
583  rootMsg = ShutUp()
584  rootMsg.mute()
585  ROOT.gErrorIgnoreLevel = ROOT.kFatal
586 
587  poolFile = None
588  try:
589  poolFile = ROOT.TFile.Open( fileName, PoolOpts.READ_MODE )
590  except Exception as e:
591  rootMsg.unMute()
592  print("## Failed to open file [%s] !!" % fileName)
593  print("## Reason:")
594  print(e)
595  print("## Bailing out...")
596  raise IOError("Could not open file [%s]" % fileName)
597 
598  rootMsg.unMute()
599 
600  if poolFile is None:
601  print("## Failed to open file [%s] !!" % fileName)
602  msg = "Could not open file [%s]" % fileName
603  raise IOError(msg)
604 
605  self.poolFile = poolFile
606  assert self.poolFile.IsOpen() and not self.poolFile.IsZombie(), \
607  "Invalid POOL file or a Zombie one"
608  self._fileInfos = {
609  'name' : self.poolFile.GetName(),
610  'size' : self.poolFile.GetSize(),
611  }
612  return
613 

◆ __processFile()

def python.PoolFile.PoolFile.__processFile (   self)
private

Definition at line 614 of file PoolFile.py.

614  def __processFile(self):
615 
616  for name in {PoolOpts.TTreeNames.DataHeader, PoolOpts.RNTupleNames.DataHeader}:
617  dhKey = self.poolFile.FindKey( name )
618  if dhKey:
619  obj = self.poolFile.Get( name )
620  if isinstance(obj, self.ROOT.TTree):
621  nEntries = obj.GetEntries()
622  elif isRNTuple(obj):
623  nEntries = self.ROOT.Experimental.RNTupleReader.Open(obj).GetNEntries()
624  else:
625  raise NotImplementedError(f"Keys of type {type(obj)!r} not supported")
626  break
627  else:
628  nEntries = 0
629 
630  keys = []
631  containers = []
632  for k in self.poolFile.GetListOfKeys():
633  keyname = k.GetName()
634  obj = self.poolFile.Get( keyname )
635  if isinstance(obj, self.ROOT.TTree):
636  containerName = obj.GetName()
637  nEntries = obj.GetEntries()
638  dirType = "T"
639  elif isRNTuple(obj):
640  reader = self.ROOT.Experimental.RNTupleReader.Open(obj)
641  containerName = reader.GetDescriptor().GetName()
642  nEntries = reader.GetNEntries()
643  dirType = "N"
644  else:
645  raise NotImplementedError(f"Keys of type {type(obj)!r} not supported")
646  if containerName not in containers:
647  keys.append(k)
648  containers.append(containerName)
649  pass
650  if keyname.startswith(PoolOpts.POOL_HEADER) and not keyname.endswith('Form'):
651  self.dataHeaderA[PoolOpts.augmentationName(keyname)] = \
652  PoolRecord("DataHeader", 0, 0, 0,
653  nEntries = nEntries,
654  dirType = dirType)
655 
656  keys.sort (key = lambda x: x.GetName())
657  self.keys = keys
658  del containers
659 
660  for k in keys:
661  obj = self.poolFile.Get( k.GetName() )
662  if isinstance(obj, self.ROOT.TTree):
663  name = obj.GetName()
664  elif isRNTuple(obj):
665  inspector = self.ROOT.Experimental.RNTupleInspector.Create(obj)
666  name = inspector.GetDescriptor().GetName()
667 
668  if PoolOpts.isDataHeader(name):
669  contName = "DataHeader"
670  if isinstance(obj, self.ROOT.TTree):
671  memSize = obj.GetTotBytes() / Units.kb
672  diskSize = obj.GetZipBytes() / Units.kb
673  memSizeNoZip = 0.0
674  if diskSize < 0.001:
675  memSizeNoZip = memSize
676  nEntries = obj.GetEntries()
677 
679  dhBranchNames = [
680  br.GetName() for br in obj.GetListOfBranches()
681  if br.GetName().count("DataHeader_p") > 0
682  ]
683  if len(dhBranchNames) == 1:
684  dhBranch = obj.GetBranch(dhBranchNames[0])
685  typeName = dhBranch.GetClassName()
686  if not typeName and (leaf := dhBranch.GetListOfLeaves().At(0)):
687  typeName = leaf.GetTypeName()
688  poolRecord = retrieveBranchInfos(
689  dhBranch,
690  PoolRecord( contName, 0., 0., 0.,
691  nEntries,
692  dirType = "T",
693  typeName = typeName ),
694  ident = " "
695  )
696  else:
697  poolRecord = PoolRecord(contName, memSize, diskSize, memSizeNoZip,
698  nEntries,
699  dirType = "T")
700 
701  self.dataHeader = poolRecord
702  elif isRNTuple(obj):
703  diskSize = inspector.GetCompressedSize() / Units.kb
704  memSize = inspector.GetUncompressedSize() / Units.kb
705 
706  memSizeNoZip = 0.0
707  if diskSize < 0.001:
708  memSizeNoZip = memSize
709  nEntries = inspector.GetDescriptor().GetNEntries()
710  poolRecord = PoolRecord(contName, memSize, diskSize, memSizeNoZip,
711  nEntries,
712  dirType = "N")
713  self.dataHeader = poolRecord
714  elif PoolOpts.isData(name):
715  if isinstance(obj, self.ROOT.TTree):
716  if not hasattr(obj, 'GetListOfBranches'):
717  continue
718  branches = obj.GetListOfBranches()
719  dirType = "T"
720  if name in (PoolOpts.EVENT_DATA, PoolOpts.META_DATA):
721  dirType = "B"
722  for branch in branches:
723  poolRecord = retrieveBranchInfos(
724  branch,
725  make_pool_record(branch, dirType),
726  ident = " "
727  )
728  poolRecord.augName = PoolOpts.augmentationName(name)
729  self.augNames.add(poolRecord.augName)
730  self.data += [ poolRecord ]
731  elif isRNTuple(obj):
732  descriptor = inspector.GetDescriptor()
733  dirType = "N"
734  if name in {PoolOpts.RNTupleNames.EventData, PoolOpts.RNTupleNames.MetaData}:
735  dirType = "F"
736  fieldZeroId = descriptor.GetFieldZeroId()
737  for fieldDescriptor in descriptor.GetFieldIterable(fieldZeroId):
738  fieldId = fieldDescriptor.GetId()
739  fieldTreeInspector = inspector.GetFieldTreeInspector(fieldId)
740  diskSize = fieldTreeInspector.GetCompressedSize() / Units.kb
741  memSize = fieldTreeInspector.GetUncompressedSize() / Units.kb
742  typeName = fieldDescriptor.GetTypeName()
743  fieldName = fieldDescriptor.GetFieldName()
744  poolRecord = PoolRecord(fieldName, memSize, diskSize, memSize,
745  descriptor.GetNEntries(),
746  dirType=dirType,
747  typeName=typeName)
748  poolRecord.augName = PoolOpts.augmentationName(name)
749  self.augNames.add(poolRecord.augName)
750  self.data += [ poolRecord ]
751  # loop over keys
752 
753  return
754 

◆ checkFile()

def python.PoolFile.PoolFile.checkFile (   self,
  sorting = PoolRecord.Sorter.DiskSize 
)

Definition at line 763 of file PoolFile.py.

763  def checkFile(self, sorting = PoolRecord.Sorter.DiskSize):
764  if self.verbose is True:
765  print(self.fileInfos())
766  if len(self.augNames) > 1:
767  for aug in self.augNames:
768  if len(aug) > 0:
769  print( "Nbr %s Events: %i" % (aug, self.dataHeaderA[aug].nEntries) )
770 
771 
772  data = self.data
773  if sorting in PoolRecord.Sorter.allowedValues():
774  import operator
775  data.sort(key = operator.attrgetter(sorting) )
776 
777  def _get_val(x, dflt=-999.):
778  if PoolOpts.FAST_MODE:
779  return dflt
780  return x
781 
782  totMemSize = _get_val(self.dataHeader.memSize, dflt=0.)
783  totDiskSize = self.dataHeader.diskSize
784 
785  def _safe_div(num,den):
786  if float(den) == 0.:
787  return 0.
788  return num/den
789 
790  if self.verbose is True:
791  print("")
792  print("="*80)
793  print(PoolOpts.HDR_FORMAT % ( "Mem Size", "Disk Size","Size/Evt",
794  "MissZip/Mem","items",
795  "(X) Container Name (X=Tree|Branch)" ))
796  print("="*80)
797 
798  print(PoolOpts.ROW_FORMAT % (
799  _get_val (self.dataHeader.memSize),
800  self.dataHeader.diskSize,
801  _safe_div(self.dataHeader.diskSize,float(self.dataHeader.nEntries)),
802  _get_val (_safe_div(self.dataHeader.memSizeNoZip,
803  self.dataHeader.memSize)),
804  self.dataHeader.nEntries,
805  "("+self.dataHeader.dirType+") "+self.dataHeader.name
806  ))
807  print("-"*80)
808 
809  totMemSizeA = {}
810  totDiskSizeA = {}
811  for d in data:
812  totMemSize += 0. if PoolOpts.FAST_MODE else d.memSize
813  totDiskSize += d.diskSize
814  memSizeNoZip = d.memSizeNoZip/d.memSize if d.memSize != 0. else 0.
815  aug = d.augName
816  totMemSizeA[aug] = totMemSizeA.get(aug,0.) + d.memSize
817  totDiskSizeA[aug] = totDiskSizeA.get(aug,0.) + d.diskSize
818  if self.verbose is True:
819  print(PoolOpts.ROW_FORMAT % (
820  _get_val (d.memSize),
821  d.diskSize,
822  _safe_div(d.diskSize, float(self.dataHeader.nEntries)),
823  _get_val (memSizeNoZip),
824  d.nEntries,
825  "("+d.dirType+") "+d.name
826  ))
827 
828  if self.verbose is True:
829  print("="*80)
830  if len(self.augNames) > 1:
831  augs = sorted(self.augNames)
832  for a in augs:
833  print(PoolOpts.ROW_FORMAT % (
834  totMemSizeA[a], totDiskSizeA[a],
835  _safe_div(totDiskSizeA[a], float(self.dataHeaderA[a].nEntries)),
836  0.0,
837  self.dataHeaderA[a].nEntries,
838  "Aug Stream: " + ('MAIN' if a=='' else a)
839  ))
840  print("-"*80)
841  print(PoolOpts.ROW_FORMAT % (
842  totMemSize, totDiskSize,
843  _safe_div(totDiskSize, float(self.dataHeader.nEntries)),
844  0.0, self.dataHeader.nEntries,
845  "TOTAL (POOL containers)"
846  ))
847  print("="*80)
848  if PoolOpts.FAST_MODE:
849  print("::: warning: FAST_MODE was enabled: some columns' content ",)
850  print("is meaningless...")
851  return
852 

◆ detailedDump()

def python.PoolFile.PoolFile.detailedDump (   self,
  bufferName = None 
)

Definition at line 853 of file PoolFile.py.

853  def detailedDump(self, bufferName = None ):
854  if self.poolFile is None or \
855  self.keys is None:
856  print("Can't perform a detailedDump with a shelve file as input !")
857  return
858 
859  if bufferName is None:
860  bufferName = "/dev/stdout"
861  out = open( bufferName, "w" )
862  sys.stdout.flush()
863  save_stdout_fileno = os.dup (sys.stdout.fileno())
864  os.dup2( out.fileno(), sys.stdout.fileno() )
865 
866  out.write( "#" * 80 + os.linesep )
867  out.write( "## detailed dump" + os.linesep )
868  out.flush()
869 
870  for key in self.keys:
871  tree = key.ReadObj()
872  name = tree.GetName()
873 
874  if PoolOpts.isDataHeader(name) or \
875  PoolOpts.isData(name):
876  try:
877  print ("=== [%s] ===" % name, file=sys.stderr)
878  tree.Print()
879  except Exception as err:
880  print ("Caught:",err, file=sys.stderr)
881  print (sys.exc_info()[0], file=sys.stderr)
882  print (sys.exc_info()[1], file=sys.stderr)
883  pass
884  pass
885  pass
886  out.write( "#" * 80 + os.linesep )
887  out.flush()
888  out.write( "#" * 80 + os.linesep )

◆ fileInfos()

def python.PoolFile.PoolFile.fileInfos (   self)

Definition at line 755 of file PoolFile.py.

755  def fileInfos(self):
756  return os.linesep.join( [
757  "File:" + self._fileInfos['name'],
758  "Size: %12.3f kb" % (self._fileInfos['size'] / Units.kb),
759  "Nbr Events: %i" % self.dataHeader.nEntries
760  ] )
761 
762 

Member Data Documentation

◆ _fileInfos

python.PoolFile.PoolFile._fileInfos
private

Definition at line 531 of file PoolFile.py.

◆ augNames

python.PoolFile.PoolFile.augNames

Definition at line 536 of file PoolFile.py.

◆ data

python.PoolFile.PoolFile.data

Definition at line 538 of file PoolFile.py.

◆ dataHeader

python.PoolFile.PoolFile.dataHeader

try to also handle non-T/P separated DataHeaders (from old files)...

Definition at line 533 of file PoolFile.py.

◆ dataHeaderA

python.PoolFile.PoolFile.dataHeaderA

Definition at line 537 of file PoolFile.py.

◆ keys

python.PoolFile.PoolFile.keys

first we try to fetch the DataHeader

Definition at line 532 of file PoolFile.py.

◆ poolFile

python.PoolFile.PoolFile.poolFile

Definition at line 550 of file PoolFile.py.

◆ ROOT

python.PoolFile.PoolFile.ROOT

Definition at line 579 of file PoolFile.py.

◆ verbose

python.PoolFile.PoolFile.verbose

Definition at line 539 of file PoolFile.py.


The documentation for this class was generated from the following file:
python.PoolFile.file_name
def file_name(fname)
Definition: PoolFile.py:321
python.PoolFile.retrieveBranchInfos
def retrieveBranchInfos(branch, poolRecord, ident="")
Definition: PoolFile.py:420
XMLtoHeader.count
count
Definition: XMLtoHeader.py:85
Get
T * Get(TFile &f, const std::string &n, const std::string &dir="", const chainmap_t *chainmap=0, std::vector< std::string > *saved=0)
get a histogram given a path, and an optional initial directory if histogram is not found,...
Definition: comparitor.cxx:179
add
bool add(const std::string &hname, TKey *tobj)
Definition: fastadd.cxx:55
DerivationFramework::TriggerMatchingUtils::sorted
std::vector< typename T::value_type > sorted(T begin, T end)
Helper function to create a sorted vector from an unsorted one.
CxxUtils::set
constexpr std::enable_if_t< is_bitmask_v< E >, E & > set(E &lhs, E rhs)
Convenience function to set bits in a class enum bitmask.
Definition: bitmask.h:232
print
void print(char *figname, TCanvas *c1)
Definition: TRTCalib_StrawStatusPlots.cxx:25
python.PoolFile.make_pool_record
def make_pool_record(branch, dirType)
Definition: PoolFile.py:441
python.processes.powheg.ZZ.ZZ.__init__
def __init__(self, base_directory, **kwargs)
Constructor: all process options are set here.
Definition: ZZ.py:18
Trk::open
@ open
Definition: BinningType.h:40
python.PoolFile.isRNTuple
def isRNTuple(obj)
Definition: PoolFile.py:36
str
Definition: BTagTrackIpAccessor.cxx:11
python.trfValidateRootFile.checkFile
def checkFile(fileName, the_type, requireTree)
Definition: trfValidateRootFile.py:231
readCCLHist.float
float
Definition: readCCLHist.py:83