ATLAS Offline Software
Public Member Functions | Public Attributes | Private Member Functions | Private Attributes | List of all members
python.PoolFile.PoolFile Class Reference
Inheritance diagram for python.PoolFile.PoolFile:
Collaboration diagram for python.PoolFile.PoolFile:

Public Member Functions

def __init__ (self, fileName, verbose=True)
 
def fileInfos (self)
 
def checkFile (self, sorting=PoolRecord.Sorter.DiskSize)
 
def detailedDump (self, bufferName=sys.stdout.name)
 

Public Attributes

 keys
 first we try to fetch the DataHeader More...
 
 dataHeader
 try to also handle non-T/P separated DataHeaders (from old files)... More...
 
 augNames
 
 dataHeaderA
 
 data
 
 verbose
 
 poolFile
 
 ROOT
 

Private Member Functions

def __openPoolFile (self, fileName)
 
def __processFile (self)
 

Private Attributes

 _fileInfos
 

Detailed Description

A simple class to retrieve informations about the content of a POOL file.
It should be abstracted from the underlying technology used to create this
POOL file (Db, ROOT,...).
Right now, we are using the easy and loosy solution: going straight to the
ROOT 'API'.

Definition at line 511 of file PoolFile.py.

Constructor & Destructor Documentation

◆ __init__()

def python.PoolFile.PoolFile.__init__ (   self,
  fileName,
  verbose = True 
)

Definition at line 520 of file PoolFile.py.

520  def __init__(self, fileName, verbose=True):
521  object.__init__(self)
522 
523  self._fileInfos = None
524  self.keys = None
525  self.dataHeader = PoolRecord("DataHeader", 0, 0, 0,
526  nEntries = 0,
527  dirType = "T")
528  self.augNames = set()
529  self.dataHeaderA = {}
530  self.data = []
531  self.verbose = verbose
532 
533  # get the "final" file name (handles all kind of protocols)
534  try:
535  protocol, fileName = file_name(fileName)
536  except Exception as err:
537  print("## warning: problem opening PoolFileCatalog:\n%s"%err)
538  import traceback
539  traceback.print_exc(err)
540  pass
541 
542  self.poolFile = None
543  dbFileName = whichdb( fileName )
544  if dbFileName not in ( None, '' ):
545  if self.verbose is True:
546  print("## opening file [%s]..." % str(fileName))
547  db = shelve.open( fileName, 'r' )
548  if self.verbose is True:
549  print("## opening file [OK]")
550  report = db['report']
551  self._fileInfos = report['fileInfos']
552  self.dataHeader = report['dataHeader']
553  self.data = report['data']
554  else:
555  if self.verbose is True:
556  print("## opening file [%s]..." % str(fileName))
557  self.__openPoolFile( fileName )
558  if self.verbose is True:
559  print("## opening file [OK]")
560  self.__processFile()
561 
562  return
563 

Member Function Documentation

◆ __openPoolFile()

def python.PoolFile.PoolFile.__openPoolFile (   self,
  fileName 
)
private

Definition at line 564 of file PoolFile.py.

564  def __openPoolFile(self, fileName):
565  # hack to prevent ROOT from loading graphic libraries and hence bother
566  # our fellow Mac users
567  if self.verbose is True:
568  print("## importing ROOT...")
569  import PyUtils.RootUtils as ru
570  ROOT = ru.import_root()
571  self.ROOT = ROOT
572  if self.verbose is True:
573  print("## importing ROOT... [DONE]")
574  # prevent ROOT from being too verbose
575  rootMsg = ShutUp()
576  rootMsg.mute()
577  ROOT.gErrorIgnoreLevel = ROOT.kFatal
578 
579  poolFile = None
580  try:
581  poolFile = ROOT.TFile.Open( fileName, PoolOpts.READ_MODE )
582  except Exception as e:
583  rootMsg.unMute()
584  print("## Failed to open file [%s] !!" % fileName)
585  print("## Reason:")
586  print(e)
587  print("## Bailing out...")
588  raise IOError("Could not open file [%s]" % fileName)
589 
590  rootMsg.unMute()
591 
592  if poolFile is None:
593  print("## Failed to open file [%s] !!" % fileName)
594  msg = "Could not open file [%s]" % fileName
595  raise IOError(msg)
596 
597  self.poolFile = poolFile
598  assert self.poolFile.IsOpen() and not self.poolFile.IsZombie(), \
599  "Invalid POOL file or a Zombie one"
600  self._fileInfos = {
601  'name' : self.poolFile.GetName(),
602  'size' : self.poolFile.GetSize(),
603  }
604  return
605 

◆ __processFile()

def python.PoolFile.PoolFile.__processFile (   self)
private

Definition at line 606 of file PoolFile.py.

606  def __processFile(self):
607 
608  for name in {PoolOpts.TTreeNames.DataHeader, PoolOpts.RNTupleNames.DataHeader}:
609  dhKey = self.poolFile.FindKey( name )
610  if dhKey:
611  obj = self.poolFile.Get( name )
612  if isinstance(obj, self.ROOT.TTree):
613  nEntries = obj.GetEntries()
614  elif isinstance(obj, self.ROOT.Experimental.RNTuple):
615  nEntries = self.ROOT.Experimental.RNTupleReader.Open(obj).GetNEntries()
616  else:
617  raise NotImplementedError(f"Keys of type {type(obj)!r} not supported")
618  break
619  else:
620  nEntries = 0
621 
622  keys = []
623  containers = []
624  for k in self.poolFile.GetListOfKeys():
625  keyname = k.GetName()
626  obj = self.poolFile.Get( keyname )
627  if isinstance(obj, self.ROOT.TTree):
628  containerName = obj.GetName()
629  nEntries = obj.GetEntries()
630  dirType = "T"
631  elif isinstance(obj, self.ROOT.Experimental.RNTuple):
632  reader = self.ROOT.Experimental.RNTupleReader.Open(obj)
633  containerName = reader.GetDescriptor().GetName()
634  nEntries = reader.GetNEntries()
635  dirType = "N"
636  else:
637  raise NotImplementedError(f"Keys of type {type(obj)!r} not supported")
638  if containerName not in containers:
639  keys.append(k)
640  containers.append(containerName)
641  pass
642  if keyname.startswith(PoolOpts.POOL_HEADER) and not keyname.endswith('Form'):
643  self.dataHeaderA[PoolOpts.augmentationName(keyname)] = \
644  PoolRecord("DataHeader", 0, 0, 0,
645  nEntries = nEntries,
646  dirType = dirType)
647 
648  keys.sort (key = lambda x: x.GetName())
649  self.keys = keys
650  del containers
651 
652  for k in keys:
653  obj = self.poolFile.Get( k.GetName() )
654  if isinstance(obj, self.ROOT.TTree):
655  name = obj.GetName()
656  elif isinstance(obj, self.ROOT.Experimental.RNTuple):
657  reader = self.ROOT.Experimental.RNTupleReader.Open(obj)
658  name = reader.GetDescriptor().GetName()
659 
660  if PoolOpts.isDataHeader(name):
661  contName = "DataHeader"
662  if isinstance(obj, self.ROOT.TTree):
663  memSize = obj.GetTotBytes() / Units.kb
664  diskSize = obj.GetZipBytes() / Units.kb
665  memSizeNoZip = 0.0
666  if diskSize < 0.001:
667  memSizeNoZip = memSize
668  nEntries = obj.GetEntries()
669 
671  dhBranchNames = [
672  br.GetName() for br in obj.GetListOfBranches()
673  if br.GetName().count("DataHeader_p") > 0
674  ]
675  if len(dhBranchNames) == 1:
676  dhBranch = obj.GetBranch(dhBranchNames[0])
677  typeName = dhBranch.GetClassName()
678  if not typeName and (leaf := dhBranch.GetListOfLeaves().At(0)):
679  typeName = leaf.GetTypeName()
680  poolRecord = retrieveBranchInfos(
681  dhBranch,
682  PoolRecord( contName, 0., 0., 0.,
683  nEntries,
684  dirType = "T",
685  typeName = typeName ),
686  ident = " "
687  )
688  else:
689  poolRecord = PoolRecord(contName, memSize, diskSize, memSizeNoZip,
690  nEntries,
691  dirType = "T")
692 
693  self.dataHeader = poolRecord
694  elif isinstance(obj, self.ROOT.Experimental.RNTuple):
695  reader = self.ROOT.Experimental.RNTupleReader.Open(obj)
696  inspector = self.ROOT.Experimental.RNTupleInspector.Create(obj)
697  diskSize = inspector.GetCompressedSize() / Units.kb
698  memSize = inspector.GetUncompressedSize() / Units.kb
699 
700  memSizeNoZip = 0.0
701  if diskSize < 0.001:
702  memSizeNoZip = memSize
703  nEntries = reader.GetNEntries()
704  poolRecord = PoolRecord(contName, memSize, diskSize, memSizeNoZip,
705  nEntries,
706  dirType = "N")
707  self.dataHeader = poolRecord
708  elif PoolOpts.isData(name):
709  if isinstance(obj, self.ROOT.TTree):
710  if not hasattr(obj, 'GetListOfBranches'):
711  continue
712  branches = obj.GetListOfBranches()
713  dirType = "T"
714  if name in (PoolOpts.EVENT_DATA, PoolOpts.META_DATA):
715  dirType = "B"
716  for branch in branches:
717  poolRecord = retrieveBranchInfos(
718  branch,
719  make_pool_record(branch, dirType),
720  ident = " "
721  )
722  poolRecord.augName = PoolOpts.augmentationName(name)
723  self.augNames.add(poolRecord.augName)
724  self.data += [ poolRecord ]
725  elif isinstance(obj, self.ROOT.Experimental.RNTuple):
726  reader = self.ROOT.Experimental.RNTupleReader.Open(obj)
727  descriptor = reader.GetDescriptor()
728  inspector = self.ROOT.Experimental.RNTupleInspector.Create(obj)
729  dirType = "N"
730  if name in {PoolOpts.RNTupleNames.EventData, PoolOpts.RNTupleNames.MetaData}:
731  dirType = "F"
732  fieldZeroId = descriptor.GetFieldZeroId()
733  for fieldDescriptor in descriptor.GetFieldIterable(fieldZeroId):
734  fieldId = fieldDescriptor.GetId()
735  fieldTreeInspector = inspector.GetFieldTreeInspector(fieldId)
736  diskSize = fieldTreeInspector.GetCompressedSize() / Units.kb
737  memSize = fieldTreeInspector.GetUncompressedSize() / Units.kb
738  fieldDescriptor = fieldTreeInspector.GetDescriptor()
739  typeName = fieldDescriptor.GetTypeName()
740  fieldName = fieldDescriptor.GetFieldName()
741  poolRecord = PoolRecord(fieldName, memSize, diskSize, memSize,
742  descriptor.GetNEntries(),
743  dirType=dirType,
744  typeName=typeName)
745  poolRecord.augName = PoolOpts.augmentationName(name)
746  self.augNames.add(poolRecord.augName)
747  self.data += [ poolRecord ]
748  # loop over keys
749 
750  return
751 

◆ checkFile()

def python.PoolFile.PoolFile.checkFile (   self,
  sorting = PoolRecord.Sorter.DiskSize 
)

Definition at line 760 of file PoolFile.py.

760  def checkFile(self, sorting = PoolRecord.Sorter.DiskSize):
761  if self.verbose is True:
762  print(self.fileInfos())
763  if len(self.augNames) > 1:
764  for aug in self.augNames:
765  if len(aug) > 0:
766  print( "Nbr %s Events: %i" % (aug, self.dataHeaderA[aug].nEntries) )
767 
768 
769  data = self.data
770  if sorting in PoolRecord.Sorter.allowedValues():
771  import operator
772  data.sort(key = operator.attrgetter(sorting) )
773 
774  def _get_val(x, dflt=-999.):
775  if PoolOpts.FAST_MODE:
776  return dflt
777  return x
778 
779  totMemSize = _get_val(self.dataHeader.memSize, dflt=0.)
780  totDiskSize = self.dataHeader.diskSize
781 
782  def _safe_div(num,den):
783  if float(den) == 0.:
784  return 0.
785  return num/den
786 
787  if self.verbose is True:
788  print("")
789  print("="*80)
790  print(PoolOpts.HDR_FORMAT % ( "Mem Size", "Disk Size","Size/Evt",
791  "MissZip/Mem","items",
792  "(X) Container Name (X=Tree|Branch)" ))
793  print("="*80)
794 
795  print(PoolOpts.ROW_FORMAT % (
796  _get_val (self.dataHeader.memSize),
797  self.dataHeader.diskSize,
798  _safe_div(self.dataHeader.diskSize,float(self.dataHeader.nEntries)),
799  _get_val (_safe_div(self.dataHeader.memSizeNoZip,
800  self.dataHeader.memSize)),
801  self.dataHeader.nEntries,
802  "("+self.dataHeader.dirType+") "+self.dataHeader.name
803  ))
804  print("-"*80)
805 
806  totMemSizeA = {}
807  totDiskSizeA = {}
808  for d in data:
809  totMemSize += 0. if PoolOpts.FAST_MODE else d.memSize
810  totDiskSize += d.diskSize
811  memSizeNoZip = d.memSizeNoZip/d.memSize if d.memSize != 0. else 0.
812  aug = d.augName
813  totMemSizeA[aug] = totMemSizeA.get(aug,0.) + d.memSize
814  totDiskSizeA[aug] = totDiskSizeA.get(aug,0.) + d.diskSize
815  if self.verbose is True:
816  print(PoolOpts.ROW_FORMAT % (
817  _get_val (d.memSize),
818  d.diskSize,
819  _safe_div(d.diskSize, float(self.dataHeader.nEntries)),
820  _get_val (memSizeNoZip),
821  d.nEntries,
822  "("+d.dirType+") "+d.name
823  ))
824 
825  if self.verbose is True:
826  print("="*80)
827  if len(self.augNames) > 1:
828  augs = sorted(self.augNames)
829  for a in augs:
830  print(PoolOpts.ROW_FORMAT % (
831  totMemSizeA[a], totDiskSizeA[a],
832  _safe_div(totDiskSizeA[a], float(self.dataHeaderA[a].nEntries)),
833  0.0,
834  self.dataHeaderA[a].nEntries,
835  "Aug Stream: " + ('MAIN' if a=='' else a)
836  ))
837  print("-"*80)
838  print(PoolOpts.ROW_FORMAT % (
839  totMemSize, totDiskSize,
840  _safe_div(totDiskSize, float(self.dataHeader.nEntries)),
841  0.0, self.dataHeader.nEntries,
842  "TOTAL (POOL containers)"
843  ))
844  print("="*80)
845  if PoolOpts.FAST_MODE:
846  print("::: warning: FAST_MODE was enabled: some columns' content ",)
847  print("is meaningless...")
848  return
849 

◆ detailedDump()

def python.PoolFile.PoolFile.detailedDump (   self,
  bufferName = sys.stdout.name 
)

Definition at line 850 of file PoolFile.py.

850  def detailedDump(self, bufferName = sys.stdout.name ):
851  if self.poolFile is None or \
852  self.keys is None:
853  print("Can't perform a detailedDump with a shelve file as input !")
854  return
855 
856  if bufferName == sys.stdout.name:
857  bufferName = "/dev/stdout"
858  out = open( bufferName, "w" )
859  sys.stdout.flush()
860  save_stdout_fileno = os.dup (sys.stdout.fileno())
861  os.dup2( out.fileno(), sys.stdout.fileno() )
862 
863  out.write( "#" * 80 + os.linesep )
864  out.write( "## detailed dump" + os.linesep )
865  out.flush()
866 
867  for key in self.keys:
868  tree = key.ReadObj()
869  name = tree.GetName()
870 
871  if PoolOpts.isDataHeader(name) or \
872  PoolOpts.isData(name):
873  try:
874  print ("=== [%s] ===" % name, file=sys.stderr)
875  tree.Print()
876  except Exception as err:
877  print ("Caught:",err, file=sys.stderr)
878  print (sys.exc_info()[0], file=sys.stderr)
879  print (sys.exc_info()[1], file=sys.stderr)
880  pass
881  pass
882  pass
883  out.write( "#" * 80 + os.linesep )
884  out.flush()
885  out.write( "#" * 80 + os.linesep )

◆ fileInfos()

def python.PoolFile.PoolFile.fileInfos (   self)

Definition at line 752 of file PoolFile.py.

752  def fileInfos(self):
753  return os.linesep.join( [
754  "File:" + self._fileInfos['name'],
755  "Size: %12.3f kb" % (self._fileInfos['size'] / Units.kb),
756  "Nbr Events: %i" % self.dataHeader.nEntries
757  ] )
758 
759 

Member Data Documentation

◆ _fileInfos

python.PoolFile.PoolFile._fileInfos
private

Definition at line 523 of file PoolFile.py.

◆ augNames

python.PoolFile.PoolFile.augNames

Definition at line 528 of file PoolFile.py.

◆ data

python.PoolFile.PoolFile.data

Definition at line 530 of file PoolFile.py.

◆ dataHeader

python.PoolFile.PoolFile.dataHeader

try to also handle non-T/P separated DataHeaders (from old files)...

Definition at line 525 of file PoolFile.py.

◆ dataHeaderA

python.PoolFile.PoolFile.dataHeaderA

Definition at line 529 of file PoolFile.py.

◆ keys

python.PoolFile.PoolFile.keys

first we try to fetch the DataHeader

Definition at line 524 of file PoolFile.py.

◆ poolFile

python.PoolFile.PoolFile.poolFile

Definition at line 542 of file PoolFile.py.

◆ ROOT

python.PoolFile.PoolFile.ROOT

Definition at line 571 of file PoolFile.py.

◆ verbose

python.PoolFile.PoolFile.verbose

Definition at line 531 of file PoolFile.py.


The documentation for this class was generated from the following file:
python.PoolFile.file_name
def file_name(fname)
Definition: PoolFile.py:313
python.PoolFile.retrieveBranchInfos
def retrieveBranchInfos(branch, poolRecord, ident="")
Definition: PoolFile.py:412
XMLtoHeader.count
count
Definition: XMLtoHeader.py:85
Get
T * Get(TFile &f, const std::string &n, const std::string &dir="", const chainmap_t *chainmap=0, std::vector< std::string > *saved=0)
get a histogram given a path, and an optional initial directory if histogram is not found,...
Definition: comparitor.cxx:178
add
bool add(const std::string &hname, TKey *tobj)
Definition: fastadd.cxx:55
DerivationFramework::TriggerMatchingUtils::sorted
std::vector< typename T::value_type > sorted(T begin, T end)
Helper function to create a sorted vector from an unsorted one.
CxxUtils::set
constexpr std::enable_if_t< is_bitmask_v< E >, E & > set(E &lhs, E rhs)
Convenience function to set bits in a class enum bitmask.
Definition: bitmask.h:224
python.PoolFile.make_pool_record
def make_pool_record(branch, dirType)
Definition: PoolFile.py:433
python.processes.powheg.ZZ.ZZ.__init__
def __init__(self, base_directory, **kwargs)
Constructor: all process options are set here.
Definition: ZZ.py:18
Trk::open
@ open
Definition: BinningType.h:40
Muon::print
std::string print(const MuPatSegment &)
Definition: MuonTrackSteering.cxx:28
str
Definition: BTagTrackIpAccessor.cxx:11
python.trfValidateRootFile.checkFile
def checkFile(fileName, the_type, requireTree)
Definition: trfValidateRootFile.py:227
readCCLHist.float
float
Definition: readCCLHist.py:83