|
ATLAS Offline Software
|
AthenaMT service to collect trigger cost data from all threads and summarise it at the end of the event.
More...
#include <TrigCostSvc.h>
|
| TrigCostSvc (const std::string &name, ISvcLocator *pSvcLocator) |
| Standard ATLAS Service constructor. More...
|
|
virtual | ~TrigCostSvc () |
| Destructor. More...
|
|
virtual StatusCode | initialize () override |
| Initialise, create enough storage to store m_eventSlots. More...
|
|
virtual StatusCode | finalize () override |
| Finalize, act on m_saveHashes. More...
|
|
virtual StatusCode | startEvent (const EventContext &context, const bool enableMonitoring=true) override |
| Implementation of ITrigCostSvc::startEvent. More...
|
|
virtual StatusCode | processAlg (const EventContext &context, const std::string &caller, const AuditType type) override |
| Implementation of ITrigCostSvc::processAlg. More...
|
|
virtual StatusCode | endEvent (const EventContext &context, SG::WriteHandle< xAOD::TrigCompositeContainer > &costOutputHandle, SG::WriteHandle< xAOD::TrigCompositeContainer > &rosOutputHandle) override |
| Implementation of ITrigCostSvc::endEvent. More...
|
|
virtual bool | isMonitoredEvent (const EventContext &context, const bool includeMultiSlot=true) const override |
|
virtual StatusCode | monitorROS (const EventContext &context, robmonitor::ROBDataMonitorStruct payload) override |
| Implementation of ITrigCostSvc::monitorROS. More...
|
|
virtual StatusCode | generateTimeoutReport (const EventContext &context, std::string &report) override |
|
virtual StatusCode | discardEvent (const EventContext &context) override |
| Discard a cost monitored event. More...
|
|
|
size_t | m_eventSlots |
| Number of concurrent processing slots. More...
|
|
std::unique_ptr< std::atomic< bool >[] > | m_eventMonitored |
| Used to cache if the event in a given slot is being monitored. More...
|
|
std::unique_ptr< std::shared_mutex[] > | m_slotMutex |
| Used to control and protect whole-table operations. More...
|
|
std::mutex | m_globalMutex |
| Used to protect all-slot modifications. More...
|
|
TrigCostDataStore< AlgorithmPayload > | m_algStartInfo |
| Thread-safe store of algorithm start payload. More...
|
|
TrigCostDataStore< TrigTimeStamp > | m_algStopTime |
| Thread-safe store of algorithm stop times. More...
|
|
TrigCostDataStore< std::vector< robmonitor::ROBDataMonitorStruct > > | m_rosData |
| Thread-safe store of ROS data. More...
|
|
tbb::concurrent_hash_map< std::thread::id, AlgorithmIdentifier, ThreadHashCompare > | m_threadToAlgMap |
| Keeps track of what is running right now in each thread. More...
|
|
std::unordered_map< uint32_t, uint32_t > | m_threadToCounterMap |
| Map thread's hash ID to a counting numeral. More...
|
|
size_t | m_threadCounter |
| Count how many unique thread ID we have seen. More...
|
|
Gaudi::Property< bool > | m_monitorAllEvents {this, "MonitorAllEvents", false, "Monitor every HLT event, e.g. for offline validation."} |
|
Gaudi::Property< bool > | m_enableMultiSlot {this, "EnableMultiSlot", false, "Monitored events in the MasterSlot collect data from events running in other slots."} |
|
Gaudi::Property< bool > | m_saveHashes {this, "SaveHashes", false, "Store a copy of the hash dictionary for easier debugging"} |
|
Gaudi::Property< size_t > | m_masterSlot {this, "MasterSlot", 0, "The slot responsible for saving MultiSlot data"} |
|
Gaudi::Property< std::string > | m_costSupervisorAlgName {this, "CostSupervisorAlgName", "TrigCostSupervisorAlg", "The name of cost monitoring supervising algorithm, starting at the begining of the event"} |
|
Gaudi::Property< std::string > | m_costFinalizeAlgName {this, "CostFinalizeAlgName", "TrigCostFinalizeAlg", "The name of cost monitoring finalize algorithm, starting at the end of the event"} |
|
AthenaMT service to collect trigger cost data from all threads and summarise it at the end of the event.
The main hooks into this service are: HLTSeeding - To clear the internal storage and flag the event for monitoring. TrigCostAuditor - To inform the service when algorithms start and stop executing HLTROBDataProviderSvc - To inform the service about requests for data ROBs HLTSummaryAlg - To inform the service when the HLT has finished, and to receive the persistent payload
Definition at line 36 of file TrigCostSvc.h.
◆ TrigCostSvc()
TrigCostSvc::TrigCostSvc |
( |
const std::string & |
name, |
|
|
ISvcLocator * |
pSvcLocator |
|
) |
| |
Standard ATLAS Service constructor.
- Parameters
-
[in] | name | The service's name |
[in] | svcloc | A pointer to a service location service |
Definition at line 14 of file TrigCostSvc.cxx.
15 base_class(
name, pSvcLocator),
◆ ~TrigCostSvc()
TrigCostSvc::~TrigCostSvc |
( |
| ) |
|
|
virtual |
Destructor.
Currently nothing to delete.
Definition at line 31 of file TrigCostSvc.cxx.
◆ checkSlot()
StatusCode TrigCostSvc::checkSlot |
( |
const EventContext & |
context | ) |
const |
|
private |
Sanity check that the job is respecting the number of slots which were declared at config time.
- Parameters
-
[in] | context | The event context |
- Returns
- Success if the m_eventMonitored array is range, Failure if access request would overflow
Definition at line 506 of file TrigCostSvc.cxx.
509 return StatusCode::FAILURE;
511 return StatusCode::SUCCESS;
◆ discardEvent()
StatusCode TrigCostSvc::discardEvent |
( |
const EventContext & |
context | ) |
|
|
overridevirtual |
Discard a cost monitored event.
- Parameters
-
[in] | context | The event context |
Definition at line 484 of file TrigCostSvc.cxx.
487 ATH_MSG_DEBUG(
"All events are monitored - event will not be discarded");
488 return StatusCode::SUCCESS;
494 std::unique_lock lockUnique(
m_slotMutex[ context.slot() ] );
501 return StatusCode::SUCCESS;
◆ endEvent()
Implementation of ITrigCostSvc::endEvent.
- Parameters
-
[in] | context | The event context |
[out] | costOutputHandle | Write handle to fill with execution summary if the event was monitored |
[out] | rosOutputHandle | Write handle to fill with ROS requests summary if the event was monitored |
Definition at line 206 of file TrigCostSvc.cxx.
211 return StatusCode::SUCCESS;
226 std::unique_lock lockUnique(
m_slotMutex[ context.slot() ] );
235 tbb::concurrent_hash_map<AlgorithmIdentifier, TrigTimeStamp, AlgorithmIdentifierHashCompare>::const_accessor stopTimeAcessor;
239 eventStopTime = stopTimeAcessor->second.microsecondsSinceEpoch();
248 tbb::concurrent_hash_map<AlgorithmIdentifier, AlgorithmPayload, AlgorithmIdentifierHashCompare>::const_accessor startAcessor;
252 eventStartTime = startAcessor->second.m_algStartTime.microsecondsSinceEpoch();
257 tbb::concurrent_hash_map< AlgorithmIdentifier, AlgorithmPayload, AlgorithmIdentifierHashCompare>::const_iterator beginIt;
258 tbb::concurrent_hash_map< AlgorithmIdentifier, AlgorithmPayload, AlgorithmIdentifierHashCompare>::const_iterator endIt;
259 tbb::concurrent_hash_map< AlgorithmIdentifier, AlgorithmPayload, AlgorithmIdentifierHashCompare>::const_iterator
it;
264 std::map<size_t, size_t> aiToHandleIndex;
265 for (
it = beginIt;
it != endIt; ++
it) {
273 tbb::concurrent_hash_map<AlgorithmIdentifier, TrigTimeStamp, AlgorithmIdentifierHashCompare>::const_accessor stopTimeAcessor;
277 stopTime = stopTimeAcessor->second.microsecondsSinceEpoch();
300 if (stopTime > eventStopTime) {
302 <<
" truncating its ending time stamp from " << stopTime <<
" to " << eventStopTime);
303 stopTime = eventStopTime;
309 if (stopTime < eventStartTime) {
315 <<
" truncating its starting time stamp from " <<
startTime <<
" to " << eventStartTime);
324 const uint32_t threadID =
static_cast<uint32_t>( std::hash< std::thread::id >()(
ap.m_algThreadID) );
329 const std::unordered_map<uint32_t, uint32_t>::const_iterator mapIt =
m_threadToCounterMap.find(threadID);
334 threadEnumerator = mapIt->second;
350 aiToHandleIndex[ai.
m_hash] = costOutputHandle->
size() - 1;
353 typedef tbb::concurrent_hash_map< AlgorithmIdentifier, std::vector<robmonitor::ROBDataMonitorStruct>,
AlgorithmIdentifierHashCompare>::const_iterator ROBConstIt;
359 for (ROBConstIt
it = beginRob;
it != endRob; ++
it) {
360 size_t aiHash =
it->first.m_hash;
362 if (aiToHandleIndex.count(aiHash) == 0) {
372 std::vector<uint32_t> robs_id;
373 std::vector<uint32_t> robs_size;
374 std::vector<unsigned> robs_history;
375 std::vector<unsigned short> robs_status;
383 robs_id.push_back(rob.second.rob_id);
384 robs_size.push_back(rob.second.rob_size);
385 robs_history.push_back(rob.second.rob_history);
386 robs_status.push_back(rob.second.isStatusOk());
393 result &= tc->
setDetail<std::vector<uint32_t>>(
"robs_size", robs_size);
394 result &= tc->
setDetail<std::vector<unsigned>>(
"robs_history", robs_history);
395 result &= tc->
setDetail<std::vector<unsigned short>>(
"robs_status", robs_status);
418 return StatusCode::SUCCESS;
◆ finalize()
StatusCode TrigCostSvc::finalize |
( |
| ) |
|
|
overridevirtual |
Finalize, act on m_saveHashes.
Definition at line 65 of file TrigCostSvc.cxx.
69 ATH_MSG_INFO(
"Calling hashes2file, saving dump of job's HLT hashing dictionary to disk.");
71 return StatusCode::SUCCESS;
◆ generateTimeoutReport()
StatusCode TrigCostSvc::generateTimeoutReport |
( |
const EventContext & |
context, |
|
|
std::string & |
report |
|
) |
| |
|
overridevirtual |
- Returns
- Generate timeout report with the most time consuming algorithms
- Parameters
-
[in] | context | The event context |
[out] | report | Created report with algorithms and times (in ms) |
Definition at line 423 of file TrigCostSvc.cxx.
429 return StatusCode::SUCCESS;
432 std::unique_lock lockUnique(
m_slotMutex[context.slot()]);
434 tbb::concurrent_hash_map< AlgorithmIdentifier, AlgorithmPayload, AlgorithmIdentifierHashCompare>::const_iterator beginIt;
435 tbb::concurrent_hash_map< AlgorithmIdentifier, AlgorithmPayload, AlgorithmIdentifierHashCompare>::const_iterator endIt;
436 tbb::concurrent_hash_map< AlgorithmIdentifier, AlgorithmPayload, AlgorithmIdentifierHashCompare>::const_iterator
it;
440 std::map<uint64_t, std::string, std::greater<uint64_t>> timeToAlgMap;
442 for (
it = beginIt;
it != endIt; ++
it) {
447 if (ai.
m_realSlot != context.slot())
continue;
452 tbb::concurrent_hash_map<AlgorithmIdentifier, TrigTimeStamp, AlgorithmIdentifierHashCompare>::const_accessor stopTimeAcessor;
456 stopTime = stopTimeAcessor->second.microsecondsSinceEpoch();
461 if (stopTime == 0)
continue;
467 report =
"Timeout detected with the following algorithms consuming the most time: ";
469 for(
const std::pair<const uint64_t, std::string>&
p : timeToAlgMap){
473 if (algCounter >= 5){
479 return StatusCode::SUCCESS;
◆ getROIID()
int32_t TrigCostSvc::getROIID |
( |
const EventContext & |
context | ) |
|
|
private |
@breif Internal function to return a RoI from an extended event context context
- Parameters
-
[in] | context | The event context |
- Returns
- RoIId from the ATLAS extended event context. Or, AlgorithmIdentifier::s_noView = -1 for no RoIIdentifier
Definition at line 516 of file TrigCostSvc.cxx.
519 if (roi)
return static_cast<int32_t
>(roi->
roiId());
◆ initialize()
StatusCode TrigCostSvc::initialize |
( |
| ) |
|
|
overridevirtual |
Initialise, create enough storage to store m_eventSlots.
Definition at line 39 of file TrigCostSvc.cxx.
44 ATH_MSG_WARNING(
"numConcurrentEvents() == 0. This is a misconfiguration, probably coming from running from pickle. "
45 "Setting local m_eventSlots to a 'large' number until this is fixed to allow the job to proceed.");
60 return StatusCode::SUCCESS;
◆ isMonitoredEvent()
bool TrigCostSvc::isMonitoredEvent |
( |
const EventContext & |
context, |
|
|
const bool |
includeMultiSlot = true |
|
) |
| const |
|
overridevirtual |
- Returns
- If the current context is flagged as being monitored.
- Parameters
-
[in] | context | The event context |
Definition at line 526 of file TrigCostSvc.cxx.
◆ monitor()
Internal call to save monitoring data for a given AlgorithmIdentifier.
- Parameters
-
[in] | context | The event context |
[in] | ai | The AlgorithmIdentifier key to store |
[in] | now | The timestamp to store (amoung other values) |
[in] | type | The type of the audit event to store |
- Returns
- Success if the data are saved
Definition at line 140 of file TrigCostSvc.cxx.
142 if (
type == AuditType::Before) {
146 std::this_thread::get_id(),
148 static_cast<uint32_t>(context.slot())
159 }
else if (
type == AuditType::After) {
165 ATH_MSG_ERROR(
"Only expecting AuditType::Before or AuditType::After");
166 return StatusCode::FAILURE;
170 return StatusCode::SUCCESS;
◆ monitorROS()
Implementation of ITrigCostSvc::monitorROS.
- Parameters
-
[in] | context | The event context |
[in] | payload | ROB data to be associated with ROS |
Definition at line 176 of file TrigCostSvc.cxx.
183 tbb::concurrent_hash_map<std::thread::id, AlgorithmIdentifier, ThreadHashCompare>::const_accessor
acc;
186 ATH_MSG_WARNING(
"Cannot find algorithm on this thread (id=" << std::this_thread::get_id() <<
"). Request "<<
payload <<
" won't be monitored");
187 return StatusCode::SUCCESS;
190 theAlg =
acc->second;
196 std::shared_lock lockShared(
m_slotMutex[ context.slot() ] );
200 return StatusCode::SUCCESS;
◆ processAlg()
StatusCode TrigCostSvc::processAlg |
( |
const EventContext & |
context, |
|
|
const std::string & |
caller, |
|
|
const AuditType |
type |
|
) |
| |
|
overridevirtual |
Implementation of ITrigCostSvc::processAlg.
- Parameters
-
[in] | context | The event context |
[in] | caller | Name of the algorithm to audit CPU usage for |
[in] | type | If we are Before or After the algorithm's execution |
Definition at line 105 of file TrigCostSvc.cxx.
113 std::shared_lock lockShared(
m_slotMutex[ context.slot() ] );
121 << (
type == AuditType::Before ?
"BEGAN" :
"ENDED") <<
" at " <<
now.microsecondsSinceEpoch());
135 return StatusCode::SUCCESS;
◆ startEvent()
StatusCode TrigCostSvc::startEvent |
( |
const EventContext & |
context, |
|
|
const bool |
enableMonitoring = true |
|
) |
| |
|
overridevirtual |
Implementation of ITrigCostSvc::startEvent.
- Parameters
-
[in] | context | The event context |
[in] | enableMonitoring | Sets if the event should be monitored or not. Not monitoring will save CPU |
- Returns
- Success unless monitoring is enabled and the service's data stores can not be cleared for some reason
Definition at line 76 of file TrigCostSvc.cxx.
84 std::unique_lock lockUnique(
m_slotMutex[ context.slot() ] );
100 return StatusCode::SUCCESS;
◆ m_algStartInfo
Thread-safe store of algorithm start payload.
Definition at line 151 of file TrigCostSvc.h.
◆ m_algStopTime
Thread-safe store of algorithm stop times.
Definition at line 152 of file TrigCostSvc.h.
◆ m_costFinalizeAlgName
◆ m_costSupervisorAlgName
◆ m_enableMultiSlot
Gaudi::Property<bool> TrigCostSvc::m_enableMultiSlot {this, "EnableMultiSlot", false, "Monitored events in the MasterSlot collect data from events running in other slots."} |
|
private |
◆ m_eventMonitored
std::unique_ptr< std::atomic<bool>[] > TrigCostSvc::m_eventMonitored |
|
private |
Used to cache if the event in a given slot is being monitored.
Definition at line 148 of file TrigCostSvc.h.
◆ m_eventSlots
size_t TrigCostSvc::m_eventSlots |
|
private |
◆ m_globalMutex
std::mutex TrigCostSvc::m_globalMutex |
|
private |
Used to protect all-slot modifications.
Definition at line 150 of file TrigCostSvc.h.
◆ m_masterSlot
Gaudi::Property<size_t> TrigCostSvc::m_masterSlot {this, "MasterSlot", 0, "The slot responsible for saving MultiSlot data"} |
|
private |
◆ m_monitorAllEvents
Gaudi::Property<bool> TrigCostSvc::m_monitorAllEvents {this, "MonitorAllEvents", false, "Monitor every HLT event, e.g. for offline validation."} |
|
private |
◆ m_rosData
◆ m_saveHashes
Gaudi::Property<bool> TrigCostSvc::m_saveHashes {this, "SaveHashes", false, "Store a copy of the hash dictionary for easier debugging"} |
|
private |
◆ m_slotMutex
std::unique_ptr< std::shared_mutex[] > TrigCostSvc::m_slotMutex |
|
private |
Used to control and protect whole-table operations.
Definition at line 149 of file TrigCostSvc.h.
◆ m_threadCounter
size_t TrigCostSvc::m_threadCounter |
|
private |
Count how many unique thread ID we have seen.
Definition at line 158 of file TrigCostSvc.h.
◆ m_threadToAlgMap
Keeps track of what is running right now in each thread.
Definition at line 155 of file TrigCostSvc.h.
◆ m_threadToCounterMap
std::unordered_map<uint32_t, uint32_t> TrigCostSvc::m_threadToCounterMap |
|
private |
Map thread's hash ID to a counting numeral.
Definition at line 157 of file TrigCostSvc.h.
The documentation for this class was generated from the following files:
int32_t getROIID(const EventContext &context)
@breif Internal function to return a RoI from an extended event context context
size_t m_slotToSaveInto
The slot which is used for the purposes of recording data on this algorithm's execution.
StatusCode push_back(const AlgorithmIdentifier &ai, ENTRY &&entry, MsgStream &msg)
Inserts the entry in the vector payload into the map.
tbb::concurrent_hash_map< std::thread::id, AlgorithmIdentifier, ThreadHashCompare > m_threadToAlgMap
Keeps track of what is running right now in each thread.
std::unordered_map< uint32_t, uint32_t > m_threadToCounterMap
Map thread's hash ID to a counting numeral.
The structure which is used to monitor the ROB data request in L2 It is created for every addROBData ...
bool setDetail(const std::string &name, const TYPE &value)
Set an TYPE detail on the object.
TrigConf::HLTHash storeHash(MsgStream &msg) const
Small structure to hold an algorithm's name and store, plus some details on its EventView....
std::string m_caller
Name of the algorithm.
TrigComposite_v1 TrigComposite
Declare the latest version of the class.
static constexpr int16_t s_noView
Constant value used to express an Algorithm which is not running in a View.
bool hasExtendedEventContext(const EventContext &ctx)
Test whether a context object has an extended context installed.
StatusCode retrieve(const AlgorithmIdentifier &ai, typename tbb::concurrent_hash_map< AlgorithmIdentifier, PAYLOAD, AlgorithmIdentifierHashCompare >::const_accessor &payload, MsgStream &msg) const
Retrieve a payload from the map given an AlgorithmIdentifier.
Gaudi::Property< bool > m_monitorAllEvents
nope - should be used for standalone also, perhaps need to protect the class def bits #ifndef XAOD_AN...
static AlgorithmIdentifier make(const EventContext &context, const std::string &caller, MsgStream &msg, const int16_t slotOverride=-1)
Construct an AlgorithmIdentifier.
const TrigRoiDescriptor * roiDescriptor() const
Get cached pointer to View's Region of Interest Descriptor or nullptr if not describing a View.
#define ATH_MSG_VERBOSE(x)
size_t m_eventSlots
Number of concurrent processing slots.
Gaudi::Property< bool > m_saveHashes
StatusCode checkSlot(const EventContext &context) const
Sanity check that the job is respecting the number of slots which were declared at config time.
setScaleOne setStatusOne setSaturated int16_t
StatusCode getIterators(const EventContext &context, MsgStream &msg, typename tbb::concurrent_hash_map< AlgorithmIdentifier, PAYLOAD, AlgorithmIdentifierHashCompare >::const_iterator &begin, typename tbb::concurrent_hash_map< AlgorithmIdentifier, PAYLOAD, AlgorithmIdentifierHashCompare >::const_iterator &end)
Fetches Begin and End const iterators to iterate over the data store.
TrigCostDataStore< TrigTimeStamp > m_algStopTime
Thread-safe store of algorithm stop times.
uint64_t end_time
start time of ROB request (microsec since epoch)
const ExtendedEventContext & getExtendedEventContext(const EventContext &ctx)
Retrieve an extended context from a context object.
size_t m_threadCounter
Count how many unique thread ID we have seen.
std::string m_store
Name of the algorithm's store.
static void hashes2file(const std::string &fileName="hashes2string.txt")
debugging output of internal dictionary
size_t m_hash
Hash of algorithm + store + realSlot.
utility class to measure time duration in AthenaMT The pattern when it is useful: AlgA tags the begin...
StatusCode isValid() const
std::unique_ptr< std::atomic< bool >[] > m_eventMonitored
Used to cache if the event in a given slot is being monitored.
TrigConf::HLTHash callerHash(MsgStream &msg) const
virtual StatusCode processAlg(const EventContext &context, const std::string &caller, const AuditType type) override
Implementation of ITrigCostSvc::processAlg.
Static hash and equal members as required by tbb::concurrent_hash_map.
Class used to describe composite objects in the HLT.
TrigCostDataStore< std::vector< robmonitor::ROBDataMonitorStruct > > m_rosData
Thread-safe store of ROS data.
int16_t m_viewID
If not within an event view, then the m_iewID = s_noView = -1.
std::string to_string(const DetectorType &type)
uint64_t start_time
map of ROBs requested
std::mutex m_globalMutex
Used to protect all-slot modifications.
Gaudi::Property< std::string > m_costSupervisorAlgName
value_type push_back(value_type pElem)
Add an element to the end of the collection.
static const std::string hash2string(HLTHash, const std::string &category="TE")
hash function translating identifiers into names (via internal dictionary)
virtual unsigned int roiId() const override final
these quantities probably don't need to be used any more
const AccessorWrapper< T > * accessor(xAOD::JetAttribute::AttributeID id)
Returns an attribute accessor corresponding to an AttributeID.
std::unique_ptr< std::shared_mutex[] > m_slotMutex
Used to control and protect whole-table operations.
size_t m_realSlot
The actual slot of the algorithm.
StatusCode clear(const EventContext &context, MsgStream &msg)
Clears all data stored in an event slot.
std::map< const uint32_t, robmonitor::ROBDataStruct > requested_ROBs
name of requesting algorithm
#define ATH_MSG_WARNING(x)
size_t getNSlots()
Return the number of event slots.
TrigCostDataStore< AlgorithmPayload > m_algStartInfo
Thread-safe store of algorithm start payload.
bool getDetail(const std::string &name, TYPE &value) const
Get an TYPE detail from the object.
Small structure wrap the various values stored for an algorithm just before it starts to execute.
float distance(const Amg::Vector3D &p1, const Amg::Vector3D &p2)
calculates the distance between two point in 3D space
size_type size() const noexcept
Returns the number of elements in the collection.
Gaudi::Property< std::string > m_costFinalizeAlgName
Gaudi::Property< size_t > m_masterSlot
StatusCode monitor(const EventContext &context, const AlgorithmIdentifier &ai, const TrigTimeStamp &now, const AuditType type)
Internal call to save monitoring data for a given AlgorithmIdentifier.
Gaudi::Property< bool > m_enableMultiSlot
StatusCode initialize(size_t nSlots)
Initialise internal storage.
StatusCode insert(const AlgorithmIdentifier &ai, const PAYLOAD &payload, MsgStream &msg)
Inserts the payload into the map.