ATLAS Offline Software
Public Member Functions | Private Attributes | List of all members
AthCUDA::KernelRunnerSvc Class Reference

Service used for executing AthCUDA::IKernelTask tasks. More...

#include <KernelRunnerSvc.h>

Inheritance diagram for AthCUDA::KernelRunnerSvc:
Collaboration diagram for AthCUDA::KernelRunnerSvc:

Public Member Functions

void setTaskFinished ()
 
Interface inherited from @c IService
virtual StatusCode initialize () override
 Initialise the service. More...
 
virtual StatusCode finalize () override
 Finalise the service. More...
 
Interface inherited from @c AthCUDA::IKernelRunnerSvc
virtual StatusCode execute (std::unique_ptr< IKernelTask > task) override
 Execute a user specified kernel task. More...
 

Private Attributes

std::atomic_int m_kernelsInFlight
 The current number of kernels being executed. More...
 
std::atomic_uint m_totalTasks
 The number of tasks executed during the job in total. More...
 
std::atomic_uint m_gpuTasks
 The number of tasks sent to the GPU during the job. More...
 
std::unique_ptr< KernelRunnerSvcImplm_impl
 Implementation helper object. More...
 
Service properties
Gaudi::Property< int > m_nKernels
 The number of streams to use. More...
 
ServiceHandle< IStreamPoolSvcm_streamPoolSvc
 Service managing CUDA the streams. More...
 

Detailed Description

Service used for executing AthCUDA::IKernelTask tasks.

It allows the user to execute a configured number of tasks in parallel on a GPU, while "overflow" tasks are executed on the CPU instead.

Author
Attila Krasznahorkay Attil.nosp@m.a.Kr.nosp@m.aszna.nosp@m.hork.nosp@m.ay@ce.nosp@m.rn.c.nosp@m.h

Definition at line 33 of file KernelRunnerSvc.h.

Member Function Documentation

◆ execute()

StatusCode AthCUDA::KernelRunnerSvc::execute ( std::unique_ptr< IKernelTask task)
overridevirtual

Execute a user specified kernel task.

If a GPU is available at runtime, and it is not doing other things at the moment, this function offloads the calculation to the GPU, and returns right away. The user is expected to return control in the calling thread to the framework, as the kernel task will notify the framework when the task gets finished.

If a GPU is not available for any reason, the function just executes the task on the CPU in the caller thread, and returns only once the task is finished.

Parameters
taskThe task to be executed on the CPU or GPU
Returns
A code describing what happened to the task

Definition at line 72 of file KernelRunnerSvc.cxx.

72  {
73 
74  // Make sure that we received a valid task.
75  if( task.get() == nullptr ) {
76  ATH_MSG_ERROR( "Invalid task object received" );
77  return StatusCode::FAILURE;
78  }
79 
80  // One way or another, we will execute this task.
81  ++m_totalTasks;
82 
83  // Check if a GPU is available, and no other thread is launching a GPU
84  // calculation right now.
85  if( ( ! m_impl ) || m_streamPoolSvc->isEmpty() ||
86  ( ( m_nKernels.value() > 0 ) &&
87  ( m_kernelsInFlight.load() >= m_nKernels.value() ) ) ) {
88 
89  // If so, let's just execute the task in the current thread.
90  ATH_MSG_VERBOSE( "Executing a task on the CPU" );
91  StreamHolder dummy;
92  if( task->finished( task->execute( dummy ),
93  IKernelTask::Synchronous ) != 0 ) {
94  ATH_MSG_ERROR( "Failed to execute task in the caller thread!" );
95  return StatusCode::FAILURE;
96  }
97 
98  // Return gracefully.
99  return StatusCode::SUCCESS;
100  }
101 
102  // If we got here, we need to schedule the task for execution on the/a
103  // GPU.
104 
105  // Give the task to the implementation object to launch it.
106  ATH_MSG_VERBOSE( "Executing an offloaded task" );
108  ++m_gpuTasks;
109  m_impl->execute( std::move( task ) );
110 
111  // Return gracefully.
112  return StatusCode::SUCCESS;
113  }

◆ finalize()

StatusCode AthCUDA::KernelRunnerSvc::finalize ( )
overridevirtual

Finalise the service.

Definition at line 51 of file KernelRunnerSvc.cxx.

51  {
52 
53  // Destroy the implementation object.
54  m_impl.reset();
55 
56  // Tell the user what happened.
57  ATH_MSG_INFO( " o All task(s) executed: " << m_totalTasks.load() );
58  const double percentage =
59  ( m_totalTasks != 0 ?
60  ( static_cast< double >( m_gpuTasks.load() ) /
61  static_cast< double >( m_totalTasks.load() ) * 100.0 ) : 0.0 );
62  ATH_MSG_INFO( " o GPU task(s) executed: " << m_gpuTasks.load() << " ("
63  << percentage << "%)" );
64 
65  // Finalise the base class.
67 
68  // Return gracefully.
69  return StatusCode::SUCCESS;
70  }

◆ initialize()

StatusCode AthCUDA::KernelRunnerSvc::initialize ( )
overridevirtual

Initialise the service.

Definition at line 18 of file KernelRunnerSvc.cxx.

18  {
19 
20  // Reset the internal counter(s).
22  m_totalTasks = 0;
23  m_gpuTasks = 0;
24 
25  // If no devices are available or no kernels are allowed to run on the
26  // GPU, then don't even set up the implementation object.
27  if( ( Info::instance().nDevices() == 0 ) ||
28  ( m_nKernels.value() == 0 ) ) {
29  ATH_MSG_INFO( "Will run everything on the CPU." );
30  return StatusCode::SUCCESS;
31  }
32 
33  // Access the stream pool service.
34  ATH_CHECK( m_streamPoolSvc.retrieve() );
35 
36  // Create the implementation object.
37  m_impl = std::make_unique< KernelRunnerSvcImpl >( *m_streamPoolSvc,
38  *this );
39 
40  // Tell the user what happened.
41  std::ostringstream str;
42  str << Info::instance();
43  ATH_MSG_INFO( "Started service for running " << m_nKernels.value()
44  << " GPU kernel(s) in parallel on device(s):\n"
45  << str.str() );
46 
47  // Return gracefully.
48  return StatusCode::SUCCESS;
49  }

◆ setTaskFinished()

void AthCUDA::KernelRunnerSvc::setTaskFinished ( )

Definition at line 115 of file KernelRunnerSvc.cxx.

115  {
116 
117  // Update the internal counter.
119  return;
120  }

Member Data Documentation

◆ m_gpuTasks

std::atomic_uint AthCUDA::KernelRunnerSvc::m_gpuTasks
private

The number of tasks sent to the GPU during the job.

Definition at line 97 of file KernelRunnerSvc.h.

◆ m_impl

std::unique_ptr< KernelRunnerSvcImpl > AthCUDA::KernelRunnerSvc::m_impl
private

Implementation helper object.

Definition at line 100 of file KernelRunnerSvc.h.

◆ m_kernelsInFlight

std::atomic_int AthCUDA::KernelRunnerSvc::m_kernelsInFlight
private

The current number of kernels being executed.

Definition at line 92 of file KernelRunnerSvc.h.

◆ m_nKernels

Gaudi::Property< int > AthCUDA::KernelRunnerSvc::m_nKernels
private
Initial value:
{ this, "NParallelKernels", 2,
"The number of CUDA kernels to execute in parallel" }

The number of streams to use.

Definition at line 81 of file KernelRunnerSvc.h.

◆ m_streamPoolSvc

ServiceHandle< IStreamPoolSvc > AthCUDA::KernelRunnerSvc::m_streamPoolSvc
private
Initial value:
{ this, "StreamPoolSvc",
"AthCUDA::StreamPoolSvc",
"The AthCUDA::StreamPoolSvc instance to use" }

Service managing CUDA the streams.

Definition at line 85 of file KernelRunnerSvc.h.

◆ m_totalTasks

std::atomic_uint AthCUDA::KernelRunnerSvc::m_totalTasks
private

The number of tasks executed during the job in total.

Definition at line 95 of file KernelRunnerSvc.h.


The documentation for this class was generated from the following files:
AthCUDA::KernelRunnerSvc::m_gpuTasks
std::atomic_uint m_gpuTasks
The number of tasks sent to the GPU during the job.
Definition: KernelRunnerSvc.h:97
AthCUDA::Info::instance
static const Info & instance()
Singleton accessor function.
python.tests.PyTestsLib.finalize
def finalize(self)
_info( "content of StoreGate..." ) self.sg.dump()
Definition: PyTestsLib.py:53
AthCUDA::IKernelTask::Synchronous
@ Synchronous
The kernel was executed synchronously on the CPU.
Definition: IKernelTask.h:31
ATH_MSG_INFO
#define ATH_MSG_INFO(x)
Definition: AthMsgStreamMacros.h:31
ATH_MSG_VERBOSE
#define ATH_MSG_VERBOSE(x)
Definition: AthMsgStreamMacros.h:28
AthCUDA::KernelRunnerSvc::m_impl
std::unique_ptr< KernelRunnerSvcImpl > m_impl
Implementation helper object.
Definition: KernelRunnerSvc.h:100
ATH_MSG_ERROR
#define ATH_MSG_ERROR(x)
Definition: AthMsgStreamMacros.h:33
python.xAODType.dummy
dummy
Definition: xAODType.py:4
ATH_CHECK
#define ATH_CHECK
Definition: AthCheckMacros.h:40
AthCUDA::KernelRunnerSvc::m_nKernels
Gaudi::Property< int > m_nKernels
The number of streams to use.
Definition: KernelRunnerSvc.h:81
AthCUDA::KernelRunnerSvc::m_streamPoolSvc
ServiceHandle< IStreamPoolSvc > m_streamPoolSvc
Service managing CUDA the streams.
Definition: KernelRunnerSvc.h:85
AthCUDA::KernelRunnerSvc::m_kernelsInFlight
std::atomic_int m_kernelsInFlight
The current number of kernels being executed.
Definition: KernelRunnerSvc.h:92
python.CaloScaleNoiseConfig.str
str
Definition: CaloScaleNoiseConfig.py:78
AthCUDA::KernelRunnerSvc::m_totalTasks
std::atomic_uint m_totalTasks
The number of tasks executed during the job in total.
Definition: KernelRunnerSvc.h:95
str
Definition: BTagTrackIpAccessor.cxx:11