Service used for executing AthCUDA::IKernelTask tasks. More...

#include <KernelRunnerSvc.h>

Inheritance diagram for AthCUDA::KernelRunnerSvc:

Collaboration diagram for AthCUDA::KernelRunnerSvc:

Public Member Functions
void	setTaskFinished ()

Interface inherited from @c IService
virtual StatusCode	initialize () override
	Initialise the service. More...

virtual StatusCode	finalize () override
	Finalise the service. More...

Interface inherited from @c AthCUDA::IKernelRunnerSvc
virtual StatusCode	execute (std::unique_ptr< IKernelTask > task) override
	Execute a user specified kernel task. More...

Private Attributes
std::atomic_int	m_kernelsInFlight
	The current number of kernels being executed. More...

std::atomic_uint	m_totalTasks
	The number of tasks executed during the job in total. More...

std::atomic_uint	m_gpuTasks
	The number of tasks sent to the GPU during the job. More...

std::unique_ptr< KernelRunnerSvcImpl >	m_impl
	Implementation helper object. More...

Service properties
Gaudi::Property< int >	m_nKernels
	The number of streams to use. More...

ServiceHandle< IStreamPoolSvc >	m_streamPoolSvc
	Service managing CUDA the streams. More...

Detailed Description

Service used for executing AthCUDA::IKernelTask tasks.

It allows the user to execute a configured number of tasks in parallel on a GPU, while "overflow" tasks are executed on the CPU instead.

Author: Attila Krasznahorkay Attil.nosp@m.a.Kr.nosp@m.aszna.nosp@m.hork.nosp@m.ay@ce.nosp@m.rn.c.nosp@m.h

Definition at line 33 of file KernelRunnerSvc.h.

Member Function Documentation

◆ execute()

StatusCode AthCUDA::KernelRunnerSvc::execute ( std::unique_ptr< IKernelTask > task )

overridevirtual

Execute a user specified kernel task.

If a GPU is available at runtime, and it is not doing other things at the moment, this function offloads the calculation to the GPU, and returns right away. The user is expected to return control in the calling thread to the framework, as the kernel task will notify the framework when the task gets finished.

If a GPU is not available for any reason, the function just executes the task on the CPU in the caller thread, and returns only once the task is finished.

Parameters

task	The task to be executed on the CPU or GPU

Returns: A code describing what happened to the task

Definition at line 72 of file KernelRunnerSvc.cxx.

                                                                           {
  
       // Make sure that we received a valid task.
       if( task.get() == nullptr ) {
          ATH_MSG_ERROR( "Invalid task object received" );
          return StatusCode::FAILURE;
       }
  
       // One way or another, we will execute this task.
       ++m_totalTasks;
  
       // Check if a GPU is available, and no other thread is launching a GPU
       // calculation right now.
       if( ( ! m_impl ) || m_streamPoolSvc->isEmpty() ||
           ( ( m_nKernels.value() > 0 ) &&
             ( m_kernelsInFlight.load() >= m_nKernels.value() ) ) ) {
  
          // If so, let's just execute the task in the current thread.
          ATH_MSG_VERBOSE( "Executing a task on the CPU" );
          StreamHolder dummy;
          if( task->finished( task->execute( dummy ),
                              IKernelTask::Synchronous ) != 0 ) {
             ATH_MSG_ERROR( "Failed to execute task in the caller thread!" );
             return StatusCode::FAILURE;
          }
  
          // Return gracefully.
          return StatusCode::SUCCESS;
       }
  
       // If we got here, we need to schedule the task for execution on the/a
       // GPU.
  
       // Give the task to the implementation object to launch it.
       ATH_MSG_VERBOSE( "Executing an offloaded task" );
       ++m_kernelsInFlight;
       ++m_gpuTasks;
       m_impl->execute( std::move( task ) );
  
       // Return gracefully.
       return StatusCode::SUCCESS;
    }

◆ finalize()

StatusCode AthCUDA::KernelRunnerSvc::finalize ( )

overridevirtual

Finalise the service.

Definition at line 51 of file KernelRunnerSvc.cxx.

                                         {
  
       // Destroy the implementation object.
       m_impl.reset();
  
       // Tell the user what happened.
       ATH_MSG_INFO( " o All task(s) executed: " << m_totalTasks.load() );
       const double percentage =
          ( m_totalTasks != 0 ?
            ( static_cast< double >( m_gpuTasks.load() ) /
              static_cast< double >( m_totalTasks.load() ) * 100.0 ) : 0.0 );
       ATH_MSG_INFO( " o GPU task(s) executed: " << m_gpuTasks.load() << " ("
                     << percentage << "%)" );
  
       // Finalise the base class.
       ATH_CHECK( Service::finalize() );
  
       // Return gracefully.
       return StatusCode::SUCCESS;
    }

◆ initialize()

StatusCode AthCUDA::KernelRunnerSvc::initialize ( )

overridevirtual

Initialise the service.

Definition at line 18 of file KernelRunnerSvc.cxx.

                                           {
  
       // Reset the internal counter(s).
       m_kernelsInFlight = 0;
       m_totalTasks = 0;
       m_gpuTasks = 0;
  
       // If no devices are available or no kernels are allowed to run on the
       // GPU, then don't even set up the implementation object.
       if( ( Info::instance().nDevices() == 0 ) ||
           ( m_nKernels.value() == 0 ) ) {
          ATH_MSG_INFO( "Will run everything on the CPU." );
          return StatusCode::SUCCESS;
       }
  
       // Access the stream pool service.
       ATH_CHECK( m_streamPoolSvc.retrieve() );
  
       // Create the implementation object.
       m_impl = std::make_unique< KernelRunnerSvcImpl >( *m_streamPoolSvc,
                                                         *this );
  
       // Tell the user what happened.
       std::ostringstream str;
       str << Info::instance();
       ATH_MSG_INFO( "Started service for running " << m_nKernels.value()
                     << " GPU kernel(s) in parallel on device(s):\n"
                     << str.str() );
  
       // Return gracefully.
       return StatusCode::SUCCESS;
    }

◆ setTaskFinished()

void AthCUDA::KernelRunnerSvc::setTaskFinished ( )

Definition at line 115 of file KernelRunnerSvc.cxx.

                                          {
  
       // Update the internal counter.
       --m_kernelsInFlight;
       return;
    }

Member Data Documentation

◆ m_gpuTasks

std::atomic_uint AthCUDA::KernelRunnerSvc::m_gpuTasks

private

The number of tasks sent to the GPU during the job.

Definition at line 97 of file KernelRunnerSvc.h.

◆ m_impl

std::unique_ptr< KernelRunnerSvcImpl > AthCUDA::KernelRunnerSvc::m_impl

private

Implementation helper object.

Definition at line 100 of file KernelRunnerSvc.h.

◆ m_kernelsInFlight

std::atomic_int AthCUDA::KernelRunnerSvc::m_kernelsInFlight

private

The current number of kernels being executed.

Definition at line 92 of file KernelRunnerSvc.h.

◆ m_nKernels

Gaudi::Property< int > AthCUDA::KernelRunnerSvc::m_nKernels

private

Initial value:

{ this, "NParallelKernels", 2,

"The number of CUDA kernels to execute in parallel" }

The number of streams to use.

Definition at line 81 of file KernelRunnerSvc.h.

◆ m_streamPoolSvc

ServiceHandle< IStreamPoolSvc > AthCUDA::KernelRunnerSvc::m_streamPoolSvc

private

Initial value:

{ this, "StreamPoolSvc",
         "AthCUDA::StreamPoolSvc",
         "The AthCUDA::StreamPoolSvc instance to use" }

Service managing CUDA the streams.

Definition at line 85 of file KernelRunnerSvc.h.

◆ m_totalTasks

std::atomic_uint AthCUDA::KernelRunnerSvc::m_totalTasks

private

The number of tasks executed during the job in total.

Definition at line 95 of file KernelRunnerSvc.h.

The documentation for this class was generated from the following files:

Public Member Functions

Private Attributes

Detailed Description

Member Function Documentation

◆ execute()

◆ finalize()

◆ initialize()

◆ setTaskFinished()

Member Data Documentation

◆ m_gpuTasks

◆ m_impl

◆ m_kernelsInFlight

◆ m_nKernels

◆ m_streamPoolSvc

◆ m_totalTasks