Interface for GPU kernel size optimization (allowing adjustment of kernel sizes to the properties of the available device). More...

#include <IGPUKernelSizeOptimizer.h>

Inheritance diagram for IGPUKernelSizeOptimizer:

Collaboration diagram for IGPUKernelSizeOptimizer:

Public Types
enum	SpecialSizeHints { CooperativeLaunch = -1 }

Public Member Functions
virtual void	register_kernel (const std::string &kernel_name, void *kernel, const int blocksize_hint, const int gridsize_hint, const int max_total_threads)
	Register a kernel with a specific name. More...

virtual void	register_kernels (const std::string &tool_name, const int number, void *kernels, const int blocksize_hints, const int gridsize_hints, const int max_total_threads, const int offset=0)=0
	Register a set of kernels that can be referred back to with a name and a number. More...

virtual CUDAKernelLaunchConfiguration	get_launch_configuration (const std::string &name, const int number=0, const int dynamic_memory=0) const =0
	Retrieve the (hopefully optimal) kernel launch configuration. More...

virtual bool	can_use_cooperative_groups () const =0
	Whether the device + environment in use support cooperative groups. More...

virtual bool	can_use_dynamic_parallelism () const =0
	Whether the device + environment in use support dynamic parallelism. More...

virtual bool	use_minimal_kernel_sizes () const
	Whether to avoid oversizing kernels and instead (if possible) launch kernels with the exact number of threads... More...

virtual	~IGPUKernelSizeOptimizer ()=default

Detailed Description

Interface for GPU kernel size optimization (allowing adjustment of kernel sizes to the properties of the available device).

Author: Nuno Fernandes nuno..nosp@m.dos..nosp@m.santo.nosp@m.s.fe.nosp@m.rnand.nosp@m.es@c.nosp@m.ern.c.nosp@m.h

Date: 03 August 2023 This class should remain independent of any Athena-centric includes, so that it can be straightforwardly used from within .cu files.

Definition at line 28 of file IGPUKernelSizeOptimizer.h.

Member Enumeration Documentation

◆ SpecialSizeHints

enum IGPUKernelSizeOptimizer::SpecialSizeHints

Enumerator
CooperativeLaunch

Definition at line 32 of file IGPUKernelSizeOptimizer.h.

   {
     CooperativeLaunch = -1
   };

Constructor & Destructor Documentation

◆ ~IGPUKernelSizeOptimizer()

virtual IGPUKernelSizeOptimizer::~IGPUKernelSizeOptimizer ( )

virtualdefault

Member Function Documentation

◆ can_use_cooperative_groups()

virtual bool IGPUKernelSizeOptimizer::can_use_cooperative_groups ( ) const

pure virtual

Whether the device + environment in use support cooperative groups.

◆ can_use_dynamic_parallelism()

virtual bool IGPUKernelSizeOptimizer::can_use_dynamic_parallelism ( ) const

pure virtual

Whether the device + environment in use support dynamic parallelism.

◆ get_launch_configuration()

virtual CUDAKernelLaunchConfiguration IGPUKernelSizeOptimizer::get_launch_configuration	(	const std::string &	name,
		const int	number = `0`,
		const int	dynamic_memory = `0`
	)		const

pure virtual

Retrieve the (hopefully optimal) kernel launch configuration.

◆ register_kernel()

virtual void IGPUKernelSizeOptimizer::register_kernel	(	const std::string &	kernel_name,
		void *	kernel,
		const int	blocksize_hint,
		const int	gridsize_hint,
		const int	max_total_threads
	)

inlinevirtual

Register a kernel with a specific name.

Definition at line 39 of file IGPUKernelSizeOptimizer.h.

   {
     this->register_kernels(kernel_name, 1, &kernel, &blocksize_hint, &gridsize_hint, &max_total_threads, 0);
   }

◆ register_kernels()

virtual void IGPUKernelSizeOptimizer::register_kernels	(	const std::string &	tool_name,
		const int	number,
		void **	kernels,
		const int *	blocksize_hints,
		const int *	gridsize_hints,
		const int *	max_total_threads,
		const int	offset = `0`
	)

pure virtual

Register a set of kernels that can be referred back to with a name and a number.

Uses C-style arrays for more immediate CUDA compatibility, assumes the size of kernels, blocksize_hints and gridsize_hints is number, and starts the numbering with an optional offset.

◆ use_minimal_kernel_sizes()

virtual bool IGPUKernelSizeOptimizer::use_minimal_kernel_sizes ( ) const

inlinevirtual

Whether to avoid oversizing kernels and instead (if possible) launch kernels with the exact number of threads...

Definition at line 74 of file IGPUKernelSizeOptimizer.h.

   {
     //Testing shows that, at least on the devices we use,
     //we only lose performance by dyn-par'ing our way to do this.
     return false;
   }

The documentation for this class was generated from the following file:

IGPUKernelSizeOptimizer.h

Public Types

Public Member Functions

Detailed Description

Member Enumeration Documentation

◆ SpecialSizeHints

Constructor & Destructor Documentation

◆ ~IGPUKernelSizeOptimizer()

Member Function Documentation

◆ can_use_cooperative_groups()

◆ can_use_dynamic_parallelism()

◆ get_launch_configuration()

◆ register_kernel()

◆ register_kernels()

◆ use_minimal_kernel_sizes()