ATLAS Offline Software
|
Vectorization helpers. More...
#include "CxxUtils/features.h"
#include "CxxUtils/inline_hints.h"
#include <cstdlib>
#include <cstring>
#include <type_traits>
#include "CxxUtils/vec_fb.h"
Go to the source code of this file.
Classes | |
struct | CxxUtils::vecDetail::vec_typedef< T, N > |
check the type and the size of the vector. More... | |
struct | CxxUtils::vecDetail::vec_type< VEC > |
Deduce the element type from a vectorized type. More... | |
struct | CxxUtils::vecDetail::vec_mask_type< VEC > |
Deduce the type of the mask returned by relational operations, for a vectorized type. More... | |
struct | CxxUtils::vecDetail::bool_pack_helper::bool_pack<... > |
Namespaces | |
CxxUtils | |
CxxUtils::vecDetail | |
CxxUtils::vecDetail::bool_pack_helper | |
Helper for static asserts for argument packs. | |
Macros | |
#define | WANT_VECTOR_FALLBACK 0 |
Typedefs | |
template<bool... bs> | |
using | CxxUtils::vecDetail::bool_pack_helper::all_true = std::is_same< bool_pack< bs..., true >, bool_pack< true, bs... > > |
template<typename T , size_t N> | |
using | CxxUtils::vec = typename vecDetail::vec_typedef< T, N >::type |
Define a nice alias for the vectorized type. More... | |
template<class VEC > | |
using | CxxUtils::vec_type_t = typename vecDetail::vec_type< VEC >::type |
Define a nice alias for the element type of a vectorized type. More... | |
template<class VEC > | |
using | CxxUtils::vec_mask_type_t = typename vecDetail::vec_mask_type< VEC >::type |
Define a nice alias for the mask type for a vectorized type. More... | |
Functions | |
template<class VEC > | |
constexpr ATH_ALWAYS_INLINE size_t | CxxUtils::vec_size () |
Return the number of elements in a vectorized type. More... | |
template<class VEC > | |
constexpr ATH_ALWAYS_INLINE size_t | CxxUtils::vec_size (const VEC &) |
Return the number of elements in a vectorized type. More... | |
template<typename VEC , typename T > | |
ATH_ALWAYS_INLINE void | CxxUtils::vbroadcast (VEC &v, T x) |
Copy a scalar to each element of a vectorized type. More... | |
template<typename VEC > | |
ATH_ALWAYS_INLINE void | CxxUtils::vload (VEC &dst, vec_type_t< VEC > const *src) |
template<typename VEC > | |
ATH_ALWAYS_INLINE void | CxxUtils::vstore (vec_type_t< VEC > *dst, const VEC &src) |
template<typename VEC > | |
ATH_ALWAYS_INLINE void | CxxUtils::vselect (VEC &dst, const VEC &a, const VEC &b, const vec_mask_type_t< VEC > &mask) |
template<typename VEC > | |
ATH_ALWAYS_INLINE void | CxxUtils::vmin (VEC &dst, const VEC &a, const VEC &b) |
template<typename VEC > | |
ATH_ALWAYS_INLINE void | CxxUtils::vmax (VEC &dst, const VEC &a, const VEC &b) |
template<typename VEC > | |
ATH_ALWAYS_INLINE bool | CxxUtils::vany (const VEC &mask) |
template<typename VEC > | |
ATH_ALWAYS_INLINE bool | CxxUtils::vnone (const VEC &mask) |
template<typename VEC > | |
ATH_ALWAYS_INLINE bool | CxxUtils::vall (const VEC &mask) |
template<typename VEC1 , typename VEC2 > | |
ATH_ALWAYS_INLINE void | CxxUtils::vconvert (VEC1 &dst, const VEC2 &src) |
performs dst is the result of a static cast of each element of src More... | |
template<size_t... Indices, typename VEC , typename VEC1 > | |
ATH_ALWAYS_INLINE void | CxxUtils::vpermute (VEC1 &dst, const VEC &src) |
vpermute function. More... | |
template<size_t... Indices, typename VEC , typename VEC1 > | |
ATH_ALWAYS_INLINE void | CxxUtils::vpermute2 (VEC1 &dst, const VEC &src1, const VEC &src2) |
vpermute2 function. More... | |
Vectorization helpers.
This file provides some helpers for writing vectorized code in C++.
A vectorized type may be named as CxxUtils::vec<T, N>
. Here T
is the element type, which should be an elementary integer or floating-point type. N
is the number of elements in the vector; it should be a power of 2. This will either be a built-in vector type if the vector_size
attribute is supported or a fallback C++ class intended to be (mostly) functionally equivalent (see vec_fb.h)
The GCC, clang and fallback vector types support: ++, –, +,-,*,/,%, =, &,|,^,~, >>,<<, !, &&, ||, ==, !=, >, <, >=, <=, =, sizeof and Initialization from brace-enclosed lists
Furthermore the GCC and clang vector types support the ternary operator.
We also support some additional operations.
Deducing useful types:
CxxUtils::vec_type_t<VEC>
is the element type of VEC
.CxxUtils::vec_mask_type_t<VEC>
is the vector type return by relational operations.Deducing the num of elements in a vectorized type:
CxxUtils::vec_size<VEC>()
is the number of elements in VEC
.CxxUtils::vec_size(const VEC&)
is the number of elements in VEC
.Initializing with a value :
CxxUtils::vbroadcast
(VEC& v, T x) initializes each element of v
with x
.Load from/store to array:
CxxUtils::vload
(VEC& dst, const vec_type_t<VEC>* src) loads elements from src
to dst
CxxUtils::vstore
(vec_type_t<VEC>* dst, const VEC& src) stores elements from src
to dst
Basic Algorithms :CxxUtils::vselect
(VEC& dst, const VEC& a, const VEC& b, const vec_mask_type_t<VEC>& mask) copies elements from a
or b
, depending on the value of mask
to dst
. dst[i] = mask[i] ? a[i] : b[i]CxxUtils::vmin
(VEC& dst, const VEC& a, const VEC& b) copies to dst
[i] the min(a[i],b[i])CxxUtils::vmax
(VEC& dst, const VEC& a, const VEC& b) copies to dst
[i] the max(a[i],b[i])Bool reductions :
CxxUtils::vany(const VEC& mask)
Returns true if at least one value in mask is true.CxxUtils::vnone(const VEC& mask)
Returns true if all values in k are falseCxxUtils::vall(const VEC& mask)
Returns true if all values in k are trueConversions/Casting :
CxxUtils::vconvert
(VEC1& dst, const VEC2& src) Fills dst
with the result of a static_cast of every element of src
to the element type of dst. dst[i] = static_cast<vec_type_t<VEC1>>(src[i])Permutations :
The destination is a vector with the same element type as the source vector(s) but that has an element count equal to the number of indices specified
CxxUtils::vpermute<mask>
(VEC& dst, const VEC& src) Fills dst with permutation of src according to mask. mask
is a list of integers that specifies the elements that should be extracted and returned in src
. dst[i] = src[mask[i]] where mask[i] is the ith integer in the mask
.CxxUtils::vpermute2<mask>
(VEC& dst, const VEC& src1,const VEC& src2) Fills dst
with permutation of src1
and src2
according to mask
. mask
is a list of integers that specifies the elements that should be extracted from src1
and src2
. An index i in the interval [0,N) indicates that element number i from the first input vector should be placed in the corresponding position in the result vector. An index in the interval [N,2N) indicates that the element number i-N from the second input vector should be placed in the corresponding position in the result vector.For good performance the user should use vector types that fit the size of the ISA. e.g 128 bit wide for SSE, 256 wide for AVX etc.
Specifying a combination that is not valid for the current architecture causes the compiler to synthesize the instructions using a narrower mode. But this might not always produce optimal code for all operations.
Consider using Function Multiversioning (CxxUtils/features.h) if you really need to target efficiently multiple ISAs.
Definition in file vec.h.