ATLAS Offline Software
Loading...
Searching...
No Matches
vec.h File Reference

Vectorization helpers. More...

#include "CxxUtils/features.h"
#include "CxxUtils/inline_hints.h"
#include <cstdlib>
#include <cstring>
#include <type_traits>
#include "CxxUtils/vec_fb.h"
Include dependency graph for vec.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  CxxUtils::vecDetail::vec_typedef< T, N >
 check the type and the size of the vector. More...
struct  CxxUtils::vecDetail::vec_type< VEC >
 Deduce the element type from a vectorized type. More...
struct  CxxUtils::vecDetail::vec_mask_type< VEC >
 Deduce the type of the mask returned by relational operations, for a vectorized type. More...

Namespaces

namespace  CxxUtils
namespace  CxxUtils::vecDetail
namespace  CxxUtils::vecDetail::bool_pack_helper
 Helper for static asserts for argument packs.

Macros

#define WANT_VECTOR_FALLBACK   0

Typedefs

template<bool... bs>
using CxxUtils::vecDetail::bool_pack_helper::all_true = std::is_same<bool_pack<bs..., true>, bool_pack<true, bs...>>
template<typename T, size_t N>
using CxxUtils::vec = typename vecDetail::vec_typedef<T,N>::type
 Define a nice alias for the vectorized type.
template<class VEC>
using CxxUtils::vec_type_t = typename vecDetail::vec_type<VEC>::type
 Define a nice alias for the element type of a vectorized type.
template<class VEC>
using CxxUtils::vec_mask_type_t = typename vecDetail::vec_mask_type<VEC>::type
 Define a nice alias for the mask type for a vectorized type.

Functions

template<class VEC>
ATH_ALWAYS_INLINE constexpr size_t CxxUtils::vec_size ()
 Return the number of elements in a vectorized type.
template<class VEC>
ATH_ALWAYS_INLINE constexpr size_t CxxUtils::vec_size (const VEC &)
 Return the number of elements in a vectorized type.
template<typename VEC, typename T>
ATH_ALWAYS_INLINE void CxxUtils::vbroadcast (VEC &v, T x)
 Copy a scalar to each element of a vectorized type.
template<typename VEC>
ATH_ALWAYS_INLINE void CxxUtils::vload (VEC &dst, vec_type_t< VEC > const *src)
template<typename VEC>
ATH_ALWAYS_INLINE void CxxUtils::vstore (vec_type_t< VEC > *dst, const VEC &src)
template<typename VEC>
ATH_ALWAYS_INLINE void CxxUtils::vselect (VEC &dst, const VEC &a, const VEC &b, const vec_mask_type_t< VEC > &mask)
template<typename VEC>
ATH_ALWAYS_INLINE void CxxUtils::vmin (VEC &dst, const VEC &a, const VEC &b)
template<typename VEC>
ATH_ALWAYS_INLINE void CxxUtils::vmax (VEC &dst, const VEC &a, const VEC &b)
template<typename VEC>
ATH_ALWAYS_INLINE bool CxxUtils::vany (const VEC &mask)
template<typename VEC>
ATH_ALWAYS_INLINE bool CxxUtils::vnone (const VEC &mask)
template<typename VEC>
ATH_ALWAYS_INLINE bool CxxUtils::vall (const VEC &mask)
template<typename VEC1, typename VEC2>
ATH_ALWAYS_INLINE void CxxUtils::vconvert (VEC1 &dst, const VEC2 &src)
 performs dst is the result of a static cast of each element of src
template<size_t... Indices, typename VEC, typename VEC1>
ATH_ALWAYS_INLINE void CxxUtils::vpermute (VEC1 &dst, const VEC &src)
 vpermute function.
template<size_t... Indices, typename VEC, typename VEC1>
ATH_ALWAYS_INLINE void CxxUtils::vpermute2 (VEC1 &dst, const VEC &src1, const VEC &src2)
 vpermute2 function.

Detailed Description

Vectorization helpers.

Author
scott snyder snyde.nosp@m.r@bn.nosp@m.l.gov
Christos Anastopoulos (helper methods)
Date
Mar, 2020

gcc and clang provide built-in types for writing vectorized code, using the vector_size attribute. This usually results in code that is much easier to read and more portable than one would get using intrinsics directly. However, it is still non-standard, and there are some operations which are kind of awkward.

This file provides some helpers for writing vectorized code in C++.

A vectorized type may be named as CxxUtils::vec<T, N>. Here T is the element type, which should be an elementary integer or floating-point type. N is the number of elements in the vector; it should be a power of 2. This will either be a built-in vector type if the vector_size attribute is supported or a fallback C++ class intended to be (mostly) functionally equivalent (see vec_fb.h)

The GCC, clang and fallback vector types support: ++, –, +,-,*,/,%, =, &,|,^,~, >>,<<, !, &&, ||, ==, !=, >, <, >=, <=, =, sizeof and Initialization from brace-enclosed lists

Furthermore the GCC and clang vector types support the ternary operator.

We also support some additional operations.

Deducing useful types:

Deducing the num of elements in a vectorized type:

Initializing with a value :

Load from/store to array:

  • CxxUtils::vload (VEC& dst, const vec_type_t<VEC>* src) loads elements from src to dst
  • CxxUtils::vstore (vec_type_t<VEC>* dst, const VEC& src) stores elements from src to dst Basic Algorithms :
  • CxxUtils::vselect (VEC& dst, const VEC& a, const VEC& b, const vec_mask_type_t<VEC>& mask) copies elements from a or b, depending on the value of mask to dst. dst[i] = mask[i] ? a[i] : b[i]
  • CxxUtils::vmin (VEC& dst, const VEC& a, const VEC& b) copies to dst[i] the min(a[i],b[i])
  • CxxUtils::vmax (VEC& dst, const VEC& a, const VEC& b) copies to dst[i] the max(a[i],b[i])

Bool reductions :

Conversions/Casting :

  • CxxUtils::vconvert (VEC1& dst, const VEC2& src) Fills dst with the result of a static_cast of every element of src to the element type of dst. dst[i] = static_cast<vec_type_t<VEC1>>(src[i])

Permutations :

The destination is a vector with the same element type as the source vector(s) but that has an element count equal to the number of indices specified

  • CxxUtils::vpermute<mask> (VEC& dst, const VEC& src) Fills dst with permutation of src according to mask. mask is a list of integers that specifies the elements that should be extracted and returned in src. dst[i] = src[mask[i]] where mask[i] is the ith integer in the mask.
  • CxxUtils::vpermute2<mask> (VEC& dst, const VEC& src1,const VEC& src2) Fills dst with permutation of src1 and src2 according to mask. mask is a list of integers that specifies the elements that should be extracted from src1 and src2. An index i in the interval [0,N) indicates that element number i from the first input vector should be placed in the corresponding position in the result vector. An index in the interval [N,2N) indicates that the element number i-N from the second input vector should be placed in the corresponding position in the result vector.

For good performance the user should use vector types that fit the size of the ISA. e.g 128 bit wide for SSE, 256 wide for AVX etc.

Specifying a combination that is not valid for the current architecture causes the compiler to synthesize the instructions using a narrower mode. But this might not always produce optimal code for all operations.

Consider using Function Multiversioning (CxxUtils/features.h) if you really need to target efficiently multiple ISAs.

Definition in file vec.h.

Macro Definition Documentation

◆ WANT_VECTOR_FALLBACK

#define WANT_VECTOR_FALLBACK   0

Definition at line 143 of file vec.h.