Pack/unpack floating-point data from/to a given number of bits. More...

#include <FloatPacker.h>

Collaboration diagram for CxxUtils::FloatPacker:

Public Types
typedef uint32_t	Packdest
	Type into which we pack. More...

Public Member Functions
	FloatPacker (int nbits, int nmantissa, double scale=1, bool is_signed=true, bool round=false)
	Constructor. More...

Packdest	pack (double src, std::string *err=nullptr) const
	Pack a value. More...

double	unpack (Packdest val, std::string *err=nullptr) const
	Unpack the value `VAL`. More...

Private Attributes
int	m_nmantissa
	Number of bits in the mantissa + sign bit. More...

double	m_scale
	Scale factor for stored numbers. More...

double	m_invscale
	Inverse of scale. More...

bool	m_is_signed
	Should we use a sign bit? More...

bool	m_round
	Should we round instead of truncating? More...

int	m_npack
	Number of bits in mantissa (exclusive of any sign bit). More...

Packdest	m_npack_ones
	Mask with that many low bits set. More...

Packdest	m_signmask
	Mask containing the sign bit (or 0 if there's no sign bit). More...

int	m_nexp
	Number of exponent bits. More...

Packdest	m_nexp_ones
	Mask with that many low bits set. More...

int	m_min_exp
	Minimum exponent value. More...

int	m_max_exp
	Maximum exponent value. More...

Detailed Description

Pack/unpack floating-point data from/to a given number of bits.

The format is specified by the following parameters.

nbits - The total number of bits in the representation. scale - Scale factor to apply before storing. nmantissa - The number of bits to use for the mantissa and sign bit. is_signed - Flag to tell if we should use a sign bit. round - Flag to tell if we should round or truncate.

From these we derive:

npack = nmantissa, if is_signed is false. = nmantissa-1 if is_signed is true. nexp = nbits - nmantissa

The format consists of, in order from high bits to low bits:

A sign bit, if is_signed is true.
nexp bits of exponent information.
npack bits of mantissa.

The number is stored in normalized form, with an exponent bias of 2^(nexp-1). But if the (biased) exponent is zero, then the mantissa is stored in denormalized form. If nexp==0, this gives a fixed-point representation in the range [0,1). 0 is represented by all bits 0; if we have a sign bit, we can also represent -0 by all bits 0 except for the sign bit.

Definition at line 57 of file FloatPacker.h.

Member Typedef Documentation

◆ Packdest

typedef uint32_t CxxUtils::FloatPacker::Packdest

Type into which we pack.

Definition at line 61 of file FloatPacker.h.

Constructor & Destructor Documentation

◆ FloatPacker()

CxxUtils::FloatPacker::FloatPacker	(	int	nbits,
		int	nmantissa,
		double	scale = `1`,
		bool	is_signed = `true`,
		bool	round = `false`
	)

Constructor.

Parameters

nbits	The number of bits in the packed representation.
nmantissa	The number of bits to use for the mantissa and sign bit.
scale	Divide the input number by this before packing.
is_signed	If true, then one mantissa bit is used for a sign.
round	If true, numbers will be rounded. Otherwise, they will be truncated.

Definition at line 212 of file FloatPacker.cxx.

   : m_nmantissa (nmantissa),
     m_scale (scale),
     m_is_signed (is_signed),
     m_round (round)
 {
   // scale==0 means not to scale.
   // Use that instead of 1 since it's faster to test for 0.
   if (scale == 1 || scale == 0)
     m_invscale = 0;
   else {
     // Avoid spurious div-zero FPEs with clang.
     CXXUTILS_TRAPPING_FP;
     m_invscale = 1. / m_scale;
   }
  
   // Set up other cached values.
   m_npack = m_nmantissa;
   if (m_is_signed)
     --m_npack;
  
   m_npack_ones = ones<Packdest> (m_npack);
  
   // Sign bit mask.
   if (m_is_signed)
     m_signmask = 1U << (nbits - 1);
   else
     m_signmask = 0;
  
   // Number of exponent bits.
   m_nexp = nbits - m_nmantissa;
   m_nexp_ones = ones<Packdest> (m_nexp);
  
   // Minimum exponent value.
   m_min_exp = min_int (m_nexp);
  
   // Maximum exponent value.
   m_max_exp = max_int (m_nexp);
  
   if (m_npack < 1 || m_npack > nbits)
     throw std::runtime_error ("Bad number of mantissa bits.");
 }

Member Function Documentation

◆ pack()

FloatPacker::Packdest CxxUtils::FloatPacker::pack	(	double	src,
		std::string *	err = `nullptr`
	)		const

Pack a value.

Parameters

src	Value to pack.
err	If non-null, then this string will be set to a description of any error that occurs.

Returns: The packed value.

For now, we convert floats to doubles before packing.

Definition at line 270 of file FloatPacker.cxx.

 {
   double_or_int d;
   d.d.d = src;
  
   // Fast-path for zero.  (Purely an optimization.)
   // Note: can't use a double compare here.  On some architectures (eg, MIPS)
   // a denormal will compare equal to zero.
   if (d.i[0] == 0 && d.i[1] == 0)
     return 0;
  
   // Check for NaN and infinity.
   if (d.d.ieee.exponent == ieee754_double_exponent_mask) {
     if (err) {
       std::ostringstream os;
       os << "Bad float number: " << src << " (" << std::setbase(16) << d.i[0]
          << " " << d.i[1] << ")";
       *err = os.str();
     }
     d.d.d = 0;
   }
  
   if (m_invscale)
     d.d.d *= m_invscale;
  
   bool was_negative = false;
   if (d.d.ieee.negative != 0) {
     if (m_is_signed) {
       was_negative = true;
       d.d.d = -d.d.d;
     }
     else {
       // Don't complain on -0.
       if (d.d.d < 0 && err) {
         std::ostringstream os;
         os << "Float overflow during packing: " << src;
         *err = os.str();
       }
       d.d.d = 0;
     }
   }
  
   // Check for zero again.
   // (Also need to preserve the sign; the scale division may
   // have underflowed.)
   if (d.i[0] == 0 && d.i[1] == 0) {
     if (was_negative)
       return m_signmask;
     else
       return 0;
   }
  
   // Get packdest_bits bits of mantissa.
  
   Packdest mantissa =
     (d.d.ieee.mantissa0 << (packdest_bits -
                           ieee754_double_mantissa0_bits)) |
     (d.d.ieee.mantissa1 >>
      (ieee754_double_mantissa1_bits -
       (packdest_bits - ieee754_double_mantissa0_bits)));
  
   // Get the unbiased exponent.
   int exponent =
     static_cast<int> (d.d.ieee.exponent) - ieee754_double_bias;
  
   // Do rounding, if requested.
   if (m_round) {
     Packdest lsbmask = (1 << (packdest_bits - m_npack));
     int roundbit;
     Packdest roundmask;
     if (lsbmask > 1) {
       roundbit = (mantissa & (lsbmask >> 1));
       roundmask = ~ static_cast<Packdest> (roundbit - 1);
     }
     else {
       roundbit = (d.d.ieee.mantissa1 &
                   ((1 << ((ieee754_double_mantissa1_bits -
                            (packdest_bits -
                             ieee754_double_mantissa0_bits)) - 1))));
       roundmask = ~ static_cast<Packdest> (0);
     }
  
     if (roundbit != 0) {
       // Handle the case where it would overflow.
       if ((mantissa & roundmask) == roundmask) {
         mantissa >>= 1;
         mantissa |= roundmask;
         exponent += 1;
       }
  
       mantissa += lsbmask;
     }
   }
  
   // If the number is too large, bitch, and reset to the largest number.
   if (exponent > m_max_exp) {
     if (err) {
       std::ostringstream os;
       os << "Float overflow during packing: " << src;
       *err = os.str();
     }
     exponent = m_max_exp;
     mantissa = static_cast<Packdest> (~0);
   }
  
   // Handle denormals.  (We've already handled the zero case.)
   if (exponent == - ieee754_double_bias)
     renormalize_denormal (exponent, mantissa);
  
   // If the number is too small, denormalize, or underflow to 0.
   underflow_to_denormal (m_min_exp, m_round ? m_npack: 0, exponent, mantissa);
  
   // Pack in the mantissa bits.
   Packdest dest = mantissa >> (packdest_bits - m_npack);
  
   // The exponent, if desired.
   if (m_nexp > 0)
     dest |= ((exponent - m_min_exp) << m_npack);
  
   // And the optional sign bit.
   if (was_negative)
     dest |= m_signmask;
  
   return dest;
 }

◆ unpack()

double CxxUtils::FloatPacker::unpack	(	Packdest	val,
		std::string *	err = `nullptr`
	)		const

Unpack the value VAL.

Parameters

val	The packed data. It should start with the low bit, and any extraneous bits should have been masked off.
err	If non-null, then this string will be set to a description of any error that occurs.

Definition at line 404 of file FloatPacker.cxx.

 {
   // Fast-path for 0.
   if (val == 0)
     return 0;
  
   // Break apart the packed value.
   bool was_negative = false;
   if ((val & m_signmask) != 0)
     was_negative = true;
  
   double d;
  
   // Fast path for fixed-point representations.
   if (m_nexp == 0) {
     Packdest mantissa = (val & m_npack_ones);
     d = mantissa / ((double)m_npack_ones + 1);
     if (was_negative)
       d *= -1;
   }
   else {
     // Get the mantissa.
     Packdest mantissa = (val & m_npack_ones) << (packdest_bits - m_npack);
  
     // General case.
     // Get the exponent.
     int exponent = ((val >> m_npack) & m_nexp_ones);
     exponent += m_min_exp; // unbias.
  
     ieee754_double dd;
  
     // Handle denormals.
     if (exponent == m_min_exp) {
       // Maybe it was -0?
       if (mantissa == 0) {
         dd.d = 0;
         if (was_negative)
           dd.ieee.negative = 1;
         return dd.d;
       }
  
       renormalize_denormal (exponent, mantissa);
     }
  
     // Complain about overflow.
     if (exponent >= max_int (ieee754_double_exponent_bits)) {
       if (err) {
         std::ostringstream os;
         os << "Overflow while unpacking float; exponent: " << exponent;
         *err = os.str();
       }
       exponent = max_int (ieee754_double_exponent_bits) + 1;
       mantissa = 0; // Infinity.
     }
  
     // Underflow into denormal.
     underflow_to_denormal ( - ieee754_double_bias, 0,
                             exponent, mantissa);
  
     // Pack into a double.
     dd.ieee.negative = was_negative ? 1 : 0;
     dd.ieee.exponent = exponent + ieee754_double_bias;
     dd.ieee.mantissa0 =
       (mantissa >> (packdest_bits - ieee754_double_mantissa0_bits));
     dd.ieee.mantissa1 =
       (mantissa << (ieee754_double_mantissa0_bits -
                     (packdest_bits - ieee754_double_mantissa1_bits)));
     d = dd.d;
   }
  
   // Set the result.
   if (m_scale)
     d *= m_scale;
   return d;
 }

Member Data Documentation

◆ m_invscale

double CxxUtils::FloatPacker::m_invscale

private

Inverse of scale.

Definition at line 111 of file FloatPacker.h.

◆ m_is_signed

bool CxxUtils::FloatPacker::m_is_signed

private

Should we use a sign bit?

Definition at line 114 of file FloatPacker.h.

◆ m_max_exp

int CxxUtils::FloatPacker::m_max_exp

private

Maximum exponent value.

Definition at line 138 of file FloatPacker.h.

◆ m_min_exp

int CxxUtils::FloatPacker::m_min_exp

private

Minimum exponent value.

Definition at line 135 of file FloatPacker.h.

◆ m_nexp

int CxxUtils::FloatPacker::m_nexp

private

Number of exponent bits.

Definition at line 129 of file FloatPacker.h.

◆ m_nexp_ones

Packdest CxxUtils::FloatPacker::m_nexp_ones

private

Mask with that many low bits set.

Definition at line 132 of file FloatPacker.h.

◆ m_nmantissa

int CxxUtils::FloatPacker::m_nmantissa

private

Number of bits in the mantissa + sign bit.

Definition at line 105 of file FloatPacker.h.

◆ m_npack

int CxxUtils::FloatPacker::m_npack

private

Number of bits in mantissa (exclusive of any sign bit).

Definition at line 120 of file FloatPacker.h.

◆ m_npack_ones

Packdest CxxUtils::FloatPacker::m_npack_ones

private

Mask with that many low bits set.

Definition at line 123 of file FloatPacker.h.

◆ m_round

bool CxxUtils::FloatPacker::m_round

private

Should we round instead of truncating?

Definition at line 117 of file FloatPacker.h.

◆ m_scale

double CxxUtils::FloatPacker::m_scale

private

Scale factor for stored numbers.

Definition at line 108 of file FloatPacker.h.

◆ m_signmask

Packdest CxxUtils::FloatPacker::m_signmask

private

Mask containing the sign bit (or 0 if there's no sign bit).

Definition at line 126 of file FloatPacker.h.

The documentation for this class was generated from the following files:

Public Types

Public Member Functions

Private Attributes

Detailed Description

Member Typedef Documentation

◆ Packdest

Constructor & Destructor Documentation

◆ FloatPacker()

Member Function Documentation

◆ pack()

◆ unpack()

Member Data Documentation

◆ m_invscale

◆ m_is_signed

◆ m_max_exp

◆ m_min_exp

◆ m_nexp

◆ m_nexp_ones

◆ m_nmantissa

◆ m_npack

◆ m_npack_ones

◆ m_round

◆ m_scale

◆ m_signmask