ATLAS Offline Software
Loading...
Searching...
No Matches
CxxUtils::FloatPacker Class Reference

Pack/unpack floating-point data from/to a given number of bits. More...

#include <FloatPacker.h>

Collaboration diagram for CxxUtils::FloatPacker:

Public Types

typedef uint32_t Packdest
 Type into which we pack.

Public Member Functions

 FloatPacker (int nbits, int nmantissa, double scale=1, bool is_signed=true, bool round=false)
 Constructor.
Packdest pack (double src, std::string *err=nullptr) const
 Pack a value.
double unpack (Packdest val, std::string *err=nullptr) const
 Unpack the value VAL.

Private Attributes

int m_nmantissa
 Number of bits in the mantissa + sign bit.
double m_scale
 Scale factor for stored numbers.
double m_invscale
 Inverse of scale.
bool m_is_signed
 Should we use a sign bit?
bool m_round
 Should we round instead of truncating?
int m_npack
 Number of bits in mantissa (exclusive of any sign bit).
Packdest m_npack_ones
 Mask with that many low bits set.
Packdest m_signmask
 Mask containing the sign bit (or 0 if there's no sign bit).
int m_nexp
 Number of exponent bits.
Packdest m_nexp_ones
 Mask with that many low bits set.
int m_min_exp
 Minimum exponent value.
int m_max_exp
 Maximum exponent value.

Detailed Description

Pack/unpack floating-point data from/to a given number of bits.

The format is specified by the following parameters.

nbits - The total number of bits in the representation. scale - Scale factor to apply before storing. nmantissa - The number of bits to use for the mantissa and sign bit. is_signed - Flag to tell if we should use a sign bit. round - Flag to tell if we should round or truncate.

From these we derive:

npack = nmantissa, if is_signed is false. = nmantissa-1 if is_signed is true. nexp = nbits - nmantissa

The format consists of, in order from high bits to low bits:

  • A sign bit, if is_signed is true.
  • nexp bits of exponent information.
  • npack bits of mantissa.

The number is stored in normalized form, with an exponent bias of 2^(nexp-1). But if the (biased) exponent is zero, then the mantissa is stored in denormalized form. If nexp==0, this gives a fixed-point representation in the range [0,1). 0 is represented by all bits 0; if we have a sign bit, we can also represent -0 by all bits 0 except for the sign bit.

Definition at line 57 of file FloatPacker.h.

Member Typedef Documentation

◆ Packdest

Type into which we pack.

Definition at line 61 of file FloatPacker.h.

Constructor & Destructor Documentation

◆ FloatPacker()

CxxUtils::FloatPacker::FloatPacker ( int nbits,
int nmantissa,
double scale = 1,
bool is_signed = true,
bool round = false )

Constructor.

Parameters
nbitsThe number of bits in the packed representation.
nmantissaThe number of bits to use for the mantissa and sign bit.
scaleDivide the input number by this before packing.
is_signedIf true, then one mantissa bit is used for a sign.
roundIf true, numbers will be rounded. Otherwise, they will be truncated.

Definition at line 212 of file FloatPacker.cxx.

217 : m_nmantissa (nmantissa),
218 m_scale (scale),
219 m_is_signed (is_signed),
220 m_round (round)
221{
222 // scale==0 means not to scale.
223 // Use that instead of 1 since it's faster to test for 0.
224 if (scale == 1 || scale == 0)
225 m_invscale = 0;
226 else {
227 // Avoid spurious div-zero FPEs with clang.
229 m_invscale = 1. / m_scale;
230 }
231
232 // Set up other cached values.
234 if (m_is_signed)
235 --m_npack;
236
238
239 // Sign bit mask.
240 if (m_is_signed)
241 m_signmask = 1U << (nbits - 1);
242 else
243 m_signmask = 0;
244
245 // Number of exponent bits.
246 m_nexp = nbits - m_nmantissa;
248
249 // Minimum exponent value.
250 m_min_exp = min_int (m_nexp);
251
252 // Maximum exponent value.
253 m_max_exp = max_int (m_nexp);
254
255 if (m_npack < 1 || m_npack > nbits)
256 throw std::runtime_error ("Bad number of mantissa bits.");
257}
int m_nmantissa
Number of bits in the mantissa + sign bit.
Packdest m_npack_ones
Mask with that many low bits set.
int m_min_exp
Minimum exponent value.
int m_nexp
Number of exponent bits.
bool m_is_signed
Should we use a sign bit?
double m_invscale
Inverse of scale.
int m_npack
Number of bits in mantissa (exclusive of any sign bit).
int m_max_exp
Maximum exponent value.
Packdest m_nexp_ones
Mask with that many low bits set.
Packdest m_signmask
Mask containing the sign bit (or 0 if there's no sign bit).
bool m_round
Should we round instead of truncating?
double m_scale
Scale factor for stored numbers.
constexpr T ones(unsigned int n)
Return a bit mask with the lower n bits set.
Definition ones.h:25
#define CXXUTILS_TRAPPING_FP
Definition trapping_fp.h:24

Member Function Documentation

◆ pack()

FloatPacker::Packdest CxxUtils::FloatPacker::pack ( double src,
std::string * err = nullptr ) const

Pack a value.

Parameters
srcValue to pack.
errIf non-null, then this string will be set to a description of any error that occurs.
Returns
The packed value.

For now, we convert floats to doubles before packing.

Definition at line 270 of file FloatPacker.cxx.

271{
272 double_or_int d;
273 d.d.d = src;
274
275 // Fast-path for zero. (Purely an optimization.)
276 // Note: can't use a double compare here. On some architectures (eg, MIPS)
277 // a denormal will compare equal to zero.
278 if (d.i[0] == 0 && d.i[1] == 0)
279 return 0;
280
281 // Check for NaN and infinity.
282 if (d.d.ieee.exponent == ieee754_double_exponent_mask) {
283 if (err) {
284 std::ostringstream os;
285 os << "Bad float number: " << src << " (" << std::setbase(16) << d.i[0]
286 << " " << d.i[1] << ")";
287 *err = os.str();
288 }
289 d.d.d = 0;
290 }
291
292 if (m_invscale)
293 d.d.d *= m_invscale;
294
295 bool was_negative = false;
296 if (d.d.ieee.negative != 0) {
297 if (m_is_signed) {
298 was_negative = true;
299 d.d.d = -d.d.d;
300 }
301 else {
302 // Don't complain on -0.
303 if (d.d.d < 0 && err) {
304 std::ostringstream os;
305 os << "Float overflow during packing: " << src;
306 *err = os.str();
307 }
308 d.d.d = 0;
309 }
310 }
311
312 // Check for zero again.
313 // (Also need to preserve the sign; the scale division may
314 // have underflowed.)
315 if (d.i[0] == 0 && d.i[1] == 0) {
316 if (was_negative)
317 return m_signmask;
318 else
319 return 0;
320 }
321
322 // Get packdest_bits bits of mantissa.
323
324 Packdest mantissa =
325 (d.d.ieee.mantissa0 << (packdest_bits -
326 ieee754_double_mantissa0_bits)) |
327 (d.d.ieee.mantissa1 >>
328 (ieee754_double_mantissa1_bits -
329 (packdest_bits - ieee754_double_mantissa0_bits)));
330
331 // Get the unbiased exponent.
332 int exponent =
333 static_cast<int> (d.d.ieee.exponent) - ieee754_double_bias;
334
335 // Do rounding, if requested.
336 if (m_round) {
337 Packdest lsbmask = (1 << (packdest_bits - m_npack));
338 int roundbit;
339 Packdest roundmask;
340 if (lsbmask > 1) {
341 roundbit = (mantissa & (lsbmask >> 1));
342 roundmask = ~ static_cast<Packdest> (roundbit - 1);
343 }
344 else {
345 roundbit = (d.d.ieee.mantissa1 &
346 ((1 << ((ieee754_double_mantissa1_bits -
347 (packdest_bits -
348 ieee754_double_mantissa0_bits)) - 1))));
349 roundmask = ~ static_cast<Packdest> (0);
350 }
351
352 if (roundbit != 0) {
353 // Handle the case where it would overflow.
354 if ((mantissa & roundmask) == roundmask) {
355 mantissa >>= 1;
356 mantissa |= roundmask;
357 exponent += 1;
358 }
359
360 mantissa += lsbmask;
361 }
362 }
363
364 // If the number is too large, bitch, and reset to the largest number.
365 if (exponent > m_max_exp) {
366 if (err) {
367 std::ostringstream os;
368 os << "Float overflow during packing: " << src;
369 *err = os.str();
370 }
371 exponent = m_max_exp;
372 mantissa = static_cast<Packdest> (~0);
373 }
374
375 // Handle denormals. (We've already handled the zero case.)
376 if (exponent == - ieee754_double_bias)
377 renormalize_denormal (exponent, mantissa);
378
379 // If the number is too small, denormalize, or underflow to 0.
380 underflow_to_denormal (m_min_exp, m_round ? m_npack: 0, exponent, mantissa);
381
382 // Pack in the mantissa bits.
383 Packdest dest = mantissa >> (packdest_bits - m_npack);
384
385 // The exponent, if desired.
386 if (m_nexp > 0)
387 dest |= ((exponent - m_min_exp) << m_npack);
388
389 // And the optional sign bit.
390 if (was_negative)
391 dest |= m_signmask;
392
393 return dest;
394}
uint32_t Packdest
Type into which we pack.
Definition FloatPacker.h:61

◆ unpack()

double CxxUtils::FloatPacker::unpack ( Packdest val,
std::string * err = nullptr ) const

Unpack the value VAL.

Parameters
valThe packed data. It should start with the low bit, and any extraneous bits should have been masked off.
errIf non-null, then this string will be set to a description of any error that occurs.

Definition at line 404 of file FloatPacker.cxx.

405{
406 // Fast-path for 0.
407 if (val == 0)
408 return 0;
409
410 // Break apart the packed value.
411 bool was_negative = false;
412 if ((val & m_signmask) != 0)
413 was_negative = true;
414
415 double d;
416
417 // Fast path for fixed-point representations.
418 if (m_nexp == 0) {
419 Packdest mantissa = (val & m_npack_ones);
420 d = mantissa / ((double)m_npack_ones + 1);
421 if (was_negative)
422 d *= -1;
423 }
424 else {
425 // Get the mantissa.
426 Packdest mantissa = (val & m_npack_ones) << (packdest_bits - m_npack);
427
428 // General case.
429 // Get the exponent.
430 int exponent = ((val >> m_npack) & m_nexp_ones);
431 exponent += m_min_exp; // unbias.
432
433 ieee754_double dd;
434
435 // Handle denormals.
436 if (exponent == m_min_exp) {
437 // Maybe it was -0?
438 if (mantissa == 0) {
439 dd.d = 0;
440 if (was_negative)
441 dd.ieee.negative = 1;
442 return dd.d;
443 }
444
445 renormalize_denormal (exponent, mantissa);
446 }
447
448 // Complain about overflow.
449 if (exponent >= max_int (ieee754_double_exponent_bits)) {
450 if (err) {
451 std::ostringstream os;
452 os << "Overflow while unpacking float; exponent: " << exponent;
453 *err = os.str();
454 }
455 exponent = max_int (ieee754_double_exponent_bits) + 1;
456 mantissa = 0; // Infinity.
457 }
458
459 // Underflow into denormal.
460 underflow_to_denormal ( - ieee754_double_bias, 0,
461 exponent, mantissa);
462
463 // Pack into a double.
464 dd.ieee.negative = was_negative ? 1 : 0;
465 dd.ieee.exponent = exponent + ieee754_double_bias;
466 dd.ieee.mantissa0 =
467 (mantissa >> (packdest_bits - ieee754_double_mantissa0_bits));
468 dd.ieee.mantissa1 =
469 (mantissa << (ieee754_double_mantissa0_bits -
470 (packdest_bits - ieee754_double_mantissa1_bits)));
471 d = dd.d;
472 }
473
474 // Set the result.
475 if (m_scale)
476 d *= m_scale;
477 return d;
478}

Member Data Documentation

◆ m_invscale

double CxxUtils::FloatPacker::m_invscale
private

Inverse of scale.

Definition at line 111 of file FloatPacker.h.

◆ m_is_signed

bool CxxUtils::FloatPacker::m_is_signed
private

Should we use a sign bit?

Definition at line 114 of file FloatPacker.h.

◆ m_max_exp

int CxxUtils::FloatPacker::m_max_exp
private

Maximum exponent value.

Definition at line 138 of file FloatPacker.h.

◆ m_min_exp

int CxxUtils::FloatPacker::m_min_exp
private

Minimum exponent value.

Definition at line 135 of file FloatPacker.h.

◆ m_nexp

int CxxUtils::FloatPacker::m_nexp
private

Number of exponent bits.

Definition at line 129 of file FloatPacker.h.

◆ m_nexp_ones

Packdest CxxUtils::FloatPacker::m_nexp_ones
private

Mask with that many low bits set.

Definition at line 132 of file FloatPacker.h.

◆ m_nmantissa

int CxxUtils::FloatPacker::m_nmantissa
private

Number of bits in the mantissa + sign bit.

Definition at line 105 of file FloatPacker.h.

◆ m_npack

int CxxUtils::FloatPacker::m_npack
private

Number of bits in mantissa (exclusive of any sign bit).

Definition at line 120 of file FloatPacker.h.

◆ m_npack_ones

Packdest CxxUtils::FloatPacker::m_npack_ones
private

Mask with that many low bits set.

Definition at line 123 of file FloatPacker.h.

◆ m_round

bool CxxUtils::FloatPacker::m_round
private

Should we round instead of truncating?

Definition at line 117 of file FloatPacker.h.

◆ m_scale

double CxxUtils::FloatPacker::m_scale
private

Scale factor for stored numbers.

Definition at line 108 of file FloatPacker.h.

◆ m_signmask

Packdest CxxUtils::FloatPacker::m_signmask
private

Mask containing the sign bit (or 0 if there's no sign bit).

Definition at line 126 of file FloatPacker.h.


The documentation for this class was generated from the following files: