Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Using integer arithmetic to calculate a value for assignment to a floating-point variable may lead to loss of information. This can be avoided by converting one of the integers in the expression to a floating-point type. When converting integers to floating-point values, it is important to be aware that there may be loss of precision (see INT33-J. Do not cast numeric types to wider floating-point types without range checking).

...

In this noncompliant code example, the division and multiplication operations take place on integers and the result gets converted to floating-point. This causes floating-point variables d, e, and f to be initialized incorrectly because the operations take place before the values of a, b and c are converted to floating-point. The results of the operations are truncated to the nearest integer or may overflow.

Code Block
bgColor#FFCCCC
short a = 533;
int b = 6789;
long c = 4664382371590123456L;

float d = a / 7;    // d is 76.0 (truncated)
double e = b / 30;  // e is 226.0 (truncated)
double f = c * 2;   // f is -9.1179793305293046E18 duebecause toof overflow

Compliant Solution (Floating Point Literal)

In this This compliant solution , eliminates the decimal error in initialization is eliminated by ensuring that at least one of the operands to the division operation is of the floating-point type.

Code Block
bgColor#CCCCFF
short a = 533;
int b = 6789;
long c = 4664382371590123456L;

float d = a / 7.0f;       // d is 76.14286
double e = b / 30.;       // e is 226.3
double f = (double)c * 2; // f is 9.328764743180247E18

Compliant Solution

In this This compliant solution , eliminates the initialization errors are eliminated by first storing the integers in the floating-point variables and then performing the arithmetic operations. This ensures that at least one of the operands is a floating-point number, and consequently the operation is performed on floating-point numbers.

Code Block
bgColor#CCCCFF
short a = 533;
int b = 6789;
long c = 4664382371590123456L;

float d = a;
double e = b;
double f = c;

d /= 7;   // d is 76.14286
e /= 30;  // e is 226.3
f *= 2;  // f is 9.328764743180247E18

...

FLP31-EX1: If it is the programmer's intention to have the operation use integers before the conversion (obviating the need for a call to use the floor() method, for example) it should be clearly documented to help future maintainers understand that this behavior is intentional.

...

Improper conversions between integers and floating point values may yield unexpected results, especially loss as a result of precision loss. In some cases, these unexpected results may involve overflow, or undefined behavior.

...