The C programming language provides the ability to use floating-point numbers for calculations. The C Standard specifies requirements on a conforming implementation for floating-point numbers but makes few guarantees about the specific underlying floating-point representation because of the existence of competing floating-point systems.
By definition, a floating-point number is of finite precision and, regardless of the underlying implementation, is prone to errors associated with rounding. (See FLP01-C. Take care in rearranging floating-point expressions and FLP02-C. Avoid using floating-point numbers when precise computation is needed.)
The most common floating-point system is specified by the IEEE 754 standard. An older floating-point system is the IBM floating-point representation (sometimes called IBM/370). Each of these systems has different precisions and ranges of representable values. As a result, they do not represent all of the same values, are not binary compatible, and have different associated error rates.
Because of a lack of guarantees on the specifics of the underlying floating-point system, no assumptions can be made about either precision or range. Even if code is not intended to be portable, the chosen compiler's behavior must be well understood at all compiler optimization levels.
Here is a simple illustration of precision limitations. The following code prints the decimal representation of 1/3 to 50 decimal places. Ideally, it would print 50 numeral 3s:
On 64-bit Linux, with GCC 4.1, it produces
On 64-bit Windows, with Microsoft Visual Studio 2012, it produces
Additionally, compilers may treat floating-point variables differently under different levels of optimization [Gough 2005]:
When compiled on an IA-32 Linux machine with GCC 3.4.4 at optimization level 1 or higher, or on an IA-64 Windows machine with Microsoft Visual Studio 2012 in Debug or Release mode, this code prints
On an IA-32 Linux machine with GCC 3.4.4 with optimization turned off, this code prints
The reason for this behavior is that Linux uses the internal extended precision mode of the x87 floating-point unit (FPU) on IA-32 machines for increased accuracy during computation. When the result is stored into memory by the assignment to
c, the FPU automatically rounds the result to fit into a
double. The value read back from memory now compares unequally to the internal representation, which has extended precision. Windows does not use the extended precision mode, so all computation is done with double precision, and there are no differences in precision between values stored in memory and those internal to the FPU. For GCC, compiling at optimization level 1 or higher eliminates the unnecessary store into memory, so all computation happens within the FPU with extended precision [Gough 2005].
The standard constant
__FLT_EPSILON__ can be used to evaluate if two floating-point values are close enough to be considered equivalent given the granularity of floating-point operations for a given implementation.
__FLT_EPSILON__ represents the difference between 1 and the least value greater than 1 that is representable as a float. The granularity of a floating-point operation is determined by multiplying the operand with the larger absolute value by
On all tested platforms, this code prints
double precision and
long double precision floating-point values, use a similar approach using the
__LDBL_EPSILON__ constants, respectively.
Consider using numerical analysis to properly understand the numerical properties of the problem.
Failing to understand the limitations of floating-point numbers can result in unexpected computational results and exceptional conditions, possibly resulting in a violation of data integrity.
|Floating-point expressions shall not be tested for equality or inequality|
|Absorption of float operand||One addition or subtraction operand is absorbed by the other operand|
Search for vulnerabilities resulting from the violation of this recommendation on the CERT website.
|[Gough 2005]||Section 8.6, "Floating-Point Issues"|
|[Hatton 1995]||Section 2.7.3, "Floating-Point Misbehavior"|
|[IEEE 754 2006]|
|[Lockheed Martin 2005]||AV Rule 202, Floating-point variables shall not be tested for exact equality or inequality|