INT02. Understand integer conversion rules

Type conversions occur explicitly as the result of a cast or implicitly as required by an operation. While conversions are generally required for the correct execution of a program, they can also lead to lost or misinterpreted data.

The C99 standard rules define how C compilers handle conversions. These rules include integer promotions, integer conversion rank, and the usual arithmetic conversions.

Integer Promotions

Integer types smaller than int are promoted when an operation is performed on them. If all values of the original type can be represented as an int, the value of the smaller type is converted to an int; otherwise, it is converted to an unsigned int.

Integer promotions are applied as part of the usual arithmetic conversions to certain argument expressions, operands of the unary +, -, and ~ operators, and operands of the shift operators. The following code fragment illustrates the use of integer promotions:

char c1, c2;
c1 = c1 + c2;

Integer promotions require the promotion of each variable (c1 and c2) to int size. The two ints are added and the sum truncated to fit into the char type.

Integer promotions are performed to avoid arithmetic errors resulting from the overflow of intermediate values. For example:

char cresult, c1, c2, c3;
c1 = 100;
c2 = 90;
c3 = --120;
cresult = c1 + c2 + c3;

In this example, the value of c1 is added to the value of c2. The sum of these values is then added to the value of c3 (according to operator precedence rules). The addition of c1 and c2 would result in an overflow of the signed char type because the result of the operation exceeds the maximum size of signed char. Because of integer promotions, however, c1, c2, and c3 are each converted to integers and the overall expression is successfully evaluated. The resulting value is then truncated and stored in cresult. Because the result is in the range of the signed char type, the truncation does not result in lost data.

Integer promotions have a number of interesting consequences. For example, adding two small integer types always results in a value of type signed int or unsigned int, and the actual operation takes place in this type. Also, applying the bitwise negation operator ~ to an unsigned char (on IA-32) results in a negative value of type signed int because the value is zero-extended to 32 bits.

Integer Conversion Rank

Every integer type has an integer conversion rank that determines how conversions are performed. The following rules for determining integer conversion rank are defined in C99.

No two different signed integer types have the same rank, even if they have the same representation.
The rank of a signed integer type is greater than the rank of any signed integer type with less precision.
The rank of long long int is greater than the rank of long int, which is greater than the rank of int, which is greater than the rank of short int, which is greater than the rank of signed char.
The rank of any unsigned integer type is equal to the rank of the corresponding signed integer type, if any.
The rank of any standard integer type is greater than the rank of any extended integer type with the same width.
The rank of char is equal to the rank of signed char and unsigned char.
The rank of any extended signed integer type relative to another extended signed integer type with the same precision is implementation defined but still subject to the other rules for determining the integer conversion rank.
For all integer types T1, T2, and T3, if T1 has greater rank than T2 and T2 has greater rank than T3, then T1 has greater rank than T3.

The integer conversion rank is used in the usual arithmetic conversions to determine what conversions need to take place to support an operation on mixed integer types.

Usual Arithmetic Conversions

The usual arithmetic conversions are a set of rules that provides a mechanism to yield a common type when both operands of a binary operator are balanced to a common type or the second and third arguments of the conditional operator ( ? : ) are balanced to a common type. Balancing conversions involve two operands of different types, and one or both operands may be converted. Many operators that accept arithmetic operands perform conversions using the usual arithmetic conversions. After integer promotions are performed on both operands, the following rules are applied to the promoted operands.

If both operands have the same type, no further conversion is needed.
If both operands are of the same integer type (signed or unsigned), the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
If the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type is converted to the type of the operand with unsigned integer type.
If the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type. Specific operations can add to or modify the semantics of the usual arithmetic operations.

Example

In the following example, assume the following code is compiled and executed on IA-32:

signed char sc = SCHAR_MAX;
unsigned char uc = UCHAR_MAX;
signed long long sll = sc + uc;

Both the signed char sc and the unsigned char uc are subject to integer promotions in this example. Because all values of the original types can be represented as int, both values are automatically converted to int as part of the integer promotions. Further conversions are possible, if the types of these variables are not equivalent as a result of the "usual arithmetic conversions." The actual addition operation in this case takes place between the two 32-bit int values. This operation is not influenced by the fact that the resulting value is stored in a signed long long integer. The 32-bit value resulting from the addition is simply sign-extended to 64-bits after the addition operation has concluded.

Assuming that the precision of signed char is 7 bits and the precision of unsigned char is 8 bits, this operation is perfectly safe. However, if the compiler represents the signed char and unsigned char types using 31 and 32 bit precision (respectively), the variable uc would need be converted to unsigned int instead of signed int. As a result of the usual arithmetic conversions, the signed int is converted to unsigned and the addition takes place between the two unsigned int values. Also, because uc is equal to UCHAR_MAX, which is equal to UINT_MAX in this example, the addition will result in an overflow. The resulting value is then zero-extended to fit into the 64-bit storage allocated by sll.

Consequences

Misunderstanding integer conversion rules can lead to integer errors, which in turn can lead to exploitable vulnerabilites.

References

Seacord 05 Chapter 5 Integers
ISO/IEC 9899-1999 6.3 Conversions

Space shortcuts

Page tree