Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: REM Cost Reform

The internal representations of bit-field structures have several properties (such as internal padding) that are implementation-defined. Additionally, bit-field structures have several implementation-defined constraints:

  • The alignment of bit-fields in the storage unit (for example, the bit-fields may be allocated from the high end or the low end of the storage unit)
  • Whether or not bit-fields can overlap a storage unit boundary

Consequently, it is impossible to write portable safe code that makes assumptions regarding the layout of bit-field structure members.

Noncompliant Code Example (Bit-Field Alignment)

Bit-fields can be used to allow flags or other integer values with small ranges to be packed together to save storage space. Bit-fields can improve the storage efficiency of structures. Compilers typically allocate consecutive bit-field structure members into the same int-sized storage, as long as they fit completely into that storage unit. However, the order of allocation within a storage unit is implementation-defined. Some implementations are right-to-left: the first member occupies the low-order position of the storage unit. Others are left-to-right: the first member occupies the high-order position of the storage unit. Calculations that depend on the order of bits within a storage unit may produce different results on different implementations.

Consider the following structure made up of four 8-bit bit-field members:

Code Block
struct bf

C provides a storage-compaction capability for structure members, in which each member occupies only a specified number of bits. Such a member is known as a bit-field. Bit-fields can be useful for reducing the storage needed for a large array of structures. They are also useful for defining various hardware interfaces which specify the individual bits within a machine word.

In portable code, do not depend upon the allocation order of bit-fields in memory. Of course, in machine-specific non-portable code one knows exactly how the bit-fields are laid out, and the internal details can be inspected with bitwise operations.

Consider the representation of time-of-day in hours, minutes, seconds and milliseconds. Bit-fields provide one way to represent such times:

Code Block

typedef struct time_day {
  unsigned h1int m1 : 2;	{0:2}8;
  unsigned h2 
  unsigned m1 
  unsigned int m2 
  unsigned s1 : 8;
  unsigned s2int 
m3  unsigned f1 : 8;
  unsigned f2int 
  unsigned f3
  {0:9} {0:5) {0:9) {0:5) {0:9) {0:9} {0:9} {0:9}
} TIME_DAYm4 : 8;
};	/* 32 bits total */

The last millisecond of the day is
23:59:59.999 (hh:mm:ss.fff)
Each member (bit-field) is declared to be unsigned (int); this is the only bit-field type that is guaranteed to be portable to all current compilers. Each member is declared to have only as many bits as are necessary to represent the possible digits at its position in the time representation. Representing h1 (first digit of hours) takes only two bits to represent the possible values (0, 1, and 2). And the largest members need only four bits to represent ten digits, 0 through 9. The total number of bits is 32.

Consecutive bit-field members are allocated by the compiler to the same int-sized word, as long as they fit completely. Thus, on a 32-bit machine, a TIME_DAY object will occupy exactly one int-sized word. On a 16-bit machine, the first five members (totalling 16 bits) will fit into one int-sized word, and the last four members will fit into an immediately following word. Such an exact fit is rare, however. Add another member such as "day-of-year" to the structure, and the nice size-fitting property disappears. Thus, bit-fields are useful for storage-saving only if they occupy most or all of the space of an int, and if the storage-saving property is to be reasonably portable, they must occupy most of the space in a 32-bit integer 6-1.

The order of allocation within a word is different in different implementations. Some implementations are "right-to-left": the first member occupies the low-order position of the word. Most PDP-11 and VAX compilers allocate right-to-left. Following the convention that the ]ow-order bit of a word is on the right, the right-to-]eft allocation would look like this:

Code Block

f3	f2	f1	s2	s1	m2	ml	h2 h1

Most other implementations are "left-to-right":

Code Block

h1	h2	ml	m2	s1	s2	f1	f2	f3

A union provides a convenient way to say what is going on:
typedef union time_overlay /* MACHINE DEPENDENT */
{
struct time_day time_as_fields;
long time_as_long;
} TIME_OVERLAY;

TIME_OVERLAY time_port;
This allows bitwise operations like time_port.time_as_long & 0xF00 as well as providing access via bit-field names like time_port.time_as_fields.h1.
Specifying a field size of zero causes any subsequent allocations to begin on a new word boundary. Un-named bit-fields are allowed; they occupy space but are inaccessible, which is useful for "padding" within a structure.

Because most C machines do not support bit-addressing, the "address-of" (&) operator is not allowed upon bit-field members.
Aside from these complications, bit-fields can be treated just like any other structure member. The following declaration

Code Block

#include "time_day.h"
struct time_day last_msec = {2, 3, 5, 9, 5, 9, 9, 9, 9};
initializes last_msec to the last millisecond of the day.
struct time_day now;

/* ... */
if (now.h1 == 0 || (now.h1 == 1 && now.h2 < 2))

tests whether now is less than "noon."
If we wish to use the TIME_DAY structure for an "information-hiding" purpose, so that it could be changed without affecting the programs that use it, we should employ the "leading-underscore" convention mentioned earlier in Section 6.2 :

Code Block

typedef struct_time_day {
  unsigned_h1 : 2;	/* tens digit of hours	{0:2} */
  unsigned_h2 : 4;	/* units digit of hours	{0:9} */
  unsigned_ml : 3;	/* tens digit of minutes	{0:5} */
  unsigned_m2 : 4;	/* units digit of minutes	{0:9} */
  unsigned_s1 : 3;	/* tens digit of seconds	{0:5) */
  unsigned_s2 : 4;	/* units digit of seconds	{0:9} */
  unsigned_fl : 4;	/* first digit of fraction	{0:9} */
  unsigned_f2 : 4;	/* second digit of fraction	{0:9} */
  unsigned_f3 : 4;	/* third digit of fraction	{0:9} */
} TIME_DAY;	/* 32 bits total */

Right-to-left implementations will allocate struct bf as one storage unit with this format:

Code Block
m4   m3   m2   m1

Conversely, left-to-right implementations will allocate struct bf as one storage unit with this format:

Code Block
m1   m2   m3   m4

The following code behaves differently depending on whether the implementation is left-to-right or right-to-left:

Code Block
bgColor#ffcccc
langc
struct bf {
  unsigned int m1 : 8;
  unsigned int m2 : 8;
  unsigned int m3 : 8;
  unsigned int m4 : 8;
}; /* 32 bits total */

void function() {
  struct bf data;
  unsigned char *ptr;

  data.m1 = 0;
  data.m2 = 0;
  data.m3 = 0;
  data.m4 = 0;
  ptr = (unsigned char *)&data;
  (*ptr)++; /* Can increment data.m1 or data.m4 */
}

Compliant Solution (Bit-Field Alignment)

This compliant solution is explicit in which fields it modifies:

Code Block
bgColor#ccccff
langc
struct bf {
  unsigned int m1 : 8;
  unsigned int m2 : 8;
  unsigned int m3 : 8;
  unsigned int m4 : 8;
}; /* 32 bits total */

void function() {
  struct bf data;
  data.m1 = 0;
  data.m2 = 0;
  data.m3 = 0;
  data.m4 = 0;
  data.m1++;
}

Noncompliant Code Example (Bit-Field Overlap)

In this noncompliant code example, assuming 8 bits to a byte, if bit-fields of 6 and 4 bits are declared, is each bit-field contained within a byte, or are the bit-fields split across multiple bytes?

Code Block
bgColor#ffcccc
langc
struct bf {
  unsigned int m1 : 6;
  unsigned int m2 : 4;
};

void function() {
  unsigned char *ptr;
  struct bf data;
  data.m1 = 0;
  data.m2 = 0;
  ptr = (unsigned char *)&data;
  ptr++;
  *ptr += 1; /* What does this increment? */
}

If each bit-field lives within its own byte, then m2 (or m1, depending on alignment) is incremented by 1. If the bit-fields are indeed packed across 8-bit bytes, then m2 might be incremented by 4.

Compliant Solution (Bit-Field Overlap)

This compliant solution is explicit in which fields it modifies:

Code Block
bgColor#ccccff
langc
struct bf {
  unsigned int m1 : 6;
  unsigned int m2 : 4;
};

void function() {
  struct bf data;
  data.m1 = 0;
  data.m2 = 0;
  data.m2 += 1;
}

...

Risk Assessment

Making invalid assumptions about the type of a type-cast data, especially bit-field or its layout fields, can result in unexpected program flowdata values.

Rule

Recommendation

Severity

Likelihood

Remediation Cost

Detectable

Repairable

Priority

Level

INT11-A

1 (low)

1 (unlikely)

2 (medium)

P2

L3

EXP11-C

Medium

Probable

No

No

P4

L3

Automated Detection

Tool

Version

Checker

Description

Astrée
Include Page
Astrée_V
Astrée_V

Supported: Astrée reports runtime errors resulting from invalid assumptions.
Compass/ROSE



Can detect violations of this recommendation. Specifically, it reports violations if

    • A pointer to one object is type cast to the pointer of a different object
    • The pointed-to object of the (type cast) pointer is then modified arithmetically
Helix QAC

Include Page
Helix QAC_V
Helix QAC_V

C0310, C0751
LDRA tool suite
Include Page
LDRA_V
LDRA_V

554 S

Fully implemented

Polyspace Bug Finder

Include Page
Polyspace Bug Finder_V
Polyspace Bug Finder_V

CERT C: Rec. EXP11-C

Checks for bit fields accessed using pointer.

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule recommendation on the CERT website.

References

Related Guidelines

Bibliography

[Plum 1985]Rule 6-5: In portable code, do not depend upon the allocation order of bit-fields within a word


...

Image Added Image Added Image Added Wiki Markup\[[ISO/IEC 9899-1999|AA. C References#ISO/IEC 9899-1999]\] Section 6.7.2, "Type specifiers" \[[MISRA 04|AA. C References#MISRA 04]\] Rule 3.5 \[[Plum 85|AA. C References#Plum 85]\] Rule 6-5