Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: REM Cost Reform

The EOF macro represents a negative value that is used to indicate that the file is exhausted and no data remains when reading data from a file. EOF is an example of an in-band error indicator. In-band error indicators are problematic to work with, and the creation of new in-band-error indicators is discouraged by ERR02-C. Avoid in-band error indicators.

The character byte I/O functions fgetc(), getc(), and getchar() all read a character from a stream and return it as an int. (See STR00-C. Represent characters using an appropriate type.) If the stream is at the end of the file, the end-of-file indicator for the stream is set and the function returns EOF. If a read error occurs, the error indicator for the stream is set and the function returns EOF. If these functions succeed, they cast the character returned into an unsigned char.

Because EOF is negative, it should not match any unsigned character value. However, this is only true for platforms implementations where the int type has more precision bits is wider than char. On a platform an implementation where int and char have the same precisionwidth, a character-reading function could return EOF because it read can read and return a valid character that had has the same bit-pattern as EOF. This could occur, for example, if an attacker inserted a value that looked like EOF into the file or data stream to alter the behavior of the program.

The C Standard requires only that an the int type be able to represent a maximum value of +32767 and that a char type be no larger than an int. Although uncommon, this situation can result in the integer constant expression EOF being indistinguishable from a normal valid character; that is, (int)(unsigned char)65535 == -1. Consequently, failing to use feof() and ferror() to detect end-of-file and file errors can result in incorrectly identifying the EOF character on rare implementations where sizeof(int) == sizeof(char).

This problem can also occur is much more common when reading wide characters. The fgetwc(), getwc(), and getwchar() functions all return a value of type wint_t. This value can represent the next wide character read, or it can represent WEOF, which indicates end-of-file for wide character streams. On most platformsimplementations, the wchar_t type has the same precision width as wint_t, and so these functions can return WEOF because it was truly the last wide character read.a character indistinguishable from WEOF.

In Note that in the UTF-16 character set, 0xFFFF is guaranteed not to be a character, which allows WEOF to be represented as the value −1. In 16-bit EUC (Extended UNIX Code), the high order byte can never be 0xFF, so a conflict cannot occur -1. Similarly, all UTF-32 characters are positive when viewed as a signed 32-bit integer. All widely used character sets are designed with at least one value that does not represent a character. Consequently, it would require a custom character set designed without consideration of the C programming language for this problem to occur with wide characters or with ordinary characters that are as wide as int.See STR00-C. Represent characters using an appropriate type for more information on the proper use of character types.

The C Standard feof() and ferror() functions are not subject to the problems associated with character and integer sizes , and are preferred over EOF or WEOF to detect and should be used to verify end-of-file and file errors  errors for susceptible implementations [Kettlewell 2002]. Calling both functions on each iteration of a loop adds significant overhead, so a good strategy is to temporarily trust EOF and WEOF within the loop but verify them with feof() and ferror() following the loop.

Noncompliant Code Example

This noncompliant code example tests to see if loops while the character c is not EOF as a loop-termination condition:

Code Block
bgColor#ffcccc
langc
#include <stdio.h>
 
void func(void) {
  int c;
 
  do {
    c = getchar();
  } while (c != EOF);
}

Although EOF is guaranteed to be negative and distinct from the value of any unsigned character, it is not guaranteed to be different from any such value when converted to an int. Consequently, when int is has the same size width as char, this loop may terminate earlyprematurely.

Compliant Solution (Portable)

This compliant solution uses feof() and ferror() to test for whether the EOF was an actual character or a real EOF because of end-of-file and ferror() to test for or errors:

Code Block
bgColor#ccccff
langc
#include <stdio.h>
 
void func(void) {
  int c;
 
  do {
    c = getchar();
  } while (c != EOF || (!feof(stdin) && !ferror(stdin)));
}

Noncompliant Code Example (Nonportable)

This noncompliant code example uses an assertion to ensure that the code is executed only on architectures where int is larger wider than char and EOF is guaranteed not to not be a valid character value. (See INT35-C. Use correct integer precisions).  However, this code example is noncompliant because the variable c is declared as a char rather than an int, making it possible for a valid character value to compare equal to the value of the EOF macro when char is signed because of sign extension:

Code Block
bgColor#ffcccc
langc
#include <assert.h>
#include <limits.h>
#include <stdio.h>
#include <inttypes.h>

extern size_t popcount(uintmax_t);
#define PRECISION(umax_value) popcount(umax_value)
 
void func(void) {
  char c;
  static_assert(PRECISION(UCHAR_MAX) < PRECISION(INTUINT_MAX)), "FIO34-C violation");

  do {
    c = getchar();
  } while (c != EOF);
}

Assuming that a char is a signed 8-bit value type and an int is a 32-bit valuetype, if getchar() returns the character encoded as 0xFF value '\xff (decimal 255), it will be interpreted as EOF because this value is sign-extended to 0xFFFFFFFF (the value of EOF) to perform the comparison. (See INT31STR34-C. Ensure that integer conversions do not result in lost or misinterpreted dataCast characters to unsigned char before converting to larger integer sizes.)

Compliant Solution (Nonportable)

This compliant solution declares c to be an int. Consequently, the loop will terminate only terminate when the file is exhausted.

Code Block
bgColor#ccccff
langc
#include <assert.h>
#include <stdio.h>
#include <limits.h>
#include <inttypes.h>

extern size_t popcount(uintmax_t);
#define PRECISION(umax_value) popcount(umax_value)
 
void func(void) {
  int c;
  static_assert(PRECISION(UCHAR_MAX) < PRECISION(INTUINT_MAX)), "FIO34-C violation");

  do {
    c = getchar();
  } while (c != EOF);
}

...

In this noncompliant example, the result of the call to the C standard library function getwc() is stored into a variable of type wchar_t, and is subsequently compared with WEOF:

Code Block
bgColor#ffcccc
langc
#include <stddef.h>
#include <stdio.h>
#include <wchar.h>
 
void g(void) {
  enum { BUFFER_SIZE = 32 };

void g(void) {
  wchar_t buf[BUFFER_SIZE];
  wchar_t wc;
  size_t i = 0;
  
  while ((wc = getwc(stdin)) != L'\n' && wc != WEOF) {
    if (i < (BUFFER_SIZE - 1)) {
      buf[i++] = wc;
    }
  }
  
  buf[i] = L'\0';
}

This code suffers from two problems. First, the value returned by getwc() is immediately converted to wchar_t before being compared with WEOF. Second, there is no check to see if ensure that wint_t has more precision bits is wider than wchar_t. Both of these problems make it possible for an attacker to terminate the loop prematurely by supplying the wide-character value matching WEOF in the file.

...

This compliant solution declares c wc to be a wint_t, to match the integer type of integer returned by getwc(). Furthermore, it does not rely on WEOF to determine end-of-file definitively.

Code Block
bgColor#ccccff
langc
#include <stddef.h>
#include <stdio.h>
#include <wchar.h>
 
void g(void) {
  enum { BUFFER_SIZE = 32 };

void g(void) {
  wchar_t buf[BUFFER_SIZE];
  wint_t wc;
  size_t i = 0;
  
  while ((wc = getwc(stdin)) != L'\n' &&
         !feof(stdin) && !ferror(stdin)) wc != WEOF) {
    if (i < BUFFER_SIZE - 1) {
      buf[i++] = wc;
    }
  }

  if (feof(stdin) || ferror(stdin)) {
   buf[i] = L'\0';
  } else {
    /* Received a wide character that resembles WEOF; handle error */
  }
}

Exceptions

FIO34-C-EX1: A number of C functions do not return characters but can return EOF as a status code. These functions include fclose(), fflush(), fputs(), fscanf(), puts(), scanf(), sscanf(), vfscanf(), and vscanf(). It is valid to compare these These return values can be compared to EOF without validating the result.

Risk Assessment

Comparing Incorrectly assuming characters from a file with cannot match EOF or WEOF has resulted in significant vulnerabilities, including command injection attacks. (See the *CA-1996-22 advisory.)

Rule

Severity

Likelihood

Remediation Cost

Detectable

Repairable

Priority

Level

FIO34-C

High

Probable

Yes

Medium

Yes

P12

P18

L1

...


Automated Detection

Tool

Version

Checker

Description

Astrée
Include Page
Astrée_V
Astrée_V

conversion_overflow

essential-type-assign

Soundly supported
Axivion Bauhaus Suite

Include Page
Axivion Bauhaus Suite_V
Axivion Bauhaus Suite_V

CertC-FIO34
CodeSonar
Include Page
CodeSonar_V
CodeSonar_V
LANG.CAST.COERCECoercion alters value
Compass/ROSE
 




Coverity
 
Include Page
 
Coverity_V
Coverity_V
6.5

CHAR_IO

Identifies defects when the return value of fgetc()getc(), or getchar() is incorrectly assigned to a char instead of an int. Coverity Prevent cannot discover all violations of this rule, so further verification is necessary

Cppcheck Premium
Include Page
Cppcheck Premium_V
Cppcheck Premium_V


premium-cert-fio34-c


ECLAIR1.2

CC2.FIO34

Partially implemented

Fortify SCA

5.0

 

Helix QAC

Include Page
Helix QAC_V
Helix QAC_V

C2676, C2678

C++2676, C++2678, C++3001, C++3010, C++3051, C++3137, C++3717


Klocwork
Include Page
Klocwork_V
Klocwork_V
CWARN.CMPCHR.EOF
LDRA tool suite
Include Page
LDRA_V
LDRA_V
662 S
Fully implemented
Parasoft C/C++test
Include Page
Parasoft_V
Parasoft_V

CERT_C-FIO34-a

The macro EOF should be compared with the unmodified return value from the Standard Library function

Polyspace Bug Finder

Include Page
Polyspace Bug Finder_V
Polyspace Bug Finder_V

CERT C: Rule FIO34-C


Checks for character values absorbed into EOF (rule partially covered)

Can detect violations of this rule with CERT C Rule Pack

Splint3.1.1

 



RuleChecker

Include Page
RuleChecker_V
RuleChecker_V

essential-type-assignSupported
 

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

Key here (explains table format and definitions)

CERT-CWE Mapping Notes

Key here for mapping notes

CWE-197 and FIO34-C

Independent( FLP34-C, INT31-C) FIO34-C = Subset( INT31-C)

Therefore: FIO34-C = Subset( CWE-197)

Bibliography

[Kettlewell 2002]Section 1.2, "<stdio.h> and Character Types"
[NIST 2006]SAMATE Reference Dataset Test Case ID 000-000-088
[Summit 2005]Question 12.2

 



Image Removed Image Removed Image RemovedImage Added Image Added Image Added