You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 47 Next »

Signed character data must be converted to unsigned char before being assigned or converted to a larger signed type. Because compilers have the latitude to define char to have the same range, representation, and behavior as either signed char or unsigned char, this rule should be applied to both signed char and (plain) char characters.

This rule is only applicable in cases where the character data may contain values that can be interpreted as negative values. For example, if the char type is represented by a two's complement 8-bit value, any character value greater than +127 is interpreted as a negative value.

This rule is a generalization of rule STR37-C. Arguments to character handling functions must be representable as an unsigned char.

Noncompliant Code Example

This noncompliant code example is taken from a vulnerability in bash versions 1.14.6 and earlier that resulted in the release of CERT Advisory CA-1996-22. This vulnerability resulted from the sign extension of character data referenced by the string pointer in the yy_string_get() function in the parse.y module of the bash source code:

static int yy_string_get() {
  register char *string;
  register int c;

  string = bash_input.location.string;
  c = EOF;

  /* If the string doesn't exist, or is empty, EOF found. */
  if (string && *string) {
    c = *string++;
    bash_input.location.string = string;
  }
  return (c);
}

The string variable is used to traverse the character string containing the command line to be parsed. As characters are retrieved from this pointer, they are stored in a variable of type int. For compilers in which the char type defaults to signed char, this value is sign-extended when assigned to the int variable. For character code 255 decimal (-1 in two's complement form), this sign extension results in the value -1 being assigned to the integer, which is indistinguishable from EOF.

This problem was repaired by explicitly declaring the string variable as unsigned char.

static int yy_string_get() {
  register unsigned char *string;
  register int c;

  string = bash_input.location.string;
  c = EOF;

  /* If the string doesn't exist, or is empty, EOF found. */
  if (string && *string) {
    c = *string++;
    bash_input.location.string = string;
  }
  return (c);
}

This solution, however, is in violation of recommendation STR04-C. Use plain char for characters in the basic character set.

Compliant Solution

In this compliant solution, the result of the expression *string++ is cast to (unsigned char) before assignment to the int variable c.

static int yy_string_get() {
  register char *string;
  register int c;

  string = bash_input.location.string;
  c = EOF;

  /* If the string doesn't exist, or is empty, EOF found. */
  if (string && *string) {
    /* cast to unsigned type */
    c = (unsigned char)*string++;

    bash_input.location.string = string;
  }
  return (c);
}

Noncompliant Code Example

In this noncompliant example the result of the cast of *s to unsigned int may result in a value in excess of UCHAR_MAX because of integer promotions, consequently causing the function to violate VOID Guarantee that array indices are within the valid range, leading to undefined behavior.

static const char table[UCHAR_MAX] = { /* ... /* };

int first_not_in_table(const char *str) {
  const char *s = str;
  for (; *s; ++s) {
    if (table[(unsigned)*s] != *s)
      return s - str;
  return -1;
}

Compliant Solution

This compliant solution casts the char value to unsigned char before allowing it to be implicitly promoted to a larger unsigned type.

static const char table[UCHAR_MAX] = { /* ... /* };

ptrdiff_t first_not_in_table(const char *str) {
  const char *s = str;
  for (; *s; ++s) {
    if (table[(unsigned char)*s] != *s)
      return s - str;
  return -1;
}

Risk Assessment

This is a subtle error that results in a disturbingly broad range of potentially severe vulnerabilities.

Rule

Severity

Likelihood

Remediation Cost

Priority

Level

STR34-C

medium

probable

medium

P8

L2

Automated Detection

Tool

Version

Checker

Description

9.7.1

434 S

Fully Implemented

Fortify SCA

V. 5.0

 

Can detect violations of this rule with CERT C Rule Pack.

Compass/ROSE

 

 

Can detect violations of this rule when checking for violations of guideline INT07-C. Use only explicitly signed or unsigned char type for numeric values.

GCC

2.95 and later

-Wchar-subscripts

Detects objects of type char used as array indices.

1.2

charcast

Fully Implemented

Related Vulnerabilities

CVE-2009-0887 results from a violation of this rule. In Linux PAM (up to version 1.0.3), the libpam implementation of strtok casts a (potentially signed) character to an integer, for use as an index to an array. An attacker can exploit this by inputting a string with non-ASCII characters, causing the cast to result in a negative index and accessing memory outside of the array [xorl 2009].

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

CERT C++ Secure Coding Standard: STR34-CPP. Cast characters to unsigned types before converting to larger integer sizes

ISO/IEC 9899:1999 Section 6.2.5, "Types"
MISRA Rule 6.1, "The plain char type shall be used only for the storage and use of character values."

MITRE CWE: CWE-704, "Incorrect Type Conversion or Cast"

Bibliography

[xorl 2009] "CVE-2009-0887: Linux-PAM Singedness Issue"


  • No labels