Skip to end of metadata
Go to start of metadata

The C Standard identifies the following distinct situations in which undefined behavior (UB) can arise as a result of invalid pointer operations:

UB

Description

Example Code

46

Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object.

Forming Out-of-Bounds Pointer, Null Pointer Arithmetic

47

Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that points just beyond the array object and is used as the operand of a unary * operator that is evaluated.

Dereferencing Past the End Pointer, Using Past the End Index

49

An array subscript is out of range, even if an object is apparently accessible with the given subscript, for example, in the lvalue expression a[1][7] given the declaration int a[4][5]).

Apparently Accessible Out-of-Range Index

62

An attempt is made to access, or generate a pointer to just past, a flexible array member of a structure when the referenced object provides no elements for that array.

Pointer Past Flexible Array Member

Noncompliant Code Example (Forming Out-of-Bounds Pointer)

In this noncompliant code example, the function f() attempts to validate the index before using it as an offset to the statically allocated table of integers. However, the function fails to reject negative index values. When index is less than zero, the behavior of the addition expression in the return statement of the function is undefined behavior 46. On some implementations, the addition alone can trigger a hardware trap. On other implementations, the addition may produce a result that when dereferenced triggers a hardware trap. Other implementations still may produce a dereferenceable pointer that points to an object distinct from table. Using such a pointer to access the object may lead to information exposure or cause the wrong object to be modified.

enum { TABLESIZE = 100 };

static int table[TABLESIZE];

int *f(int index) {
  if (index < TABLESIZE) {
    return table + index;
  }
  return NULL;
}

Compliant Solution

One compliant solution is to detect and reject invalid values of index if using them in pointer arithmetic would result in an invalid pointer:

enum { TABLESIZE = 100 };

static int table[TABLESIZE];

int *f(int index) {
  if (index >= 0 && index < TABLESIZE) {
    return table + index;
  }
  return NULL;
}

Compliant Solution

Another slightly simpler and potentially more efficient compliant solution is to use an unsigned type to avoid having to check for negative values while still rejecting out-of-bounds positive values of index:

#include <stddef.h>
 
enum { TABLESIZE = 100 };

static int table[TABLESIZE];

int *f(size_t index) {
  if (index < TABLESIZE) {
    return table + index;
  }
  return NULL;
}

Noncompliant Code Example (Dereferencing Past-the-End Pointer)

This noncompliant code example shows the flawed logic in the Windows Distributed Component Object Model (DCOM) Remote Procedure Call (RPC) interface that was exploited by the W32.Blaster.Worm. The error is that the while loop in the GetMachineName() function (used to extract the host name from a longer string) is not sufficiently bounded. When the character array pointed to by pwszTemp does not contain the backslash character among the first MAX_COMPUTERNAME_LENGTH_FQDN + 1 elements, the final valid iteration of the loop will dereference past the end pointer, resulting in exploitable  undefined behavior 47. In this case, the actual exploit allowed the attacker to inject executable code into a running program. Economic damage from the Blaster worm has been estimated to be at least $525 million [Pethia 2003].

For a discussion of this programming error in the Common Weakness Enumeration database, see CWE-119, "Improper Restriction of Operations within the Bounds of a Memory Buffer," and CWE-121, "Stack-based Buffer Overflow" [MITRE 2013].

error_status_t _RemoteActivation(
      /* ... */, WCHAR *pwszObjectName, ... ) {
   *phr = GetServerPath(
              pwszObjectName, &pwszObjectName);
    /* ... */
}

HRESULT GetServerPath(
  WCHAR *pwszPath, WCHAR **pwszServerPath ){
  WCHAR *pwszFinalPath = pwszPath;
  WCHAR wszMachineName[MAX_COMPUTERNAME_LENGTH_FQDN+1];
  hr = GetMachineName(pwszPath, wszMachineName);
  *pwszServerPath = pwszFinalPath;
}

HRESULT GetMachineName(
  WCHAR *pwszPath,
  WCHAR wszMachineName[MAX_COMPUTERNAME_LENGTH_FQDN+1])
{
  pwszServerName = wszMachineName;
  LPWSTR pwszTemp = pwszPath + 2;
  while (*pwszTemp != L'\\')
    *pwszServerName++ = *pwszTemp++;
  /* ... */
}

Compliant Solution

In this compliant solution, the while loop in the GetMachineName() function is bounded so that the loop terminates when a backslash character is found, the null-termination character (L'\0') is discovered, or the end of the buffer is reached. This code does not result in a buffer overflow even if no backslash character is found in wszMachineName.

HRESULT GetMachineName(
  wchar_t *pwszPath,
  wchar_t wszMachineName[MAX_COMPUTERNAME_LENGTH_FQDN+1])
{
  wchar_t *pwszServerName = wszMachineName;
  wchar_t *pwszTemp = pwszPath + 2;
  wchar_t *end_addr
    = pwszServerName + MAX_COMPUTERNAME_LENGTH_FQDN;
  while ( (*pwszTemp != L'\\')
     &&  ((*pwszTemp != L'\0'))
     && (pwszServerName < end_addr) )
  {
    *pwszServerName++ = *pwszTemp++;
  }

  /* ... */
}

This compliant solution is for illustrative purposes and is not necessarily the solution implemented by Microsoft. This particular solution may not be correct because there is no guarantee that a backslash is found.

Noncompliant Code Example (Using Past-the-End Index)

Similar to the dereferencing-past-the-end-pointer error, the function insert_in_table() in this noncompliant code example uses an otherwise valid index to attempt to store a value in an element just past the end of an array.

First, the function incorrectly validates the index pos against the size of the buffer. When pos is initially equal to size, the function attempts to store value in a memory location just past the end of the buffer.

Second, when the index is greater than size, the function modifies size before growing the size of the buffer. If the call to realloc() fails to increase the size of the buffer, the next call to the function with a value of pos equal to or greater than the original value of size will again attempt to store value in a memory location just past the end of the buffer or beyond.

Third, the function violates INT30-C. Ensure that unsigned integer operations do not wrap, which could lead to wrapping when 1 is added to pos or when size is multiplied by the size of int.

For a discussion of this programming error in the Common Weakness Enumeration database, see CWE-122, "Heap-based Buffer Overflow," and CWE-129, "Improper Validation of Array Index" [MITRE 2013].

#include <stdlib.h>
 
static int *table = NULL;
static size_t size = 0;

int insert_in_table(size_t pos, int value) {
  if (size < pos) {
    int *tmp;
    size = pos + 1;
    tmp = (int *)realloc(table, sizeof(*table) * size);
    if (tmp == NULL) {
      return -1;   /* Failure */
    }
    table = tmp;
  }

  table[pos] = value;
  return 0;
}

Compliant Solution

This compliant solution correctly validates the index pos by using the <= relational operator, ensures the multiplication will not overflow, and avoids modifying size until it has verified that the call to realloc() was successful:

#include <stdint.h>
#include <stdlib.h>
 
static int *table = NULL;
static size_t size = 0;

int insert_in_table(size_t pos, int value) {
  if (size <= pos) {
    if ((SIZE_MAX - 1 < pos) ||
        ((pos + 1) > SIZE_MAX / sizeof(*table))) {
      return -1;
    }
 
    int *tmp = (int *)realloc(table, sizeof(*table) * (pos + 1));
    if (tmp == NULL) {
      return -1;
    }
    /* Modify size only after realloc() succeeds */
    size  = pos + 1;
    table = tmp;
  }

  table[pos] = value;
  return 0;
}

Noncompliant Code Example (Apparently Accessible Out-of-Range Index)

This noncompliant code example declares matrix to consist of 7 rows and 5 columns in row-major order. The function init_matrix iterates over all 35 elements in an attempt to initialize each to the value given by the function argument x. However, because multidimensional arrays are declared in C in row-major order, the function iterates over the elements in column-major order, and when the value of j reaches the value COLS during the first iteration of the outer loop, the function attempts to access element matrix[0][5]. Because the type of matrix is int[7][5], the j subscript is out of range, and the access has undefined behavior 49.

#include <stddef.h>
#define COLS 5
#define ROWS 7
static int matrix[ROWS][COLS];

void init_matrix(int x) {
  for (size_t i = 0; i < COLS; i++) {
    for (size_t j = 0; j < ROWS; j++) {
      matrix[i][j] = x;
    }
  }
}

Compliant Solution

This compliant solution avoids using out-of-range indices by initializing matrix elements in the same row-major order as multidimensional objects are declared in C:

#include <stddef.h>
#define COLS 5
#define ROWS 7
static int matrix[ROWS][COLS];

void init_matrix(int x) {
  for (size_t i = 0; i < ROWS; i++) {
    for (size_t j = 0; j < COLS; j++) {
      matrix[i][j] = x;
    }
  }
}

Noncompliant Code Example (Pointer Past Flexible Array Member)

In this noncompliant code example, the function find() attempts to iterate over the elements of the flexible array member buf, starting with the second element. However, because function g() does not allocate any storage for the member, the expression first++ in find() attempts to form a pointer just past the end of buf when there are no elements. This attempt is undefined behavior 62. (See MSC21-C. Use robust loop termination conditions for more information.)

#include <stdlib.h>
 
struct S {
  size_t len;
  char buf[];  /* Flexible array member */
};

const char *find(const struct S *s, int c) {
  const char *first = s->buf;
  const char *last  = s->buf + s->len;

  while (first++ != last) { /* Undefined behavior */
    if (*first == (unsigned char)c) {
      return first;
    }
  }
  return NULL;
}
 
void g(void) {
  struct S *s = (struct S *)malloc(sizeof(struct S));
  if (s == NULL) {
    /* Handle error */
  }
  s->len = 0;
  find(s, 'a');
}

Compliant Solution

This compliant solution avoids incrementing the pointer unless a value past the pointer's current value is known to exist:

#include <stdlib.h>
 
struct S {
  size_t len;
  char buf[];  /* Flexible array member */
};

const char *find(const struct S *s, int c) {
  const char *first = s->buf;
  const char *last  = s->buf + s->len;

  while (first != last) { /* Avoid incrementing here */
    if (*++first == (unsigned char)c) {
      return first;
    }
  }
  return NULL;
}
 
void g(void) {
  struct S *s = (struct S *)malloc(sizeof(struct S));
  if (s == NULL) {
    /* Handle error */
  }
  s->len = 0;
  find(s, 'a');
}

Noncompliant Code Example (Null Pointer Arithmetic)

This noncompliant code example is similar to an Adobe Flash Player vulnerability that was first exploited in 2008. This code allocates a block of memory and initializes it with some data. The data does not belong at the beginning of the block, which is left uninitialized. Instead, it is placed offset bytes within the block. The function ensures that the data fits within the allocated block. 

#include <string.h>
#include <stdlib.h>

char *init_block(size_t block_size, size_t offset,
                 char *data, size_t data_size) {
  char *buffer = malloc(block_size);
  if (data_size > block_size || block_size - data_size < offset) {
    /* Data won't fit in buffer, handle error */
  }
  memcpy(buffer + offset, data, data_size);
  return buffer;
}

This function fails to check if the allocation succeeds, which is a violation of ERR33-C. Detect and handle standard library errors. If the allocation fails, then malloc() returns a null pointer. The null pointer is added to offset and passed as the destination argument to memcpy(). Because a null pointer does not point to a valid object, the result of the pointer arithmetic is undefined behavior 46.

An attacker who can supply the arguments to this function can exploit it to execute arbitrary code. This can be accomplished by providing an overly large value for block_size, which causes malloc() to fail and return a null pointer. The offset argument will then serve as the destination address to the call to memcpy(). The attacker can specify the data and data_size arguments to provide the address and length of the address, respectively, that the attacker wishes to write into the memory referenced by offset. The overall result is that the call to memcpy() can be exploited by an attacker to overwrite an arbitrary memory location with an attacker-supplied address, typically resulting in arbitrary code execution.

Compliant Solution  (Null Pointer Arithmetic)

This compliant solution ensures that the call to malloc() succeeds:

#include <string.h>
#include <stdlib.h>

char *init_block(size_t block_size, size_t offset,
                 char *data, size_t data_size) {
  char *buffer = malloc(block_size);
  if (NULL == buffer) {
    /* Handle error */
  }
  if (data_size > block_size || block_size - data_size < offset) {
    /* Data won't fit in buffer, handle error */
  }
  memcpy(buffer + offset, data, data_size);
  return buffer;
}

Risk Assessment

Writing to out-of-range pointers or array subscripts can result in a buffer overflow and the execution of arbitrary code with the permissions of the vulnerable process. Reading from out-of-range pointers or array subscripts can result in unintended information disclosure.

Rule

Severity

Likelihood

Remediation Cost

Priority

Level

ARR30-C

High

Likely

High

P9

L2

Automated Detection

Tool

Version

Checker

Description

Astrée17.04i

pointered-deallocation

array-index-range

null-dereferencing

Partially checked
CodeSonar4.4

LANG.MEM.BO
LANG.MEM.BU

LANG.MEM.TBA

LANG.MEM.TO
LANG.MEM.TU

LANG.STRUCT.PBB
LANG.STRUCT.PPE

BADFUNC.BO.*

Buffer overrun
Buffer underrun

Tainted buffer access

Type overrun
Type underrun

Pointer before beginning of object
Pointer past end of object

A collection of warning classes that report uses of library functions prone to internal buffer overflows.

Compass/ROSE  

Could be configured to catch violations of this rule. The way to catch the noncompliant code example is to first hunt for example code that follows this pattern:

   for (LPWSTR pwszTemp = pwszPath + 2; *pwszTemp != L'\\';
*pwszTemp++;)

In particular, the iteration variable is a pointer, it gets incremented, and the loop condition does not set an upper bound on the pointer. Once this case is handled, ROSE can handle cases like the real noncompliant code example, which is effectively the same semantics, just different syntax

Coverity

2017.07

OVERRUN

NEGATIVE_RETURNS

ARRAY_VS_SINGLETON

BUFFER_SIZE

Can detect the access of memory past the end of a memory buffer/array

Can detect when the loop bound may become negative

Can detect the out-of-bound read/write to array allocated statically or dynamically

Can detect buffer overflows

Klocwork2017

ABV.ANY_SIZE_ARRAY
ABV.GENERAL
ABV.STACK
ABV.TAINTED
ABV.UNICODE.BOUND_MAP
ABV.UNICODE.FAILED_MAP
ABV.UNICODE.NNTS_MAP
ABV.UNICODE.SELF_MAP
ABV.UNKNOWN_SIZE
NNTS.MIGHT
NNTS.MUST
NNTS.TAINTED
SV.STRBO.BOUND_COPY.OVERFLOW
SV.STRBO.BOUND_COPY.UNTERM
SV.STRBO.BOUND_SPRINTF
SV.TAINTED.ALLOC_SIZE
SV.TAINTED.CALL.INDEX_ACCESS
SV.TAINTED.CALL.LOOP_BOUND
SV.TAINTED.INDEX_ACCESS
SV.TAINTED.LOOP_BOUND
SV.UNBOUND_STRING_INPUT.CIN

SV.UNBOUND_STRING_INPUT.FUNC

 
LDRA tool suite 9.7.1

45 D, 47 S, 476 S, 489 S, 64 X, 66 X, 68 X, 69 X, 70 X, 71 X, 79 X

Partially implemented
Parasoft C/C++test9.5BD-PB-ARRAYPartially implemented
Polyspace Bug FinderR2016a

Array access out of bounds, Array access with tainted index, Pointer access out of bounds, Pointer dereference with tainted offset, Use of tainted pointer

Array index outside bounds during array access

Array index from unsecure source possibly outside array bounds

Pointer dereferenced outside its bounds

Offset is from an unsecure source and dereference may be out of bounds

Pointer from an unsecure source may be NULL or point to unknown memory

PRQA QA-C9.3

2840, 2841, 2842, 2843, 2844, 2930, 2931, 2932, 2933, 2934, 2950,
2951, 2952, 2953

Partially implemented
PRQA QA-C++4.1

2820, 2821, 2822, 2823, 2824, 2840, 2841, 2842, 2843, 2844, 2930,
2931, 2932, 2950, 2951, 2952

Partially implemented
Cppcheck1.66arrayIndexOutOfBounds, outOfBounds, negativeIndex, arrayIndexThenCheck, arrayIndexOutOfBoundsCond,  possibleBufferAccessOutOfBounds

Context sensitive analysis of array index, pointers, etc.

Array index out of bounds

Buffer overflow when calling various functions memset,strcpy,..

Warns about condition (a[i] == 0 && i < unknown_value) and recommends that (i < unknown_value && a[i] == 0) is used instead

Detects unsafe code when array is accessed before/after it is tested if the array index is out of bounds

Related Vulnerabilities

CVE-2008-1517 results from a violation of this rule. Before Mac OSX version 10.5.7, the XNU kernel accessed an array at an unverified user-input index, allowing an attacker to execute arbitrary code by passing an index greater than the length of the array and therefore accessing outside memory [xorl 2009].

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

ISO/IEC TR 24772:2013Arithmetic Wrap-Around Error [FIF]
Unchecked Array Indexing [XYZ]
ISO/IEC TS 17961Forming or using out-of-bounds pointers or array subscripts [invptr]
MITRE CWE

CWE-119, Improper Restriction of Operations within the Bounds of a Memory Buffer
CWE-122, Heap-based Buffer Overflow
CWE-123, Write-what-where Condition
CWE-125, Out-of-bounds Read
CWE-129, Improper Validation of Array Index
CWE-788, Access of Memory Location after End of Buffer

MISRA C:2012Rule 18.1 (required)

Bibliography

[Finlay 2003] 
[Microsoft 2003] 
[Pethia 2003] 
[Seacord 2013b]Chapter 1, "Running with Scissors"
[Viega 2005]Section 5.2.13, "Unchecked Array Indexing"
[xorl 2009 ]"CVE-2008-1517: Apple Mac OS X (XNU) Missing Array Index Validation"

 


6 Comments

  1. The Compass/ROSE entry on this page is verbose and weird, relative to the other tool entries.

  2. The "Compliant Solution (Using Past-the-End Index)" still violates "INT30-C. Ensure that unsigned integer operations do not wrap": if pos is exactly SIZE_MAX, (pos+1) will wrap to 0, and realloc(table,0) will be called. What happens next is implementation-defined ("MEM04-C. Beware of zero-length allocations"):

    • If realloc(table,0) frees table and returns NULL, this will violate "MEM30-C. Do not access freed memory" the next time insert_in_table() is called (because the static variables table and size are not updated).
    • If realloc(table,0) returns a 0-sized chunk of memory, table and size are correctly updated (size becomes 0), but there is an immediate out-of-bounds write when value is stored to table[pos] (pos is SIZE_MAX).
    • If realloc(table,0) returns NULL as an error, but does not free table, nothing bad will happen because the static variables table and size are left untouched (does such an implementation actually exist?).
    1. You are correct, this was still missing a check for the case where pos == SIZE_MAX. I've corrected now. Thank you!

  3. The "Null Pointer Arithmetic" examples (both Noncompliant and Compliant) still violate "ARR30-C. Do not form or use out-of-bounds pointers or array subscripts": the block_size - data_size > offset check should be block_size - data_size < offset. Right now, the memcpy() can overflow buffer because the check guarantees that offset + data_size >= block_size.

    1. Thank you for pointing this out, I've corrected the mistake.

  4. Great, thanks to all of you for your hard work!