View Source

The incorrect use of arrays has traditionally been a source of exploitable vulnerabilities. Elements referenced within an array using the subscript operator \[\] are not checked unless the programmer provides adequate bounds checking. As a result, the expression {{array \[pos\] = value}} can be used by an attacker to transfer control to arbitrary code.

If the attacker can control the values of both {{pos}} and {{value}} in the expression {{array \[pos\] = value}}, he or she can perform an arbitrary write (overwrite other storage locations with contents of his or her choice).  The consequences range from changing a variable used to determine what permissions the program grants to executing arbitrary code with the permissions of the vulnerable process.  Arrays are also a common source of buffer overflows when iterators exceed the bounds of the array.

An array is a series of objects, all of which are the same size and type. Each object in an array is called an _array element_. The entire array is stored contiguously in memory (that is, there are no gaps between elements). Arrays are commonly used to represent a sequence of elements where random access is important but there is little or no need to insert new elements into the sequence (which can be an expensive operation with arrays).

Arrays containing a constant number of elements can be declared as follows:
{code}
enum { ARRAY_SIZE = 12 };
int dis[ARRAY_SIZE];
{code}
These statements allocate storage for an array of 12 integers referenced by {{dis}}. Arrays are indexed from {{0..n-1}} (where {{n}} represents an array bound). Arrays can also be declared as follows:
{code}
int ita[];
{code}
This is called an _incomplete type_ because the size is unknown. If an array of unknown size is initialized, its size is determined by the largest indexed element with an explicit initializer. At the end of its initializer list, the array no longer has incomplete type.
{code}
int ita[] = { 1, 2 };
{code}
While these declarations work fine when the size of the array is known at compilation time, it is not possible to declare an array in this fashion when the size can be determined only at runtime. The C99 standard adds support for variable-length arrays or arrays whose size is determined at runtime. Before the introduction of variable-length arrays in C99, however, these "arrays" were typically implemented as pointers to their respective element types allocated using {{malloc()}}, as shown in this example.
{code}
int *dat = (int *)malloc(ARRAY_SIZE * sizeof(int));
{code}

It is important to retain any pointer value returned by malloc() so that the referenced memory may eventually be deallocated. One possible way of preserving such a value is to use a constant pointer.
{code}
int * const dat = (int * const)malloc(
  ARRAY_SIZE * sizeof(int)
);
/* ... */
free(dat);
dat = NULL;
{code}

Both {{dis}} and {{dat}} arrays can be initialized as follows:
{code}
for (i = 0; i < ARRAY_SIZE; i++) {
   dis[i] = 42; /* Assigns 42 to each element; */
   /* ... */
}
{code}
The {{dat}} array can also be initialized as follows:
{code}
for (i = 0; i < ARRAY_SIZE; i++) {
   *dat = 42;
   dat++;
}
dat -= ARRAY_SIZE;

The {{dis}} identifier cannot be incremented, so the expression {{dis+\+}} results in a fatal compilation error. Both arrays can be initialized as follows:
{code}
int *p = dis;
for (i = 0; i < ARRAY_SIZE; i++)  {
  *p = 42; // Assigns 42 to each element;
  p++;
}
{code}
The variable {{p}} is declared as a pointer to an integer and then incremented in the loop. This technique can be used to initialize both arrays and is a better style of programming than incrementing the pointer to the array because it does not change the pointer to the start of the array.

Obviously, there is a relationship between array subscripts {{\[\]}} and pointers. The expression {{dis\[i\]}} is equivalent to {{\*(dis+i)}}. In other words, if {{dis}} is an array object (equivalently, a pointer to the initial element of an array object) and {{i}} is an integer, {{dis\[i\]}} designates the {{i{}}}th element of {{dis}} (counting from zero). In fact, because {{\*(dis+i)}} can be expressed as {{\*(i+dis)}}, the expression {{dis\[i\]}} can be represented as {{i\[dis\]}}, although doing so is not encouraged.

The initial element of an array is accessed using an index of zero; for example, {{dat\[0\]}} references the first element of {{dat}} array. The {{dat}} identifier points to the start of the array, so adding zero is inconsequential in that {{\*(dat+i)}} is equivalent to {{\*(dat+0)}}, which is equivalent to {{\*(dat)}}.

h2. Risk Assessment

Arrays are a common source of vulnerabilities in C language programs because they are frequently used but not always fully understood.
|| Recommendation || Severity || Likelihood || Remediation Cost || Priority || Level ||
| ARR00-C | high | probable | high | {color:#cc9900}{*}P6{*}{color} | {color:#cc9900}{*}L2{*}{color} |

h3. Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the [CERT website|https://www.kb.cert.org/vulnotes/bymetric?searchview&query=FIELD+KEYWORDS+contains+ARR00-C].

h2. References

\[[ISO/IEC 9899:1999|AA. C References#ISO/IEC 9899-1999]\] Section 6.7.5.2, "Array declarators"
\[[MITRE 07|AA. C References#MITRE 07]\] [CWE ID 119|http://cwe.mitre.org/data/definitions/119.html], "Failure to Constrain Operations within the Bounds of an Allocated Memory Buffer," and [CWE ID 129|http://cwe.mitre.org/data/definitions/129.html], "Unchecked Array Indexing"

*[!CERT C Secure Coding Standard^button_arrow_left.png!|06. Arrays (ARR)]**&nbsp;&nbsp;&nbsp;*&nbsp;&nbsp;&nbsp;*[!CERT C Secure Coding Standard^button_arrow_up.png!|06. Arrays (ARR)]**&nbsp;&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp; *[!CERT C Secure Coding Standard^button_arrow_right.png!|ARR01-C. Do not apply the sizeof operator to a pointer when taking the size of an array]*