Do not add or subtract an integer to a pointer if the resulting value does not refer to an element within the array (or to the nonexistent element just after the last element of the array). According to the C++ Standard ISO/IEC 14882-2003, section 5.7:
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
If the pointer resulting from the addition (or subtraction) is outside of the bounds of the array, an overflow has occurred and the result is undefined.
Likewise, adding or subtracting an integer to an iterator if the resulting value does not refer to an element within the container (or to the nonexistent element just after the last element in the container). The C++ Standard, section 24.1, paragraph 5, says:
Just as a regular pointer to an array guarantees that there is a pointer value pointing past the last element of the array, so for any iterator type there is an iterator value that points past the last element of a corresponding container. These values are called past-the-end values. Values of an iterator i for which the expression *i is defined are called dereferenceable. The library never assumes that past-the-end values are dereferenceable.
Noncompliant Code Example (Arrays)
In this noncompliant code example, a pointer is set to reference the start of an array. Array elements are accessed sequentially within the for loop. The array pointer ip is incremented on each iteration.
int ar[20];
for (int *ip = &ar[0]; ip < &ar[21]; ip++) {
*ip = 0;
}
C++2003 guarantees that it is permissible to use the address of ar[20] even though no such element exists. However, in this noncompliant code example, the bound of the array is incorrectly specified, and consequently, the reference to &ar[21] constitutes undefined behavior. On the final iteration of the loop, the expression ip++ (which adds 1 to ip) will also overflow.
This code also suffers from using "magic numbers," described in DCL06-CPP. Use meaningful symbolic constants to represent literal values in program logic. When replacing the numbers with constants, a developer is likely to catch the invalid array bounds in the for statement.
Compliant Solution (Arrays)
This compliant solution fixes the problem from the previous noncompliant code example by using the common idiom sizeof(ar)/sizeof(ar[0]) to determine the actual number of elements in the array. This idiom works only when the definition of the array is visible (see [ARR01-CPP. Do not apply the sizeof operator to a pointer when taking the size of an array]).
int ar[20];
for (int *ip = &ar[0]; ip < &ar[sizeof(ar)/sizeof(ar[0])]; ip++) {
*ip = 0;
}
C++2003 guarantees that it is permissible to use the address of ar[sizeof(ar)/sizeof(ar[0])] even though no such element exists. This allows you to use it for checks in loops like the one in this Compliant Solution. The guarantee extends only to one element beyond the end of an array and no further [Banahan 03].
Noncompliant Code Example (Vectors)
In this noncompliant code example, an iterator is set to reference the beginning of a vector. Vector elements are accessed sequentially within the for loop. The iterator ip is incremented on each iteration.
vector<int> ar( 20, 0);
vector<int>::iterator ip = ar.begin();
for (int i = 1; i <= 22; i++) {
*ip++ = 1;
}
C++2003 guarantees that it is permissible for the iterator to refer to the 21st element of the vector, even though the vector has only 20 elements. However, this code tries to dereference the 21st element, which is undefined. Furthermore, it also tries to increment the iterator to refer to the 22nd element of the vector, which is also undefined. On the final iteration of the loop, the expression ip++ (which increments ip) will also overflow.
This code also suffers from using "magic numbers," described in DCL06-CPP. Use meaningful symbolic constants to represent literal values in program logic. When replacing the numbers with constants, a developer is likely to catch the invalid bounds in the for statement.
Compliant Solution (Vectors)
This compliant solution fixes the problem from the previous noncompliant code example by using the ranges ar.begin() and ar.end() to determine how many iterations should be executed.
vector<int> ar( 20, 0);
for (vector<int>::iterator ip = ar.begin(); ip < ar.end(); ip++) {
*ip++ = 1;
}
Since the iterator::end() method returns an iterator pointing to one element past the end of the vector, it functions as a useful loop terminator.
Non-Compliant Code Example (Linear Address Space)
Pointer arithmetic can result in undefined behavior if the pointer operand and the resulting pointer do not refer to the same array object (or one past the last element of the array object). Compiler implementations are provided broad latitude by the standard in how to deal with undefined behavior (see MSC15-CPP. Do not depend on undefined behavior) including ignoring the situation completely with unpredictable results.
In this noncompliant code example, the programmer is trying to determine if a pointer added to a length will wrap around the end of memory.
char *buf;
size_t len = 1 << 30;
/* Check for overflow */
if (buf + len < buf) {
len = -(size_t)buf-1;
}
This code resembles the test for wraparound from the sprint() function as implemented for the Plan 9 operating system. If buf + len < buf evaluates to true, len is assigned the remaining space minus 1 byte. However, because the expression buf + len < buf constitutes undefined behavior, compilers can assume this condition will never occur and optimize out the entire conditional statement. In gcc versions 4.2 and later, for example, code that performs checks for wrapping that depend on undefined behavior (such as the code in this noncompliant code example) are optimized away; no object code to perform the check appears in the resulting executable program [[VU#162289]]. This is of special concern because it often results in the silent elimination of code that was inserted to provide a safety or security check. For gcc version 4.2.4 and later, this optimization may be disabled for with the -fno-strict-overflow option.
Compliant Solution (Linear Address Space)
In this compliant solution, both references to buf are cast to size_t. Because {[size_t}} is an unsigned type, C++2003 guarantees that it has modulo behavior.
char *buf;
size_t len = 1 << 30;
/* Check for overflow */
if ((size_t)buf+len < (size_t)buf) {
len = -(size_t)buf-1;
}
This compliant solution works on architectures that provide a linear address space. Some word-oriented machines are likely to produce a word address with the high-order bits used as a byte selector, in which case this solution will fail. Consequently, this is not a portable solution.
Non-Compliant Code Example (Pointer Addition)
Another interesting case is shown in this noncompliant code example. The expression buf + n may wrap for large values of n, resulting in undefined behavior.
int process_array(char *buf, size_t n) {
return buf + n < buf + 100;
}
This is an example of how optimization may actually help improve security. When compiled using GCC 4.3.0 with the -O2 option, for example, the expression buf + n < buf + 100 is optimized to n < 100, eliminating the possibility of wrapping. This code example is still noncompliant, because it is not safe to rely on compiler optimizations for security.
Compliant Solution (Pointer Addition)
In this compliant solution, the "optimization" is performed by hand.
int process_array(char *buf, size_t n) {
return n < 100;
}
Risk Assessment
If adding or subtracting an integer to a pointer results in a reference to an element outside the array or one past the last element of the array object, the behavior is undefined, but frequently leads to a buffer overflow or buffer underrun which can often be exploited to run arbitrary code. Iterators and STL containers exhibit the same behavior and caveats as pointers and arrays.
Rule |
Severity |
Likelihood |
Remediation Cost |
Priority |
Level |
|---|---|---|---|---|---|
ARR38-CPP |
high |
likely |
medium |
P18 |
L1 |
Automated Detection
Compass/ROSE could detect violations of this rule. At least it can catch all of the NCCE's.
- The first NCCE can be caught by the invalid array reference, since it is a compile-time constant.
- The second NCCE is a case of ptr + int < ptr. This is always a violation, because wrap-around is not guaranteed behavior for pointers, (it's only guaranteed for unsigned ints.)
- The third NCCE is a case of ptr + int1 < ptr + int2. This is not always a violation (we don't know the valid range of ptr). But it can always be converted to int1 < int2, so we should always consider this a violation.
Klocwork Version 8.0.4.16 can detect violations of this rule with the ABR checker.
Related Vulnerabilities
Search for vulnerabilities resulting from the violation of this rule on the CERT website.
Other Languages
This rule appears in the C Secure Coding Standard as ARR38-C. Do not add or subtract an integer to a pointer if the resulting value does not refer to a valid array element.
References
[[Banahan 03]] Section 5.3, "Pointers,"
and Section 5.7, "Expressions involving pointers"![]()
[[ISO/IEC 14882-2003]] Section 18.7
[[MITRE 07]] CWE ID 129
, "Unchecked Array Indexing"
[[VU#162289]]
ARR37-CPP. Do not add or subtract an integer to a pointer to a non-array object 06. Arrays (ARR) ARR39-CPP. Do not treat arrays polymorphically