Skip to end of metadata
Go to start of metadata

Dereferencing a null pointer is undefined behavior.

On many platforms, dereferencing a null pointer results in abnormal program termination, but this is not required by the standard. See "Clever Attack Exploits Fully-Patched Linux Kernel" [Goodin 2009] for an example of a code execution exploit that resulted from a null pointer dereference.

Noncompliant Code Example

This noncompliant code example is derived from a real-world example taken from a vulnerable version of the libpng library as deployed on a popular ARM-based cell phone [Jack 2007]. The  libpng library allows applications to read, create, and manipulate PNG (Portable Network Graphics) raster image files. The libpng library implements its own wrapper to malloc() that returns a null pointer on error or on being passed a 0-byte-length argument.

This code also violates ERR33-C. Detect and handle standard library errors.

#include <png.h> /* From libpng */
#include <string.h>
 
void func(png_structp png_ptr, int length, const void *user_data) { 
  png_charp chunkdata;
  chunkdata = (png_charp)png_malloc(png_ptr, length + 1);
  /* ... */
  memcpy(chunkdata, user_data, length);
  /* ... */
 }

If length has the value −1, the addition yields 0, and png_malloc() subsequently returns a null pointer, which is assigned to chunkdata. The chunkdata pointer is later used as a destination argument in a call to memcpy(), resulting in user-defined data overwriting memory starting at address 0. In the case of the ARM and XScale architectures, the 0x0 address is mapped in memory and serves as the exception vector table; consequently, dereferencing 0x0 did not cause an abnormal program termination.

Compliant Solution

This compliant solution ensures that the pointer returned by png_malloc() is not null. It also uses the unsigned type size_t to pass the length parameter, ensuring that negative values are not passed to func().

This solution also ensures that the user_data pointer is not null. Passing a null pointer to memcpy() would produce undefined behavior, even if the number of bytes to copy were 0.  The user_data pointer could be invalid in other ways, such as pointing to freed memory. However there is no portable way to verify that the pointer is valid, other than checking for null.

#include <png.h> /* From libpng */
#include <string.h>

 void func(png_structp png_ptr, size_t length, const void *user_data) { 
  png_charp chunkdata;
  if (length == SIZE_MAX) {
    /* Handle error */
  }
  chunkdata = (png_charp)png_malloc(png_ptr, length + 1);
  if (NULL == chunkdata) {
    /* Handle error */
  }
  if (NULL == user_data) {
    /* Handle error */
  }
  /* ... */
  memcpy(chunkdata, user_data, length);
  /* ... */

 }

Noncompliant Code Example

In this noncompliant code example, input_str is copied into dynamically allocated memory referenced by c_str. If malloc() fails, it returns a null pointer that is assigned to c_str. When c_str is dereferenced in memcpy(), the program exhibits undefined behavior.  Additionally, if input_str is a null pointer, the call to strlen() dereferences a null pointer, also resulting in undefined behavior. This code also violates ERR33-C. Detect and handle standard library errors.

#include <string.h>
#include <stdlib.h>
 
void f(const char *input_str) {
  size_t size = strlen(input_str) + 1;
  char *c_str = (char *)malloc(size);
  memcpy(c_str, input_str, size);
  /* ... */
  free(c_str);
  c_str = NULL;
  /* ... */
}

Compliant Solution

This compliant solution ensures that both input_str and the pointer returned by malloc() are not null: 

#include <string.h>
#include <stdlib.h>
 
void f(const char *input_str) {
  size_t size;
  char *c_str;
 
  if (NULL == input_str) {
    /* Handle error */
  }
  
  size = strlen(input_str) + 1;
  c_str = (char *)malloc(size);
  if (NULL == c_str) {
    /* Handle error */
  }
  memcpy(c_str, input_str, size);
  /* ... */
  free(c_str);
  c_str = NULL;
  /* ... */
}

Noncompliant Code Example

This noncompliant code example is from a version of drivers/net/tun.c and affects Linux kernel 2.6.30 [Goodin 2009]:

static unsigned int tun_chr_poll(struct file *file, poll_table *wait)  {
  struct tun_file *tfile = file->private_data;
  struct tun_struct *tun = __tun_get(tfile);
  struct sock *sk = tun->sk;
  unsigned int mask = 0;

  if (!tun)
    return POLLERR;

  DBG(KERN_INFO "%s: tun_chr_poll\n", tun->dev->name);

  poll_wait(file, &tun->socket.wait, wait);

  if (!skb_queue_empty(&tun->readq))
    mask |= POLLIN | POLLRDNORM;

  if (sock_writeable(sk) ||
     (!test_and_set_bit(SOCK_ASYNC_NOSPACE, &sk->sk_socket->flags) &&
     sock_writeable(sk)))
    mask |= POLLOUT | POLLWRNORM;

  if (tun->dev->reg_state != NETREG_REGISTERED)
    mask = POLLERR;

  tun_put(tun);
  return mask;
}

The sk pointer is initialized to tun->sk before checking if tun is a null pointer. Because null pointer dereferencing is undefined behavior, the compiler (GCC in this case) can optimize away the if (!tun) check because it is performed after tun->sk is accessed, implying that tun is non-null. As a result, this noncompliant code example is vulnerable to a null pointer dereference exploit, because null pointer dereferencing can be permitted on several platforms, for example, by using mmap(2) with the MAP_FIXED flag on Linux and Mac OS X, or by using the shmat() POSIX function with the SHM_RND flag [Liu 2009].

Compliant Solution

This compliant solution eliminates the null pointer deference by initializing sk to tun->sk following the null pointer check. It also adds assertions to document that certain other pointers must not be null.

static unsigned int tun_chr_poll(struct file *file, poll_table *wait)  {
  assert(file);
  struct tun_file *tfile = file->private_data;
  struct tun_struct *tun = __tun_get(tfile);
  struct sock *sk;
  unsigned int mask = 0;

  if (!tun)
    return POLLERR;
  assert(tun->dev);
  sk = tun->sk;
  assert(sk);
  assert(sk->socket);
  /* The remaining code is omitted because it is unchanged... */
}

Risk Assessment

Dereferencing a null pointer is undefined behavior, typically abnormal program termination. In some situations, however, dereferencing a null pointer can lead to the execution of arbitrary code [Jack 2007van Sprundel 2006]. The indicated severity is for this more severe case; on platforms where it is not possible to exploit a null pointer dereference to execute arbitrary code, the actual severity is low.

Rule

Severity

Likelihood

Remediation Cost

Priority

Level

EXP34-C

High

Likely

Medium

P18

L1

Automated Detection

ToolVersionCheckerDescription
Astrée
18.10
null-dereferencingFully checked
Axivion Bauhaus Suite

6.9.0

CertC-EXP34
CodeSonar
5.0p0

LANG.MEM.NPD
LANG.STRUCT.NTAD
LANG.STRUCT.UPD

Null pointer dereference
Null test after dereference
Unchecked parameter dereference

Compass/ROSE

Can detect violations of this rule. In particular, ROSE ensures that any pointer returned by malloc(), calloc(), or realloc() is first checked for NULL before being used (otherwise, it is free()-ed). ROSE does not handle cases where an allocation is assigned to an lvalue that is not a variable (such as a struct member or C++ function call returning a reference)

Coverity


2017.07

CHECKED_RETURN

NULL_RETURNS

REVERSE_INULL

FORWARD_NULL

Finds instances where a pointer is checked against NULL and then later dereferenced

Identifies functions that can return a null pointer but are not checked

Identifies code that dereferences a pointer and then checks the pointer against NULL

Can find the instances where NULL is explicitly dereferenced or a pointer is checked against NULL but then dereferenced anyway. Coverity Prevent cannot discover all violations of this rule, so further verification is necessary

Cppcheck
1.66
nullPointer, nullPointerDefaultArg, nullPointerRedundantCheck

Context sensitive analysis

Detects when NULL is dereferenced (Array of pointers is not checked. Pointer members in structs are not checked.)

Finds instances where a pointer is checked against NULL and then later dereferenced

Identifies code that dereferences a pointer and then checks the pointer against NULL

Does not guess that return values from malloc(), strchr(), etc., can be NULL (The return value from malloc() is NULL only if there is OOMo and the dev might not care to handle that. The return value from strchr() is often NULL, but the dev might know that a specific strchr() function call will not return NULL.)

Klocwork
2018

NPD.CHECK.CALL.MIGHT
NPD.CHECK.CALL.MUST
NPD.CHECK.MIGHT
NPD.CHECK.MUST
NPD.CONST.CALL
NPD.CONST.DEREF
NPD.FUNC.CALL.MIGHT
NPD.FUNC.CALL.MUST
NPD.FUNC.MIGHT
NPD.FUNC.MUST
NPD.GEN.CALL.MIGHT
NPD.GEN.CALL.MUST
NPD.GEN.MIGHT
NPD.GEN.MUST
RNPD.CALL
RNPD.DEREF


LDRA tool suite
9.7.1

45 D, 123 D, 128 D, 129 D, 130 D, 131 D, 652 S

Fully implemented
Parasoft C/C++test
10.4.1

CERT_C-EXP34-a

Avoid null pointer dereferencing

Parasoft Insure++

Runtime analysis
Polyspace Bug Finder

R2018a

Arithmetic operation with NULL pointer

Invalid use of standard library memory routine

Null pointer

Use of tainted pointer

MISRA C:2012 Directive 4.14

Arithmetic operation performed on NULL pointer

Standard library memory function called with invalid arguments

NULL pointer dereferenced

Pointer from an unsecure source may be NULL or point to unknown memory

The validity of values received from external sources shall be checked.

PRQA QA-C++
2810, 2811, 2812, 2813, 2814, 2820, 2821, 2822, 2823, 2824
PRQA QA-C
9.5

2810, 2811, 2812, 2813, 2814, 2820, 2821, 2822, 2823, 2824 

Fully implemented
PVS-Studio

6.23

V522, V595, V664, V713, V1004
SonarQube C/C++ Plugin
3.11
S2259
Splint
3.1.1


Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

Key here (explains table format and definitions)

Taxonomy

Taxonomy item

Relationship

CERT Oracle Secure Coding Standard for JavaEXP01-J. Do not use a null in a case where an object is requiredPrior to 2018-01-12: CERT: Unspecified Relationship
ISO/IEC TR 24772:2013Pointer Casting and Pointer Type Changes [HFC]Prior to 2018-01-12: CERT: Unspecified Relationship
ISO/IEC TR 24772:2013Null Pointer Dereference [XYH]Prior to 2018-01-12: CERT: Unspecified Relationship
ISO/IEC TS 17961Dereferencing an out-of-domain pointer [nullref]Prior to 2018-01-12: CERT: Unspecified Relationship
CWE 2.11CWE-476, NULL Pointer Dereference2017-07-06: CERT: Exact

CERT-CWE Mapping Notes

Key here for mapping notes

CWE-690 and EXP34-C

EXP34-C = Union( CWE-690, list) where list =


  • Dereferencing null pointers that were not returned by a function


CWE-252 and EXP34-C

Intersection( CWE-252, EXP34-C) = Ø

EXP34-C is a common consequence of ignoring function return values, but it is a distinct error, and can occur in other scenarios too.

Bibliography

[Goodin 2009]
[Jack 2007]
[Liu 2009]
[van Sprundel 2006]
[Viega 2005]Section 5.2.18, "Null-Pointer Dereference"



31 Comments

  1. Should that be: if (size >= SIZE_MAX) {

    1. I believe in this case, either expression would work.

      SIZE_MAX is the largest possible value that a size_t could take, so it is not possible to have anything larger than SIZE_MAX.

      The test was added to catch the possibly theoretical situation where the length of input_str was somehow the maximum size for size_t, and adding one to this size in the malloc expression (to allocated space for the trailing null byte) results in an integer overflow.

      I say "theoretical" because I have not successfully produced strings of this length in testing.

      1. Ah, gotcha. That makes sense. Yeah, I suspect once it's possible to allocate 2+gigs contiguously in amainstream install of a modern OS, we'll see a frenzy of new vulnerabilities come out. The 4gig boundary will probably be important too with unsigned int in LP64, but since size_t will be 64-bit, there will have to be some truncation that compilers will be able to warn on. (I think you cover that in a different rule.) The above check can't hurt, as I guess you could have a system with a 32-bit size_t that had a ton of memory and had some crazy banking/selector scheme with pointers. It also reinforces the notion to the reader that any time you see arithmetic in an allocation expression, you need to think about corner-cases.

    2. I added a comment to explain that SIZE_MAX is the limit of size_t

  2. In my experience, there are reasons to check for a NULL pointer other than dereferencing it.

    A common memory-leak idiom, is reallocating storage and assigning its address to a pointer that already points to allocated storage. The correct idiom is to only allocate storage if the pointer is currently NULL. But no where in that particular idiom would a NULL pointer necessarily be deferenced.

  3. The article easily misleads the reader into believeing that ensuring pointer validity boils down to checking for pointer being not equal to NULL. Unfortunately the problem is much more complex, and generally unsolvable within standard C. Consider the following example:

    void f(int *x)
    {
      *x = 12;
    }
    
    void g(void)
    {
      int x, *p = &x;
      f(p+1);
    }
    

    There's no way f can check whether x points into valid memory or not. Using platform-specific means (e.g. parsing /proc/self/maps under linux) one might find out whether the pointer points into mapped memory, but this is still not a guarantee of validity because it is very coarse-grained – see again the above example. IMHO, the rule title should be changed to something less general.

    1. That's true.  I've changed it to say null pointer instead of invalid pointer.

  4. It is useful to have a function with portable interface but platform-dependent implementation:

    extern bool invalid(const void *);
    ...
    assert(!invalid(p)); // or whatever

    Typical implementation:

    bool invalid(const void *p) {
    extern char _etext;
    return p == NULL || (char *)p < &_etext;
    }

    Note that it doesn't know how to check for non-heap, non-stack.  Many platforms can support testing for those also.

    The idea is not to guarantee validity, but to catch a substantial number of problems that could occur.

  5. Made code more compliant with other rules.

    At this point we define size as strlen(input_str) + 1. Since SIZE_MAX represents the largest possible object, the largest possible string would then be SIZE_MAX-1 characters long (excluding '\0'). So the SIZE_MAX check was unnecessary.

  6. This is a matter of style, and also following code walkthrough.  In the complaint version

    We have mask = 0;

    Then below, first change to mask  is

    mask |= POLLIN | POLLRDNORM;

     I like to make source code checking a little quicker by putting parenthesizes around  arguments to |=  or &=  as

    mask |= (POLLOUT | POLLWRNORM);

     

     

  7. The final NCCE is actually more insidious than it seems at first.  Because null pointer dereferencing is UB, the if (!tun) check can be elided entirely by the optimizer (since the tun->sk implies that tun must be non-null).

    http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html

  8. The 2nd NCCE/CS pair seems redundant with the first NCCE/CS pair.

    1. One could argue that all code examples would be redundant with the first pair.  (smile)  In this case, the difference is the assumption that malloc() always returns non-null for the second NCCE, whereas the first NCCE has the malloc() abstracted away.

  9. I suggest that this topic needs to include calloc() and realloc()   Refer to Linux man pages online  for more enlightenment about malloc(), and friends.  

    I believe that dereferencing NULL should not crash the system, should not allow a write to a NULL pointer area, but should always set errno,  If I am a hacker, could I trap a null failure that would force a memory dump. Could I capture, and I would be able to glean much security information from the dump?   The null pointer check for writing or dereferencing should be a compiler flag or library setting.

    1. This rule applies to all null pointers, regardless of which function returned them. 

      Believing that dereferencing NULL shouldn't crash the system doesn't make it true.  I guess you could write a proposal to modify the C Standard, but our coding standard is meant to provide guidance for the existing language.

  10. Solution 1, it looks like, today's solution tomorrow's problem. int changed to size_t and if size_t parameter's is zero, allocate one word. Then we hit memcpy with length 0. When length is zero, it is probably unusable condition for this function.   

    1. There are other problems with this code, as is noted in the rule. But passing 0 to memcpy() is not one of them. The standard will simply copy 0 bytes...which is essentially a no-op. (C11, S7.24.2.1)


      1. That interpretation of the standard is not supported universally. See C17 7.1.4p1, which says, in part:

        Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow:

        If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after default argument promotion) not expected by a function with a variable number of arguments, the behavior is undefined.

        The issue is: memcpy() and friends do not explicitly state that a null pointer is a valid pointer value, even if the number of bytes to copy is 0.

      2. Isn't easier just to check valid range of length? I doubt that "length" of zero is a valid parameter, and although there no copy, but we see memory allocation. It looks like a logic bug, which can cause a memory leaking.    

        1. Aaron, don't confuse Vladimir :)

          A non-null but invalid pointer passed to memcpy() can indeed cause undefined behavior, but that is not the issue in the noncompliant code...the pointer will either be valid or null. And the compliant solution guarantees that the pointer will be valid if the code calls memcpy().

          The issue of passing n=0 to memcpy() is distinct from null or invalid pointers. Best to cite C11 s7.24.2.1 here:

          The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.

          Clearly the standard enumerates 1 case of undefined behavior, but makes no special mention of n=0. (In contrast, the case of passing 0 bytes to malloc is addressed in C11.) 0 is certainly within the 'domain of the function' (a phrase defined by mathematics but not by C11), as copying 0 bytes is well-understood (although silly).

          I would therefore assert that a platform whose memcpy() did anything besides a no-op when given n=0 and valid source/destination pointers was not C-standards-compliant.

          1. I would therefore assert that a platform whose memcpy() did anything besides a no-op when given n=0 and valid source/destination pointers was not C-standards-compliant.

            Your assertion is not backed by the wording in the standard, nor by common implementer understanding. It's even called out explicitly in C17 7.24.1p2:

            Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. Unless explicitly stated otherwise in the description of a particular function in this subclause, pointer arguments on such a call shall still have valid values, as described in 7.1.4. On such a call, a function that locates a character finds no occurrence, a function that compares two character sequences returns zero, and a function that copies characters copies zero characters.

            Note that 7.1.4 explicitly states that a null pointer is not a valid pointer argument. The value 0 for the number of bytes to copy is not what causes the UB, it's the null pointer value which triggers it.

            Optimizers are optimizing based on this latitude and have been for years. See the "Null pointer checks may be optimized away more aggressively" section in https://gcc.gnu.org/gcc-4.9/porting_to.html as an example with one common implementation.


            1. Aaron:
              I suspect we are talking past each other. So let me be more precise in my wording:

              I assert that a platform whose memcpy() did anything besides copy zero bytes when given n=0 and valid src and dest pointers was not C-standards-compliant. By 'valid pointers' I mean that both src and dest pointers are not null and they both point to non-overlapping arrays containing at least n bytes each.

              The n=0 is a mildly interesting edge case: Clearly a pointer that points to at least one valid byte could be used as the src or dest pointer to a call to memcpy(..., 0). I suppose there is a question of "Is a pointer that points to 0 bytes valid?" that we haven't considered here: I'd guess null pointers are not valid, even though they point to 0 bytes. Likewise, pointers to freed memory are not valid. I would also guess that pointers that point to the one-past-the-end of an array are also invalid. I'd guess WG14 has considered these questions, but I haven't until now :)

              Finally, there is the matter of the compliant solution. Which ensures that the chunkdata pointer is valid, but makes no such check to the user_data pointer. I suppose we can check that that is not null, but we cannot check that it is valid (in any portable way).

              1. I assert that a platform whose memcpy() did anything besides copy zero bytes when given n=0 and valid src and dest pointers was not C-standards-compliant. By 'valid pointers' I mean that both src and dest pointers are not null and they both point to non-overlapping arrays containing at least n bytes each.

                Phew, we're agreed here. Thank you for clarifying your assertion until I understood it properly.

                I'd guess null pointers are not valid, even though they point to 0 bytes.

                Correct; a null pointer is not a valid pointer for the C library functions.

                Finally, there is the matter of the compliant solution. Which ensures that the chunkdata pointer is valid, but makes no such check to the user_data pointer. I suppose we can check that that is not null, but we cannot check that it is valid (in any portable way).

                I think that checking for user_data being NULL would be an improvement to the CS so long as there is an explicit mention that user_data being NULL is invalid even if length == 0.

                1. I think that checking for user_data being NULL would be an improvement to the CS so long as there is an explicit mention that user_data being NULL is invalid even if length == 0.

                  Agreed. I've made this change.

                  1. Thanks, David! Small typo nit: "such as if i t pointed to freed memory" meant to say "if it" instead (removing whitespace).

        2. Vladimir:

          To be precise, once length is changed to a size_t and cannot take negative values, it cannot have an invalid value. 0 is a valid value as far as memcpy() is concerned, and malloc() has special language concerning malloc(0). So no checking of the length is necessary (besides preventing integer overflow, which the compliant solution does).

  11. Why does the second compliant example permit using possibly-null pointers? Shouldn't the function check all pointers before dereferencing them or passing them to another function?

    static unsigned int tun_chr_poll(struct file *file, poll_table *wait)  {
      if (!file) 
        // handle error  
      structtun_file *tfile = file->private_data;
      if (!tfile)
        // handle error
    /* The remaining code is omitted because it is unchanged... */
    }
    1. Good question!  That noncompliant code example (it's currently the 3rd) came from the Linux kernel, whose source is publicly available.

      1. Off by one error: It is the third example. But the problem also exists in the compliant version, so I'm not so sure that it's really compliant. 

        1. Agreed. I added an assertion to that compliant code example.