Skip to end of metadata
Go to start of metadata

The C11 Standard [ISO/IEC 9899:2011] introduced a new term: temporary lifetime. Modifying an object with temporary lifetime is undefined behavior. According to subclause 6.2.4, paragraph 8

A non-lvalue expression with structure or union type, where the structure or union contains a member with array type (including, recursively, members of all contained structures and unions) refers to an object with automatic storage duration and temporary lifetime. Its lifetime begins when the expression is evaluated and its initial value is the value of the expression. Its lifetime ends when the evaluation of the containing full expression or full declarator ends. Any attempt to modify an object with temporary lifetime results in undefined behavior.

This definition differs from the C99 Standard (which defines modifying the result of a function call or accessing it after the next sequence point as undefined behavior) because a temporary object's lifetime ends when the evaluation containing the full expression or full declarator ends, so the result of a function call can be accessed. This extension to the lifetime of a temporary also removes a quiet change to C90 and improves compatibility with C++. 

C functions may not return arrays; however, functions can return a pointer to an array or a struct or union that contains arrays. Consequently, if a function call returns by value a struct or union containing an array, do not modify those arrays within the expression containing the function call. Do not access an array returned by a function after the next sequence point or after the evaluation of the containing full expression or full declarator ends.

Noncompliant Code Example (C99)

This noncompliant code example conforms to the C11 Standard; however, it fails to conform to C99. If compiled with a C99-conforming implementation, this code has undefined behavior because the sequence point preceding the call to printf() comes between the call and the access by printf() of the string in the returned object.

#include <stdio.h>

struct X { char a[8]; };

struct X salutation(void) {
  struct X result = { "Hello" };
  return result;
}

struct X addressee(void) {
  struct X result = { "world" };
  return result;
}

int main(void) {
  printf("%s, %s!\n", salutation().a, addressee().a);
  return 0;
}

Compliant Solution

This compliant solution stores the structures returned by the call to addressee() before calling the printf() function. Consequently, this program conforms to both C99 and C11.

#include <stdio.h>

struct X { char a[8]; };
 
struct X salutation(void) {
  struct X result = { "Hello" };
  return result;
}

struct X addressee(void) {
  struct X result = { "world" };
  return result;
}

int main(void) {
  struct X my_salutation = salutation();
  struct X my_addressee = addressee();
 
  printf("%s, %s!\n", my_salutation.a, my_addressee.a);
  return 0;
}

Noncompliant Code Example

This noncompliant code example attempts to retrieve an array and increment the array's first value. The array is part of a struct that is returned by a function call. Consequently, the array has temporary lifetime, and modifying the array is undefined behavior.

#include <stdio.h>

struct X { int a[6]; };

struct X addressee(void) {
  struct X result = { { 1, 2, 3, 4, 5, 6 } };
  return result;
}

int main(void) {
  printf("%x", ++(addressee().a[0]));
  return 0;
}

Compliant Solution

This compliant solution stores the structure returned by the call to addressee() as my_x before calling the printf() function. When the array is modified, its lifetime is no longer temporary but matches the lifetime of the block in main().

#include <stdio.h>

struct X { int a[6]; };

struct X addressee(void) {
  struct X result = { { 1, 2, 3, 4, 5, 6 } };
  return result;
}

int main(void) {
  struct X my_x = addressee();
  printf("%x", ++(my_x.a[0]));
  return 0;
}

Risk Assessment

Attempting to modify an array or access it after its lifetime expires may result in erroneous program behavior.

Rule

Severity

Likelihood

Remediation Cost

Priority

Level

EXP35-C

Low

Probable

Medium

P4

L3

Automated Detection

Tool

Version

Checker

Description

Axivion Bauhaus Suite

6.9.0

CertC-EXP35
LDRA tool suite
9.7.1
642 S, 42 D, 77 DEnhanced Enforcement
Parasoft C/C++test

10.4.2

CERT_C-EXP35-a

Do not access an array in the result of a function call

Polyspace Bug Finder

R2019b

CERT-C: Rule EXP35-CChecks for accesses on objects with temporary lifetime (rule fully covered)
PRQA QA-C

9.5

0450 [U], 0455 [U], 0459 [U],

0465 [U], 0465 [U]


Splint
3.1.1



Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

Key here (explains table format and definitions)

Taxonomy

Taxonomy item

Relationship

ISO/IEC TR 24772:2013Dangling References to Stack Frames [DCM]Prior to 2018-01-12: CERT: Unspecified Relationship
ISO/IEC TR 24772:2013Side-effects and Order of Evaluation [SAM]Prior to 2018-01-12: CERT: Unspecified Relationship

Bibliography

[ISO/IEC 9899:2011]6.2.4, "Storage Durations of Objects"



15 Comments

  1. I at least agree with Hal. These examples in juxtaposition do nothing to help me understand the problem being solved.

    I think the issue is made unnecessarily complex through the use of the compiler internal terminology "sequence point." Programmers do not know what they are, much less where they occur. If we are attempting to teach them something about them, then annotating the examples with some sort of sequence point clues would help. If not, the use of the term obscures the point being made. I'd stick with pure programmer-aware terminologies.

    1. The term "sequence point" is not "compiler internal terminology." It is the terminology used by the C standard. AFAIK, there isn't any other correct terminology. Sequence points are basically the places where order of execution is specified. As a programmer, you do need to know about them, although not necessarily what they they are called. Eg, if you say f( x );g( y );, even as a programmer, you generally want to know that f will be called before g is called.

      Anyways, if you were reading the rules in order (or even the titles), you would have come across [EXP30-C. Do not depend on order of evaluation between sequence points] first, which does explain what a sequence point is. If you know of better terminology, would you mind suggesting it?

  2. The statement "This program has undefined behavior because there is a sequence point before printf() is called, and printf() accesses the result of the call to addressee()." is not an adequate explanation of the failure here. If you replace "address().a" with "address()", the explanation would be the same, but the code would be correct. It's the ".a" that's wrong and the explanation should reflect that. There must be a sequence point between address() and address().a that's the issue.

    1. It's not the ".a" that is wrong. It's the fact that the printf() function will be trying to access the value returned by address() in the previous sequence point.

      Perhaps you are confused because they are using a function call as a sequence point. Would it be more clear if they instead used semicolon?

      char *s = addressee().a;
      /* the next line accesses the return value from the previous sequence point */
      char c = s[3]; 
      

      I think this is a worse example, because it also violates [DCL30-C. Do not refer to an object outside of its lifetime]

  3. When I compile this code with GCC version4.1 and the -Wall switch, I do not get a warning about this bug.  I only get a warning that the format string expects type 'char *' but the argument has type 'char[6]'

    1. I can reproduce this on gcc3.4.4

      then again... this could just be a case of a less than informative warning, perhaps we should explain a bit more exactly what is going on?

  4. printf does not accesses the return value from addressee().
    main() does that, and passes its member 'a' to printf().

  5. The code examples don't support the rule. Not sure why the NCCE segfaults, but I suspect it is due to the array, not to anything regarding sequence points.

    First off, the examples are evidence of something. And I also get the silly compiler warning from gcc that Arbob and Alex report on the NCCE:

    foo.c:11: warning: format '%s' expects type 'char *', but argument 2 has type 'char[6]'
    

    This is, of course, specific to printf() and its ilk. In fact gcc, is very sticky about passing the 'temporary array' addressee().a to functions...usually it returns an error and rejects typecasting events.

    At first I thought the problem is that arrays are glorified pointers and the NCCE violates DCL30-C. Declare objects with appropriate storage durations. But as a counterexample, the following code compiles w/o warning and works properly:

    #include <stdio.h>
    
    struct X { char a[6]; };
    
    struct X addressee() {
      struct X result = { "world" };
      return result;
    }
    
    int main(void) {
      printf("Hello, %s!\n", &(addressee().a[0]));
      return 0;
    }
    

    Also, if struct X uses something besides arrays, the code also compiles cleanly and works.

    #include <stdio.h>
    
    struct X { char a; };
    
    struct X addressee() {
      struct X result = { '!' };
      return result;
    }
    
    int main(void) {
      printf("Hello, World%c\n", addressee().a);
      return 0;
    }
    

    So I think the NCCE illustrates something bad about arrays, but don't know what. No rule in the Arrays section seems to apply.

    On a side note, the NCCE may be bad C, but it seems to be valid C+. It compiles cleanly under G+ and runs correctly. I suspect C++ treats temporary values differently than C. For instance, in C++ a function can return a reference to a variable, the result being that you can use a function call as an lvalue. So I suspect C++ is more thorough in its treatment of temporary values (or of arrays, whichever this is about.)

    1. C++ has different semantics. In C++ temporaries (like return values) are preserved until the end of evaluation of the containing full expression or full declarator.

      1. OK, having looked at this example more, I am convinced that the problem is better formulated in terms of arrays. C99 section 6.9.1, paragraph 3 explicitly states that functions may not return arrays. The NCCE violates the spirit of this, but not the letter, as its function returns an array wrapped inside a struct.

        The NCCE behaves the same even if the array being returned lives on the heap, not the stack (eg created with malloc()). Also, I can't recreate the problem without arrays; eg a struct containing a struct works perfectly.

        The webpage "Extending the Lifetime of Temporary Objects" provides the reference for this rule, claiming a sequence point after the addressee() function and printf() is responisible for the behavior.

        But the NCCE is not bad because of sequence points, and I think the C99 standard is being misinterpreted here. Obviously a function's return value must be leglitimate across at least one sequence point, otherwise you couldn't do foo( bar()), since there is a sequence point between the bar() call and the foo() call. There is no sequence point in referring to the array within the struct; the only sequence points (by definition) are the call to addressee() and the call to printf().

        I haven't found a definitive reference forbidding the NCCE.

        As for gcc, it does not seem to properly convert the array to a pointer in the NCCE, which is why the NCCE crashes, but works if we wrap the array in a &array[0] expression, as in my previous comment.

        One telling clue about this is that gcc won't compile the program if you replace 'printf' with some other function taking a char* or char array. I get the impression they were trying to prevent some array casts at the compiler level, and doing other array casts right, and the NCCE was just a loophole they never changed (prob because its claimed to violate sequence points.) As noted in the rule, this is not a problem with MSVC.

        So this may just be a gcc bug. If it merits a rule, it would be "Don't pass as a function argument an array that is a member of a struct returned by another function.

        I guess my big question is: Can anyone cite a reference (that isn't about sequence points) saying why the NCCE is bad?

        1. Clark Nelson sez:

          >> Perhaps I am cofused over the meaning of this paragraph, from
          >> C99 Section 6.5.2.2 says 1999:
          >>
          >> If an attempt is made to modify the result of a function call or
          >> to access it after the next sequence point, the behavior is
          >> undefined.
          >>
          >> It would seem to me that this paragraph renders a simple cascading
          >> function call like foo( bar()) illegal, because there is a sequence
          >> point before the call to foo() and after the call to bar(), and that
          >> sequence point 'kills' the return value of bar().
          >
          > You have to remember that in C, all arguments are passed by value, so what foo receives is not the object returned by bar, but a copy thereof. So it's not the same object, so there's no undefined behavior.
          >
          >> So why is foo( bar()) legal, but foo( bar().a) illegal? (assuming
          >> bar() returns a struct with an array member 'a')?
          >
          > Because (bar().a) implicitly takes the address of an object returned by bar, so if foo dereferences that pointer, it is accessing the actual object returned by bar after a sequence point.
          >
          >> Rob pointed me to your proposal to amend this paragraph in the C
          >> standard:
          >>
          >> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1285.htm
          >>
          >> AFAICT this would kill our rule, as the non-compliant code example
          >> would become perfectly legal C (it is already legal C++). But, as you
          >> know, your proposal is not part of the C standard yet, so how do we
          >> cope in the meantime?
          >
          > That's a judgment call. Theoretically, even after a new C standard comes out, there will still be compilers around that won't yet conform to it. So from some perspective, it will still be reasonable to suggest to programmers that they avoid references to rvalue arrays. On the other hand, no one has yet found a C compiler that actually generates code that doesn't satisfy the C++ rule, which suggests that references to rvalue arrays are practically pretty safe.
          >
          > Actually, I take it back: in pre-C99 mode, GCC does something entirely unexpected with rvalue arrays. At least on IA-32, if an rvalue array expression is used as an argument to a function, the whole array is passed by value, i.e. copied into the argument block of the called function. No matter what the standard says, it might be reasonable to warn people away from that behavior.
          >
          > Clark

          1. After studying Clark's responses above to my questions, I tested the NCCE again. The code still coredumps on gcc version 4.2.3 (on an AMD64-bit Ubuntu box), but it works perfectly if I add a --std=c99 flag to the compile command, which instructs GCC to adhere to C99 as much as possible.

            Clark's last two paragraphs sum up the situation pretty effectively. I'll just add:

            • As mentioned above, the problem is with array rvalues, so the rule should focus on arrays.
            • Most compilers already 'do the right thing' and compile the NCCE so that it runs correctly. So does GCC4.2, but only with --std=c99.
            • It's our call if we wish to maintain this rule or drop it. It will be obsolete someday, but isn't yet.
  6. At some point, we need to look into the Automated Detection section; the NCCEs all compile cleanly with gcc 4.8.1 in -Wall -Weverything -pedantic mode.

    1. Clang also does not catch instances of this rule. I've removed the GCC row from the table; I don't believe it catches this rule currently (5.2.0).