The C99 \[[ISO/IEC 9899:1999|AA. C References#ISO/IEC 9899-1999]\] function {{strtok()}} is a string tokenization function that takes three arguments: an initial string to be parsed, a const-qualified character delimiter, and a pointer to a pointer to modify to return the result. |
The first time strtok()
is called, the string is parsed into tokens, character delimiter, and address of the variable in which to return the result are passed as arguments. The strtok()
function parses the string up to the first instance of the delimiter character, replaces the character in place with a null byte ('\0'
), and puts the address of the first character in the token to the passed-in variable. Subsequent calls to strtok()
begin parsing immediately after the most recently-placed null character.
Because strtok()
modifies its argument, the string is subsequently unsafe and cannot be used in its original form. If you need to preserve the original string, copy it into a buffer and pass the address of the buffer to strtok()
instead of the original string.
In this example, the strtok()
function is used to parse the first argument into colon-delimited tokens; it outputs each word from the string on a new line. Assume that PATH
is "/usr/bin:/usr/sbin:/sbin"
.
char *token; char *path = getenv("PATH"); token = strtok(path, ":"); puts(token); while (token = strtok(0, ":")) { puts(token); } printf("PATH: %s\n", path); /* PATH is now just "/usr/bin" */ |
After the loop ends, path
is modified as follows: "/usr/bin\0/bin\0/usr/sbin\0/sbin\0"
. This is an issue because the local path
variable becomes /usr/bin
and because the environment variable PATH
has been unintentionally changed, which can have unintended consequences.
In this solution the string being tokenized is copied into a temporary buffer which is not referenced after the calls to strtok()
:
char *token; char *path = getenv("PATH"); /* PATH is something like "/usr/bin:/bin:/usr/sbin:/sbin" */ char *copy = (char *)malloc(strlen(path) + 1); if (copy == NULL) { /* handle error */ } strcpy(copy, path); token = strtok(copy, ":"); puts(token); while (token = strtok(0, ":")) { puts(token); } free(copy); copy = NULL; printf("PATH: %s\n", path); /* PATH is still "/usr/bin:/bin:/usr/sbin:/sbin" */ |
Another possibility is to provide your own implementation of strtok()
that does not modify the initial arguments.
To quote the Linux Programmer's Manual (man) page on {{strtok(3)}} \[[Linux 07|AA. C References#Linux 07]\]: |
Never use this function. This function modifies its first argument. The identity of the delimiting character is lost. This function cannot be used on constant strings.
Improper strtok()
use is likely to result in truncated data, producing unexpected results later in program execution.
Recommendation |
Severity |
Likelihood |
Remediation Cost |
Priority |
Level |
---|---|---|---|---|---|
STR06-A |
medium |
likely |
medium |
P12 |
L1 |
Fortify SCA Version 5.0 is able to detect violations of this recommendation.
Compass/ROSE can detect violations of this recommendation.
Search for vulnerabilities resulting from the violation of this rule on the CERT website.
\[[ISO/IEC 9899:1999|AA. C References#ISO/IEC 9899-1999]\] Section 7.21.5.8, "The {{strtok}} function" \[[Linux 07|AA. C References#Linux 07]\] [strtok(3)|http://www.kernel.org/doc/man-pages/online/pages/man3/strtok.3.html] |
STR05-A. Use pointers to const when referring to string literals 07. Characters and Strings (STR) STR07-A. Use TR 24731 for remediation of existing string manipulation code