...
| Wiki Markup |
|---|
The following function from \[[Viega 2032003|AA. Bibliography#Viega 03]\] detects invalid character sequences in a string but does not reject non-minimal forms. It returns {{1}} if the string is composed only of legitimate sequences; otherwise, it returns {{0}}. |
| Code Block |
|---|
int spc_utf8_isvalid(const unsigned char *input) {
int nb;
const unsigned char *c = input;
for (c = input; *c; c += (nb + 1)) {
if (!(*c & 0x80)) nb = 0;
else if ((*c & 0xc0) == 0x80) return 0;
else if ((*c & 0xe0) == 0xc0) nb = 1;
else if ((*c & 0xf0) == 0xe0) nb = 2;
else if ((*c & 0xf8) == 0xf0) nb = 3;
else if ((*c & 0xfc) == 0xf8) nb = 4;
else if ((*c & 0xfe) == 0xfc) nb = 5;
while (nb-- > 0)
if ((*(c + nb) & 0xc0) != 0x80) return 0;
}
return 1;
}
|
...
Search for vulnerabilities resulting from the violation of this rule on the CERT website.
Related Guidelines
CERT C++ Secure Coding Standard: MSC10-CPP. Character Encoding - UTF8 Related Issues
Bibliography
unmigrated-wiki-markup
\[[ISO/IEC 10646:2003|AA. Bibliography#ISO/IEC 10646-2003]\] \[[
ISO/IEC PDTR 24772|AA. Bibliography#ISO/IEC PDTR 24772]\] TR 24772 "AJN Choice of Filenames and other External Identifiers"
MITRE CWE: CWE-176, "Failure to Handle Unicode Encoding"
MITRE CWE: CWE-116, "Improper Encoding or Escaping of Output"
Bibliography
| Wiki Markup |
|---|
\[[Kuhn 2006|AA. Bibliography#Kuhn 06]\] \[[Kuhn 2006|AA. Bibliography#Kuhn 06]\] \[[MITRE 2007|AA. Bibliography#MITRE 07]\] [CWE ID 176|http://cwe.mitre.org/data/definitions/176.html], "Failure to Handle Unicode Encoding," [CWE ID 116|http://cwe.mitre.org/data/definitions/116.html], "Improper Encoding or Escaping of Output" \[[Pike 1993|AA. Bibliography#Pike 93]\] \[[Unicode 2006|AA. Bibliography#Unicode 06]\] \[[Viega 2003|AA. Bibliography#Viega 03]\] Section 3.12, "Detecting Illegal UTF-8 Characters" \[[Wheeler 2003|AA. Bibliography#Wheeler 03]\] \[[Yergeau 1998|AA. Bibliography#Yergeau 98]\] |
...