...
Security-Related Issues
According to to RFC 2279: UTF-8, a transformation format of ISO 10646 [Yergeau 1998],
Implementors of UTF-8 need to consider the security aspects of how they handle invalid UTF-8 sequences. It is conceivable that, in some circumstances, an attacker would be able to exploit an incautious UTF-8 parser by sending it an octet sequence that is not permitted by the UTF-8 syntax.
A particularly subtle form of this attack can be carried out against a parser that performs security-critical validity checks against the UTF-8 encoded form of its input, but interprets certain invalid octet sequences as characters. For example, a parser might prohibit the null character when encoded as the single-octet sequence
00, but allow the invalid two-octet sequenceC0 80and interpret it as a null character. Another example might be a parser which prohibits the octet sequence2F 2E 2E 2F("/../"), yet permits the invalid octet sequence2F C0 AE 2E 2F.
...
Failing to properly handle UTF8-encoded data can result in a data integrity violation or denial-of-service attack.
Recommendation | Severity | Likelihood | Remediation Cost | Priority | Level |
|---|---|---|---|---|---|
MSC10-C | mediumMedium | unlikelyUnlikely | highHigh | P2 | L3 |
Automated Detection
Tool | Version | Checker | Description | ||||||
|---|---|---|---|---|---|---|---|---|---|
| 176 S | Fully implemented |
Related Vulnerabilities
Search for vulnerabilities resulting from the violation of this rule on the CERT website.
...
| CERT C++ Secure Coding Standard | MSC10-CPP. Character encoding: UTF8-related issues |
| MITRE CWE | CWE-176, Failure to handle Unicode encoding CWE-116, Improper encoding or escaping of output |
Bibliography
| [ISO/IEC 10646:2012] | |
| [Kuhn 2006] | UTF-8 and Unicode FAQ for Unix/Linux |
| [Pike 1993] | "Hello World" |
| [Unicode 2006] | |
| [Viega 2003] | Section 3.12, "Detecting Illegal UTF-8 Characters" |
| [Wheeler 2003] | Secure Programmer: Call Components Safely |
| [Yergeau 1998] | RFC 2279 |
...