Page History

...

Wiki Markup

Although UTF-8 originated from the Plan 9 developers \[[Pike 93|AA. C References#Pike 93]\], Plan 9's own support only covers the low 16-bit range.  In general, many "Unicode" systems only support the low 16-bit range, not the full 31-bit ISO 10646 code space \[[ISO/IEC 10646:2003(E)|AA. C References#ISO/IEC 10646-2003]\].

...

Wiki Markup
According to \[[Yergeau 98\|AA. C References#Yergeau 98]\]:

...

Wiki Markup

[Corrigendum #1: UTF-8 Shortest Form|http://www.unicode.org/versions/corrigendum1.html] to the Unicode Standard \[[Unicode 06|AA. C References#Unicode 06]\] describes modifications to Version 3.0 of The Unicode Standard necessary to define what is meant by the shortest form.

...

Wiki Markup

The following function from \[[Viega 03|AA. C References#Viega 03]\] detects invalid character sequences in a string but does not reject non-minimal forms. It returns {{1}} if the string is composed only of legitimate sequences; otherwise it returns {{0}}.

...

Wiki Markup

\[[ISO/IEC 10646:2003|AA. C References#ISO/IEC 10646-2003]\]
\[[ISO/IEC PDTR 24772|AA. C References#ISO/IEC PDTR 24772]\] "AJN Choice of Filenames and other External Identifiers"
\[[Kuhn 06|AA. C References#Kuhn 06]\]
\[[MITRE 07|AA. C References#MITRE 07]\] [CWE ID 176|http://cwe.mitre.org/data/definitions/176.html], "Failure to Handle Unicode Encoding," [CWE ID 116|http://cwe.mitre.org/data/definitions/116.html], "Improper Encoding or Escaping of Output" 
\[[Pike 93|AA. C References#Pike 93]\]
\[[Unicode 06|AA. C References#Unicode 06]\]  
\[[Viega 03|AA. C References#Viega 03]\] Section 3.12, "Detecting Illegal UTF-8 Characters"
\[[Wheeler 03|AA. C References#Wheeler 03]\]
\[[Yergeau 98|AA. C References#Yergeau 98]\]

...

MSC09-C. Character Encoding - Use Subset of ASCII for Safety 49. Miscellaneous (MSC)

Space shortcuts

Page tree

Versions Compared

Old Version 48

New Version 49

Key