(THIS CODING RULE OR GUIDELINE IS UNDER CONSTRUCTION)

According to [JNI Tips], section "UTF-8 and UTF-16 Strings", Java uses UTF-16 strings that are not null-terminated. UTF-16 strings may contain \u0000 in the middle of the string, so it is necessary to know the length of the string when working on Java strings in native code.

JNI does provide methods that work with Modified UTF-8 (see [API 2013], Interface DataInput, section "Modified UTF-8"). The advantage of working with Modified UTF-8 is that it encodes \u0000 as 0xc0 0x80 instead of 0x00. This allows the use of C-style null-terminated strings that can be handled by the usual string functions. However, arbitrary UTF-8 data cannot be expected to work correctly in JNI. Data passed to the NewStringUTF function must be in Modified UTF-8 format. Character data read from a file or stream cannot be passed to the NewStringUTF function without being filtered to convert the high-ASCII characters to Modified UTF-8.

Noncompliant Code Example

This noncompliant code example shows an example where the wrong type of character encoding is used with erroneous results.

Compliant Solution

In this compliant solution ...

Risk Assessment

If it is assumed that Java strings are null-terminated then erroneous results may be obtained.

Rule	Severity	Likelihood	Remediation Cost	Priority	Level
JNI02-J	Low	Probable	High	P2	L3

Automated Detection

It is not feasible to automatically detect that the wrong character encoding is assumed in handling Java strings.

Bibliography

JNI Tips	UTF-8 and UTF-16 Strings
API 2013	Modified UTF-8

Space shortcuts

Page tree

(THIS CODING RULE OR GUIDELINE IS UNDER CONSTRUCTION)

Noncompliant Code Example

Compliant Solution

Risk Assessment

Automated Detection

Bibliography

Space shortcuts

Page tree

JNI04-J. Do not assume that Java strings are null-terminated

(THIS CODING RULE OR GUIDELINE IS UNDER CONSTRUCTION)

Noncompliant Code Example

Compliant Solution

Risk Assessment

Automated Detection

Bibliography