...
This noncompliant code example attempts to convert the byte array representing a BigInteger into a String. Because some of the bytes do not denote valid characters, the resulting String representation loses information. Converting the String back to a BigInteger produces a different value.
lang | BigInteger x = new BigInteger("530500452766");
// convert x to a String
byte[] byteArray = x.toByteArray();
String s = new String(byteArray);
// convert s back to a BigInteger
byteArray = s.getBytes();
x = new BigInteger(byteArray);
|
|
|
When this program was run on a Linux platform where the default character encoding is US-ASCII, the string s got the value {?J??, because some of the characters were unprintable. When converted back to a BigInteger, x got the value 149830058370101340468658109.
...
Do not try to convert the String object to a byte array to obtain the original BigInteger. Character encoded data may yield a byte array that, when converted to a BigInteger, results in a completely different value.
lang | BigInteger x = new BigInteger("530500452766");
String s = x.toString(); // valid character data
try {
byte[] byteArray = s.getBytes("UTF8");
// ns prints as "530500452766"
String ns = new String(byteArray, "UTF8");
// construct the original BigInteger
BigInteger x1 = new BigInteger(ns);
} catch (UnsupportedEncodingException ex) {
// handle error
}
|
|
|
Noncompliant Code Example
This noncompliant code example corrupts the data when string contains characters that are not representable in the specified charset.
lang | // Corrupts data on errors
public static byte[] toCodePage(String charset, String string)
throws UnsupportedEncodingException {
return string.getBytes(charset);
}
// Fails to detect corrupt data
public static String fromCodePage(String charset, byte[] bytes)
throws UnsupportedEncodingException {
return new String(bytes, charset);
}
|
|
|
Compliant Solution
The java.nio.charset.CharsetEncoder class can transform a sequence of 16-bit Unicode characters into a sequence of bytes in a specific charset, while the java.nio.charset.CharacterDecoder class can reverse the procedure [API 2006].
This compliant solution uses the CharsetEncoder and CharsetDecoder classes to handle encoding conversions.
lang | public static byte[] toCodePage(String charset, String string)
throws IOException {
Charset cs = Charset.forName(charset);
CharsetEncoder coder = cs.newEncoder();
ByteBuffer bytebuf = coder.encode(CharBuffer.wrap(string));
byte[] bytes = new byte[bytebuf.limit()];
bytebuf.get(bytes);
return bytes;
}
|
|
|
Noncompliant Code Example
...
| Code Block |
|---|
| // Corrupts data on errors
public static void toFile(String charset, String filename,
String string) throws IOException {
FileOutputStream stream = new FileOutputStream(filename, true);
OutputStreamWriter writer = new OutputStreamWriter(stream, charset);
writer.write(string, 0, string.length());
writer.close();
} |
|
Compliant Solution
This compliant solution uses the CharsetEncoder class to perform the required function.
...