Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The trailing byte ranges overlap the range of both the single byte and lead byte characters. This can cause issues because if a multibyte character is separated between buffer boundaries, it will be interpreted differently, as defined by its composing bytes . [Phillips 05].

Also, see FIO03-J. Specify the character encoding while performing file or network IO.

A third issue is caused because of the behavior of the String class constructor. According to the Java API [API 06] for the String class:

...

Code Block
bgColor#ccccff
public static String readBytes(DataInputStreamFileInputStream disfis) throws IOException
{
  byte[] data = new byte[1024];
  DataInputStream dis = new DataInputStream(fis);
  dis.readFully(data);
  String str = new String(data,"UTF-8");
  return str;
}

...