Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Wiki Markup
The contracts of the read methods for the {{InputStream}} and {{Reader}} families are complicated.  According to the Java API \[[API 2006|AA. Bibliography#API 06]\] for the class {{InputStream}}, the {{read(byte\[\], int, int)}} method provides the following behavior:

The default implementation of this method blocks until the requested amount of input data len has been read, end of file is detected, or an exception is thrown. Subclasses are encouraged to provide a more efficient implementation of this method.

Wiki Markup
However, the {{read(byte\[\])}} method states that it
:

Reads some number of bytes from the input stream and stores them into the buffer array b. The number of bytes actually read is returned as an integer. The number of bytes read is, at most, equal to the length of b.

The read() methods return as soon as they find available input data. As a result, these methods can stop reading data before the array is filled , because there might not be enough data available to fill the array.

Ignoring the result returned by the read() methods is a violation of rule EXP00-J. Do not ignore values returned by methods. Security issues can arise even when return values are considered , because the default behavior of the read() methods lacks any guarantee that the entire buffer array is filled. Consequently, when using read() to fill an array, the program must check the return value of read(), and handle the case where the array is only partially filled. In such cases, the program may try to fill the rest of the array, or work only with the subset of the array that was filled, or throw an exception.

...

The programmer's misunderstanding of the general contract of the read() methods method can result in failure to read the intended data in full. It is possible that the data is less than 1024 bytes long, with additional data available from the InputStream.

Compliant Solution (Multiple

...

Calls to read())

This compliant solution reads all the desired bytes into its buffer, accounting for the total number of bytes read and adjusting the remaining bytes' offset, consequently ensuring that the required data are is read in full. It also avoids splitting multibyte encoded characters across buffers by deferring construction of the result string until the data has been fully read, see . (See IDS10-J. Do not assume every character in a string is the same size for more information.)

Code Block
bgColor#ccccff
public static String readBytes(InputStream in) throws IOException {
  int offset = 0;
  int bytesRead = 0;
  byte[] data = new byte[1024];
  while (true) { 
    bytesRead += in.read(data, offset, data.length - offset);
    if (bytesRead == -1 || offset >= data.length)
      break;
    offset += bytesRead;
  }
  String str = new String(data, "UTF-8");
  return str;
}

...

Related Guidelines

MITRE CWE

CWE ID -135 "Incorrect Calculation of Multi-Byte String Length"

Bibliography

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="cfaa91ab-28a1-4ad9-8fee-087b4b14c2c9"><ac:plain-text-body><![CDATA[

[[API 2006

AA. Bibliography#API 06]]

Class InputStream, DataInputStream

]]></ac:plain-text-body></ac:structured-macro>

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="a56a27bc-9cd5-4998-94e0-bfdf4f7d9de3"><ac:plain-text-body><![CDATA[

[[Chess 2007

AA. Bibliography#Chess 07]]

8.1 Handling Errors with Return Codes

]]></ac:plain-text-body></ac:structured-macro>

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="13b64011-c16e-4461-98fb-5ebf7712b921"><ac:plain-text-body><![CDATA[

[[Harold 1999

AA. Bibliography#Harold 99]]

Chapter 7: Data Streams, Reading Byte Arrays

]]></ac:plain-text-body></ac:structured-macro>

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="ab103d24-366a-4e52-9730-9d91548cc830"><ac:plain-text-body><![CDATA[

[[Phillips 2005

AA. Bibliography#Phillips 05]]

 

]]></ac:plain-text-body></ac:structured-macro>

...

      12. Input Output (FIO)      FIO11-J. Do not attempt to read raw binary data as character data