I found a strange behavior in implementation of clojure.java.io/copy function
https://github.com/clojure/clojure/blob/ee1b606ad066ac8df2efd4a6b8d0d365c206f5bf/src/clj/clojure/java/io.clj#L391
(defn copy
"Copies input to output. Returns nil or throws IOException.
Input may be an InputStream, Reader, File, byte[], char[], or String.
Output may be an OutputStream, Writer, or File.
Options are key/value pairs and may be one of
:buffer-size buffer size to use, default is 1024.
:encoding encoding to use if converting between
byte and char streams.
Does not close any streams except those it opens itself
(on a File)."
{:added "1.2"}
[input output & opts]
(do-copy input output (when opts (apply hash-map opts))))
Actual copying is implemented here when copying from an InputStream to an OutputStream.
https://github.com/clojure/clojure/blob/ee1b606ad066ac8df2efd4a6b8d0d365c206f5bf/src/clj/clojure/java/io.clj#L306
(defmethod do-copy [InputStream OutputStream] [^InputStream input ^OutputStream output opts]
(let [buffer (make-array Byte/TYPE (buffer-size opts))]
(loop []
(let [size (.read input buffer)] ;;; XXX point 1
(when (pos? size) ;;; XXX point 2
(do (.write output buffer 0 size)
(recur)))))))
Here .read function at point 1 is https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#read(byte[])
The javadoc states following
Reads some number of bytes from the input stream and stores them into the buffer array b. The number of bytes actually read is returned as an integer. This method blocks until input data is available, end of file is detected, or an exception is thrown.
If the length of b is zero, then no bytes are read and 0 is returned; otherwise, there is an attempt to read at least one byte. If no byte is available because the stream is at the end of the file, the value -1 is returned; otherwise, at least one byte is read and stored into b.
Meaning that return value -1 means end of stream, and return value 0 doesn't mean end of stream. However condition at point 2 in code above, stops recursion when .read returns value that is smaller than 1.
Now consider a case where .read returns this sequence of values in consecutive calls:
1024, 0, 1024, 201, -1
clojure.java/io copies only the first 1024 bytes, when the whole stream has 2249 bytes. Is this the intended behavior? Should the condition at point 1 be (not (neg? size))
?
I posted this issue also to google groups: https://groups.google.com/forum/#!topic/clojure/XzpPPXXhgM4