I have spend a lot of time today trying to find and fix an issue which ended up to be a fun discovery at the end.
The following java error occurred when loading a pdf file from an url stream:
java.io.IOException: missing CR at sun.net.www.http.ChunkedInputStream.processRaw(ChunkedInputStream.java:405) at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:572) at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) at java.io.FilterInputStream.read(FilterInputStream.java:133) at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3066) at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3060)
This looked like a java lib error since java version was a bit old so the first idea was to replace the code with some apache httpClient based code to load the URL. This generated the following error, very similar.
java.io.IOException: CRLF expected at end of chunk: 121/79 at org.apache.commons.httpclient.ChunkedInputStream.readCRLF(Unknown Source) at org.apache.commons.httpclient.ChunkedInputStream.nextChunk(Unknown Source) at org.apache.commons.httpclient.ChunkedInputStream.read(Unknown Source) at java.io.FilterInputStream.read(FilterInputStream.java:133) at org.apache.commons.httpclient.AutoCloseInputStream.read(Unknown Source) at java.io.FilterInputStream.read(FilterInputStream.java:107) at org.apache.commons.httpclient.AutoCloseInputStream.read(Unknown Source)
Since this was a windows machine and the requests passed via localhost another try was to use another interface. The result was similar.
After some search I found a nice tool: http://www.netresec.com/?page=RawCap which does not requires any install and can be used even on localhost to generate a pcap compatible file which can then be inspected in wireshark.
The result was strange. Opening the capture in wireshark on my machine showed: [7540 bytes missing in capture file] in the tcp stream. This corresponded to a lot of packets: [TCP ZeroWindow], [TCP ZeroWindowProbe].
Since this was a VMWare install and I previously had some trouble with vmware software switches I assumed this was related to the network card config however it also happened on localhost.
After some more investigation I realized this was only happening when several request where made in parallel. I confirmed this by looking in the code.
The code contained a worker pool. Each worker/thread constructed the url the used then returned an inputStream.
DataSource source = new URLDataSource(reportUrl); return source.getInputStream();
However all the results were handled in sequence. As such while one url inputStream was read, the server continued to send data on all others request but this data was not read on the client side fast enough. As a result the tcp window got indeed to 0 and the strange error occurred.
Of course the solution was to fully read the data in each worker.