If the above is true then the best answer by far is to remove chunking. The core functions that matter can be written better, there are transducers and ham-fisted's lazy-noncaching namespace both of which provide lazy, noncaching programming model which is just faster than lazy-caching-threadsafe.