Welcome! Please see the About page for a little more info on how this works.

+4 votes
in core.async by

This change https://github.com/clojure/core.async/commit/3429e3e1f1d49403bf9608b36dbd6715ffe4dd4f changes the number of threads created by a pipeline pretty significantly(0 new threads vs N new threads) and removes the only difference between pipeline and pipeline-blocking while maintaining the separate existence , which seems odd, so I am trying to understand the purpose of it.

  1. I have seen Alex say he found a bug with a blocking operation in pipeline and fixed it by making pipeline and pipeline-blocking act the same, so I think that refers to this commit.

  2. What I don't understand is where the blocking operation is in the old version. The only thing I see that might be it is the use of >!! in the process function, but it is called once on a channel with a buffer of size 1, that is created immediately before and closed right after, so there is no way other operations can be pending on it, so it might as well be a put! which never blocks.

if #1 really is the reason behind the commit, I don't understand it in light of #2, if #1 is not the reason behind the commit does anyone know what it is?

Is #2 straight up wrong? Is there some way that the >!! can block given a non-shared channel with space in its buffer?

Is this about blocking xforms? I thought that was caveat emptor.

Will pipeline be marked as deprecated since it now has the same behavior as pipeline-blocking?

1 Answer

+3 votes
by
selected by
 
Best answer

As a general policy, blocking operations should not be performed in the scope of a go block, and that includes the core.async blocking operations like >!!. So, yes - the problem here is that the pipeline code violates this policy. In practice, you're correct in this particular case that using >!! on an empty channel of size 1 won't block (but it still violates the policy of intent).

Using put! would be another option, however there is also a bigger conceptual issue here that pipeline purports to use "parallelism n". However, by putting those tasks in go blocks, you are actually subject to a max parallelism of the go thread pool, which is typically 8, and that presumes no one else is using the go block thread pool, so it could even be less than that. And secondly, if the user happens to do a blocking operation in pipeline (which they shouldn't), then they can easily lock up the whole system.

Switching to use the same strategy as pipeline-blocking (using a separate caching thread pool) addresses the parallelism issue. However, this may not be the last change in this regard - we may still switch to use an explicit fixed size compute thread pool, rather than a caching thread pool which would separate these again. So, no plans to deprecate anything - the user is still stating an intent that makes an important difference here.

...