Welcome! Please see the About page for a little more info on how this works.

+14 votes
in Java Interop by
closed by

Asking this here because I can't post to https://clojure.atlassian.net/jira/software/c/projects/CLJ/issues/CLJ-2771.

I run a startup and we ran headlong into this synchronized block when using Java 19 virtual threads:

https://github.com/clojure/clojure/blob/4090f405466ea90bbaf3addbe41f0a6acb164dbb/src/jvm/clojure/lang/Delay.java#L35

This took significant engineering effort and sanity to debug, as -Djdk.tracePinnedThreads reported precisely nothing.

As it stands, we're needing to reimplement a virtual-thread-friendly (non-synchronized) clojure.core/delay, and as many clojure.core functions as we need to ensure virtual threads don't get pinned.

As always, I'm amazed by the Clojure team's work! Just wanted to highlight the virtual-thread-unfriendliness of some of its API and get it prioritized if possible, since virtual threads are extremely useful.

Thanks!

closed with the note: Done in 1.12.0-alpha5

2 Answers

+2 votes
by
 
Best answer

Clojure 1.12.0-alpha5 is now available and modifies lazy-seq and delay to use locks instead of synchronized.

0 votes
by

Thanks, definitely interested in preparing Clojure for virtual threads. I'm curious what you had in your delay?

by
Great to hear! Initially it was postprocessing of an HTTP response (which itself was a promise returned from HttpKit). Then, to eliminate some variables, I removed the HTTP request, added an unrealized "dummy" promise, and made the `delay` body simply `(deref p <timeout> nil)`. Low values of `<timeout>` — 1-10 ms — almost never showed any issues. Higher values would more and more often result in "virtual thread starvation" due to pinning.

Overriding `clojure.lang.Delay` with a custom `Delay` implementation using `ReentrantLock` fixed the problem. Happy to share the Java class we have — it passes https://github.com/clojure/clojure/blob/master/test/clojure/test_clojure/delays.clj on both platform and virtual threads. (It also has an optimization which makes it avoid referencing a `volatile` value field on `deref` if it can reference an unsynchronized one.)
by
The main thing that's bad about synchronized in virtual threads is synchronized around blocking, so definitely delay around that seems like a possible issue. I've looked at all the synchronization in core and there are really just a handful that seem problematic (this is one).

"Avoid referencing a `volatile` value field on `deref` if it can reference an unsynchronized one" sounds like a classic concurrency mistake in Java (ala double checked locking), but might be ok depending on usage. Just keep in mind there are no guarantees any other thread will see changes to shared non-volatile, non-synchronized state.
by
Agreed about `synchronized`.

Good observation about a possible concurrency mistake. Here's the relevant code. Particularly `deref` uses the `volatile`/unsynchronized strategy:

```
public class Delay implements IDeref, IPending {
  volatile Object value;
  Object unsynchronizedValue;
  ReentrantLock lock;

  private static class ExceptionalValue {
    private final Object v;

    public ExceptionalValue (Object v) {
      this.v = v;
    }
  }

  private static class PendingValue {
    private final Object v;

    public PendingValue (Object v) {
      this.v = v;
    }
  }

  public Delay(IFn f){
    Object v = new PendingValue(f);
    unsynchronizedValue = v;
    value = v;
    lock = new ReentrantLock();
  }

  private static Object getOrThrow(Object v) {
    if (v instanceof ExceptionalValue) {
      return Util.sneakyThrow((Throwable)(((ExceptionalValue)v).v));
    } else {
      return v;
    }
  }

  public Object deref() {
    if (unsynchronizedValue instanceof PendingValue) {
      // Volatile read
      Object v = value;

      if (v instanceof PendingValue) {
        try {
          lock.lock();

          // Volatile read after lock held, in case it changed
          Object vAfter = value;

          if (vAfter instanceof PendingValue) {
            IFn f = (IFn)(((PendingValue)vAfter).v);

            Object vComputed = null;
            try {
              vComputed = f.invoke();
            } catch (Throwable t) {
              vComputed = new ExceptionalValue(t);
            }

            unsynchronizedValue = vComputed;
            value = vComputed;
            return getOrThrow(vComputed);
          } else {
            return getOrThrow(vAfter);
          }
        } finally {
          lock.unlock();
        }
      } else {
        return v;
      }
    } else {
      return getOrThrow(unsynchronizedValue);
    }
  }

  ... // `force`, `isRealized` implemented here
}
```
by
You write unsynchronizedValue under the lock, but the read is not under the lock, so there is no guarantee that other threads actually ever see that write. This is a thread visibility issue  similar to the one with the classic double checked locking pattern. I would not try to cheat in this way, but it would probably be useful to use a ReentrantReadWriteLock here.
by
You're right — there should be no expectation that readers *must* pick up a change to `unsynchronizedValue`. That said, if a thread fails to pick up the change, it falls back to reading  `value` (which is `volatile`), which is guaranteed to always be up to date. In our tests, something like 40% of reads picked up `unsynchronizedValue` before ever reading `value`, so to that extent (and based on other internal perf tests) the optimization appears to be valuable. In all cases `deref()` yielded the correct result.

`ReentrantReadWriteLock` is an interesting suggestion to explore — thank you!
by
I would like to understand what the original problematic code was doing. Was the “post processing of http response” itself running further IO? Like fanning out requests?
by
Our use case for Virtual Threads was, as you suspected, I/O. (I say "was" because virtual threads — as we used them, anyway — were too unreliable. They would cause deadlocks that were punishing to debug, and cost us probably a full month of dev time, all included. We've since moved to using a custom `future` macro on top of a large threadpool.) We were doing an HTTP request fan-out with `http-kit` and then doing post-processing on the responses. We needed all of that to 1) not block the calling thread while 2) not exhausting the `clojure.core/future` threadpool. We figured that, while we could (and did) use a workaround, virtual threads could be a nice solution, so we reached to it.

Essentially here's the code for what we were doing:
```clojure
;; In a library function
(defn request-and-post-process [request]
  (let [response (http-kit/request request)]
    ;; We use a `delay` to keep it a derefable output while avoiding using a thread
    (delay (post-process @response))))

;; In a business logic function
;; Will be `deref`ed upstream
(defn async-concurrently-make-requests [requests]
  (vthread
    (->> requests
         ;; Concurrently kick off all requests
         (mapv #(vthread @(request-and-post-process %)))
         ;; Aggregate
         (mapv deref))))
```

Notice that `clojure.core/delay` is used. Turns out that `clojure.lang.Delay` uses `synchronized` in its `deref` method (see Clojure Q&A thread), which pins the virtual threads. `deref`ing the `delay` makes the calling virtual thread wait for as long as the HTTP request and post-processing takes to run, which can take 30 seconds in some cases. We were seeing not only pinning behavior here, but deadlock, which was unexpected. (The pinning is an issue for `clojure.lang.Delay`; the deadlock is an issue for the underlying JVM implementation of virtual threads, as far as I can tell. I believe there's an open ticket for it on bugs.openjdk.org.) To avoid the pinning, we used our custom `ReentrantLock`-based `delay`. However, there was other pinning going on in e.g. a JSON deserialization library that seems to have landed us in the deadlock zone despite the `delay` enhancement.

Let me know if all this answers your question! Happy to elaborate further.
by
Regarding the deadlock issue that seems to be a bug in the JVM, has that been resolved in JDK 21 or is this just a bug in the JDK 19 preview of virtual threads?
...