Share your thoughts in the 2024 State of Clojure Survey!

Welcome! Please see the About page for a little more info on how this works.

0 votes
in Refs, agents, atoms by

Hi,

I've come across some strange behavior while playing around with refs at the REPL.
So I have a ref with a simple watcher function:

(defn watcher [_ r o n]
  (printf "ref: %s old: %s new: %s time: %d\n" @r o n (System/currentTimeMillis))
  (flush))

(def x (ref 2))
(add-watch x nil watcher) 

I then evaluate the following expression two times in a row:

(future (dosync (ensure x) (Thread/sleep 3000) (alter x * 2)))

What I would expect to happen is that the watcher function should print the ref's value two times in a row, about 3 seconds apart.
But nothing happens.
When I then try to access the ref with another command such as

(dosync (ensure x) (alter x inc))

The REPL blocks completely. Once I did get the expected results several minutes later, another time I got this error message:

Execution error at reftest.core/eval24449 (form-init12758928819079876959.clj:34).                                               
Transaction failed after reaching retry limit

What I find perplexing is that everything seems to work just fine when I evaluate this let-block:

(let [x (ref 2)]
  (add-watch x nil watcher)
  (println "start:" (deref x))
  (future (dosync (ensure x) (Thread/sleep 3000) (alter x * 2)))
  (future (dosync (ensure x) (Thread/sleep 3000) (alter x * 2)))
  (dosync (ensure x) (alter x inc))
  (println "end:" @x))

Is the behavior at the REPL some kind of bug or am I doing something wrong?
How can I make sure that such STM transactions don't fail in real code?

by
You can confirm what Alex wrote by sprinkling the `dosync` body with print statements. And the `let` variant can also hang if you add e.g. a 100 ms sleep between calls to `future`. Although that's probably more of a happenstance than a rule.

1 Answer

+2 votes
by
selected by
 
Best answer

I assume both transactions are failing and retrying multiple times, and each retry takes a very long time until one of them happens to get the timing right to work or maxes out the retries.

You should never do blocking operations in your transactions.

by
Also, you don't need to ensure if you're altering the same ref in the transaction.
by
Hello Alex, thank you for the warning about doing blocking operations.

I used ensure above because I found that when I leave it out, the third transaction increments the ref before either of the previous transactions double it. For example, the below code prints a 12 at the end:
 
(let [x (ref 2)]
  (add-watch x nil watcher)
  (future (dosync (Thread/sleep 3000) (alter x * 2)))
  (future (dosync (Thread/sleep 3000) (alter x * 2)))
  (Thread/sleep 100) ; to make really sure the other transactions are running
  (dosync (alter x inc))
  (println "end:" (deref x) (quot (System/currentTimeMillis) 1000)))

Anyway, after doing some more reading on refs and STM, and thinking more about my project's business logic, I've come to the conclusion that one or two atoms are probably more appropriate for my use case than the dozens of refs I had initially envisioned.
...