Welcome! Please see the About page for a little more info on how this works.

+1 vote
in Clojure by

The bug CLJ-2094 demonstrates an edge case with protocols stemming from functions like partial eagerly evaluating their arguments and thus retaining outdated references. I ran into this recently and found that the ticket had been declined because of "normal Clojure evaluation rules".

I think there's an easy fix for protocols that adds very little overhead and allows protocol-related functions to work in both eager and lazy evaluation contexts. Would the team be interested in a patch for this?

by
Perhaps you can provide a bit more information about the "easy fix" and the impact it might have on existing code so that folks have a bit more insight into what the patch might actually involve?
by
agree with Sean, can you share the idea?
by
edited by
lol in the past y’all generally prefer I check before I start slinging patches, so I thought I’d post a gut check before shoving code in your face.

I’ve signed the CLA:

https://github.com/NoahTheDuke/clojure/pull/4

In short: deref the var on the protocol in the couple of protocol functions that look at `:imps` so any updates are seen.
by
In my benchmarks with criterium, it's roughly +5ns per invocation.
by
If I move the calls to the inner-most points of use, the performance cost is lessened as well.

1 Answer

0 votes
by

This question is more of a proposed solution than a problem and that makes it difficult for me to file it somewhere as a ticket. I'm not sure if CLJ-2094 is the actual problem you care about or if it's something else? If something else, please share.

Trying to put words to the suggested solution here... When finding a protocol method match, look up the current protocol value via the var (which may have been updated with new extensions), rather than using the protocol value that was passed (as it may no longer represent the current state).

But the protocol already has mechanisms to reset its state when extensions are added to the protocol, so I think that problem has already been largely solved. The only way the protocol state would not match the current state is that the protocol state you have is no longer connected. And in this case, I think that's because it is the protocol function var holds a function instance, which has a cache, which has the protocol, but the function instance itself is old (the var holds a new function instance).

So you are using a dead function, but doing a lookup during method invocation to recover the live function's equivalent state. Adding a var lookup in the invocation call path is I think a non-starter as the whole protocol design is trying to avoid that (that's why we have a cache which can be dumped on modification). The alternate solution here is to push var lookup of the function to the caller. And that's essentially what you get with an anonymous function implementation of the spec in CLJ-2094 (vs partial).

Again, I'm not sure what problem we're actually trying to solve here. In CLJ-2094, the problem was that the function was captured by partial. The solution is - don't capture the value. We could also consider some alternative to partial that retained the var lookup - you could imagine a partial-var that took a var rather than a function evaluated from a symbol. The workaround is to use wrap in a function to delay the evaluation. Is this enough of a problem to warrant something new? Don't know.

by
Thanks for the detailed description of the situation, Alex. I appreciate the clarity of your word choices.

> Again, I'm not sure what problem we're actually trying to solve here. In CLJ-2094, the problem was that the function was captured by partial. The solution is - don't capture the value.

The problem I'm trying to solve is to make using dead protocol functions work the same as live protocol functions. It's not obvious that protocols, which can be changed by side-effecting functions (`extend`), aren't actually reliant on global state. This can lead to subtle errors when defining protocols in one namespace and extending them in other namespaces if there the protocol is evaluated in between requiring the later namespaces. (Which is to say that at my job we have run into this issue.)

Compare to multimethods and spec, which do have global state and aren't reliant on the order that they're evaluated:

    (defmulti example (fn [a b] [(:type a) (:type b)]))

    (def partial-multi-2094 (partial example {:type :foo}))

    (defmethod example [:foo :bar] [a b] {:a a :b b})

    (partial-multi-2094 {:type :bar :extra :keys})
    ;=> {:a {:type :foo} :b {:type :bar :extra :keys}}

    (require '[clojure.spec.alpha :as s])

    (def partial-valid-2094 (partial s/valid? ::clj-2094))

    (s/def ::clj-2094 (fn [x] true))

    (partial-valid-2094 :a)
    ;=> true

> The solution is - don't capture the value.

Obviously, this is the current solution and one that works in many other instances (such as functions themselves). I think this is a place where the inconsistency causes undue pain and could be positively changed.

You said that derefing the var is a no-go because of the effort that has been put into the method cache. Is your fear that my proposed changes would be used in the generated methods or similar? My proposed change only affects the protocol helper functions (`extends?`, `extenders`, `find-protocol-method`, and `satisfies?`) and doesn't interact with the base protocol method builders, so the existing method cache would continue to work as they do currently.

Thanks again for taking the time to write your detailed comment. I really value the effort you put in to these discussions.
...