Share your thoughts in the 2024 State of Clojure Survey!

Welcome! Please see the About page for a little more info on how this works.

0 votes
in Clojure by
With this test, you can see that we cannot serialize methods from protocols (i.e. time-from-tweet), as this results in a java.io.NotSerializableException: clojure.lang.MethodImplCache
 at java.io.ObjectOutputStream.writeObject0 (ObjectOutputStream.java:1183)
    java.io.ObjectOutputStream.defaultWriteFields (ObjectOutputStream.java:1547)
    java.io.ObjectOutputStream.writeSerialData (ObjectOutputStream.java:1508)
    java.io.ObjectOutputStream.writeOrdinaryObject (ObjectOutputStream.java:1431)
    java.io.ObjectOutputStream.writeObject0 (ObjectOutputStream.java:1177)
    java.io.ObjectOutputStream.writeObject (ObjectOutputStream.java:347)
    sparkling.protocol_test$serialize.invoke (protocol_test.clj:11)


This is the actual test:

(ns sparkling.protocol-test
  (:require [clojure.test :refer :all])
  (:import [java.io ObjectInputStream ByteArrayInputStream ObjectOutputStream ByteArrayOutputStream]))


(defn- serialize
  "Serializes a single object, returning a byte array."
  [v]
  (with-open [bout (ByteArrayOutputStream.)
              oos (ObjectOutputStream. bout)]
    (.writeObject oos v)
    (.flush oos)
    (.toByteArray bout)))

(defn- deserialize
  "Deserializes and returns a single object from the given byte array."
  [bytes]
  (with-open [ois (-> bytes ByteArrayInputStream. ObjectInputStream.)]
    (.readObject ois)))


(defprotocol timestamped
  (time-from-tweet [item]))

(defrecord tweet [username tweet timestamp]
  timestamped
  (time-from-tweet [_]
    timestamp
    ))

(deftest sequable-serialization
  (testing "Serialization of function"
    (let [item identity]
      (is item (-> item serialize deserialize))))

  (testing "Serialization of protocol method"
    (let [item time-from-tweet]
      (is item (-> item serialize deserialize)))))

6 Answers

0 votes
by

Comment made by: chrisbetz

BTW: Same is true for multimethods, here the exception is java.io.NotSerializableException: clojure.lang.MultiFn

0 votes
by

Comment made by: alexmiller

I don't think we expect functions to be serializable in this way. Both protocols and multimethods effectively have runtime state based on what implementations have extended them. What would it mean to serialize these functions? Would you serialize them with whatever implementations have been loaded at that point? Or with none? Both seem problematic to me. Regular functions are closures and can capture the state of their environment. I think better answers are either AOT or for regular functions, something like the serializable-fn library.

0 votes
by

Comment made by: chrisbetz

Hi,

thanks for the comments. First, something to the background: I'm developing Sparkling, a Clojure API to Apache Spark. For distributing code in the cluster it depends on AOT compiled functions, so yes, you cannot simply serialize any function around, it needs to be AOT'd. Serializiation provides us with support for the current bindings etc, and everything works as expected. So, AFunction is serializable for a reason and so are other implementations of AFn/IFn, everything works well.

Regarding the state of protocols and multimethods - I think it's conceptually the same as the state of functions (which function definition, the var might be bound multiple times, etc.), and the closures given in bindings. There's no reason for me as the user of a protocol to believe that the method from the protocol differs from a function. In fact (ifn? protocol-method) also returns true.

serializable-fn, not being intended for over-the-wire serialization in the first place, has problems with collections of functions in bindings of the serializble function, together with an issue with PermGen pollution by creating classes for the same function over and over again in the context of Spark.

I think I'm fine for the moment, as I can wrap the protocol method in a function, but I still believe, that this is a bug.

Regards

Chris

0 votes
by

Comment made by: chrisbetz

actually, this is the code snippet from (link: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/AFunction.java text: clojure.lang.AFunction) causing the pain:

`
public abstract class AFunction extends AFn implements IObj, Comparator, Fn, Serializable {

public volatile MethodImplCache __methodImplCache;
`

AFunction is serializable, but MethodImplCache is not. I'm not sure if it's enough to mark it as transient, because I did not check where initialization happens.

0 votes
by

Comment made by: chrisbetz

My comment per mail got lost in SMTP-nirvana: There's an easy workaround. Wrap the protocol method in a function, that will do the trick at the cost of uglifying your code ;)

0 votes
by
Reference: https://clojure.atlassian.net/browse/CLJ-1701 (reported by alex+import)
...