Welcome! Please see the About page for a little more info on how this works.

+2 votes
in Clojure by
closed by

Hi, i’ve come across the following when updating to Clojure 1.11.0. In our application we have some places where we create V5 (name based, SHA1 hash) uuids, using the danlentz/clj-uuid library. After the upgrade the v5 uuid’s produced are different. An example being:

(def ^:const +namespace+ #uuid "50d94d91-a1cf-422d-9586-4ddacf6df176")

(clj-uuid/v5 +namespace+ :some-keyword) 

;; Clojure 1.10.3
=> #uuid "d30e9c3c-ced2-534e-a6b8-ecf784fb0785"

;; Clojure 1.11.0
=> #uuid "a16f6719-952a-55b9-b71b-b15dd263665b"

After some trial an error It seems it is the local part argument i.e :some-keyword that is causing the difference, as within the clj-uuid/v5 fn it converts the keyword Object to a ByteArray, which now appears to be different. (If I use a String instead of a keyword for the local part argument then the uuid produced is consistent before and after the Clojure upgrade.)
The uuids produced are used in downstream systems and It would be quite difficult to have them handle the change. Is there anyway that I could achieve the exact same previous uuids?

closed with the note: Fixed in 1.11.1

2 Answers

0 votes
by

Hello. Thank you for the report. Just to clarify -- the version of Java used between tests is the same yes?

by
Yep the same version of Java (openjdk version "17.0.2")
0 votes
by
edited by

Just rehashing conversation from elsewhere here for the record - it seems that using a Keyword for the local part of the name will cause clj-uuid to serialize the object to bytes using Java serialization.

Clojure 1.11 made some additive backwards-compatible changes in Keyword (to improve arity exception reporting) and the binary serialization of keywords thus changed between 1.10 and 1.11.

We do not guarantee binary serializability of Clojure objects between releases so the expectation here that these would be identical is incorrect. There are several ways this property could be addressed in clj-uuid if desired - by providing custom serialization for Keyword specifically in UUIDNameBytes protocol (https://github.com/danlentz/clj-uuid/blob/master/src/clj_uuid.clj#L557), or by relying on pr to string then to bytes instead of binary serialization to object stream, etc.

It's possible we could "fix" this specific case by setting the serialversionUID of the clojure.lang.Keyword to the value it had in 1.10 (as these objects could be binary compatible) and we will consider that a bit more. But even if we did that, I would recommend using something more stable for clj-uuid local name parts (like strings).

by
Thanks Alex, makes sense. Will raise an issue on clj-uuid and see of there's a way forward there as well.
by
It may be possible to fix your specific issue by creating Keyword-like and Symbol-like Java classes (same non transient fields) and set the serialVersionUID in them to the 1.10 serialvers. I believe you could then (stably) replicate the serialized bytes. You could then override the protocol in clj-uuid to use these pseudo classes for Keyword serialization. You will be stuck with that hack in perpetuity but it may give you a path to migrate at least. Or if you have a small number of keyword namespaces you could even hardcode the keyword to bytes mapping.
by
Thanks Alex I will give it a go. Out of interest, why would a Symbol class need to be created, if it hasn't been changed in 1.11.0?
by
The Keyword class has a field that holds a Symbol, so serialization will include both classes.
by
Got it, thanks
by
Hi all.  Yes, I definitely wish I'd special cased a stable byte array conversion for a broader range of objects, keywords in particular.

Just to understand better about the "pseudo Keyword/Symbol" class idea, could these serialize in the same way without sharing the same (conflicting) classname? Even if we serialize identically (.writeObject) my read of ObjectOutputStream https://docs.oracle.com/javase/7/docs/api/java/io/ObjectOutputStream.html looks disappointing -- it serializes "the class of the object, the class signature, and the values of all non-transient fields".  Would we also need a specialized ObjectOutputStream I guess?

I've been looking at the byte array serialization of various namespaced keywords.  it might be possible to recreate this, but at least at first glance it's not as clear as just dropping in the bytes of the keyword name and namespace.
by
Yeah, I think it would need the same class names. It is possible to load alternate classes in a classloader (even a dynamic classloader) but that's getting to be a pretty elaborate workaround.
by
I got something working but had to copy the clojure.lang.Keyword class (as you say, it did require to have the exact same class name) & added a serialVersionUID (with a value that equaled what was automatically generated by the previous version of the class in 1.10.0).

I started seeing if there’s a way/hack of having it have a different class name and then overriding it to “clojure.lang.Keyword” as part of the serialization logic, so that the custom class is just used for the uuid generation/doesnt conflict - but didn’t have much luck. e.g this https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/io/ObjectStreamClass.java#L708 is where the name is written as part of ObjectStreamClass
- There’s also a .writeObjectOverride method on ObjectOutputStream… but re-implementing the default writeObject wasn’t very appealing
by
FYI, I think we are going to do a 1.11.1 and pin the serialVersionUIDs for Keyword and ArraySeq back to the 1.10.3 versions. In case that helps decide on a delayed upgrade path.
by
In terms of the delayed upgrade path for uuid, are there any opinions on the best way to handle this?  Force an incompatible (but stable) serialization as optional?  Default?
by
I don't think there is a good standard way to handle this other than to put a compatibility note on the library. There are things you could do with version conditionals, but all those options are pretty gross and affect perf.
by
Clojure 1.11.1-rc1 is now available - please test and report back!
by
Many thanks, looks good!
...