For anyone who may be following this, I expanded the benchmarking, separating out the laziness change from the transducer variations. I also added a new variation for completeness.
To be honest, the best performing functions surprised me. I did not expect the persistent set to perform as well as it did, and I also did not expect the transient variations to perform worse for large sequences. However, these both made sense when I considered the shape of the data (large sequences with a small percentage of unique values).
I was also surprised that the transient set approach that skips the call to `contains?` performed as well as it did. While it avoids some of the code path of an earlier function, I am surprised that it didn't slow down similarly to the `distinct-transient` function.