Skip to content

Faster HashCode

Justin Conklin edited this page Aug 24, 2017 · 1 revision

Let's say we want a more performant integer array hashCode implementation in Clojure (disregarding our knowledge of java.util.Arrays/hashCode for the moment). We could simply write a Java helper class, but it's pretty easy to just write it in JVM bytecode.

(require '[insn.clojure :as bc])

(bc/defn hashcode ^long [arr]
  [[:aload 1] [:ifnull :L/NULL]                ;; if arr == null then goto NULL
   [:aload 1] [:checkcast [:int]] [:astore 1]  ;; arr = (int[]) arr
   [:aload 1] [:arraylength]      [:istore 2]  ;; len = arr.length
   [:ldc 0]   [:istore 3]                      ;; i = 0
   [:ldc 1]   [:istore 4]                      ;; h = 1
   [:mark :L/LOOP]
   [:iload 2]  [:iload 3] [:if-icmpeq :L/RET]  ;; if len == i then goto RET
   [:ldc 31]   [:iload 4] [:imul]              ;; push h * 31
   [:aload 1]  [:iload 3] [:iaload]            ;; push arr[i]
   [:iadd]     [:istore 4]                     ;; add pushed values, store into h
   [:iinc 3 1] [:goto :L/LOOP]                 ;; i++, goto LOOP
   [:mark :L/RET]
   [:iload 4] [:i2l] [:lreturn]                ;; return h as long
   [:mark :L/NULL]
   [:ldc2 0] [:lreturn]])                      ;; return 0L

By my benchmarks, this is about 25% faster than the optimized Clojure equivalent, on par with Java. The reason likely being the additional l2i and i2l cast instructions at every step of the Clojure version. An int is required at each iteration to correctly update the hash value and Clojure's loop can only maintain longs.