Introducing HipHip (Array): Fast and flexible numerical computation in Clojure

This post and work on HipHip are done in part by Leon Barrett, who spent a ‘sprintbatical’ visiting fellow Clojure shop Prismatic, and Emil Flakk. This post is cross-posted to Prismatic’s blog.

The Climate Corporation and Prismatic share the unique challenge of developing new algorithms and making them work at a large scale. This combination of research and engineering takes many cycles of algorithm and model design, including rapid implementation to test on lots of real data. In contrast to most engineering prototyping, such numeric modeling demands high performance out of the gate, since correctness cannot be judged on just a little bit of data. In this post, we introduce our open-source array processing library HipHip, which combines Clojure’s expressiveness with the fastest math Java has to offer.

Computational Functional Programming

As outlined in several past posts, Climate and Prismatic love Clojure (and functional programming generally) because it allows us to quickly build performant and reusable tools. However, many of the data abstractions that make Clojure code so reusable don’t yield fast numerical computation. For instance, consider the dot product operation, which is the core performance bottleneck in most machine learning algorithms. Using Clojure’s built-in sequence operations, it would be natural to write the dense dot product as:

;; 1ms for 10k doubles: 20 MFlops
(defn dot-product [^doubles ws ^doubles xs]
  (reduce + (map * ws xs))

The above code succinctly expresses the idea that we form the sequence of element-wise products between ws and xs, and then we sum those entries. But its performance is a disaster, especially if you’re considering real-world machine learning use where each prediction could require thousands of dot products. The core of the slowness arises from the fact that intermediate operations produce sequences of ‘boxed’ Java Double objects. All arithmetic operations on these boxed objects are significantly slower than on their primitive counterparts. This implementation also creates an unnecessary intermediate sequence (the result of the map), rather than just summing the numbers directly.

Clojure’s built-in array macros vs. plain-old Java

Clojure comes with built-in array processing macros (amap and areduce) which extend Clojure sequence operations to arrays and allow for primitive arithmetic. Here’s what dot-product looks like using those macros:

;; 8.5 us for 10k doubles: 2.3 GFlops
;; (11 us with *unchecked-math* false)
(defn dot-product [^doubles ws ^doubles xs]
  (areduce xs i ret 0.0
    (+ ret (* (aget xs i)
              (aget ws i)))))

The performance here is acceptable, but the code itself is pretty convoluted. Arguably, this isn’t even much of an improvement over the Java version:

// 8.5 us for 10k doubles: 2.3 GFlops
public double dotProduct(double [] ws, double[] xs) {
   double result = 0.0;
    for (int i=0; i < ws.length; ++i) {
      result += ws[i] * xs[i];
   return result;

Why not just use Java for math bottlenecks?

Since the dot product is a crucial bottleneck for machine learning, why not just suck it up and use the Java version? Clojure makes Java-interop easy so that you can take the core bottleneck of your system and make it fast.

The problem is that for computational systems, there typically isn’t a single bottleneck. Instead, any function that performs array operations (whether double, long, float, etc.) over each item of data can cause an unacceptable slowdown, even during the research prototyping stage. Since these sorts of array operations are the bulk of our computational code, our research engineers would mostly be stuck writing Java. We need the expressiveness of Clojure and the speed of Java primitives.

Introducing the HipHip array processing library

We’re happy to announce HipHip, a Clojure array-processing library. Here’s what a dot product looks like in HipHip:

;; 8.5 us for 10k doubles: 2.3 GFlops
(require '[hiphip.double :as dbl])
(defn dot-product [ws xs]
  (dbl/asum [x xs w ws] (* x w)))

All the arithmetic operations are performed over primitive doubles and which allows the speed to match Java. Here we get a clean, functional representation of the operation that matches Java performance without any worrying about type hints.

HipHip provides namespaces for working on arrays of the four main primitive math types (double, float, long, and int) without type hints, which include:

  • Versions of clojure.core array fns that do not require type hints:
(dbl/aset xs i (* 2.0 (dbl/aget xs (inc i)))))
  • A family of macros for efficiently iterating over array(s) with a common binding syntax, including amake, doarr, afill!, and our own versions of amap and areduce.
(defn add-in-place! 
  "Modify xs, adding to each element the corresponding element of ys"
  [xs ys]
  (dbl/afill! [x xs y ys] (+ x y)))
(defn pointwise-product 
  "Produce a new double array of the element-wise product of xs and ys"
  [xs ys]
  (dbl/amap [x xs y ys] (* x y)))
  • Common ‘mathy’ operations like summing over an array and selecting a maximal element (or many):
(defn standard-deviation [xs]
  (let [mean (dbl/amean xs)
        mean-sq (/ (dbl/asum [x xs] (* x x))
                   (dbl/alength xs))]
    (Math/sqrt (- mean-sq (* mean mean)))))
(= 1 (dbl/amax-index (double-array [1.0 3.0 2.0])))
(defn max-normalize [xs]
  (let [m (dbl/amax xs)]
    (afill! [x xs] (/ x m))))

You might find some of these operations useful, even if your data isn’t already in arrays. For
example, the following function can find the top 1000 elements of vector xs by score-fn
about 20 times faster than (take 1000 (reverse (sort-by score-fn xs))) on 100k elements.

(defn top-k 
  "Return the top k elements of xs according to score-fn"
  [k score-fn xs]
  (let [n (count xs)
        scores (dbl/amake [i n] (score-fn (nth xs i)))
        ^ints idxs (dbl/amax-indices scores k)]
    (seq (hiphip.array/amake Object [i k]
           (nth xs (aget idxs (- n i 1)))))))

HipHip also provides a generic namespace hiphip.array with variants of the iteration macros
that work on all (hinted) array types, including mixing-and-matching a variety of array types
in the same operation:

HipHip uses Criterium for benchmarking, and runs benchmarks as tests, so we can be pretty confident that most of our double array operations run 0-50% slower than Java (but we’re not quite there for other array types yet, see the ‘Known Issues’ section of the readme for details). The readme also describes a critical option for fast array math if you’re running under Leiningen.

Check it out and let us know what you think in the Hacker News thread.

Posted in Engineering
2 comments on “Introducing HipHip (Array): Fast and flexible numerical computation in Clojure
  1. filou says:

    Tried running dot-product and ended up with the following error:

    Exception in thread “main” java.lang.ClassCastException: clojure.lang.PersistentVector cannot be cast to [D
    at hiphip.core$dot_product.invoke(core.clj:6)
    at hiphip.core$_main.doInvoke(core.clj:14)
    at clojure.lang.RestFn.invoke(
    at clojure.lang.Var.invoke(

    • Yes, filou, these macros are designed to work on primitive arrays, made for example by float-array or hiphip/float.amake. For speed, you would want primitive arrays anyway because they avoid the cost of boxing. They also are mutable, which can be important for writing really fast inner loops.

      So, this will not work:
      (hiphip.double/asum [1 2 3 4])

      but this will work:
      (hiphip.double/asum (double-array [1 2 3 4]))

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: