## Thursday, August 13, 2009

### About the subtitle

I picked this Zen proverb because in early 2008 I was looking for a language with the following requirements: strong metaprogrammability, strong concurrency support, strong functional bias and bonus points for running on the JVM. I had prepared myself to not find the rare bird when I met Clojure (and it was nearly love at first sight).

So in that perspective I am the pupil and Clojure (the language, Rich Hickey and the whole community) the teacher.

## Thursday, August 6, 2009

### Clojure as fast as Java!

(dotimes [i 10]

(let [n (int (Math/pow 2 (+ 10 i)))

_ (println "n=" n)

a (do

(print "clojure\t")

(time (doall (let [s (reduce conj empty-set (range 0 n 2))] (map #(get s %) (range n))))))

b (do

(print "java\t")

(time (doall (let [s (reduce conj #{} (range 0 n 2))] (map #(get s %) (range n))))))]

(println (= a b))))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

n= 1024

clojure "Elapsed time: 1.006064 msecs"

java "Elapsed time: 0.736197 msecs"

true

n= 2048

clojure "Elapsed time: 1.813009 msecs"

java "Elapsed time: 1.328102 msecs"

true

n= 4096

clojure "Elapsed time: 3.590191 msecs"

java "Elapsed time: 2.608153 msecs"

true

n= 8192

clojure "Elapsed time: 7.046566 msecs"

java "Elapsed time: 5.807302 msecs"

true

n= 16384

clojure "Elapsed time: 16.015862 msecs"

java "Elapsed time: 10.284897 msecs"

true

n= 32768

clojure "Elapsed time: 29.803928 msecs"

java "Elapsed time: 23.850378 msecs"

true

n= 65536

clojure "Elapsed time: 68.79778 msecs"

java "Elapsed time: 63.947582 msecs"

true

n= 131072

clojure "Elapsed time: 132.082499 msecs"

java "Elapsed time: 113.433411 msecs"

true

n= 262144

clojure "Elapsed time: 292.149631 msecs"

java "Elapsed time: 265.39197 msecs"

true

n= 524288

clojure "Elapsed time: 595.265321 msecs"

java "Elapsed time: 698.711009 msecs"

true

To tell the truth, this benchmark is a bit unfair for Java: no dummy map and slightly different algorithms (half-full bitmap nodes become array-nodes, no leaf-node etc.).

Follow me on Twitter, here

### What *warn-on-reflection* doesn't tell you about arrays

user=> (time (let [a (make-array Object 100)] (dotimes [i 10000000] (aset a (rem i 100) nil))))

Reflection warning, NO_SOURCE_PATH:840 - call to aset can't be resolved.

"Elapsed time: 136063.015272 msecs"

With aget/aset, the index must be hinted to be an int:

user=> (time (let [a (make-array Object 100)] (dotimes [i 10000000] (aset a (int (rem i 100)) nil))))

"Elapsed time: 1064.546402 msecs"

Wow, more than 100x faster (reflection is bad) but despite the compiler doesn't complain one can help it to choose a faster path:

user=> (time (let [a #^"[Ljava.lang.Object;" (make-array Object 100)] (dotimes [i 10000000] (aset a (int (rem i 100)) nil))))

"Elapsed time: 247.446882 msecs"

On the whole we get a 500x speed-up with only two type hints.

## Thursday, July 30, 2009

### Everybody loves the Sieve of Eratosthenes

If I judge by traffic logs for this blog, many newcomers want to compute prime numbers in Clojure.

A recent thread on the mailing list prompted me to test an idea I had for a while: to use a map to implement the Sieve of Eratosthenes.

The first implementation was this one:

(defn primes [max]where the sieve is a map from the next non-prime numbers to their factors. It's so naive that even numbers are tested for primality but it doesn't perform that bad: on my box, it takes 3s to compute all primes below 1,000,000 (it's roughly as fast as clojure.contrib.lazy-seqs/primes when the seq isn't yet memoized).

(let [enqueue (fn [sieve n factor]

(let [m (+ n factor)]

(assoc sieve m

(conj (sieve m) factor))))

next-sieve (fn [sieve candidate]

(if-let [factors (sieve candidate)]

(reduce #(enqueue %1 candidate %2)

(dissoc sieve candidate)

factors)

(enqueue sieve candidate candidate)))]

(apply concat (vals (reduce next-sieve {} (range 2 max))))))

I wasn't happy with the way I handled the case where a non-prime was already *checked off* (ie was a key of the map): I was conjing onto the list of prime factors for this number. If instead I tried to *check off* n+p (where n is the already known non-prime and p the current prime) or n+2p or n+3p... until I found a yet unknown non-prime I wouldn't need to maintain a list of factors, nor to conj or reduce over it. And I was hoping that less allocations would yield better perfs.

Here is the second iteration:

(defn primes2 [max]and it computes all the primes below 1,000,000 in 1.8s instead of 3s but it still tests even numbers for primality.

(let [enqueue (fn [sieve n factor]

(let [m (+ n factor)]

(if (sieve m)

(recur sieve m factor)

(assoc sieve m factor))))

next-sieve (fn [sieve candidate]

(if-let [factor (sieve candidate)]

(-> sieve

(dissoc candidate)

(enqueue candidate factor))

(enqueue sieve candidate candidate)))]

(vals (reduce next-sieve {} (range 2 max)))))

primes3 is primes2 modified to only test odd numbers:

(defn primes3 [max]and it computes the same list of primes in 1.5s.

(let [enqueue (fn [sieve n factor]

(let [m (+ n (+ factor factor))]

(if (sieve m)

(recur sieve m factor)

(assoc sieve m factor))))

next-sieve (fn [sieve candidate]

(if-let [factor (sieve candidate)]

(-> sieve

(dissoc candidate)

(enqueue candidate factor))

(enqueue sieve candidate candidate)))]

(cons 2 (vals (reduce next-sieve {} (range 3 max 2))))))

Out of curiosity, I wrote a lazy version of primes3:

(defn lazy-primes3 []and, surprisingly (better locality?), it computes the primes below 1,000,000 in 1s.

(letfn [(enqueue [sieve n step]

(let [m (+ n step)]

(if (sieve m)

(recur sieve m step)

(assoc sieve m step))))

(next-sieve [sieve candidate]

(if-let [step (sieve candidate)]

(-> sieve

(dissoc candidate)

(enqueue candidate step))

(enqueue sieve candidate (+ candidate candidate))))

(next-primes [sieve candidate]

(if (sieve candidate)

(recur (next-sieve sieve candidate) (+ candidate 2))

(cons candidate

(lazy-seq (next-primes (next-sieve sieve candidate)

(+ candidate 2))))))]

(cons 2 (lazy-seq (next-primes {} 3)))))

## Monday, June 8, 2009

### Linear Interpolation and sorted-map

`subseq`

and `rsubseq`

) can help when working with a partition of disjoint intervals, eg when you need to interpolate.(defn interpolator

"Takes a coll of 2D points (vectors) and returns

their linear interpolation function."

[points]

(let [m (into (sorted-map) points)]

(fn [x]

(let [[[x1 y1]] (rsubseq m <= x)

[[x2 y2]] (subseq m > x)]

(if x2

(+ y1 (* (- x x1) (/ (- y2 y1) (- x2 x1))))

y1)))))

;; => (map (interpolator [[0 0] [1 1] [3 2] [4 3]]) (range 0 9/2 1/2))

;; (0 1/2 1 5/4 3/2 7/4 2 5/2 3)