Clojure and me has moved.

Thursday, August 13, 2009

About the subtitle

This post has moved, go to its new location
The subtitle for this blog may be misleading. Actually it reads "When the pupil is ready to learn, a teacher will appear."
I picked this Zen proverb because in early 2008 I was looking for a language with the following requirements: strong metaprogrammability, strong concurrency support, strong functional bias and bonus points for running on the JVM. I had prepared myself to not find the rare bird when I met Clojure (and it was nearly love at first sight).
So in that perspective I am the pupil and Clojure (the language, Rich Hickey and the whole community) the teacher.

Thursday, August 6, 2009

Clojure as fast as Java!

This post has moved, go to its new location
I wrote a prototypal implementation of IPersistentSet in Clojure written with new new and, surprisingly, on its first benchmark it's already on par with Java-based implementations.
(dotimes [i 10]
(let [n (int (Math/pow 2 (+ 10 i)))
_ (println "n=" n)
a (do
(print "clojure\t")
(time (doall (let [s (reduce conj empty-set (range 0 n 2))] (map #(get s %) (range n))))))
b (do
(print "java\t")
(time (doall (let [s (reduce conj #{} (range 0 n 2))] (map #(get s %) (range n))))))]
(println (= a b))))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

n= 1024
clojure "Elapsed time: 1.006064 msecs"
java "Elapsed time: 0.736197 msecs"
true
n= 2048
clojure "Elapsed time: 1.813009 msecs"
java "Elapsed time: 1.328102 msecs"
true
n= 4096
clojure "Elapsed time: 3.590191 msecs"
java "Elapsed time: 2.608153 msecs"
true
n= 8192
clojure "Elapsed time: 7.046566 msecs"
java "Elapsed time: 5.807302 msecs"
true
n= 16384
clojure "Elapsed time: 16.015862 msecs"
java "Elapsed time: 10.284897 msecs"
true
n= 32768
clojure "Elapsed time: 29.803928 msecs"
java "Elapsed time: 23.850378 msecs"
true
n= 65536
clojure "Elapsed time: 68.79778 msecs"
java "Elapsed time: 63.947582 msecs"
true
n= 131072
clojure "Elapsed time: 132.082499 msecs"
java "Elapsed time: 113.433411 msecs"
true
n= 262144
clojure "Elapsed time: 292.149631 msecs"
java "Elapsed time: 265.39197 msecs"
true
n= 524288
clojure "Elapsed time: 595.265321 msecs"
java "Elapsed time: 698.711009 msecs"
true

To tell the truth, this benchmark is a bit unfair for Java: no dummy map and slightly different algorithms (half-full bitmap nodes become array-nodes, no leaf-node etc.).
Follow me on Twitter, here

What *warn-on-reflection* doesn't tell you about arrays

This post has moved, go to its new location
user=> (time (let [a (make-array Object 100)] (dotimes [i 10000000] (aset a (rem i 100) nil))))
Reflection warning, NO_SOURCE_PATH:840 - call to aset can't be resolved.
"Elapsed time: 136063.015272 msecs"

With aget/aset, the index must be hinted to be an int:
user=> (time (let [a (make-array Object 100)] (dotimes [i 10000000] (aset a (int (rem i 100)) nil))))
"Elapsed time: 1064.546402 msecs"

Wow, more than 100x faster (reflection is bad) but despite the compiler doesn't complain one can help it to choose a faster path:
user=> (time (let [a #^"[Ljava.lang.Object;" (make-array Object 100)] (dotimes [i 10000000] (aset a (int (rem i 100)) nil))))
"Elapsed time: 247.446882 msecs"

On the whole we get a 500x speed-up with only two type hints.

Thursday, July 30, 2009

Everybody loves the Sieve of Eratosthenes

This post has moved, go to its new location

If I judge by traffic logs for this blog, many newcomers want to compute prime numbers in Clojure.

A recent thread on the mailing list prompted me to test an idea I had for a while: to use a map to implement the Sieve of Eratosthenes.

The first implementation was this one:

(defn primes [max]
(let [enqueue (fn [sieve n factor]
(let [m (+ n factor)]
(assoc sieve m
(conj (sieve m) factor))))
next-sieve (fn [sieve candidate]
(if-let [factors (sieve candidate)]
(reduce #(enqueue %1 candidate %2)
(dissoc sieve candidate)
factors)
(enqueue sieve candidate candidate)))]
(apply concat (vals (reduce next-sieve {} (range 2 max))))))
where the sieve is a map from the next non-prime numbers to their factors. It's so naive that even numbers are tested for primality but it doesn't perform that bad: on my box, it takes 3s to compute all primes below 1,000,000 (it's roughly as fast as clojure.contrib.lazy-seqs/primes when the seq isn't yet memoized).

I wasn't happy with the way I handled the case where a non-prime was already checked off (ie was a key of the map): I was conjing onto the list of prime factors for this number. If instead I tried to check off n+p (where n is the already known non-prime and p the current prime) or n+2p or n+3p... until I found a yet unknown non-prime I wouldn't need to maintain a list of factors, nor to conj or reduce over it. And I was hoping that less allocations would yield better perfs.

Here is the second iteration:

(defn primes2 [max]
(let [enqueue (fn [sieve n factor]
(let [m (+ n factor)]
(if (sieve m)
(recur sieve m factor)
(assoc sieve m factor))))
next-sieve (fn [sieve candidate]
(if-let [factor (sieve candidate)]
(-> sieve
(dissoc candidate)
(enqueue candidate factor))
(enqueue sieve candidate candidate)))]
(vals (reduce next-sieve {} (range 2 max)))))
and it computes all the primes below 1,000,000 in 1.8s instead of 3s but it still tests even numbers for primality.

primes3 is primes2 modified to only test odd numbers:

(defn primes3 [max]
(let [enqueue (fn [sieve n factor]
(let [m (+ n (+ factor factor))]
(if (sieve m)
(recur sieve m factor)
(assoc sieve m factor))))
next-sieve (fn [sieve candidate]
(if-let [factor (sieve candidate)]
(-> sieve
(dissoc candidate)
(enqueue candidate factor))
(enqueue sieve candidate candidate)))]
(cons 2 (vals (reduce next-sieve {} (range 3 max 2))))))
and it computes the same list of primes in 1.5s.

Out of curiosity, I wrote a lazy version of primes3:

(defn lazy-primes3 []
(letfn [(enqueue [sieve n step]
(let [m (+ n step)]
(if (sieve m)
(recur sieve m step)
(assoc sieve m step))))
(next-sieve [sieve candidate]
(if-let [step (sieve candidate)]
(-> sieve
(dissoc candidate)
(enqueue candidate step))
(enqueue sieve candidate (+ candidate candidate))))
(next-primes [sieve candidate]
(if (sieve candidate)
(recur (next-sieve sieve candidate) (+ candidate 2))
(cons candidate
(lazy-seq (next-primes (next-sieve sieve candidate)
(+ candidate 2))))))]
(cons 2 (lazy-seq (next-primes {} 3)))))
and, surprisingly (better locality?), it computes the primes below 1,000,000 in 1s.

Monday, June 8, 2009

Linear Interpolation and sorted-map

This post has moved, go to its new location
Sorted collections (with subseq and rsubseq) can help when working with a partition of disjoint intervals, eg when you need to interpolate.
(defn interpolator 
"Takes a coll of 2D points (vectors) and returns
their linear interpolation function."
[points]
(let [m (into (sorted-map) points)]
(fn [x]
(let [[[x1 y1]] (rsubseq m <= x)
[[x2 y2]] (subseq m > x)]
(if x2
(+ y1 (* (- x x1) (/ (- y2 y1) (- x2 x1))))
y1)))))

;; => (map (interpolator [[0 0] [1 1] [3 2] [4 3]]) (range 0 9/2 1/2))
;; (0 1/2 1 5/4 3/2 7/4 2 5/2 3)

Moustache: syntax file and tests

This post has moved, go to its new location
Tests and a hairy syntax file.

Friday, June 5, 2009

Enlive selectors: documented!

This post has moved, go to its new location
Selectors syntax