Thursday, August 13, 2009
About the subtitle
I picked this Zen proverb because in early 2008 I was looking for a language with the following requirements: strong metaprogrammability, strong concurrency support, strong functional bias and bonus points for running on the JVM. I had prepared myself to not find the rare bird when I met Clojure (and it was nearly love at first sight).
So in that perspective I am the pupil and Clojure (the language, Rich Hickey and the whole community) the teacher.
Thursday, August 6, 2009
Clojure as fast as Java!
(dotimes [i 10]
(let [n (int (Math/pow 2 (+ 10 i)))
_ (println "n=" n)
a (do
(print "clojure\t")
(time (doall (let [s (reduce conj empty-set (range 0 n 2))] (map #(get s %) (range n))))))
b (do
(print "java\t")
(time (doall (let [s (reduce conj #{} (range 0 n 2))] (map #(get s %) (range n))))))]
(println (= a b))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
n= 1024
clojure "Elapsed time: 1.006064 msecs"
java "Elapsed time: 0.736197 msecs"
true
n= 2048
clojure "Elapsed time: 1.813009 msecs"
java "Elapsed time: 1.328102 msecs"
true
n= 4096
clojure "Elapsed time: 3.590191 msecs"
java "Elapsed time: 2.608153 msecs"
true
n= 8192
clojure "Elapsed time: 7.046566 msecs"
java "Elapsed time: 5.807302 msecs"
true
n= 16384
clojure "Elapsed time: 16.015862 msecs"
java "Elapsed time: 10.284897 msecs"
true
n= 32768
clojure "Elapsed time: 29.803928 msecs"
java "Elapsed time: 23.850378 msecs"
true
n= 65536
clojure "Elapsed time: 68.79778 msecs"
java "Elapsed time: 63.947582 msecs"
true
n= 131072
clojure "Elapsed time: 132.082499 msecs"
java "Elapsed time: 113.433411 msecs"
true
n= 262144
clojure "Elapsed time: 292.149631 msecs"
java "Elapsed time: 265.39197 msecs"
true
n= 524288
clojure "Elapsed time: 595.265321 msecs"
java "Elapsed time: 698.711009 msecs"
true
To tell the truth, this benchmark is a bit unfair for Java: no dummy map and slightly different algorithms (half-full bitmap nodes become array-nodes, no leaf-node etc.).
Follow me on Twitter, here
What *warn-on-reflection* doesn't tell you about arrays
user=> (time (let [a (make-array Object 100)] (dotimes [i 10000000] (aset a (rem i 100) nil))))
Reflection warning, NO_SOURCE_PATH:840 - call to aset can't be resolved.
"Elapsed time: 136063.015272 msecs"
With aget/aset, the index must be hinted to be an int:
user=> (time (let [a (make-array Object 100)] (dotimes [i 10000000] (aset a (int (rem i 100)) nil))))
"Elapsed time: 1064.546402 msecs"
Wow, more than 100x faster (reflection is bad) but despite the compiler doesn't complain one can help it to choose a faster path:
user=> (time (let [a #^"[Ljava.lang.Object;" (make-array Object 100)] (dotimes [i 10000000] (aset a (int (rem i 100)) nil))))
"Elapsed time: 247.446882 msecs"
On the whole we get a 500x speed-up with only two type hints.
Thursday, July 30, 2009
Everybody loves the Sieve of Eratosthenes
If I judge by traffic logs for this blog, many newcomers want to compute prime numbers in Clojure.
A recent thread on the mailing list prompted me to test an idea I had for a while: to use a map to implement the Sieve of Eratosthenes.
The first implementation was this one:
(defn primes [max]where the sieve is a map from the next non-prime numbers to their factors. It's so naive that even numbers are tested for primality but it doesn't perform that bad: on my box, it takes 3s to compute all primes below 1,000,000 (it's roughly as fast as clojure.contrib.lazy-seqs/primes when the seq isn't yet memoized).
(let [enqueue (fn [sieve n factor]
(let [m (+ n factor)]
(assoc sieve m
(conj (sieve m) factor))))
next-sieve (fn [sieve candidate]
(if-let [factors (sieve candidate)]
(reduce #(enqueue %1 candidate %2)
(dissoc sieve candidate)
factors)
(enqueue sieve candidate candidate)))]
(apply concat (vals (reduce next-sieve {} (range 2 max))))))
I wasn't happy with the way I handled the case where a non-prime was already checked off (ie was a key of the map): I was conjing onto the list of prime factors for this number. If instead I tried to check off n+p (where n is the already known non-prime and p the current prime) or n+2p or n+3p... until I found a yet unknown non-prime I wouldn't need to maintain a list of factors, nor to conj or reduce over it. And I was hoping that less allocations would yield better perfs.
Here is the second iteration:
(defn primes2 [max]and it computes all the primes below 1,000,000 in 1.8s instead of 3s but it still tests even numbers for primality.
(let [enqueue (fn [sieve n factor]
(let [m (+ n factor)]
(if (sieve m)
(recur sieve m factor)
(assoc sieve m factor))))
next-sieve (fn [sieve candidate]
(if-let [factor (sieve candidate)]
(-> sieve
(dissoc candidate)
(enqueue candidate factor))
(enqueue sieve candidate candidate)))]
(vals (reduce next-sieve {} (range 2 max)))))
primes3 is primes2 modified to only test odd numbers:
(defn primes3 [max]and it computes the same list of primes in 1.5s.
(let [enqueue (fn [sieve n factor]
(let [m (+ n (+ factor factor))]
(if (sieve m)
(recur sieve m factor)
(assoc sieve m factor))))
next-sieve (fn [sieve candidate]
(if-let [factor (sieve candidate)]
(-> sieve
(dissoc candidate)
(enqueue candidate factor))
(enqueue sieve candidate candidate)))]
(cons 2 (vals (reduce next-sieve {} (range 3 max 2))))))
Out of curiosity, I wrote a lazy version of primes3:
(defn lazy-primes3 []and, surprisingly (better locality?), it computes the primes below 1,000,000 in 1s.
(letfn [(enqueue [sieve n step]
(let [m (+ n step)]
(if (sieve m)
(recur sieve m step)
(assoc sieve m step))))
(next-sieve [sieve candidate]
(if-let [step (sieve candidate)]
(-> sieve
(dissoc candidate)
(enqueue candidate step))
(enqueue sieve candidate (+ candidate candidate))))
(next-primes [sieve candidate]
(if (sieve candidate)
(recur (next-sieve sieve candidate) (+ candidate 2))
(cons candidate
(lazy-seq (next-primes (next-sieve sieve candidate)
(+ candidate 2))))))]
(cons 2 (lazy-seq (next-primes {} 3)))))
Monday, June 8, 2009
Linear Interpolation and sorted-map
subseq
and rsubseq
) can help when working with a partition of disjoint intervals, eg when you need to interpolate.(defn interpolator
"Takes a coll of 2D points (vectors) and returns
their linear interpolation function."
[points]
(let [m (into (sorted-map) points)]
(fn [x]
(let [[[x1 y1]] (rsubseq m <= x)
[[x2 y2]] (subseq m > x)]
(if x2
(+ y1 (* (- x x1) (/ (- y2 y1) (- x2 x1))))
y1)))))
;; => (map (interpolator [[0 0] [1 1] [3 2] [4 3]]) (range 0 9/2 1/2))
;; (0 1/2 1 5/4 3/2 7/4 2 5/2 3)