If I judge by traffic logs for this blog, many newcomers want to compute prime numbers in Clojure.

A recent thread on the mailing list prompted me to test an idea I had for a while: to use a map to implement the Sieve of Eratosthenes.

The first implementation was this one:

(defn primes [max]where the sieve is a map from the next non-prime numbers to their factors. It's so naive that even numbers are tested for primality but it doesn't perform that bad: on my box, it takes 3s to compute all primes below 1,000,000 (it's roughly as fast as clojure.contrib.lazy-seqs/primes when the seq isn't yet memoized).

(let [enqueue (fn [sieve n factor]

(let [m (+ n factor)]

(assoc sieve m

(conj (sieve m) factor))))

next-sieve (fn [sieve candidate]

(if-let [factors (sieve candidate)]

(reduce #(enqueue %1 candidate %2)

(dissoc sieve candidate)

factors)

(enqueue sieve candidate candidate)))]

(apply concat (vals (reduce next-sieve {} (range 2 max))))))

I wasn't happy with the way I handled the case where a non-prime was already *checked off* (ie was a key of the map): I was conjing onto the list of prime factors for this number. If instead I tried to *check off* n+p (where n is the already known non-prime and p the current prime) or n+2p or n+3p... until I found a yet unknown non-prime I wouldn't need to maintain a list of factors, nor to conj or reduce over it. And I was hoping that less allocations would yield better perfs.

Here is the second iteration:

(defn primes2 [max]and it computes all the primes below 1,000,000 in 1.8s instead of 3s but it still tests even numbers for primality.

(let [enqueue (fn [sieve n factor]

(let [m (+ n factor)]

(if (sieve m)

(recur sieve m factor)

(assoc sieve m factor))))

next-sieve (fn [sieve candidate]

(if-let [factor (sieve candidate)]

(-> sieve

(dissoc candidate)

(enqueue candidate factor))

(enqueue sieve candidate candidate)))]

(vals (reduce next-sieve {} (range 2 max)))))

primes3 is primes2 modified to only test odd numbers:

(defn primes3 [max]and it computes the same list of primes in 1.5s.

(let [enqueue (fn [sieve n factor]

(let [m (+ n (+ factor factor))]

(if (sieve m)

(recur sieve m factor)

(assoc sieve m factor))))

next-sieve (fn [sieve candidate]

(if-let [factor (sieve candidate)]

(-> sieve

(dissoc candidate)

(enqueue candidate factor))

(enqueue sieve candidate candidate)))]

(cons 2 (vals (reduce next-sieve {} (range 3 max 2))))))

Out of curiosity, I wrote a lazy version of primes3:

(defn lazy-primes3 []and, surprisingly (better locality?), it computes the primes below 1,000,000 in 1s.

(letfn [(enqueue [sieve n step]

(let [m (+ n step)]

(if (sieve m)

(recur sieve m step)

(assoc sieve m step))))

(next-sieve [sieve candidate]

(if-let [step (sieve candidate)]

(-> sieve

(dissoc candidate)

(enqueue candidate step))

(enqueue sieve candidate (+ candidate candidate))))

(next-primes [sieve candidate]

(if (sieve candidate)

(recur (next-sieve sieve candidate) (+ candidate 2))

(cons candidate

(lazy-seq (next-primes (next-sieve sieve candidate)

(+ candidate 2))))))]

(cons 2 (lazy-seq (next-primes {} 3)))))