Thursday, August 13, 2009
About the subtitle
I picked this Zen proverb because in early 2008 I was looking for a language with the following requirements: strong metaprogrammability, strong concurrency support, strong functional bias and bonus points for running on the JVM. I had prepared myself to not find the rare bird when I met Clojure (and it was nearly love at first sight).
So in that perspective I am the pupil and Clojure (the language, Rich Hickey and the whole community) the teacher.
Thursday, August 6, 2009
Clojure as fast as Java!
(dotimes [i 10]
(let [n (int (Math/pow 2 (+ 10 i)))
_ (println "n=" n)
a (do
(print "clojure\t")
(time (doall (let [s (reduce conj empty-set (range 0 n 2))] (map #(get s %) (range n))))))
b (do
(print "java\t")
(time (doall (let [s (reduce conj #{} (range 0 n 2))] (map #(get s %) (range n))))))]
(println (= a b))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
n= 1024
clojure "Elapsed time: 1.006064 msecs"
java "Elapsed time: 0.736197 msecs"
true
n= 2048
clojure "Elapsed time: 1.813009 msecs"
java "Elapsed time: 1.328102 msecs"
true
n= 4096
clojure "Elapsed time: 3.590191 msecs"
java "Elapsed time: 2.608153 msecs"
true
n= 8192
clojure "Elapsed time: 7.046566 msecs"
java "Elapsed time: 5.807302 msecs"
true
n= 16384
clojure "Elapsed time: 16.015862 msecs"
java "Elapsed time: 10.284897 msecs"
true
n= 32768
clojure "Elapsed time: 29.803928 msecs"
java "Elapsed time: 23.850378 msecs"
true
n= 65536
clojure "Elapsed time: 68.79778 msecs"
java "Elapsed time: 63.947582 msecs"
true
n= 131072
clojure "Elapsed time: 132.082499 msecs"
java "Elapsed time: 113.433411 msecs"
true
n= 262144
clojure "Elapsed time: 292.149631 msecs"
java "Elapsed time: 265.39197 msecs"
true
n= 524288
clojure "Elapsed time: 595.265321 msecs"
java "Elapsed time: 698.711009 msecs"
true
To tell the truth, this benchmark is a bit unfair for Java: no dummy map and slightly different algorithms (half-full bitmap nodes become array-nodes, no leaf-node etc.).
Follow me on Twitter, here
What *warn-on-reflection* doesn't tell you about arrays
user=> (time (let [a (make-array Object 100)] (dotimes [i 10000000] (aset a (rem i 100) nil))))
Reflection warning, NO_SOURCE_PATH:840 - call to aset can't be resolved.
"Elapsed time: 136063.015272 msecs"
With aget/aset, the index must be hinted to be an int:
user=> (time (let [a (make-array Object 100)] (dotimes [i 10000000] (aset a (int (rem i 100)) nil))))
"Elapsed time: 1064.546402 msecs"
Wow, more than 100x faster (reflection is bad) but despite the compiler doesn't complain one can help it to choose a faster path:
user=> (time (let [a #^"[Ljava.lang.Object;" (make-array Object 100)] (dotimes [i 10000000] (aset a (int (rem i 100)) nil))))
"Elapsed time: 247.446882 msecs"
On the whole we get a 500x speed-up with only two type hints.
Thursday, July 30, 2009
Everybody loves the Sieve of Eratosthenes
If I judge by traffic logs for this blog, many newcomers want to compute prime numbers in Clojure.
A recent thread on the mailing list prompted me to test an idea I had for a while: to use a map to implement the Sieve of Eratosthenes.
The first implementation was this one:
(defn primes [max]where the sieve is a map from the next non-prime numbers to their factors. It's so naive that even numbers are tested for primality but it doesn't perform that bad: on my box, it takes 3s to compute all primes below 1,000,000 (it's roughly as fast as clojure.contrib.lazy-seqs/primes when the seq isn't yet memoized).
(let [enqueue (fn [sieve n factor]
(let [m (+ n factor)]
(assoc sieve m
(conj (sieve m) factor))))
next-sieve (fn [sieve candidate]
(if-let [factors (sieve candidate)]
(reduce #(enqueue %1 candidate %2)
(dissoc sieve candidate)
factors)
(enqueue sieve candidate candidate)))]
(apply concat (vals (reduce next-sieve {} (range 2 max))))))
I wasn't happy with the way I handled the case where a non-prime was already checked off (ie was a key of the map): I was conjing onto the list of prime factors for this number. If instead I tried to check off n+p (where n is the already known non-prime and p the current prime) or n+2p or n+3p... until I found a yet unknown non-prime I wouldn't need to maintain a list of factors, nor to conj or reduce over it. And I was hoping that less allocations would yield better perfs.
Here is the second iteration:
(defn primes2 [max]and it computes all the primes below 1,000,000 in 1.8s instead of 3s but it still tests even numbers for primality.
(let [enqueue (fn [sieve n factor]
(let [m (+ n factor)]
(if (sieve m)
(recur sieve m factor)
(assoc sieve m factor))))
next-sieve (fn [sieve candidate]
(if-let [factor (sieve candidate)]
(-> sieve
(dissoc candidate)
(enqueue candidate factor))
(enqueue sieve candidate candidate)))]
(vals (reduce next-sieve {} (range 2 max)))))
primes3 is primes2 modified to only test odd numbers:
(defn primes3 [max]and it computes the same list of primes in 1.5s.
(let [enqueue (fn [sieve n factor]
(let [m (+ n (+ factor factor))]
(if (sieve m)
(recur sieve m factor)
(assoc sieve m factor))))
next-sieve (fn [sieve candidate]
(if-let [factor (sieve candidate)]
(-> sieve
(dissoc candidate)
(enqueue candidate factor))
(enqueue sieve candidate candidate)))]
(cons 2 (vals (reduce next-sieve {} (range 3 max 2))))))
Out of curiosity, I wrote a lazy version of primes3:
(defn lazy-primes3 []and, surprisingly (better locality?), it computes the primes below 1,000,000 in 1s.
(letfn [(enqueue [sieve n step]
(let [m (+ n step)]
(if (sieve m)
(recur sieve m step)
(assoc sieve m step))))
(next-sieve [sieve candidate]
(if-let [step (sieve candidate)]
(-> sieve
(dissoc candidate)
(enqueue candidate step))
(enqueue sieve candidate (+ candidate candidate))))
(next-primes [sieve candidate]
(if (sieve candidate)
(recur (next-sieve sieve candidate) (+ candidate 2))
(cons candidate
(lazy-seq (next-primes (next-sieve sieve candidate)
(+ candidate 2))))))]
(cons 2 (lazy-seq (next-primes {} 3)))))
Monday, June 8, 2009
Linear Interpolation and sorted-map
subseq
and rsubseq
) can help when working with a partition of disjoint intervals, eg when you need to interpolate.(defn interpolator
"Takes a coll of 2D points (vectors) and returns
their linear interpolation function."
[points]
(let [m (into (sorted-map) points)]
(fn [x]
(let [[[x1 y1]] (rsubseq m <= x)
[[x2 y2]] (subseq m > x)]
(if x2
(+ y1 (* (- x x1) (/ (- y2 y1) (- x2 x1))))
y1)))))
;; => (map (interpolator [[0 0] [1 1] [3 2] [4 3]]) (range 0 9/2 1/2))
;; (0 1/2 1 5/4 3/2 7/4 2 5/2 3)
Moustache: syntax file and tests
Friday, June 5, 2009
Sunday, May 17, 2009
The Need for More Lack of Understanding
It's not the same kind of non-understanding which I have in mind. I'm talking about a macro/DSL being able to not understand what is passed to it.
I think it's an important feature for the user to be able to get out of your DSL and back in regular Clojure.
In both Moustache and Enlive, I avoid to give a special meaning to lists (in Clojure you also have vectors, maps and sets), hence lists always denote user-code and are embedded as-is in the macro expansion. (If I really had to use lists for a DSL, I would check a list doesn't start with a special value (eg do
or unquote
(~) — thanks to MB's comment) before processing it).
That's why in Enlive you can easily add your own selectors step [:p (my-selector-step arg1 arg2)]
as long as (my-selector-step arg1 arg2)
evaluates to a correct value (here a state machine).
That's also how Moustache supports wrapping another Ring handler or custom route validation.
Monday, May 4, 2009
Counting without numbers
(loop [state init, n (count some-seq)]when it struck me that seqs are numerals too!
(if (pos? n)
(recur value (dec n))
(ends here)))
(loop [state init, n some-seq]or:
(if (seq n)
(recur value (rest n))
(ends here)))
(loop [state init, n (seq some-seq)]
(if n
(recur value (next n))
(ends here)))
Friday, May 1, 2009
Functionally growing a tree (2): insertion points and zippers
It eases growing trees from an empty root.
; show-ip represents the currently edited structure with * denoting the insertion-point
(defn show-ip [ip] (-> ip (insert-left '*) first z/root))
(def e (-> [] z/vector-zip (insertion-point :append)))
(-> e show-ip) ; [*]
(-> e (insert-left 1) show-ip) ; [1 *]
(-> e (insert-left 1) (insert-right 2) show-ip) ; [1 * 2]
(-> e (insert-left 1) (insert-right 2) left show-ip) ; [* 1 2]
(-> e (insert-left [1 2]) show-ip) ; [[1 2] *]
(-> e (insert-left [1 2]) left show-ip) ; [* [1 2]]
(-> e (insert-left [1 2]) left right show-ip) ; [[1 2] *]
(-> e (insert-left [1 2]) left next show-ip) ; [[* 1 2]]
(-> e (insert-left [1 2]) left next next show-ip) ; [[1 * 2]]
(-> e (insert-left [1 2]) left next next next show-ip) ; [[1 2 *]]
(-> e (insert-left [1 2]) left next next next next show-ip) ; [[1 2] *]
(-> e (insert-left [1 2]) left next next next next prev show-ip) ; [[1 2 *]]
Wednesday, April 29, 2009
.toUpperCase
.toUpperCase
or .toLowerCase
but specify Locale/ENGLISH
.
Tuesday, April 28, 2009
Counting occurences (solution to the exercise)
- What does
(merge-with + {:a 12} {:b 4} {:a 3 :b 7})
return? {:b 11, :a 15}
when there are several values for a key, these values are merged (two at a time) using the specified function — here they are summed.- Can you count occurrences of each value in a collection using
merge-with
? (apply merge-with + (map (fn [x] {x 1}) coll))
or, usingfor:
(apply merge-with + (for [x coll] {x 1}))
.
Monday, April 27, 2009
Screenscraping with Enlive
(select (html-resource (java.net.URL. "http://clojure-log.n01se.net/")) [:#main [:a (attr? :href)]])
returns a seq of link nodes.
Counting occurences
(reduce #(assoc %1 %2 (inc (%1 %2 0))) {} coll)
. Since it can take some time to get accustomed to the functional way of thought, here is how one can work such an expression out:- How to count all occurrences of 42?
(count (filter #{42} coll))
- How to express count using
reduce
? (defn my-count [coll] (reduce (fn [n _] (inc n)) 0 coll))
- How to count all occurrences of 42 using
reduce
? (reduce (fn [n _] (inc n)) 0 (filter #{42} coll))
- Can you get rid of the
filter
? (reduce (fn [n x] (if (= 42 x) (inc n) n)) 0 coll)
- I'd like the result to be
{42 occurences-count}
. (reduce (fn [m x] (if (= 42 x) (assoc m 42 (inc (m 42))) m)) {42 0} coll)
- 42 is hardcoded in four places, it's bad!
(reduce (fn [m x] (if (= 42 x) (assoc m x (inc (m x))) m)) {42 0} coll)
- Can't you replace
{42 0}
with{}
? - No
(inc (m x))
would fail because(m x)
would returnnil
. - How does one provide a default value when the key is not found ?
(a-map a-key default-value)
- Can't you replace
{42 0}
with{}
? (reduce (fn [m x] (if (= 42 x) (assoc m x (inc (m x 0))) m)) {} coll)
- I changed my mind I'd like you to count occurrences of each value.
- Easy!
(reduce (fn [m x] (assoc m x (inc (m x 0)))) {} coll)
or, terser,(reduce #(assoc %1 %2 (inc (%1 %2 0))) {} coll)
Exercise:
- What does
(merge-with + {:a 12} {:b 4} {:a 3 :b 7})
return? - Can you count occurrences of each value in a collection using
merge-with
?
Wednesday, April 22, 2009
Vicious bug
- What's wrong with this code
(do (defmacro foobar [x] (doto x println)) (foobar (+ 1 1)))
? - Nothing.
- Run it again.
- The first time,
foobar
is treated as a function (its argument has been evaluated) becausefoobar
is unknown before the whole top-level expression is compiled.
The second time, the var namedfoobar
preexists and is a macro,(foobar (+ 1 1))
is expanded.
with-test
to test macros. It's the kind of bug that goes unnoticed while developing incrementally and evaluating in the same REPL, it was waiting for me to start a fresh REPL.
Sanitizing HTML with Enlive
net.cgrand.enlive-html=> (sniptest "<div id=user-data>"You also need to remove most attributes but it's just a demo of something that was impossible with the old Enlive.
[:#user-data] (html-content "code injection<script>alert('boo')</script>")
[:#user-data (but #{:p :br :a :strong :em})] nil)
"<html><body><div id=\"user-data\">code injection</div></body></html>"
By the way, the old Enlive is no more. Long live the new Enlive!
Saturday, April 18, 2009
Make it work, make it right, make it fast
- Make it work
- sort of done: the old Enlive,
- Make it right
- work in progress, the new Enlive,
- Make it fast
- once the "right" branch is merged into master.
See the README file to know what's new and here for an example.
Friday, April 17, 2009
Mapping every second item
interleave
, take-nth
and map
or a combination of mapcat
and partition
when I thought of this:(map #(%1 %2) (cycle [f identity]) coll)I really love clojure's map parallel processing. (I should ask if
every?
and some
could be allowed to take several colls.)
Wednesday, March 25, 2009
mapsplice
My first laborious attempt involved extracting subseqs of untouched items, mapping the function on the remaining items then joining everything back. Bleh! (Really, I don't like indices but this time I haven't found a way to work around.)
I was sad when, suddenly, I found this shortest solution:
(defn mapsplice [f coll indices]
(let [need-splice (map (set indices) (iterate inc 0))]
(mapcat #(if %2 (f %1) [%1]) coll need-splice)))
;; (mapsplice #(repeat % %) (range 10) [7 3])
;; (0 1 2 3 3 3 4 5 6 7 7 7 7 7 7 7 8 9)
Tuesday, March 3, 2009
Simple variations on state machines in Clojure
(reduce transition-fn initial-state input)
returns the final state.If you are interested in intermediate states, you can use clojure.contrib.seq-utils/reductions instead of reduce.
If you want an asynchronous state machine, you can use an agent:
(def agt (agent initial-state))If you add a watch to the agent, you can react to state transitions.
;then each time you get an input-value:
(send agt transition-fn input-value)
Monday, February 23, 2009
rest vs drop
(rest a-seq)
and (drop 1 a-seq)
aren't equivalent anymore:user=> (def s (map #(do (println %) %) (range 10)))
#'user/s
user=> (def d (drop 1 s))
#'user/d
user=> (def r (rest s))
0
#'user/r
As one can see
rest
needs to realize the first element of s while drop
doesn't. The corollary is that (drop n s)
holds on the whole seq (including the nth first elements) while rest
computes the first element and then discards it.
Saturday, January 31, 2009
Shadowing global names
#'var-name
to access a shadowed global.Don't do that! You'd better use
namespace-where-defined-or-alias/var-name
.There are several reasons not to use
#'var-name:
- it doesn't work if the value of your var is not invokable,
- it evaluates to something different: it returns the var an not its value. It happens that vars proxy function calls to their values but a var can be rebound:
(def a (map str (range 10)))
(def b (map #'str (range 10)))
(take 5 a)
("0" "1" "2" "3" "4")
(take 5 b)
("0" "1" "2" "3" "4")
(binding [str identity] (doall a))
("0" "1" "2" "3" "4" "5" "6" "7" "8" "9")
(binding [str identity] (doall b))
("0" "1" "2" "3" "4" 5 6 7 8 9)
@#'var-name
is better than #'var-name
but, really, just use the namespaced symbol.
Tuesday, January 27, 2009
Living on the bleeding edge
(add-classpath "http://clojure-contrib.svn.sourceforge.net/viewvc/clojure-contrib/trunk/src/") ; sourceforge
(add-classpath "http://github.com/Lau-of-DK/clojureql/raw/master/src/"); github
(require '[clojure.contrib.duck-streams :as duck])
(require '[dk.bestinclass.clojureql :as ql])
Disclaimer:
add-classpath
is not meant to be used anywhere but at the repl,- you'd better trust what you download,
- if you are unlucky you can get an inconsistent snapshot,
- use at your own risk!
Tuesday, January 20, 2009
Bindings and send
what is the best way to bind vars to a value when sending to an agent?
I don't know if it's the best way but it's mine:
(defmacro capture-and-send
"Capture the current value of the specified vars and rebind
them on the agent thread before executing the action."
[vars agent action & args]
(let [locals (map #(gensym (name %)) vars)]
`(let [~@(interleave locals vars)
action# (fn [& args#]
(binding [~@(interleave vars locals)]
(apply ~action args#)))]
(send ~agent action# ~@args))))
;; usage:
(capture-and-send [*out*] a f b c)
I post it here because erohtar needed it, I needed it once so others may need it.
Monday, January 19, 2009
Enlive: yet another HTML templating library
Its main design goal is to decouple html and presentation code, that's why Enlive templates are plain old html files (it will ease roundtripping with designers).
In code, you have to declare where (using CSS-like selectors) and how (using clojure code) to alter the html template.
The resulting templating functions are compiled (the html tree isn't transformed at runtime) and yields a seq of strings.
Here is an example:
(deftemplate microblog-template "net/cgrand/enlive_html/example.html" [title posts](NB: manually edited to add linebreaks.)
[:title] title
[:h1] title
[:div.no-msg] (when-not (seq posts) ~(html/show))
[:div.post] (for [{:keys [title body]} posts]
~(at
[:h2] title
[:p] body)))
;; at the repl:
net.cgrand.enlive-html.examples=> (apply str (microblog-template "Hello user!"
[{:title "post #1"
:body "hello with dangerous chars: <>&"}
{:title "post #2"
:body "dolor ipsum"}]))
"<html><head><title>Hello user!</title></head>
<body><h1>Hello user!</h1>
<div class=\"post\"><h2>post #1</h2>
<p>hello with dangerous chars: <>&</p></div>
<div class=\"post\"><h2>post #2</h2>
<p>dolor ipsum</p></div></body></html>"
(Disclaimer: original idea by Ozzilee)
Sunday, January 11, 2009
Functionally growing a tree
Update: follow-up
Thursday, January 8, 2009
try-or, or-try, try-else or else-try?
(defmacro try-or
"Evaluates exprs one at a time, from left to right. If a form returns a
value, this value is returned. If a form throws an exception, the next
form is evaluated.
If the last form throws an exception, the exception isn't caught."
([] nil)
([form] form)
([form & forms]
`(try
~form
(catch Exception e#
(try-or ~@forms)))))
Monday, January 5, 2009
Recursive seqs
(defmacro rec-cat
"Similar to lazy-cat but binds the resulting sequence using the supplied
binding-form, allowing recursive expressions. The first collection
expression must not be recursive and must return a non-nil seq."
[binding-form expr & rec-exprs]
`(let [rec-rest# (atom nil)
result# (lazy-cat ~expr (force @rec-rest#))
~binding-form result#]
(swap! rec-rest# (constantly (delay (lazy-cat ~@rec-exprs))))
result#))
(defmacro rec-cons
"Similar to lazy-cons but binds the resulting sequence using the supplied
binding-form, allowing recursive expressions. The first expression must
not be recursive and must return a non-nil seq."
[binding-form expr & rec-exprs]
`(let [rec-rest# (atom nil)
result# (lazy-cons ~expr (force @rec-rest#))
~binding-form result#]
(swap! rec-rest# (constantly (delay (lazy-cat ~@rec-exprs))))
result#))
Examples:
Natural numbers (just like(iterate inc 0)
):(rec-cons naturals 0 (map inc naturals))fibonnaci sequence:
(rec-cat fibs [0 1] (map + fibs (rest fibs)))Chouser's cute reduction:
(defn reduction
"Returns a lazy seq of the intermediate values of the reduction (as
per reduce) of coll by f, starting with init."
([f coll]
(if (seq coll)
(rec-cons reductions (first coll) (map f reductions (rest coll)))
(cons (f) nil)))
([f init coll]
(rec-cons reductions init (map f reductions coll))))