Clojure and me has moved.

Monday, April 27, 2009

Screenscraping with Enlive

This post has moved, go to its new location
(select (html-resource (java.net.URL. "http://clojure-log.n01se.net/")) [:#main [:a (attr? :href)]]) returns a seq of link nodes.

4 comments:

swannodette said...

Christophe,

What's the best way to get the flattened list of matching nodes? Using a zipper on the result of select?

Thanks!
David

Christophe Grand said...

David,

select already returns a list of nodes, so I'm unsure about what you want to flatten. Can you be more precise?

swannodette said...

Oops sorry for the slow reply. I notice for example when I extract all divs from http://nytimes.com, I only get two divs. That is because all the other divs are nested in those two top level ones. I was just asking for guidance about the best way to traverse just the divs I'm interested in- hopefully I'm making sense here.

Christophe Grand said...

I fixed this bug this morning (CEST)