This is a very very simple script and I think it can be vastly simplified, especially with some type of API. But I like finding out new ways of using Hpricot to scrub HTML. Anyways, recently I was asked to build a list of plugins for use with WordPress and then share with my co-workers. I could have just sent them the link to the tag, but I thought a list of urls would be helpful too. Here is the code: require 'rubygems' require 'hpricot' require 'open-uri' @base = "" def get_hrefs(number) doc = Hpricot(open(@base + "?page=#{number}", :proxy => ''))"//a[@class='taggedlink  ']").each do |t| p t.attributes['href'] end end (1..5).each do |entry| get_hrefs(entry) end

comments powered by Disqus