English 中文(简体)
原标题:How do I scrape a site, with multiple pages, and create one single html page with Ruby?
  • 时间:2011-11-05 17:41:16
  •  标签:
  • ruby
  • hpricot

So what I would like to do is scrape this site: http://boxerbiography.blogspot.com/ and create one HTML page that I can either print or send to my Kindle.




我确实必须看一看其中一条的渊源(正好是前文),例如: 查阅来源:http:// Boxerbiography.blogspot.com/2006/12/10-progamer-lim-yohwan-e-sports-icon.html,并人工编排某些标签之间的文字(如h3,p等)?

如果我采取这种做法,我就不得不研究每一章/条款的每个来源,然后这样做。 难道不能够打败书写文字的目的吗?





I d recomment using Nokogiri instead of Hpricot. It s more robust, uses less resources, fewer bugs, it s easier to use, and faster.

I did some scraping extensively for work on time, and had to switch to Nokogiri, because Hpricot would crash on some pages unexplicably.

检查这种铁路 种姓:

http://railscasts.com/episodes/190-cr-scraping-with-nokogiri” rel=“nofollow”http://railscasts.com/episodes/190-cr-scraping-with-nokogiri



http://www.engineyard.com/blog/ 2010_started-with-nokogiri/



Ruby parser in Java

The project I m doing is written in Java and parsers source code files. (Java src up to now). Now I d like to enable parsing Ruby code as well. Therefore I am looking for a parser in Java that parses ...

rails collection_select vs. select

collection_select and select Rails helpers: Which one should I use? I can t see a difference in both ways. Both helpers take a collection and generates options tags inside a select tag. Is there a ...

RubyCAS-Client question: Rails

I ve installed RubyCAS-Client version 2.1.0 as a plugin within a rails app. It s working, but I d like to remove the ?ticket= in the url. Is this possible?

Ordering a hash to xml: Rails

I m building an xml document from a hash. The xml attributes need to be in order. How can this be accomplished? hash.to_xml

multiple ruby extension modules under one directory

Can sources for discrete ruby extension modules live in the same directory, controlled by the same extconf.rb script? Background: I ve a project with two extension modules, foo.so and bar.so which ...

Text Editor for Ruby-on-Rails

guys which text editor is good for Rubyonrails? i m using Windows and i was using E-Texteditor but its not free n its expired now can anyone plese tell me any free texteditor? n which one is best an ...
