我如何从鲁比的简易文本档案中提取所有URLs?
我对一些图书馆进行了审判,但在有些情况下却失败。 什么最佳方式?
我如何从鲁比的简易文本档案中提取所有URLs?
我对一些图书馆进行了审判,但在有些情况下却失败。 什么最佳方式?
哪些案例失败了?
图书馆regexpert。
regexp = /(^$)|(^(http|https)://[a-z0-9]+([-.]{1}[a-z0-9]+)*.[a-z]{2,5}(([0-9]{1,5})?/.*)?$)/ix
然后在案文上填写scan
。
EDIT: Seems such as the regexpsupport the separate string. 请更正如下:
如果你想利用在鲁比拉已经为你规定的内容:
require "uri"
URI.extract("text here http://foo.example.org/bla and here mailto:test@example.com and here also.")
# => ["http://foo.example.org/bla", "mailto:test@example.com"]
http://railsapi.com/doc/ruby-v1.8/classes/URI.html#M004495"
页: 1 gem
require "twitter-text"
class UrlParser
include Twitter::Extractor
end
urls = UrlParser.new.extract_urls("http://stackoverflow.com")
puts urls.inspect
http://ruby-doc.org/french/sc/presidencyes/String.html#M000812”rel=“noreferer”>.scan(
string.scan(/(https?://([-w.]+)+(:d+)?(/([w/_.]*(?S+)?)?)?)/)
你们可以先从这个地方开始,并根据你们的需要进行调整。
如果你的意见类似:
"http://i.imgur.com/c31IkbM.gifv;http://i.imgur.com/c31IkbM.gifvhttp://i.imgur.com/c31IkbM.gifv"
i.e. URLs不一定有周围的白色空间,可以由任何划界人划定,或者完全没有划界,你可以采用以下方法:
def process_images(raw_input)
return [] if raw_input.nil?
urls = raw_input.split( http )
urls.shift
urls.map { |url| "http#{url}".strip.split(/[s,;]/)[0] }
end
希望!
require uri
foo = #<URI::HTTP:0x007f91c76ebad0 URL:http://foobar/00u0u_gKHnmtWe0Jk_600x450.jpg>
foo.to_s
=> "http://foobar/00u0u_gKHnmtWe0Jk_600x450.jpg"
<<>strong>edit: 解释
对于那些通过JSON的对策或通过使用Nokogiri或Mechanize等报废工具而使URI陷入困境的人,这一解决办法对我有利。
The project I m doing is written in Java and parsers source code files. (Java src up to now). Now I d like to enable parsing Ruby code as well. Therefore I am looking for a parser in Java that parses ...
collection_select and select Rails helpers: Which one should I use? I can t see a difference in both ways. Both helpers take a collection and generates options tags inside a select tag. Is there a ...
All of the following API do the same thing: open a file and call a block for each line. Is there any preference we should use one than another? File.open("file").each_line {|line| puts line} open("...
I ve installed RubyCAS-Client version 2.1.0 as a plugin within a rails app. It s working, but I d like to remove the ?ticket= in the url. Is this possible?
Here s the string: 04046955104021109 I need it to be formatted like so: 040469551-0402-1109 What s the shortest/most efficient way to do that with ruby?
I m building an xml document from a hash. The xml attributes need to be in order. How can this be accomplished? hash.to_xml
Can sources for discrete ruby extension modules live in the same directory, controlled by the same extconf.rb script? Background: I ve a project with two extension modules, foo.so and bar.so which ...
guys which text editor is good for Rubyonrails? i m using Windows and i was using E-Texteditor but its not free n its expired now can anyone plese tell me any free texteditor? n which one is best an ...