我试图在我的 Ruby 应用程序中使用以下 Regex 代码来匹配 HTTP 链接, 但它产生无效输出, 附加一个时段, 有时是一个时段和一个词, 在链接后面, 当在网络上测试时, 其无效 。
URL_PATTERN = Regexp.new %r{http://[w/.%-]+}i
<input>.to_s.scan( URL_PATTERN ).uniq
扫描链接的上述代码有问题吗?
应用程序代码 :
require bundler/setup
require twitter
RECORD_LIMIT = 100
URL_PATTERN = Regexp.new %r{http://[w/.%-]+}i
def usage
warn "Usage: ruby #{File.basename $0} <hashtag>"
exit 64
end
# Ensure that the hashtag has a hash symbol. This makes the leading #
# optional, which avoids the need to quote or escape it on the command line.
def format_hashtag(hashtag)
(hashtag.scan(/^#/).empty?) ? "##{hashtag}" : hashtag
end
# Return a sorted list of unique URLs found in the list of tweets.
def uniq_urls(tweets)
tweets.map(&:text).grep( %r{http://}i ).to_s.scan( URL_PATTERN ).uniq
end
def search(hashtag)
Twitter.search(hashtag, rpp: RECORD_LIMIT, result_type: recent )
end
if __FILE__ == $0 usage unless ARGV.size >= 1
hashtag = format_hashtag(ARGV[0])
tweets = search(hashtag)
puts uniq_urls(tweets)
end