http://github.com/nahi/http://client'rel=“nofollow” gem 下载大宗档案。 理想的情况是,我希望能够下载多达2GB的档案。 当我下载档案时,我不想由于明显的原因把档案内容上装上记忆中,因此,我正在使用HTTPClient的“连续照相机”号,然后书写提交给档案的空白。
HTTPClient.new.get_content(url) do |chunk|
puts "Downloaded chunk of size #{chunk.size}"
file.write(chunk)
end
页: 1 10份甲基溴文档可以轻易取用30秒。
栏目中的插图将按以下方式打印出块尺寸:
Downloaded chunk of size 12276
Downloaded chunk of size 4108
Downloaded chunk of size 12276
Downloaded chunk of size 4108
Downloaded chunk of size 12276
Downloaded chunk of size 4108
Downloaded chunk of size 12276
Downloaded chunk of size 4108
Downloaded chunk of size 12276
如果你看一下4108/12276的草地大小,那么问题似乎是这样。 草地面积实际上很小。 我无法说明如何使chu块面积更大。
I have used Patron which is based on libcurl and it is pretty fast at downloading but I am not too keen on introducing a dependency on libcurl for now.
www.un.org/Depts/DGACM/index_spanish.htm 我如何能够更快地下载HTTPClient?
www.un.org/Depts/DGACM/index_french.htm 10/27/2011
我尝试了@NaHi提出的两个建议,这里是我发现的。
当我确定透明——gzip_decompression选项时,我得出以下例外。
Zlib::StreamError: stream error: invalid window bits
from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient/session.rb:652:in `get_body
from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient.rb:1062:in `do_get_block
from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient.rb:866:in `do_request
from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient.rb:953:in `protect_keep_alive_disconnected
from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient.rb:865:in `do_request
from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient.rb:938:in `follow_redirect
from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient.rb:577:in `get_content
When I set the header to Accept-Encoding => gzip,deflate I did see an improve in performance. A 7296502 byte file that was earlier taking 24 seconds now takes 16 seconds. So that helps. In comparison however patron downloads the same file in 1.5 seconds. So I m still far from achieving the same kind of performance with httpclient.