English 中文(简体)
开放式返回ASCII-8BIT, 网页编号为:o-8859
原标题:open-uri returning ASCII-8BIT from webpage encoded in iso-8859

我用开放式的言语阅读一个网页,该网页的编码为:o-8859-1。 当我读到该页的内容时,开放式返回了ASCII-8BIT编码的插图。

open("http://www.nigella.com/recipes/view/DEVILS-FOOD-CAKE-5310") {|f| p f.content_type, f.charset, f.read.encoding }
 => ["text/html", "iso-8859-1", #<Encoding:ASCII-8BIT>] 

我对此表示猜疑,是因为网页具有无效力的星号(或特性)x92,http://en.wikipedia.org/wiki/ISO/IEC_8859-1

我需要将网页储存为“utf-8”编码文档。 任何关于如何处理编码不准确的网页的想法。 我可以抓住这一例外情况,试图猜测正确的编码,但这似乎很麻烦,而且容易发生错误。

问题回答




相关问题
How does gettext handle dynamic content?

In php (or maybe gettext in general), what does gettext do when it sees a variable to dynamic content? I have 2 cases in mind. 1) Let s say I have <?=$user1?> poked John <?=$user2?>. ...

Explain the Need for Mutexes in Locales, Please

Reading the question Why doesn’t C++ STL support atoi(const string& ) like functions?, I encountered a comment which warned that GCC (at least) has a bug that can slow down multi-threaded ...

How does Vistalizer work

How does Vistalizer manage to override the language limit in Windows Vista Home edition. Which api s does it use to allow installation of Multiple language packages.

Localized exceptions (within a Struts2 app)

I am developing a Struts 2 application with support for multiple languages. If one of the domain objects needs to throw an exception, how can it do so in such a way that the error message is no ...

Rails Globalize plugin help

Has anyone gotten the Globalize plugin to work Rails 2.3.2 or later? If so, could you direct me to some useful info?

热门标签