English 中文(简体)
UnicodeDecodeError:utf8编解码器无法解码位置3-6中的字节:无效数据
原标题:UnicodeDecodeError: utf8 codec can t decode bytes in position 3-6: invalid data

unicode是如何在python2上工作的?我就是不明白。

在这里,我从服务器下载数据,并将其解析为JSON。

Traceback (most recent call last):
  File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.12-py2.6.egg/eventlet/hubs/poll.py", line 92, in wait
    readers.get(fileno, noop).cb(fileno)
  File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.12-py2.6.egg/eventlet/greenthread.py", line 202, in main
    result = function(*args, **kwargs)
  File "android_suggest.py", line 60, in fetch
    suggestions = suggest(chars)
  File "android_suggest.py", line 28, in suggest
    return [i[ s ] for i in json.loads(opener.open( https://market.android.com/suggest/SuggRequest?json=1&query= +s+ &hl=de&gl=DE ).read())]
  File "/usr/lib/python2.6/json/__init__.py", line 307, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.6/json/decoder.py", line 319, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.6/json/decoder.py", line 336, in raw_decode
    obj, end = self._scanner.iterscan(s, **kw).next()
  File "/usr/lib/python2.6/json/scanner.py", line 55, in iterscan
    rval, next_pos = action(m, context)
  File "/usr/lib/python2.6/json/decoder.py", line 217, in JSONArray
    value, end = iterscan(s, idx=end, context=context).next()
  File "/usr/lib/python2.6/json/scanner.py", line 55, in iterscan
    rval, next_pos = action(m, context)
  File "/usr/lib/python2.6/json/decoder.py", line 183, in JSONObject
    value, end = iterscan(s, idx=end, context=context).next()
  File "/usr/lib/python2.6/json/scanner.py", line 55, in iterscan
    rval, next_pos = action(m, context)
  File "/usr/lib/python2.6/json/decoder.py", line 155, in JSONString
    return scanstring(match.string, match.end(), encoding, strict)
UnicodeDecodeError:  utf8  codec can t decode bytes in position 3-6: invalid data

非常感谢。

EDIT:以下字符串导致错误:[{“t”:“q”,“s”:“abxf6ren”}]xf6应解码为ö(abhören)

最佳回答

您试图解析为JSON的字符串不是用UTF-8编码的。它很可能是在ISO-8859-1中编码的。请尝试以下操作:

json.loads(unicode(opener.open(...), "ISO-8859-1"))

这将处理JSON消息中可能出现的任何变音。

您应该阅读Joel Spolsky的绝对最小值每个软件开发人员绝对、积极地必须了解Unicode和字符集(没有借口!)。我希望它能澄清您在Unicode方面遇到的一些问题。

问题回答

我的解决方案有点有趣。我从未想过它会像使用UTF-8编解码器那样容易保存。我使用的是notepad++(v5.6.8)。我没有注意到我最初使用ANSI编解码器保存它。我正在使用单独的文件来放置所有本地化的词典。我在Notepad++的“编码”选项卡下找到了我的解决方案。我选择了“不带BOM的UTF-8编码”并将其保存。它非常有效。

您看到的错误意味着您从远程接收的数据不是有效的JSON。JSON(根据指定)通常是UTF-8,但也可以是UTF-16或UTF-32(以大端或小端)。您看到的确切错误意味着数据的某些部分不是有效的UTF-8(也不是UTF-16或UTF-32,因为它们会产生不同的错误。)

也许您应该检查从远程端接收到的实际响应,而不是盲目地将数据传递给json.loads()。现在,您正在将响应中的所有数据读取到一个字符串中,并假设它是json。相反,请检查响应的内容类型。确保网页实际上声称给了你JSON,而不是,例如,不是JSON的错误消息。

(此外,在检查响应后,使用json.load(),将opener.open()返回的类似文件的对象传递给它,而不是将所有数据读取到字符串中并将其传递给json.loads()。)

将编码更改为Latin1/ISO-8859-1的解决方案解决了我在tex4ht的输出上调用html2text.py时观察到的一个问题。我使用它对LaTeX文档进行自动字数计数:tex4ht将它们转换为HTML,然后html2text.py将它们剥离为纯文本,以便通过wc-w进行进一步计数。现在,例如,如果一个德语“元音变音符”通过文献数据库条目输入,该过程将失败,因为html2text.py会抱怨例如。

UnicodeDecodeError:utf8编解码器无法解码位置32243-32245中的字节:无效数据

现在,这些错误随后将特别难以追踪,并且基本上您希望在参考部分中包含元音变音符。来自的html2text.py内部的一个简单更改

data=data.decode(编码)

data=data.decode(“ISO-8859-1”)

解决了这个问题;如果您使用HTML文件作为第一个参数来调用脚本,那么您也可以将编码作为第二个参数来传递,并省去修改。

万一有人有同样的问题。我正在将vim与YouCompleteMe,无法启动ycmd并显示此错误消息,我所做的是:导出LC_CTYPE=”en_US.UTF-8“,问题消失了。

将其粘贴到命令行:

export LC_CTYPE="en_US.UTF-8" 

在你的android_suggest.py中,把这个可怕的一行返回语句分解成one_step_at_a_time段。将repr(string_passed_to_json.loads)记录在某个位置,以便在发生异常后对其进行检查。眼球球的结果。如果问题不明显,请编辑您的问题以显示代表。

临时解决方法:<code>unicode(urlib2.urlopen(url).read(),utf8)</code>-如果</em>返回的是UTF-8,这应该有效。

<code>urlopen().read()</code>返回字节,您必须将它们解码为unicode字符串。此外,从http://bugs.python.org/issue4733





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...