English 中文(简体)
食堂在文字操作中失败,但可以在互动的假装中复制。
原标题:feedparser fails during script run, but can t reproduce in interactive python console

当我 run倒时,或当我用我手写的字眼时,它就失败了:

 ascii  codec can t decode byte 0xe2 in position 32: ordinal not in range(128) 

我不知道为什么,但当我简单地用同样的ur子执行 feed(url)声明时,没有发现任何错误。 这给我留下了很长的时间。

该法典简单明了:

      try:
           d = feedparser.parse(url)
      except Exception, e:
           logging.error( Error while retrieving feed. )
           logging.error(e)
           logging.error(formatExceptionInfo(None))
           logging.error(formatExceptionInfo1())

这里的痕迹是:

d = feedparser.parse(url)


 File "C:Python26libsite-packagesfeedparser.py", line 2623, in parse
    feedparser.feed(data)
  File "C:Python26libsite-packagesfeedparser.py", line 1441, in feed
    sgmllib.SGMLParser.feed(self, data)
  File "C:Python26libsgmllib.py", line 104, in feed
    self.goahead(0)
  File "C:Python26libsgmllib.py", line 143, in goahead
    k = self.parse_endtag(i)
  File "C:Python26libsgmllib.py", line 320, in parse_endtag
    self.finish_endtag(tag)
  File "C:Python26libsgmllib.py", line 360, in finish_endtag
    self.unknown_endtag(tag)
  File "C:Python26libsite-packagesfeedparser.py", line 476, in unknown_endtag
    method()
  File "C:Python26libsite-packagesfeedparser.py", line 1318, in _end_content
    value = self.popContent( content )
  File "C:Python26libsite-packagesfeedparser.py", line 700, in popContent
    value = self.pop(tag)
  File "C:Python26libsite-packagesfeedparser.py", line 641, in pop
    output = _resolveRelativeURIs(output, self.baseuri, self.encoding)
  File "C:Python26libsite-packagesfeedparser.py", line 1594, in _resolveRelativeURIs
    p.feed(htmlSource)
  File "C:Python26libsite-packagesfeedparser.py", line 1441, in feed
    sgmllib.SGMLParser.feed(self, data)
  File "C:Python26libsgmllib.py", line 104, in feed
    self.goahead(0)
  File "C:Python26libsgmllib.py", line 138, in goahead
    k = self.parse_starttag(i)
  File "C:Python26libsgmllib.py", line 296, in parse_starttag
    self.finish_starttag(tag, attrs)
  File "C:Python26libsgmllib.py", line 338, in finish_starttag
    self.unknown_starttag(tag, attrs)
  File "C:Python26libsite-packagesfeedparser.py", line 1588, in unknown_starttag
    attrs = [(key, ((tag, key) in self.relative_uris) and self.resolveURI(value) or value) for key, value in attrs]
  File "C:Python26libsite-packagesfeedparser.py", line 1584, in resolveURI
    return _urljoin(self.baseuri, uri)
  File "C:Python26libsite-packagesfeedparser.py", line 286, in _urljoin
    return urlparse.urljoin(base, uri)
  File "C:Python26liburlparse.py", line 215, in urljoin
    params, query, fragment))
  File "C:Python26liburlparse.py", line 184, in urlunparse
    return urlunsplit((scheme, netloc, url, query, fragment))
  File "C:Python26liburlparse.py", line 192, in urlunsplit
    url = scheme +  :  + url
  File "C:Python26libencodingscp1252.py", line 15, in decode
    return codecs.charmap_decode(input,errors,decoding_table)

部分内容:

当通过URL作为单编码时,这一点是可以再使用的。 当它 s一旁时,它就获得了回升。 为记录起见,您需要一种具有一定高度特性的统一编码特性的馈赠。 我不清楚为什么如此。

最佳回答

看像给你问题的圆顶,含有一些编码的案文(例如:ttin-1,其中0xe2为“带上圈子的下级” akacode>&acirc;),没有适当的内容类型标题(在<代码>Content-Type:上应当有个子=参数。

如果情况是<代码>feedparser,则不能对编码进行猜测,对违约进行判断(ascii),并且失败。

,该部分内容是 馈赠与者专著。

不幸的是,没有解决这一一般性问题的“灵丹妙药”。 您可以尝试追捕这一例外,手稿中分别读取了url的内容(使用urllib2),并试图将其与各种可能的编码编码脱钩——然后,在您最终通过这一方式获得可使用的单条编码时,就向:<><>>>> > >feedparser.parse。 (首个轴可以是月球,一个文档流,or a unicode string with the data)。

问题回答

关于OP的评论:Try any urllich,如u myfeed。 blah/xml 它应当照搬。

>>> from pprint import pprint as pp
>>> import feedparser

>>> d = feedparser.parse(u myfeed.blah/xml )
>>> pp(d)
{ bozo : 1,
  bozo_exception : SAXParseException( not well-formed (invalid token) ,),
  encoding :  utf-8 ,
  entries : [],
  feed : {},
  namespaces : {},
  version :   }

>>> d = feedparser.parse(u http://myfeed.blah/xml )
>>> pp(d)
{ bozo : 1,
  bozo_exception : URLError(gaierror(11001,  getaddrinfo failed ),),
  encoding :  utf-8 ,
  entries : [],
  feed : {},
  version : None}

>>> d = feedparser.parse("http://feedparser.org/docs/examples/atom10.xml")
>>> d[ bozo ]
0
>>> d[ feed ][ title ]
u Sample Feed 

>>> d = feedparser.parse(u"http://feedparser.org/docs/examples/atom10.xml")
>>> d[ bozo ]
0
>>> d[ feed ][ title ]
u Sample Feed 
>>>

请停止浪费;提供实际上造成问题的URL。





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...