English 中文(简体)
Python: How to get StringIO.writelines to accept unicode string?
原标题:

I m getting a

UnicodeEncodeError:  ascii  codec can t encode character u xa3  in position 34: ordinal not in range(128)

on a string stored in a.desc below as it contains the £ character. It s stored in the underlying Google App Engine datastore as a unicode string so that s fine. The cStringIO.StringIO.writelines function is trying seemingly trying to encode it in ascii format:

result.writelines([ blahblah ,a.desc, blahblahblah ])

How do I instruct it to treat the encoding as unicode if that s the correct phrasing?

app engine runs on python 2.5

最佳回答

StringIO documentation:

Unlike the memory files implemented by the StringIO module, those provided by [cStringIO] are not able to accept Unicode strings that cannot be encoded as plain ASCII strings.

If possible, use StringIO instead of cStringIO.

问题回答

You can wrap the StringIO object in a codecs.StreamReaderWriter object to automatically encode and decode unicode.

Like this:

import cStringIO, codecs
buffer = cStringIO.StringIO()
codecinfo = codecs.lookup("utf8")
wrapper = codecs.StreamReaderWriter(buffer, 
        codecinfo.streamreader, codecinfo.streamwriter)

wrapper.writelines([u"list of", u"unicode strings"])

buffer will be filled with utf-8 encoded bytes.

If I understand your case correctly, you will only need to write, so you could also do:

import cStringIO, codecs
buffer = cStringIO.StringIO()
wrapper = codecs.getwriter("utf8")(buffer)

You can also encode your string as utf-8 manually before adding it to the StringIO

for val in rows:
    if isinstance(val, unicode):
        val = val.encode( utf-8 )
result.writelines(rows)

Python 2.6 introduced the io module and you should consider using io.StringIO(), "An in-memory stream for unicode text."

In older python versions this is not optimized (pure Python), in later versions this has been optimized to (fast) C code.





相关问题
Simple JAVA: Password Verifier problem

I have a simple problem that says: A password for xyz corporation is supposed to be 6 characters long and made up of a combination of letters and digits. Write a program fragment to read in a string ...

Case insensitive comparison of strings in shell script

The == operator is used to compare two strings in shell script. However, I want to compare two strings ignoring case, how can it be done? Is there any standard command for this?

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

String initialization with pair of iterators

I m trying to initialize string with iterators and something like this works: ifstream fin("tmp.txt"); istream_iterator<char> in_i(fin), eos; //here eos is 1 over the end string s(in_i, ...

break a string in parts

I have a string "pc1|pc2|pc3|" I want to get each word on different line like: pc1 pc2 pc3 I need to do this in C#... any suggestions??

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...

热门标签