你的问题对我来说有点不清楚,无论如何,以下准则应该有助于解决你的问题。
如果您在 Python 源代码中定义这些字符串, 那么您应该
- know in which character encoding your editor saves the source code file (e.g. utf-8)
- declare that encoding in the first line of your source file, via e.g.
# -*- coding: utf-8 -*-
- define those strings as unicode objects:
strings = [u "andia, Tailândia & amp; Cingapura", u " lines through the days 1 (阿拉伯语) {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\>
<\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
(注:在 Python 3 中, 字串默认为 Unicode 对象, 也就是说, 您不需要 < code> > u code > 。 在 Python 2 中, unicode 字符属于 < code> unicode 类型, 在 Python 中, 3 Unicode 字符串属于 < code> string 类型。 )
当您想要将这些字符串保存到文件时, 您应该明确定义字符编码 :
with open( filename , w ) as f:
s =
.join(strings)
f.write(s.encode( utf-8 ))
当您想要从该文件中再次读取这些字符串时, 您必须再次明确定义字符编码, 以便正确解码文件内容 :
with open( filename ) as f:
strings = [l.decode( utf-8 ) for line in f]