English 中文(简体)
找到一份所有编码的一览表。
原标题:Get a list of all the encodings Python can encode to

我正在撰写一篇文章,试图将 by铁丝带入第2.6条的许多不同的编码。 是否有办法获得一份清单,列出我可以重复的可用编码?

我之所以试图这样做,是因为用户有一些案文没有正确编码。 有一些不祥的特性。 我知道哪一种是单条编码。 我想给他们一个答案,例如“Your text Editor is interpreting that string as X encoding,而不是Y encoding”。 我认为,我将尝试用一种编码来编码这种特性,然后用另一种编码加以编码,看我们是否具有同样的特性。

i. 类似情况:

for encoding1, encoding2 in itertools.permutation(encodinglist(), 2):
  try:
    unicode_string = my_unicode_character.encode(encoding1).decode(encoding2)
  except:
    pass
最佳回答

<>unfortunatelyencodings.aliases.aliases.keys ( is NOT an appropriate response.

aliases(因为人们会/应当期望)载有几个不同钥匙被绘制成相同价值的图例,例如1252windows_1252均被绘制为cp1252。 如果你使用<代码>(别名),则你可以节省时间。

BUTRE S A WORSE PROBLEM: aliases 吨数中含有无汞的编码(如p856、cp874、cp875、cp737和koi8_u)。

>>> from encodings.aliases import aliases
>>> def find(q):
...     return [(k,v) for k, v in aliases.items() if q in k or q in v]
...
>>> find( 1252 ) # multiple aliases
[( 1252 ,  cp1252 ), ( windows_1252 ,  cp1252 )]
>>> find( 856 ) # no codepage 856 in aliases
[]
>>> find( koi8 ) # no koi8_u in aliases
[( cskoi8r ,  koi8_r )]
>>>  x .decode( cp856 ) # but cp856 is a valid codec
u x 
>>>  x .decode( koi8_u ) # but koi8_u is a valid codec
u x 
>>>

还值得注意的是,尽管你获得了完整的编码清单,但忽视关于编码/编码特性的编码可能是一个好的想法,但还有其他一些改动,例如<代码>zlib、<编码>quopri和base64

那么,我们就想回答WHY的问题,你想要“将 by缩到许多不同的编码”。 如果我们知道,我们或许能够引导你走向正确方向。

首先,这含糊不清。 单编码中的一种DE码,以及异构体中的一种EN代码。 你们想要做些什么?

你真心要做到的是: 你们是否试图确定用哪条代码来编码某些新加入的tes子,并计划用所有可能的代码加以尝试? [注:第1条将编码任何内容] 你们是否试图用所有可能的密码编码来确定某些单编码文本的措辞? [注:图8将包含任何内容。

问题回答

这里的其他答复似乎表明,制定这份名单很难,而且充满了陷阱。 然而,这样做可能没有必要,因为文件载有一份完整的标准编码清单,以支持并自第2.3号决议以来。

你们可以找到这些清单(迄今为止释放的每一种稳定的语言):

下面是每张有文件记载的Thury版本的清单。 请注意,如果你想要落后,而不是仅仅支持一种特定的甲型六氯环己烷,那么你就只能复制 最新<>的清单。 页: 1

Python 2.3 (59 encodings)

[ ascii ,
  cp037 ,
  cp424 ,
  cp437 ,
  cp500 ,
  cp737 ,
  cp775 ,
  cp850 ,
  cp852 ,
  cp855 ,
  cp856 ,
  cp857 ,
  cp860 ,
  cp861 ,
  cp862 ,
  cp863 ,
  cp864 ,
  cp865 ,
  cp869 ,
  cp874 ,
  cp875 ,
  cp1006 ,
  cp1026 ,
  cp1140 ,
  cp1250 ,
  cp1251 ,
  cp1252 ,
  cp1253 ,
  cp1254 ,
  cp1255 ,
  cp1256 ,
  cp1257 ,
  cp1258 ,
  latin_1 ,
  iso8859_2 ,
  iso8859_3 ,
  iso8859_4 ,
  iso8859_5 ,
  iso8859_6 ,
  iso8859_7 ,
  iso8859_8 ,
  iso8859_9 ,
  iso8859_10 ,
  iso8859_13 ,
  iso8859_14 ,
  iso8859_15 ,
  koi8_r ,
  koi8_u ,
  mac_cyrillic ,
  mac_greek ,
  mac_iceland ,
  mac_latin2 ,
  mac_roman ,
  mac_turkish ,
  utf_16 ,
  utf_16_be ,
  utf_16_le ,
  utf_7 ,
  utf_8 ]

Python 2.4 (85 encodings)

[ ascii ,
  big5 ,
  big5hkscs ,
  cp037 ,
  cp424 ,
  cp437 ,
  cp500 ,
  cp737 ,
  cp775 ,
  cp850 ,
  cp852 ,
  cp855 ,
  cp856 ,
  cp857 ,
  cp860 ,
  cp861 ,
  cp862 ,
  cp863 ,
  cp864 ,
  cp865 ,
  cp866 ,
  cp869 ,
  cp874 ,
  cp875 ,
  cp932 ,
  cp949 ,
  cp950 ,
  cp1006 ,
  cp1026 ,
  cp1140 ,
  cp1250 ,
  cp1251 ,
  cp1252 ,
  cp1253 ,
  cp1254 ,
  cp1255 ,
  cp1256 ,
  cp1257 ,
  cp1258 ,
  euc_jp ,
  euc_jis_2004 ,
  euc_jisx0213 ,
  euc_kr ,
  gb2312 ,
  gbk ,
  gb18030 ,
  hz ,
  iso2022_jp ,
  iso2022_jp_1 ,
  iso2022_jp_2 ,
  iso2022_jp_2004 ,
  iso2022_jp_3 ,
  iso2022_jp_ext ,
  iso2022_kr ,
  latin_1 ,
  iso8859_2 ,
  iso8859_3 ,
  iso8859_4 ,
  iso8859_5 ,
  iso8859_6 ,
  iso8859_7 ,
  iso8859_8 ,
  iso8859_9 ,
  iso8859_10 ,
  iso8859_13 ,
  iso8859_14 ,
  iso8859_15 ,
  johab ,
  koi8_r ,
  koi8_u ,
  mac_cyrillic ,
  mac_greek ,
  mac_iceland ,
  mac_latin2 ,
  mac_roman ,
  mac_turkish ,
  ptcp154 ,
  shift_jis ,
  shift_jis_2004 ,
  shift_jisx0213 ,
  utf_16 ,
  utf_16_be ,
  utf_16_le ,
  utf_7 ,
  utf_8 ]

Python 2.5 (86 encodings)

[ ascii ,
  big5 ,
  big5hkscs ,
  cp037 ,
  cp424 ,
  cp437 ,
  cp500 ,
  cp737 ,
  cp775 ,
  cp850 ,
  cp852 ,
  cp855 ,
  cp856 ,
  cp857 ,
  cp860 ,
  cp861 ,
  cp862 ,
  cp863 ,
  cp864 ,
  cp865 ,
  cp866 ,
  cp869 ,
  cp874 ,
  cp875 ,
  cp932 ,
  cp949 ,
  cp950 ,
  cp1006 ,
  cp1026 ,
  cp1140 ,
  cp1250 ,
  cp1251 ,
  cp1252 ,
  cp1253 ,
  cp1254 ,
  cp1255 ,
  cp1256 ,
  cp1257 ,
  cp1258 ,
  euc_jp ,
  euc_jis_2004 ,
  euc_jisx0213 ,
  euc_kr ,
  gb2312 ,
  gbk ,
  gb18030 ,
  hz ,
  iso2022_jp ,
  iso2022_jp_1 ,
  iso2022_jp_2 ,
  iso2022_jp_2004 ,
  iso2022_jp_3 ,
  iso2022_jp_ext ,
  iso2022_kr ,
  latin_1 ,
  iso8859_2 ,
  iso8859_3 ,
  iso8859_4 ,
  iso8859_5 ,
  iso8859_6 ,
  iso8859_7 ,
  iso8859_8 ,
  iso8859_9 ,
  iso8859_10 ,
  iso8859_13 ,
  iso8859_14 ,
  iso8859_15 ,
  johab ,
  koi8_r ,
  koi8_u ,
  mac_cyrillic ,
  mac_greek ,
  mac_iceland ,
  mac_latin2 ,
  mac_roman ,
  mac_turkish ,
  ptcp154 ,
  shift_jis ,
  shift_jis_2004 ,
  shift_jisx0213 ,
  utf_16 ,
  utf_16_be ,
  utf_16_le ,
  utf_7 ,
  utf_8 ,
  utf_8_sig ]

Python 2.6 (90 encodings)

[ ascii ,
  big5 ,
  big5hkscs ,
  cp037 ,
  cp424 ,
  cp437 ,
  cp500 ,
  cp737 ,
  cp775 ,
  cp850 ,
  cp852 ,
  cp855 ,
  cp856 ,
  cp857 ,
  cp860 ,
  cp861 ,
  cp862 ,
  cp863 ,
  cp864 ,
  cp865 ,
  cp866 ,
  cp869 ,
  cp874 ,
  cp875 ,
  cp932 ,
  cp949 ,
  cp950 ,
  cp1006 ,
  cp1026 ,
  cp1140 ,
  cp1250 ,
  cp1251 ,
  cp1252 ,
  cp1253 ,
  cp1254 ,
  cp1255 ,
  cp1256 ,
  cp1257 ,
  cp1258 ,
  euc_jp ,
  euc_jis_2004 ,
  euc_jisx0213 ,
  euc_kr ,
  gb2312 ,
  gbk ,
  gb18030 ,
  hz ,
  iso2022_jp ,
  iso2022_jp_1 ,
  iso2022_jp_2 ,
  iso2022_jp_2004 ,
  iso2022_jp_3 ,
  iso2022_jp_ext ,
  iso2022_kr ,
  latin_1 ,
  iso8859_2 ,
  iso8859_3 ,
  iso8859_4 ,
  iso8859_5 ,
  iso8859_6 ,
  iso8859_7 ,
  iso8859_8 ,
  iso8859_9 ,
  iso8859_10 ,
  iso8859_13 ,
  iso8859_14 ,
  iso8859_15 ,
  iso8859_16 ,
  johab ,
  koi8_r ,
  koi8_u ,
  mac_cyrillic ,
  mac_greek ,
  mac_iceland ,
  mac_latin2 ,
  mac_roman ,
  mac_turkish ,
  ptcp154 ,
  shift_jis ,
  shift_jis_2004 ,
  shift_jisx0213 ,
  utf_32 ,
  utf_32_be ,
  utf_32_le ,
  utf_16 ,
  utf_16_be ,
  utf_16_le ,
  utf_7 ,
  utf_8 ,
  utf_8_sig ]

Python 2.7 (93 encodings)

[ ascii ,
  big5 ,
  big5hkscs ,
  cp037 ,
  cp424 ,
  cp437 ,
  cp500 ,
  cp720 ,
  cp737 ,
  cp775 ,
  cp850 ,
  cp852 ,
  cp855 ,
  cp856 ,
  cp857 ,
  cp858 ,
  cp860 ,
  cp861 ,
  cp862 ,
  cp863 ,
  cp864 ,
  cp865 ,
  cp866 ,
  cp869 ,
  cp874 ,
  cp875 ,
  cp932 ,
  cp949 ,
  cp950 ,
  cp1006 ,
  cp1026 ,
  cp1140 ,
  cp1250 ,
  cp1251 ,
  cp1252 ,
  cp1253 ,
  cp1254 ,
  cp1255 ,
  cp1256 ,
  cp1257 ,
  cp1258 ,
  euc_jp ,
  euc_jis_2004 ,
  euc_jisx0213 ,
  euc_kr ,
  gb2312 ,
  gbk ,
  gb18030 ,
  hz ,
  iso2022_jp ,
  iso2022_jp_1 ,
  iso2022_jp_2 ,
  iso2022_jp_2004 ,
  iso2022_jp_3 ,
  iso2022_jp_ext ,
  iso2022_kr ,
  latin_1 ,
  iso8859_2 ,
  iso8859_3 ,
  iso8859_4 ,
  iso8859_5 ,
  iso8859_6 ,
  iso8859_7 ,
  iso8859_8 ,
  iso8859_9 ,
  iso8859_10 ,
  iso8859_11 ,
  iso8859_13 ,
  iso8859_14 ,
  iso8859_15 ,
  iso8859_16 ,
  johab ,
  koi8_r ,
  koi8_u ,
  mac_cyrillic ,
  mac_greek ,
  mac_iceland ,
  mac_latin2 ,
  mac_roman ,
  mac_turkish ,
  ptcp154 ,
  shift_jis ,
  shift_jis_2004 ,
  shift_jisx0213 ,
  utf_32 ,
  utf_32_be ,
  utf_32_le ,
  utf_16 ,
  utf_16_be ,
  utf_16_le ,
  utf_7 ,
  utf_8 ,
  utf_8_sig ]

Python 3.0 (89 encodings)

[ ascii ,
  big5 ,
  big5hkscs ,
  cp037 ,
  cp424 ,
  cp437 ,
  cp500 ,
  cp737 ,
  cp775 ,
  cp850 ,
  cp852 ,
  cp855 ,
  cp856 ,
  cp857 ,
  cp860 ,
  cp861 ,
  cp862 ,
  cp863 ,
  cp864 ,
  cp865 ,
  cp866 ,
  cp869 ,
  cp874 ,
  cp875 ,
  cp932 ,
  cp949 ,
  cp950 ,
  cp1006 ,
  cp1026 ,
  cp1140 ,
  cp1250 ,
  cp1251 ,
  cp1252 ,
  cp1253 ,
  cp1254 ,
  cp1255 ,
  cp1256 ,
  cp1257 ,
  cp1258 ,
  euc_jp ,
  euc_jis_2004 ,
  euc_jisx0213 ,
  euc_kr ,
  gb2312 ,
  gbk ,
  gb18030 ,
  hz ,
  iso2022_jp ,
  iso2022_jp_1 ,
  iso2022_jp_2 ,
  iso2022_jp_2004 ,
  iso2022_jp_3 ,
  iso2022_jp_ext ,
  iso2022_kr ,
  latin_1 ,
  iso8859_2 ,
  iso8859_3 ,
  iso8859_4 ,
  iso8859_5 ,
  iso8859_6 ,
  iso8859_7 ,
  iso8859_8 ,
  iso8859_9 ,
  iso8859_10 ,
  iso8859_13 ,
  iso8859_14 ,
  iso8859_15 ,
  johab ,
  koi8_r ,
  koi8_u ,
  mac_cyrillic ,
  mac_greek ,
  mac_iceland ,
  mac_latin2 ,
  mac_roman ,
  mac_turkish ,
  ptcp154 ,
  shift_jis ,
  shift_jis_2004 ,
  shift_jisx0213 ,
  utf_32 ,
  utf_32_be ,
  utf_32_le ,
  utf_16 ,
  utf_16_be ,
  utf_16_le ,
  utf_7 ,
  utf_8 ,
  utf_8_sig ]

Python 3.1 (90 encodings)

[ ascii ,
  big5 ,
  big5hkscs ,
  cp037 ,
  cp424 ,
  cp437 ,
  cp500 ,
  cp737 ,
  cp775 ,
  cp850 ,
  cp852 ,
  cp855 ,
  cp856 ,
  cp857 ,
  cp860 ,
  cp861 ,
  cp862 ,
  cp863 ,
  cp864 ,
  cp865 ,
  cp866 ,
  cp869 ,
  cp874 ,
  cp875 ,
  cp932 ,
  cp949 ,
  cp950 ,
  cp1006 ,
  cp1026 ,
  cp1140 ,
  cp1250 ,
  cp1251 ,
  cp1252 ,
  cp1253 ,
  cp1254 ,
  cp1255 ,
  cp1256 ,
  cp1257 ,
  cp1258 ,
  euc_jp ,
  euc_jis_2004 ,
  euc_jisx0213 ,
  euc_kr ,
  gb2312 ,
  gbk ,
  gb18030 ,
  hz ,
  iso2022_jp ,
  iso2022_jp_1 ,
  iso2022_jp_2 ,
  iso2022_jp_2004 ,
  iso2022_jp_3 ,
  iso2022_jp_ext ,
  iso2022_kr ,
  latin_1 ,
  iso8859_2 ,
  iso8859_3 ,
  iso8859_4 ,
  iso8859_5 ,
  iso8859_6 ,
  iso8859_7 ,
  iso8859_8 ,
  iso8859_9 ,
  iso8859_10 ,
  iso8859_13 ,
  iso8859_14 ,
  iso8859_15 ,
  iso8859_16 ,
  johab ,
  koi8_r ,
  koi8_u ,
  mac_cyrillic ,
  mac_greek ,
  mac_iceland ,
  mac_latin2 ,
  mac_roman ,
  mac_turkish ,
  ptcp154 ,
  shift_jis ,
  shift_jis_2004 ,
  shift_jisx0213 ,
  utf_32 ,
  utf_32_be ,
  utf_32_le ,
  utf_16 ,
  utf_16_be ,
  utf_16_le ,
  utf_7 ,
  utf_8 ,
  utf_8_sig ]

Python 3.2 (92 encodings)

[ ascii ,
  big5 ,
  big5hkscs ,
  cp037 ,
  cp424 ,
  cp437 ,
  cp500 ,
  cp720 ,
  cp737 ,
  cp775 ,
  cp850 ,
  cp852 ,
  cp855 ,
  cp856 ,
  cp857 ,
  cp858 ,
  cp860 ,
  cp861 ,
  cp862 ,
  cp863 ,
  cp864 ,
  cp865 ,
  cp866 ,
  cp869 ,
  cp874 ,
  cp875 ,
  cp932 ,
  cp949 ,
  cp950 ,
  cp1006 ,
  cp1026 ,
  cp1140 ,
  cp1250 ,
  cp1251 ,
  cp1252 ,
  cp1253 ,
  cp1254 ,
  cp1255 ,
  cp1256 ,
  cp1257 ,
  cp1258 ,
  euc_jp ,
  euc_jis_2004 ,
  euc_jisx0213 ,
  euc_kr ,
  gb2312 ,
  gbk ,
  gb18030 ,
  hz ,
  iso2022_jp ,
  iso2022_jp_1 ,
  iso2022_jp_2 ,
  iso2022_jp_2004 ,
  iso2022_jp_3 ,
  iso2022_jp_ext ,
  iso2022_kr ,
  latin_1 ,
  iso8859_2 ,
  iso8859_3 ,
  iso8859_4 ,
  iso8859_5 ,
  iso8859_6 ,
  iso8859_7 ,
  iso8859_8 ,
  iso8859_9 ,
  iso8859_10 ,
  iso8859_13 ,
  iso8859_14 ,
  iso8859_15 ,
  iso8859_16 ,
  johab ,
  koi8_r ,
  koi8_u ,
  mac_cyrillic ,
  mac_greek ,
  mac_iceland ,
  mac_latin2 ,
  mac_roman ,
  mac_turkish ,
  ptcp154 ,
  shift_jis ,
  shift_jis_2004 ,
  shift_jisx0213 ,
  utf_32 ,
  utf_32_be ,
  utf_32_le ,
  utf_16 ,
  utf_16_be ,
  utf_16_le ,
  utf_7 ,
  utf_8 ,
  utf_8_sig ]

Python 3.3 (93 encodings)

[ ascii ,
  big5 ,
  big5hkscs ,
  cp037 ,
  cp424 ,
  cp437 ,
  cp500 ,
  cp720 ,
  cp737 ,
  cp775 ,
  cp850 ,
  cp852 ,
  cp855 ,
  cp856 ,
  cp857 ,
  cp858 ,
  cp860 ,
  cp861 ,
  cp862 ,
  cp863 ,
  cp864 ,
  cp865 ,
  cp866 ,
  cp869 ,
  cp874 ,
  cp875 ,
  cp932 ,
  cp949 ,
  cp950 ,
  cp1006 ,
  cp1026 ,
  cp1140 ,
  cp1250 ,
  cp1251 ,
  cp1252 ,
  cp1253 ,
  cp1254 ,
  cp1255 ,
  cp1256 ,
  cp1257 ,
  cp1258 ,
  cp65001 ,
  euc_jp ,
  euc_jis_2004 ,
  euc_jisx0213 ,
  euc_kr ,
  gb2312 ,
  gbk ,
  gb18030 ,
  hz ,
  iso2022_jp ,
  iso2022_jp_1 ,
  iso2022_jp_2 ,
  iso2022_jp_2004 ,
  iso2022_jp_3 ,
  iso2022_jp_ext ,
  iso2022_kr ,
  latin_1 ,
  iso8859_2 ,
  iso8859_3 ,
  iso8859_4 ,
  iso8859_5 ,
  iso8859_6 ,
  iso8859_7 ,
  iso8859_8 ,
  iso8859_9 ,
  iso8859_10 ,
  iso8859_13 ,
  iso8859_14 ,
  iso8859_15 ,
  iso8859_16 ,
  johab ,
  koi8_r ,
  koi8_u ,
  mac_cyrillic ,
  mac_greek ,
  mac_iceland ,
  mac_latin2 ,
  mac_roman ,
  mac_turkish ,
  ptcp154 ,
  shift_jis ,
  shift_jis_2004 ,
  shift_jisx0213 ,
  utf_32 ,
  utf_32_be ,
  utf_32_le ,
  utf_16 ,
  utf_16_be ,
  utf_16_le ,
  utf_7 ,
  utf_8 ,
  utf_8_sig ]

Python 3.4 (96 encodings)

[ ascii ,
  big5 ,
  big5hkscs ,
  cp037 ,
  cp273 ,
  cp424 ,
  cp437 ,
  cp500 ,
  cp720 ,
  cp737 ,
  cp775 ,
  cp850 ,
  cp852 ,
  cp855 ,
  cp856 ,
  cp857 ,
  cp858 ,
  cp860 ,
  cp861 ,
  cp862 ,
  cp863 ,
  cp864 ,
  cp865 ,
  cp866 ,
  cp869 ,
  cp874 ,
  cp875 ,
  cp932 ,
  cp949 ,
  cp950 ,
  cp1006 ,
  cp1026 ,
  cp1125 ,
  cp1140 ,
  cp1250 ,
  cp1251 ,
  cp1252 ,
  cp1253 ,
  cp1254 ,
  cp1255 ,
  cp1256 ,
  cp1257 ,
  cp1258 ,
  cp65001 ,
  euc_jp ,
  euc_jis_2004 ,
  euc_jisx0213 ,
  euc_kr ,
  gb2312 ,
  gbk ,
  gb18030 ,
  hz ,
  iso2022_jp ,
  iso2022_jp_1 ,
  iso2022_jp_2 ,
  iso2022_jp_2004 ,
  iso2022_jp_3 ,
  iso2022_jp_ext ,
  iso2022_kr ,
  latin_1 ,
  iso8859_2 ,
  iso8859_3 ,
  iso8859_4 ,
  iso8859_5 ,
  iso8859_6 ,
  iso8859_7 ,
  iso8859_8 ,
  iso8859_9 ,
  iso8859_10 ,
  iso8859_11 ,
  iso8859_13 ,
  iso8859_14 ,
  iso8859_15 ,
  iso8859_16 ,
  johab ,
  koi8_r ,
  koi8_u ,
  mac_cyrillic ,
  mac_greek ,
  mac_iceland ,
  mac_latin2 ,
  mac_roman ,
  mac_turkish ,
  ptcp154 ,
  shift_jis ,
  shift_jis_2004 ,
  shift_jisx0213 ,
  utf_32 ,
  utf_32_be ,
  utf_32_le ,
  utf_16 ,
  utf_16_be ,
  utf_16_le ,
  utf_7 ,
  utf_8 ,
  utf_8_sig ]

Python 3.5 (98 encodings)

[ ascii ,
  big5 ,
  big5hkscs ,
  cp037 ,
  cp273 ,
  cp424 ,
  cp437 ,
  cp500 ,
  cp720 ,
  cp737 ,
  cp775 ,
  cp850 ,
  cp852 ,
  cp855 ,
  cp856 ,
  cp857 ,
  cp858 ,
  cp860 ,
  cp861 ,
  cp862 ,
  cp863 ,
  cp864 ,
  cp865 ,
  cp866 ,
  cp869 ,
  cp874 ,
  cp875 ,
  cp932 ,
  cp949 ,
  cp950 ,
  cp1006 ,
  cp1026 ,
  cp1125 ,
  cp1140 ,
  cp1250 ,
  cp1251 ,
  cp1252 ,
  cp1253 ,
  cp1254 ,
  cp1255 ,
  cp1256 ,
  cp1257 ,
  cp1258 ,
  cp65001 ,
  euc_jp ,
  euc_jis_2004 ,
  euc_jisx0213 ,
  euc_kr ,
  gb2312 ,
  gbk ,
  gb18030 ,
  hz ,
  iso2022_jp ,
  iso2022_jp_1 ,
  iso2022_jp_2 ,
  iso2022_jp_2004 ,
  iso2022_jp_3 ,
  iso2022_jp_ext ,
  iso2022_kr ,
  latin_1 ,
  iso8859_2 ,
  iso8859_3 ,
  iso8859_4 ,
  iso8859_5 ,
  iso8859_6 ,
  iso8859_7 ,
  iso8859_8 ,
  iso8859_9 ,
  iso8859_10 ,
  iso8859_11 ,
  iso8859_13 ,
  iso8859_14 ,
  iso8859_15 ,
  iso8859_16 ,
  johab ,
  koi8_r ,
  koi8_t ,
  koi8_u ,
  kz1048 ,
  mac_cyrillic ,
  mac_greek ,
  mac_iceland ,
  mac_latin2 ,
  mac_roman ,
  mac_turkish ,
  ptcp154 ,
  shift_jis ,
  shift_jis_2004 ,
  shift_jisx0213 ,
  utf_32 ,
  utf_32_be ,
  utf_32_le ,
  utf_16 ,
  utf_16_be ,
  utf_16_le ,
  utf_7 ,
  utf_8 ,
  utf_8_sig ]

Python 3.6 (98 encodings)

Same as previous edition.

Python 3.7 (98 encodings)

Same as previous edition.

Python 3.8 (97 encodings)

[ ascii ,
  big5 ,
  big5hkscs ,
  cp037 ,
  cp273 ,
  cp424 ,
  cp437 ,
  cp500 ,
  cp720 ,
  cp737 ,
  cp775 ,
  cp850 ,
  cp852 ,
  cp855 ,
  cp856 ,
  cp857 ,
  cp858 ,
  cp860 ,
  cp861 ,
  cp862 ,
  cp863 ,
  cp864 ,
  cp865 ,
  cp866 ,
  cp869 ,
  cp874 ,
  cp875 ,
  cp932 ,
  cp949 ,
  cp950 ,
  cp1006 ,
  cp1026 ,
  cp1125 ,
  cp1140 ,
  cp1250 ,
  cp1251 ,
  cp1252 ,
  cp1253 ,
  cp1254 ,
  cp1255 ,
  cp1256 ,
  cp1257 ,
  cp1258 ,
  euc_jp ,
  euc_jis_2004 ,
  euc_jisx0213 ,
  euc_kr ,
  gb2312 ,
  gbk ,
  gb18030 ,
  hz ,
  iso2022_jp ,
  iso2022_jp_1 ,
  iso2022_jp_2 ,
  iso2022_jp_2004 ,
  iso2022_jp_3 ,
  iso2022_jp_ext ,
  iso2022_kr ,
  latin_1 ,
  iso8859_2 ,
  iso8859_3 ,
  iso8859_4 ,
  iso8859_5 ,
  iso8859_6 ,
  iso8859_7 ,
  iso8859_8 ,
  iso8859_9 ,
  iso8859_10 ,
  iso8859_11 ,
  iso8859_13 ,
  iso8859_14 ,
  iso8859_15 ,
  iso8859_16 ,
  johab ,
  koi8_r ,
  koi8_t ,
  koi8_u ,
  kz1048 ,
  mac_cyrillic ,
  mac_greek ,
  mac_iceland ,
  mac_latin2 ,
  mac_roman ,
  mac_turkish ,
  ptcp154 ,
  shift_jis ,
  shift_jis_2004 ,
  shift_jisx0213 ,
  utf_32 ,
  utf_32_be ,
  utf_32_le ,
  utf_16 ,
  utf_16_be ,
  utf_16_le ,
  utf_7 ,
  utf_8 ,
  utf_8_sig ]

Python 3.9 (97 encodings)

Same as previous edition.

Python 3.10 (97 encodings)

Same as previous edition.

Python 3.11 (97 encodings)

Same as previous edition.


如果它们与任何使用案件有关,则指出,该说明还列出了一些Python-specific encodings,其中很多似乎主要供作淫媒内部使用,或以某种方式供使用,如<编码>未定义的编码编码编码编码编码,编码编码编码编码编码在你试图使用时总是带有例外。 如果像这里的问答者一样,你再次试图指出你在现实世界中看到的某些案文使用了哪些编码,你可能想完全忽视这些内容。 名单如下:

["idna",
 "mbcs",
 "oem",
 "palmos",
 "punycode",
 "raw_unicode_escape",
 "rot_13",
 "undefined",
 "unicode_escape",
 "unicode_internal",
 "base64_codec",
 "bz2_codec",
 "hex_codec",
 "quopri_codec",
 "uu_codec",
 "zlib_codec"]

一些较老的甲型六氯环己烷版本有string_einski,其中特别编码是,由于从语言上删除,我没有列入上述清单。

最后,如果你要更新上述表格,以备更新的“灰色”版。

import re
import requests
import lxml.html
import pprint

previous = None
for version, url in [
    ( 2.3 ,  https://docs.python.org/2.3/lib/node130.html ),
    ( 2.4 ,  https://docs.python.org/2.4/lib/standard-encodings.html ),
    ( 2.5 ,  https://docs.python.org/2.5/lib/standard-encodings.html ),
    ( 2.6 ,  https://docs.python.org/2.6/library/codecs.html#standard-encodings ),
    ( 2.7 ,  https://docs.python.org/2.7/library/codecs.html#standard-encodings ),
    ( 3.0 ,  https://docs.python.org/3.0/library/codecs.html#standard-encodings ),
    ( 3.1 ,  https://docs.python.org/3.1/library/codecs.html#standard-encodings ),
    ( 3.2 ,  https://docs.python.org/3.2/library/codecs.html#standard-encodings ),
    ( 3.3 ,  https://docs.python.org/3.3/library/codecs.html#standard-encodings ),
    ( 3.4 ,  https://docs.python.org/3.4/library/codecs.html#standard-encodings ),
    ( 3.5 ,  https://docs.python.org/3.5/library/codecs.html#standard-encodings ),
    ( 3.6 ,  https://docs.python.org/3.6/library/codecs.html#standard-encodings ),
    ( 3.7 ,  https://docs.python.org/3.7/library/codecs.html#standard-encodings ),
    ( 3.8 ,  https://docs.python.org/3.8/library/codecs.html#standard-encodings ),
    ( 3.9 ,  https://docs.python.org/3.9/library/codecs.html#standard-encodings ),
    ( 3.10 ,  https://docs.python.org/3.10/library/codecs.html#standard-encodings ),
    ( 3.11 ,  https://docs.python.org/3.11/library/codecs.html#standard-encodings ),
]:
    html = requests.get(url).text
    # Work-around for weird HTML markup in recent versions of Python documentation:
    html = re.sub( <[/]?p> ,   , html)
    doc = lxml.html.fromstring(html)
    standard_encodings_table = doc.xpath(
         //table[preceding::h2[.//text()[contains(., "Standard Encodings")]]][//th/text()="Codec"] 
    )[0]
    codecs = standard_encodings_table.xpath( .//td[1]/text() )
    print("## Python %s (%i encodings)
" % (version, len(codecs)))
    if codecs == previous:
        print( _Same as previous version._
 )
    else:
        print( ```python
  + pprint.pformat(codecs) +  
```
 )
    previous = codecs

或许应该利用Universal Encodingstror(chardet) 图书馆,而不是执行。

>>> import chardet
>>> s =  xe2x98x83  # ☃
>>> chardet.detect(s)
{ confidence : 0.505,  encoding :  utf-8 }

页: 1 a 技术,将所有模块列入<编码>编码>包。

import pkgutil
import encodings

false_positives = set(["aliases"])

found = set(name for imp, name, ispkg in pkgutil.iter_modules(encodings.__path__) if not ispkg)
found.difference_update(false_positives)
print found

我怀疑在密码模块中存在这种方法/功能,但如果见<编码>encoding/__init__py,搜索功能检索编码编码模块夹,那么你可以采取同样的做法。

>>> os.listdir(os.path.dirname(encodings.__file__))
[ cp500.pyc ,  utf_16_le.py ,  gb18030.py ,  mbcs.pyc ,  undefined.pyc ,  idna.pyc ,  punycode.pyc ,  cp850.py ,  big5hkscs.pyc ,  mac_arabic.py ,  __init__.pyc ,  string_escape.py ,  hz.py ,  cp037.py ,  cp737.py ,  iso8859_5.pyc ,  iso8859_13.pyc ,  cp861.pyc ,  cp862.py ,  iso8859_9.pyc ,  cp949.py ,  base64_codec.pyc ,  koi8_r.py ,  iso8859_2.py ,  ptcp154.pyc ,  uu_codec.pyc ,  mac_croatian.pyc ,  charmap.pyc ,  iso8859_15.pyc ,  euc_jp.py ,  cp1250.py ,  iso8859_10.pyc ,  koi8_r.pyc ,  unicode_escape.pyc ,  cp863.pyc ,  iso8859_4.pyc ,  cp852.py ,  unicode_internal.py ,  big5hkscs.py ,  cp1257.pyc ,  cp1254.py ,  shift_jisx0213.py ,  shift_jis.pyc ,  cp869.pyc ,  hp_roman8.py ,  iso8859_4.py ,  cp775.py ,  cp1251.py ,  mac_cyrillic.pyc ,  mac_greek.pyc ,  mac_roman.pyc ,  iso8859_11.pyc ,  iso8859_6.py ,  utf_8_sig.py ,  iso8859_3.py ,  iso2022_jp_1.py ,  ascii.py ,  cp1026.pyc ,  cp1250.pyc ,  cp950.py ,  raw_unicode_escape.py ,  euc_jis_2004.pyc ,  cp775.pyc ,  euc_kr.py ,  mac
_greek.py ,  big5.pyc ,  shift_jis_2004.pyc ,  gbk.pyc ,  cp1254.pyc ,  cp1255.pyc ,  cp855.pyc ,  string_escape.pyc ,  cp949.pyc ,  cp1258.pyc ,  iso8859_3.pyc ,  mac_iceland.pyc ,  cp1251.pyc ,  cp860.py ,  cp856.py ,  cp874.py ,  iso2022_kr.py ,  cp856.pyc ,  rot_13.py ,  palmos.py ,  iso2022_jp_2.pyc ,  mac_farsi.py ,  koi8_u.pyc ,  cp1256.py ,  iso8859_10.py ,  tis_620.py ,  iso8859_14.pyc ,  cp1253.py ,  cp1258.py ,  cp437.py ,  cp862.pyc ,  mac_turkish.py ,  undefined.py ,  euc_kr.pyc ,  gb18030.pyc ,  aliases.pyc ,  iso8859_9.py ,  uu_codec.py ,  gbk.py ,  quopri_codec.pyc ,  iso8859_7.py ,  mac_iceland.py ,  iso8859_2.pyc ,  euc_jis_2004.py ,  iso2022_jp_3.pyc ,  cp874.pyc ,  __init__.py ,  mac_roman.py ,  iso8859_16.py ,  cp866.py ,  unicode_internal.pyc ,  mac_turkish.pyc ,  johab.pyc ,  cp037.pyc ,  punycode.py ,  cp1253.pyc ,  euc_jisx0213.pyc ,  iso2022_jp_2004.pyc ,  iso2022_kr.pyc ,  zlib_codec.pyc ,  cp932.py ,  cp1255.py ,  iso2022_jp_1.pyc ,  cp857.pyc ,  cp424.pyc ,
  iso2022_jp_2.py ,  iso2022_jp.pyc ,  mbcs.py ,  utf_8.py ,  palmos.pyc ,  cp1252.pyc ,  aliases.py ,  quopri_codec.py ,  latin_1.pyc ,  iso2022_jp.py ,  zlib_codec.py ,  cp1026.py ,  cp860.pyc ,  cp1252.py ,  hex_codec.pyc ,  iso8859_1.pyc ,  cp850.pyc ,  cp861.py ,  iso8859_15.py ,  cp865.pyc ,  hp_roman8.pyc ,  iso8859_7.pyc ,  mac_latin2.py ,  iso8859_11.py ,  mac_centeuro.pyc ,  iso8859_6.pyc ,  ascii.pyc ,  mac_centeuro.py ,  iso2022_jp_3.py ,  bz2_codec.py ,  mac_arabic.pyc ,  euc_jisx0213.py ,  tis_620.pyc ,  shift_jis_2004.py ,  utf_8.pyc ,  cp855.py ,  mac_romanian.pyc ,  iso8859_8.py ,  cp869.py ,  ptcp154.py ,  utf_16_be.py ,  iso2022_jp_ext.pyc ,  bz2_codec.pyc ,  base64_codec.py ,  latin_1.py ,  charmap.py ,  hz.pyc ,  cp950.pyc ,  cp875.pyc ,  cp1006.pyc ,  utf_16.py ,  shift_jisx0213.pyc ,  cp424.py ,  cp932.pyc ,  iso8859_5.py ,  mac_romanian.py ,  utf_8_sig.pyc ,  iso8859_1.py ,  cp875.py ,  cp437.pyc ,  cp865.py ,  utf_7.py ,  utf_16_be.pyc ,  rot_13.pyc ,  euc_jp.p
yc ,  raw_unicode_escape.pyc ,  iso8859_8.pyc ,  utf_16.pyc ,  iso8859_14.py ,  iso8859_16.pyc ,  cp852.pyc ,  cp737.pyc ,  mac_croatian.py ,  mac_latin2.pyc ,  iso2022_jp_ext.py ,  cp1140.py ,  mac_cyrillic.py ,  cp1257.py ,  cp500.py ,  cp1140.pyc ,  shift_jis.py ,  unicode_escape.py ,  cp864.py ,  cp864.pyc ,  cp857.py ,  hex_codec.py ,  mac_farsi.pyc ,  idna.py ,  johab.py ,  utf_7.pyc ,  cp863.py ,  iso8859_13.py ,  koi8_u.py ,  gb2312.pyc ,  cp1256.pyc ,  cp866.pyc ,  iso2022_jp_2004.py ,  utf_16_le.pyc ,  gb2312.py ,  cp1006.py ,  big5.py ]

但是,由于任何人都可以登记法典,因此获得的只是详尽清单。

Python 3.7.6 Source/a>, under /Tools/unicode/listcodecs.py:

""" List all available codec modules.

(c) Copyright 2005, Marc-Andre Lemburg (mal@lemburg.com).

    Licensed to PSF under a Contributor Agreement.

"""

import os, codecs, encodings

_debug = 0

def listcodecs(dir):
    names = []
    for filename in os.listdir(dir):
        if filename[-3:] !=  .py :
            continue
        name = filename[:-3]
        # Check whether we ve found a true codec
        try:
            codecs.lookup(name)
        except LookupError:
            # Codec not found
            continue
        except Exception as reason:
            # Probably an error from importing the codec; still it s
            # a valid code name
            if _debug:
                print( * problem importing codec %r: %s  % 
                      (name, reason))
        names.append(name)
    return names


if __name__ ==  __main__ :
    names = listcodecs(encodings.__path__[0])
    names.sort()
    print( all_codecs = [ )
    for name in names:
        print(     %r,  % name)
    print( ] )

然后:

if str(response.encoding) is "undefined" or 
        str(response.encoding) not in names:
    do_something()  # like set default to utf_8 and execute
    pass

粉碎源代码在<代码>上有文字。 列有所有代码的仪器/单编码/清单编码。

但是,在所列代码中,有些并非统法协会编码对逐个转换器,例如base64_codecquopri_codec bz2_codec,正如John Machin所指出的。

import os
def encodinglist():
    r=[]
    for i in os.listdir(os.path.split(__import__("encodings").__file__)[0]):
        name=os.path.splitext(i)[0]
        try:
            "".encode(name)
        except:
            pass
        else:
            r.append(name.replace("_","-"))
    return r

这里是列出在微粒编码包中界定的所有编码的方案方式,指出这赢得了一定数量的用户定义编码。 这结合了其他答复中的一些trick,但实际使用代典名编制了一份工作清单。

import encodings
import pkgutil
import pprint


all_encodings = set()

for _, modname, _ in pkgutil.iter_modules(
        encodings.__path__, encodings.__name__ +  . ,
):
    try:
        mod = __import__(modname, fromlist=[str( __trash )])
    except (ImportError, LookupError):
        # A few encodings are platform specific: mcbs, cp65001
        # print( skip {} .format(modname))
        pass

    try:
        all_encodings.add(mod.getregentry().name)
    except AttributeError as e:
        # the `aliases` module doensn t actually provide a codec
        # print( skip {} .format(modname))
        if  regentry  not in str(e):
            raise

pprint.pprint(sorted(all_encodings))

可以证明:

from encodings.aliases import aliases
print aliases.keys()




相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签