English 中文(简体)
如何在Windows wchar_t 和GCC/Linux之间转换(不一定是编程方式)?
原标题:
  • 时间:2008-10-25 08:55:22
  •  标签:
最佳回答

One of the most used libraries to do character conversion is the ICU library http://icu-project.org/ It is e.g. used by some boost http://www.boost.org/ libraries.

问题回答

将UTF-16(Visual C++的格式)转换为UTF-8,然后可能从UTF-8转换为UCS-4(GCC的格式)是否是一个可接受的答案?

If so, then in Windows you could use the WideCharToMultiByte function (with CP_UTF8 for the CodePage parameter), for the first part of the conversion. Then you could either paste the resulting UTF-8 strings directly into your program, or convert them further. Here is a message showing how one person did it; you can also write your own code or do it manually (the official spec, with a section on exactly how to convert UTF-8 to UCS-4, can be found here). There may be an easier way, I m not overly familiar with the conversion stuff in Linux yet.

You only need to worry about characters between xD800 and xDFFF inclusive. Every other character should map exactly the same from UTF-16 to UCS-4 when zero-filled.

Ignacio is right, if you don t use some rare Chinese characters (or some extinct scripts), then the mapping is one to one. (the official "lingo" is "if you don t have characters outside BMP")

This is the algorithm, just in case: http://unicode.org/faq/utf_bom.html#utf16-3 But again, most likely useless for your real case.

您也可以使用来自 Unicode 的免费资源(ftp://ftp.unicode.org/Public/PROGRAMS/CVTUTF





相关问题
热门标签