English 中文(简体)
Unicode < - > 多byte 转换(本地对管理)
原标题:Unicode <-> Multibyte conversion (native vs. managed)

我试图将来自.NET 的 Unicode 字符串转换为本地 C++, 这样我就可以将其写入文本文件。 此过程将被反转, 这样文件的文本将被读取并转换成管理下的 Unicode 字符串 。

我使用以下代码:

String^ FromNativeToDotNet(std::string value)
{
  // Convert an ASCII string to a Unicode String
  std::wstring wstrTo;
  wchar_t *wszTo = new wchar_t[lvalue.length() + 1];
  wszTo[lvalue.size()] = L  ;
  MultiByteToWideChar(CP_UTF8, 0, value.c_str(), -1, wszTo, (int)value.length());
  wstrTo = wszTo;
  delete[] wszTo;

  return gcnew String(wstrTo.c_str());
}


std::string FromDotNetToNative(String^ value)
{ 
  // Pass on changes to native part
  pin_ptr<const wchar_t> wcValue = SafePtrToStringChars(value);
  std::wstring wsValue( wcValue );

  // Convert a Unicode string to an ASCII string
  std::string strTo;
  char *szTo = new char[wsValue.length() + 1];
  szTo[wsValue.size()] =   ;
  WideCharToMultiByte(CP_UTF8, 0, wsValue.c_str(), -1, szTo, (int)wsValue.length(), NULL, NULL);
  strTo = szTo;
  delete[] szTo;

  return strTo;
}

What happens is that e.g. a Japanese character gets converted to two ASCII chars (漢 -> "w). I assume that s correct? But the other way does not work: when I call FromNativeToDotNet wizh "w I only get "w as a managed unicode string... How can I get the Japanese character correctly restored?

最佳回答

试试这个代替:

String^ FromNativeToDotNet(std::string value)
{
  // Convert a UTF-8 string to a UTF-16 String
  int len = MultiByteToWideChar(CP_UTF8, 0, value.c_str(), value.length(), NULL, 0);
  if (len > 0)
  {
    std::vector<wchar_t> wszTo(len);
    MultiByteToWideChar(CP_UTF8, 0, value.c_str(), value.length(), &wszTo[0], len);
    return gcnew String(&wszTo[0], 0, len);
  }

  return gcnew String((wchar_t*)NULL);
}

std::string FromDotNetToNative(String^ value)
{ 
  // Pass on changes to native part
  pin_ptr<const wchar_t> wcValue = SafePtrToStringChars(value);

  // Convert a UTF-16 string to a UTF-8 string
  int len = WideCharToMultiByte(CP_UTF8, 0, wcValue, str->Length, NULL, 0, NULL, NULL);
  if (len > 0)
  {
    std::vector<char> szTo(len);
    WideCharToMultiByte(CP_UTF8, 0, wcValue, str->Length, &szTo[0], len, NULL, NULL);
    return std::string(&szTo[0], len);
  }

  return std::string();
}
问题回答

最好使用 UTF8Encoding :

static String^ FromNativeToDotNet(std::string value)
{
    array<Byte>^ bytes = gcnew array<Byte>(value.length());
    System::Runtime::InteropServices::Marshal::Copy(IntPtr((void*)value.c_str()), bytes, 0, value.length());
    return (gcnew System::Text::UTF8Encoding)->GetString(bytes);
}


static std::string FromDotNetToNative(String^ value)
{ 
    if (value->Length == 0) return std::string("");
    array<Byte>^ bytes = (gcnew System::Text::UTF8Encoding)->GetBytes(value);
    pin_ptr<Byte> chars = &bytes[0];
    return std::string((char*)chars, bytes->Length);
}

a 日本字符被转换为两个 ASCII 字符( 69- gt; "w. " ) 。 我想这是正确的吗?

否,该字符 U+6F22, 应该转换为三字节: 0xE6 0xBC 0xA2

UTF-16 (小内衣) U+6F22 存储在内存中为 0x22 0x6F, 它看起来像是 ascii (而不是 "w ) 中的 o (而不是 "w ), 所以从 String_ 转换为 std:: string: 字符串似乎有问题 。

我不太熟悉String_ 来知道从 String_ 转换为 std:: wstring 的正确方式:: wstring, 但我很确定你的问题在哪里。


我不认为以下 与你的问题有任何关系, 但显然是错误的:

std::string strTo;
char *szTo = new char[wsValue.length() + 1];

您已经知道一个单一的宽字符可以产生多个狭义字符, 所以宽字符的数量显然不一定等于或大于相应的狭义字符的数量。

您需要使用 WideCharToMultiByte 来计算缓冲大小, 然后用这个大小的缓冲再次调用它。 或者您可以指定一个缓冲来将字符数保持为宽字符数的三倍 。





相关问题
Simple JAVA: Password Verifier problem

I have a simple problem that says: A password for xyz corporation is supposed to be 6 characters long and made up of a combination of letters and digits. Write a program fragment to read in a string ...

Case insensitive comparison of strings in shell script

The == operator is used to compare two strings in shell script. However, I want to compare two strings ignoring case, how can it be done? Is there any standard command for this?

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

String initialization with pair of iterators

I m trying to initialize string with iterators and something like this works: ifstream fin("tmp.txt"); istream_iterator<char> in_i(fin), eos; //here eos is 1 over the end string s(in_i, ...

break a string in parts

I have a string "pc1|pc2|pc3|" I want to get each word on different line like: pc1 pc2 pc3 I need to do this in C#... any suggestions??

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...

热门标签