English 中文(简体)
将 UTF-8 转换为 UTF-32, 预先计算每个字符数
原标题:Converting UTF-8 to UTF-32, pre-calculating the number of chars in each

我有一个工作算法可以将 UTF-8 字符串转换为 UTF-32 字符串, 但是, 我必须提前为我的 UTF-32 字符串分配全部空间 。 有没有办法知道 UTF- 32 字符串中有多少字符会被 UTF-8 字符串带走 。

例如, UTF-8 字符串“% 0” 是 3 个字符, 一旦转换为 UTF-32 是 2 个未签名的字符 。 在转换前, 有没有办法知道需要多少 UTF-32 字符? 还是我必须重写算法?

最佳回答

有两个基本选择:

  1. 您可以通过 UTF-8 字符串做两次传送, 第一次计算出您需要生成的 UTF-32 字符数, 第二次将它们写成缓冲 。

  2. 分配您可能需要的最大32位字符数 -- -- 即 UTF-8 字符串的长度。 这浪费了内存, 但意味着您可以一次变换 utf8- gt; utf32 。

您也可以使用混合法 -- -- 例如,如果字符串短于某些阈值,那么使用第二种方法,否则使用第一种方法。

首先,第一通行证看起来是这样的:

size_t len=0;  // warning: untested code.
for(const char *p=src; *p; ++p) {
    // characters that begin with binary 10xxxxxx... are continuations; all other
    // characters should begin a new utf32 char (assuming valid utf8 input)
    if ((*p & 0xc0) != 0x80) ++len;
}
问题回答

暂无回答




相关问题
Undefined reference

I m getting this linker error. I know a way around it, but it s bugging me because another part of the project s linking fine and it s designed almost identically. First, I have namespace LCD. Then I ...

C++ Equivalent of Tidy

Is there an equivalent to tidy for HTML code for C++? I have searched on the internet, but I find nothing but C++ wrappers for tidy, etc... I think the keyword tidy is what has me hung up. I am ...

Template Classes in C++ ... a required skill set?

I m new to C++ and am wondering how much time I should invest in learning how to implement template classes. Are they widely used in industry, or is this something I should move through quickly?

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

typedef ing STL wstring

Why is it when i do the following i get errors when relating to with wchar_t? namespace Foo { typedef std::wstring String; } Now i declare all my strings as Foo::String through out the program, ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

Window iconification status via Xlib

Is it possible to check with the means of pure X11/Xlib only whether the given window is iconified/minimized, and, if it is, how?

热门标签