Question

在C11,在预设标准8中添加了新的字面。这使一系列的果园,其文本被编码到UTF-8。如何做到这一点? 是否签署了正常的果园? 由于签署协议,其使用的信息要少一点? 我的逻辑将表明,UTF-8的案文需要一系列未经签名的果园。

Answer 1

是否签署了正常的果园?

<代码>char为>>>> 签名或unsign。

此外,光标线“浮动”仍然可用于代表信息,<代码>char/code>不一定是大的8倍(在某些平台上可能更大)。

Answer 2

There is a potential problem here:

If a implementation with CHAR_BIT = 8 use sign-magnized represent for char (so char/code> is signed),时,当UTF-8要求借方-pattern10000000时,负0. Soem>if >>> >执行进一步不支持负0,而任何特定的UTF-8 string可能含有的无效(cc)价值,引起问题。哪怕是支持否定的零,那么借方模式(10000000<>/code>)与借方模式(0000<<>/code>)相比较。 (Nul terminator)在char[]/code>中使用UTF-8数据时可能会引起问题。



我认为,这意味着,对于C11号标书的执行,必须不签署<条码>。 通常,这取决于是否签署或未签署<条码>/条码>,但当然,如果签署<条码><>条/代码>,结果未能正确执行《八号总协定》,执行者就不得不自行签字。 除此以外,整个C++的非-2级补充执行也是如此,因为C++允许<条码><>>>和<条码>用于获取标语。 只允许<条码>未签署文件<>。

在2个辅助器和1个辅助器中,UTF-8数据所需的轨道型号是sign char/code>的有效数值,因此,可自由制作char,要么签署,要么未签署,要么仍然能够代表UTF-8在char[]上的方言。 这是因为所有256个轨道模式都是有效的2个辅助值,而UTF-8则不使用111111(1个辅助负0)。

Answer 3

不管怎么说,哪怕是哪怕是哪一点。而UTF-8的规格本身并没有说这些果园必须未经签名。

PS Wat 是kookwekker voor n naam?

Answer 4

The signedness of char does not matter; utf8 can be handled with only shift and mask operations (which may be cumbersome for signed types, but not impossible) But: utf8 needs at least 8 bits, so "assert (CHAR_BIT >= 8);"

To illustrate by point: the following fragments contains no arithmetic operations on the character s value, only shift&mask.

static int eat_utf8(unsigned char *str, unsigned len, unsigned *target)
{
unsigned val = 0;
unsigned todo;

if (!len) return 0;

val = str[0];
if ((val & 0x80) == 0x00) { if (target) *target = val; return 1; }
else if ((val & 0xe0) == 0xc0) { val &= 0x1f; todo = 1; }
else if ((val & 0xf0) == 0xe0) { val &= 0x0f; todo = 2; }
else if ((val & 0xf8) == 0xf0) { val &= 0x07; todo = 3; }
else if ((val & 0xfc) == 0xf8) { val &= 0x03; todo = 4; }
else if ((val & 0xfe) == 0xfc) { val &= 0x01; todo = 5; }
else {  /* Default (Not in the spec) */
        if (target) *target = val;
        return -1; }


len--;str++;
if (todo > len) { return -todo; }

for(len=todo;todo--;) {
        /* For validity checking we should also
        ** test if ((*str & 0xc0) == 0x80) here */
        val <<= 6;
        val |= *str++ & 0x3f;
        }

if (target) *target = val;
return  1+ len;
}

友情链接