Question

在奥古格尔形式的一段问题中,以下环境被用于阻止情绪、胚胎(代谢器<代码>x97)和欧元(代谢器<代码>x80):定期表述对<代码>。

“Capture

在移动装置上的 Chrome浏览器(not)中,这种固定表述限制了以下投入:

Double quotes (character x22)
Single quotes (character x27)
Left single quotation mark (character x91)
Right single quotation mark (character x92)
Left double quotation mark (character x93)
Right double quotation mark (character x94)

虽然^[x0A-xFF]*$ 包括特性10(x0A)至特征255(xFF)。

How can I update the regular expression ^[x0A-xFF]*$ to enable the 6 items above?

I ve tried inputting different formulas in the regular expression, such as ^([^\p{Emoji}]|\[^p{Emoji}])*$ but this was not helpful, it made the situation worse.

Answer 1

TL;DR

您将 拉丁-1和 Unicode 字数表示混为一谈,这就是为什么你经常表达不回预期结果的原因。我更正了这一说法,并从那类中删除了某些非专家性质,以获得这一经常表述,供谷歌表格使用:<><>>>。

你在移动装置上的问题可能是虚拟键盘的行为造成的,这些板输入出人意料的引证标志并非你经常表达的目标(请宣读如下)。

Detailed answer

在下文中,我使用<代码>255,用于解冻,xFF,用于hexa。

The problem is that you are designating characters with their numeral representation in the Windows Latin-1 (CP1252) character set, when the Google RE2 regular expression library implemented in Google Forms designates characters with their Unicode code points (probably like most – if not all – modern regular expression engines).
For the first 256 positions (x00 to xFF), characters are identical in both sets, so the confusion is permitted since the RE2 expression ^[x0A-xFF]*$ matches the same characters, which are:

! # ` · Ö Ö Ö Ö Ö

N.B.:上层的斜体与不可印的特性相符。

但是,为了建立RE2与高于<代码>xFF的职位的特性相一致的定期表述,你必须使用统法协会编码值(“编码点”)。

让我们比较一下贵问题所考虑的特性的多少表述:

Character	Description	Position in Windows Latin-1 character set	Position in the Unicode character set	Must match the regular expression
`"`	quotation mark (or double quote)	`34` or `x22`	`34` or `x22`	yes
	apostrophe (or single quote)	`39` or `x27`	`39` or `x27`	yes
`‘`	left single quotation mark	`145` or `x91`	`8216` or `x2018`	yes
`’`	right single quotation mark	`146` or `x92`	`8217` or `x2019`	yes
`“`	left double quotation mark	`147` or `x93`	`8220` or `x201C`	yes
`”`	right double quotation mark	`148` or `x94`	`8221` or `x201D`	yes
`—`	Em dash	`151` or `x97`	`8212` or `x2014`	no
`€`	Euro sign	`128` or `x80`	`8364` or `x20AC`	no
`?`	grinning face	not included	`128512` or `x1F600`	no
other emojis	other emojis	not included	`...` or `x...`	no

All the above clarifies that your regular expression ^[x0A-xFF]*$ will match lower-position characters, but not the left/right quotation marks that stand at high positions (well above xFF) in Unicode. So you need to extend the character class with the representations of these specific marks, like this: ^[x0A-xFFx{2018}x{2019}x{201C}x{201D}]*$.
Curly brackets are required by RE2 for hexadecimal numbers made of three digits or more.

顺便说一句,我似乎没有必要在以下职位之间列入所有控制特性:<条码>x0A和<条码>x1F(只有<条码>x0A和<条码>x0D与我似乎相关)。还将<条码>x7F至<条码>x9F的标记用于控制(无印本)特性,这些特性在你的案件中不会成为投入。因此,表达方式更为相关,但时间更长:[x0Ax0Dx20-x7ExA0-xFFx{2018}x{2019}x{201C}x{201D}]*。 http://regex101.com/r/noslEo/1“rel=“nofollow noreferer”>there 。

By the way, these expressions exclude the Euro sign, the Em dash and emojis as desired.
The mismatch with characters x22 and x27 on mobile device may result from the virtual keyboard not inputing exactly the character targeted in the regular expression (quotations marks are numerous in Unicode and their shape sometimes very similar depending on the font; you could include more quotation marks in your character class).
Also, be aware that the Google RE2 library does not support the p{Emoji} character class.

TL;DR

Detailed answer

友情链接