English 中文(简体)
• 如何逃脱Unicode在格罗莫夫y s/pattern/ syntax
原标题:How to escape Unicode escapes in Groovy s /pattern/ syntax

下面的格罗维派指挥官说明了我的问题。

首先,这项工作(,如预期(0061/>t61/>)。

>>> print "a".matches(/u0061/)

true

Now let s say that we want to match , using the Unicode escape u000A. The following, using "pattern" as a string, behaves as expected:

>>> print "
".matches("u000A");

Interpreter exception: com.google.lotrepls.shared.InterpreterException:
org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed,
Script1.groovy: 1: expecting anything but   
  ; got it anyway
@ line 1, column 21. 1 error

这是因为,至少在 Java,统法协会的越狱程序早处理(JLS3.3),因此:

print "
".matches("u000A")

实际情况与:

print "
".matches("
")

固定装置是为了逃脱统法协会的密码,让列车的发动机进行如下操作:

>>> print "
".matches("\u000A")

true

现在这里的问题是部分:我们如何能够与格罗夫y <代码>/pattern/<合作。 syntax, 而不是使用脚石?

一些失败的尝试:

>>> print "
".matches(/u000A/)

Interpreter exception: com.google.lotrepls.shared.InterpreterException:
org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed,
Script1.groovy: 1: expecting EOF, found  (  @ line 1, column 19.
1 error

>>> print "
".matches(/\u000A/)

false

>>> print "\u000A".matches(/\u000A/);

true
最佳回答

~"[u0000-u0008u000Bu000Cu000E-u001Fu007F-u009F]

期望按需要开展工作。 据我所看到的,需要双向反弹,这样我就不知道编辑们为什么不满意。

问题回答

Firstly, it seems Groovy changed in this regard in the meantime, at least on https://groovyconsole.appspot.com/ and a local Groovy shell, " ".matches(/u000A/) works perfectly fine, evaluating to true.

In case you have a similar situation again, just encode the backslash with a unicode escape like in " ".matches(/u005Cu000A/) as then the unicode escape to character conversion makes it a backslash again and then the sequence for the regex parser is kept.

Another option would be to separate the backslash from the u for example by using " ".matches(/${ \ }u000A/) or " ".matches( \ + /u000A/)





相关问题
Why are there duplicate characters in Unicode?

I can see some duplicate characters in Unicode. For example, the character C can be represented by the code points U+0043 and U+0421. Why is this so?

how to extract characters from a Korean string in VBA

Need to extract the initial character from a Korean word in MS-Excel and MS-Access. When I use Left("한글",1) it will return the first syllable i.e 한, what I need is the initial character i.e ㅎ . Is ...

File open error by using codec utf-8 in python

I execute following code on windows xp and python 2.6.4 But it show IOError. How to open file whose name has utf-8 codec. >>> open( unicode( 한글.txt , euc-kr ).encode( utf-8 ) ) Traceback ...

UnicodeEncodeError on MySQL insert in Python

I used lxml to parse some web page as below: >>> doc = lxml.html.fromstring(htmldata) >>> element in doc.cssselect(sometag)[0] >>> text = element.text_content() >>>...

Fast way to filter illegal xml unicode chars in python?

The XML specification lists a bunch of Unicode characters that are either illegal or "discouraged". Given a string, how can I remove all illegal characters from it? I came up with the following ...

热门标签