English 中文(简体)
R: 定期表达在该地方无效
原标题:R: Regular expression is invalid in this locale

I have the following regular expression in an R script:

grepl("xe9", "MY TEXT", fixed = FALSE, ignore.case = TRUE, perl = FALSE)

我有以下错误:

Error in grepl("xe9", "MY TEXT", fixed = FALSE, ignore.case = TRUE,  : 
regular expression is invalid in this locale

“xe9”是“é”的单一编码,从档案中读作。

How can I fix this? Additionally, is there a useful resource on locales w.r.t regular expression in R?

最佳回答

I don t know why @joran didn t just post this as an answer:

grepl("\xe9", c("MY TEXT", "é"), fixed = FALSE, ignore.case = TRUE, perl = FALSE)
#[1] FALSE  TRUE
问题回答

暂无回答




相关问题
Why are there duplicate characters in Unicode?

I can see some duplicate characters in Unicode. For example, the character C can be represented by the code points U+0043 and U+0421. Why is this so?

how to extract characters from a Korean string in VBA

Need to extract the initial character from a Korean word in MS-Excel and MS-Access. When I use Left("한글",1) it will return the first syllable i.e 한, what I need is the initial character i.e ㅎ . Is ...

File open error by using codec utf-8 in python

I execute following code on windows xp and python 2.6.4 But it show IOError. How to open file whose name has utf-8 codec. >>> open( unicode( 한글.txt , euc-kr ).encode( utf-8 ) ) Traceback ...

UnicodeEncodeError on MySQL insert in Python

I used lxml to parse some web page as below: >>> doc = lxml.html.fromstring(htmldata) >>> element in doc.cssselect(sometag)[0] >>> text = element.text_content() >>>...

Fast way to filter illegal xml unicode chars in python?

The XML specification lists a bunch of Unicode characters that are either illegal or "discouraged". Given a string, how can I remove all illegal characters from it? I came up with the following ...

热门标签