English 中文(简体)
附带条件
原标题:conditional strsplit
  • 时间:2011-11-10 18:51:45
  •  标签:
  • regex
  • r

我有一个数据框架,其中一栏载有一组名字。 我愿介绍一下该栏的一部分名称,并做了如下解释:

DF$newname <- sapply(strsplit(as.character(DF$oldname), "_"),  [ , 5)

举例来说,分部分的第五部分包含特征说明中的名称部分。 问题是,这一数据集包含不同格式的<代码>oldname。 在第一种格式中,名称如下:XXX是:

xxx_xxx_xxx_xxx_name_xx  (name is in fifth position)

缩略语 looks

xxx_xxx_xxx_xxx_xxx_name_xx  (name is in sixth position)

我认为,我可以使用“<条码>>代号,从职能范围内指挥,但与以下守则相左:

namesplit = function(df){ 
  x <- strsplit(as.character(df$oldname), "_"),  [ , 5)
  y <- strsplit(as.character(df$oldname), "_"),  [ , 6)
  ifelse(is.character(x),x,y) }
DF$newname <- sapply(DF,namesplit)

我知道,这部法典没有工作,可以这样使用<条码>[,但我并不相信最佳方式。 我认为,我可以在<<<>t>for loop>的<代码>内进行这项工作,但我更希望找到一种办法,以允许我使用pply的方式提取名字。

thanks.

最佳回答

You can easily do this using gsub

names <- c( xxx_xxx_xxx_xxx_xxx_name1_xx ,  xxx_xxx_xxx_xxx_name2_xx )
gsub("^.*_([[:alnum:]]+)_.*$", "\1", names)


[1] "name1" "name2"
问题回答

如果名字是倒数部分,那么:

x <- c("xxx_xxx_xxx_xxx_name_xx", "xxx_xxx_xxx_xxx_xxx_name_xx")


namesplit = function(x){
x <- strsplit(as.character(x), "_")
sapply(x, function(x) x[length(x)-1])
}

HTH





相关问题
Uncommon regular expressions [closed]

Recently I discovered two amazing regular expression features: ?: and ?!. I was curious of other neat regex features. So maybe you would like to share some tricky regular expressions.

regex to trap img tag, both versions

I need to remove image tags from text, so both versions of the tag: <img src="" ... ></img> <img src="" ... />

C++, Boost regex, replace value function of matched value?

Specifically, I have an array of strings called val, and want to replace all instances of "%{n}%" in the input with val[n]. More generally, I want the replace value to be a function of the match ...

PowerShell -match operator and multiple groups

I have the following log entry that I am processing in PowerShell I m trying to extract all the activity names and durations using the -match operator but I am only getting one match group back. I m ...

Is it possible to negate a regular expression search?

I m building a lexical analysis engine in c#. For the most part it is done and works quite well. One of the features of my lexer is that it allows any user to input their own regular expressions. This ...

regex for four-digit numbers (or "default")

I need a regex for four-digit numbers separated by comma ("default" can also be a value). Examples: 6755 3452,8767,9865,8766,3454 7678,9876 1234,9867,6876,9865 default Note: "default" ...

热门标签