English 中文(简体)
% 矩阵
原标题:matrix %in% matrix
  • 时间:2011-10-30 07:02:57
  •  标签:
  • r

附录一有两个矩阵,各有两栏和不同的行数。 我想检查并看到一个矩阵中的哪一个表层。 如果是一次性的,我通常只读a %in%×,以取得我的成果。 <代码>match似乎仅用于病媒方面的工作。

> a
      [,1] [,2]
[1,]    1    2
[2,]    4    9
[3,]    1    6
[4,]    7    7
> x
     [,1] [,2]
[1,]    1    6
[2,]    2    7
[3,]    3    8
[4,]    4    9
[5,]    5   10

结果是<代码>c(FALSE,TRUE,TRUE,FALSE)。

最佳回答

另一种做法是:

> paste(a[,1], a[,2], sep="$$") %in% paste(x[,1], x[,2], sep="$$")
[1] FALSE  TRUE  TRUE FALSE

更一般性的版本是:

> apply(a, 1, paste, collapse="$$") %in% apply(x, 1, paste, collapse="$$")
[1] FALSE  TRUE  TRUE FALSE
问题回答

2. 贵国数据:

a <- matrix(c(1, 2, 4, 9, 1, 6, 7, 7), ncol=2, byrow=TRUE)
x <- matrix(c(1, 6, 2, 7, 3, 8, 4, 9, 5, 10), ncol=2, byrow=TRUE)

界定职能<代码>%inm% (a) 采用下列方法:

`%inm%` <- function(x, matrix){
  test <- apply(matrix, 1, `==`, x)
  any(apply(test, 2, all))
}

这些数据:

apply(a, 1, `%inm%`, x)
[1] FALSE  TRUE  TRUE FALSE

比较一行:

a[1, ] %inm% x
[1] FALSE

a[2, ] %inm% x
[1] TRUE

妥善的解决办法是完全正确的。 但是,如果你有大的矩阵,你可能会想根据再入侵来尝试其他东西。 如果你工作一栏,你可以缩短计算时间,排除与第一个职位对应的一切:

fastercheck <- function(x,matrix){
  nc <- ncol(matrix)
  rec.check <- function(r,i,id){
    id[id] <- matrix[id,i] %in% r[i]
    if(i<nc & any(id)) rec.check(r,i+1,id) else any(id)
  }
  apply(x,1,rec.check,1,rep(TRUE,nrow(matrix)))
}

比较:

> set.seed(100)
> x <- matrix(runif(1e6),ncol=10)
> a <- matrix(runif(300),ncol=10)
> a[c(3,7,9,15),] <- x[c(1000,48213,867,20459),]
> system.time(res1 <- a %inm% x)
   user  system elapsed 
  31.16    0.14   31.50 
> system.time(res2 <- fastercheck(a,x))
   user  system elapsed 
   0.37    0.00    0.38 
> identical(res1, res2)
[1] TRUE
> which(res2)
[1]  3  7  9 15

EDIT:

I checked the accepted answer just for fun. Performs better than the double apply ( as you get rid of the inner loop), but recursion still rules! ;-)

> system.time(apply(a, 1, paste, collapse="$$") %in% 
 + apply(x, 1, paste, collapse="$$"))
   user  system elapsed 
   6.40    0.01    6.41 

此处是使用<代码>digest的包件和为每一行创建<编码>的代谢/代码”的另一种做法,是使用洗 algorithm算法产生的(缺省为md5)。

a <- matrix(c(1, 2, 4, 9, 1, 6, 7, 7), ncol=2, byrow=TRUE)
x <- matrix(c(1, 6, 2, 7, 3, 8, 4, 9, 5, 10), ncol=2, byrow=TRUE)
apply(a, 1, digest) %in% apply(x, 1, digest)

[1] FALSE  TRUE  TRUE FALSE

晚到游戏:我以前曾用“有划界”方法写过算法,然后发现这一页。 我猜测,其中一部法典最快,但:

andrie<-function(mfoo,nfoo) apply(mfoo, 1, `%inm%`, nfoo)
# using Andrie s %inm% operator exactly as above
carl<-function(mfoo,nfoo) {
 allrows<-unlist(sapply(1:nrow(mfoo),function(j) paste(mfoo[j,],collapse= _ ))) 
 allfoo <- unlist(sapply(1:nrow(nfoo),function(j) paste(nfoo[j,],collapse= _ )))
 thewalls<-setdiff(allrows,allfoo)
 dowalls<-mfoo[allrows%in%thewalls,]
}

 ramnath <- function (a,x) apply(a, 1, digest) %in% apply(x, 1, digest)

 mfoo<-matrix( sample(1:100,400,rep=TRUE),nr=100)
 nfoo<-mfoo[sample(1:100,60),]

 library(microbenchmark)
 microbenchmark(andrie(mfoo,nfoo),carl(mfoo,nfoo),ramnath(mfoo,nfoo),times=5)

Unit: milliseconds
                expr       min        lq    median        uq            max neval
  andrie(mfoo, nfoo) 25.564196 26.527632 27.964448 29.687344     102.802004     5
    carl(mfoo, nfoo)  1.020310  1.079323  1.096855  1.193926       1.246523     5
 ramnath(mfoo, nfoo)  8.176164  8.429318  8.539644  9.258480       9.458608     5

So apparently constructing character strings and doing a single set operation is fastest! (PS I checked and all 3 algorithms give the same result)





相关问题
How to plot fitted model over observed time series

This is a really really simple question to which I seem to be entirely unable to get a solution. I would like to do a scatter plot of an observed time series in R, and over this I want to plot the ...

REvolution for R

since the latest Ubuntu release (karmic koala), I noticed that the internal R package advertises on start-up the REvolution package. It seems to be a library collection for high-performance matrix ...

R - capturing elements of R output into text files

I am trying to run an analysis by invoking R through the command line as follows: R --no-save < SampleProgram.R > SampleProgram.opt For example, consider the simple R program below: mydata =...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Changing the order of dodged bars in ggplot2 barplot

I have a dataframe df.all and I m plotting it in a bar plot with ggplot2 using the code below. I d like to make it so that the order of the dodged bars is flipped. That is, so that the bars labeled "...

Strange error when using sparse matrices and glmnet

I m getting a weird error when training a glmnet regression. invalid class "dgCMatrix" object: length(Dimnames[[2]]) must match Dim[2] It only happens occasionally, and perhaps only under larger ...

Generating non-duplicate combination pairs in R

Sorry for the non-descriptive title but I don t know whether there s a word for what I m trying to achieve. Let s assume that I have a list of names of different classes like c( 1 , 2 , 3 , 4 ) ...

Per panel smoothing in ggplot2

I m plotting a group of curves, using facet in ggplot2. I d like to have a smoother applied to plots where there are enough points to smooth, but not on plots with very few points. In particular I d ...

热门标签