English 中文(简体)
R组中发生的各种因素
原标题:Count factors occurring in group in R
  • 时间:2011-10-25 19:05:54
  •  标签:
  • r
  • dataframe

This is my data:

> head(Kandula_for_n)
                date      dist  date_only
1 2005-05-08 12:00:00  138.5861 2005-05-08
2 2005-05-08 16:00:00 1166.9265 2005-05-08
3 2005-05-08 20:00:00 1270.7149 2005-05-08
6 2005-05-09 08:00:00  233.1971 2005-05-09
7 2005-05-09 12:00:00 1899.9530 2005-05-09
8 2005-05-09 16:00:00  726.8363 2005-05-09

我现在要增加一个,每个数据条目(dist)的(n)项。 2005-05-08年度数据为零,因为12、16和20小时有3个数据条目。 我采用了以下法典,实际上使我想到的是:

ndist <-tapply(1:NROW(Kandula_for_n), Kandula_for_n$date_only, function(x) length(unique(x)))

ndist<-as.data.frame(ndist)之后,我接着说:

> head(ndist)
           ndist
2005-05-08     3
2005-05-09     4
2005-05-10     6
2005-05-11     4
2005-05-12     6
2005-05-13     6

The problem is that the count is together with date_only in one column that is called ndist. But I would need them in two separate columns, one with the count and one with date_only. How can this be done? I guess its rather simple, but I just don t get it. I would appreciate if you could give me any thoughts on that.

感谢您的努力。

最佳回答

这些只是名字。 您:

ndist$date = row.names(ndist)

EDIT:或ndist = 数据.frame(date = 姓名(ndist),ndist),视其是否已成为数据框架而定。

问题回答

简而言之,由于我发现tapply()硬要把我的脑环绕起来,我喜欢使用plyr,用于这些类型的物品:

## make up some data
## you get better/faster/more answers if you do this bit for us :)
dates <- seq(Sys.Date(), Sys.Date() + 5, by = 1)
Kandula_for_n <- data.frame(date_only = sample( dates + 5, 10, replace=TRUE ) , dist=rnorm(10) )

require(plyr)
ddply(Kandula_for_n, "date_only", function(x) data.frame(x, ndist=nrow(x)) )

这将使你们能够:

    date_only       dist ndist
1  2011-10-30  0.2434168     5
2  2011-10-30 -0.9361780     5
3  2011-10-30  1.4593197     5
4  2011-10-30 -0.1851402     5
5  2011-10-30  0.6652419     5
6  2011-10-31  0.8876420     1
7  2011-11-03  0.5087175     2
8  2011-11-03 -1.0065152     2
9  2011-11-04  0.4236352     2
10 2011-11-04  0.4535686     2

<代码>dply行:

dply(Kandula_for_n, “date_ only”,Function(x) data.frame(x,ndist=nrow(x))

采用输入数据,按<代码>日>分类,只有领域,而且对于每个独特的价值而言,匿名功能只适用于由<代码>date_ only同等价值的记录构成的数据框架。 我的匿名功能只是使用数据。 页: 1

How about something a bit more simple:

as.data.frame(table(unique(Kandula_for_n)$date_only))




相关问题
How to plot fitted model over observed time series

This is a really really simple question to which I seem to be entirely unable to get a solution. I would like to do a scatter plot of an observed time series in R, and over this I want to plot the ...

REvolution for R

since the latest Ubuntu release (karmic koala), I noticed that the internal R package advertises on start-up the REvolution package. It seems to be a library collection for high-performance matrix ...

R - capturing elements of R output into text files

I am trying to run an analysis by invoking R through the command line as follows: R --no-save < SampleProgram.R > SampleProgram.opt For example, consider the simple R program below: mydata =...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Changing the order of dodged bars in ggplot2 barplot

I have a dataframe df.all and I m plotting it in a bar plot with ggplot2 using the code below. I d like to make it so that the order of the dodged bars is flipped. That is, so that the bars labeled "...

Strange error when using sparse matrices and glmnet

I m getting a weird error when training a glmnet regression. invalid class "dgCMatrix" object: length(Dimnames[[2]]) must match Dim[2] It only happens occasionally, and perhaps only under larger ...

Generating non-duplicate combination pairs in R

Sorry for the non-descriptive title but I don t know whether there s a word for what I m trying to achieve. Let s assume that I have a list of names of different classes like c( 1 , 2 , 3 , 4 ) ...

Per panel smoothing in ggplot2

I m plotting a group of curves, using facet in ggplot2. I d like to have a smoother applied to plots where there are enough points to smooth, but not on plots with very few points. In particular I d ...