English 中文(简体)
表代数 页: 1
原标题:Table generation R
  • 时间:2012-05-04 20:28:04
  •  标签:
  • r

我有类似的数据集:

val<-c("Y","N")
test<-data.frame(age=rnorm(n=100,mean=50,sd=10),var1=sample(val,100,T),var2=sample(val,100,T),var3=sample(val,100,T),sex=sample(c("F","M"),100,T))

I´d like to create a summary reporting the mean age for each category using Hmisc.

library(Hmisc)
summary.formula(age~sex+var1+var2+var3,data=test)

然而,瓦尔1-瓦尔3实际上属于相同的分类变量,其等级为瓦尔1、瓦尔和瓦尔3,而不是Y/N。 此外,这些并非相互排斥。 因此,有可能产生一种变数的变量,即不同的等级,不相互排斥和类型

summary.formula(age~sex+var4,data=test)

and have an output like:

+-------+-+---+----+
|       | |N  |age |
+-------+-+---+----+
|sex    |F| 44|48.0|
|       |M| 56|50.8|
+-------+-+---+----+
|var4   |var1| xx|yy|
|       |var2| xx|yy|
        |var3| xx|yy|
+-------+-+---+----+
|Overall| |100|49.6|
+-------+-+---+----+

任何帮助都将受到高度赞赏。

问题回答

如何填写<代码>paste0? (或过去(......,ep=......)

> test$var4 <- paste0(test$var1, test$var2, test$var3)
> summary.formula(age~sex+var4, data=test)
age    N=100

+-------+---+---+--------+
|       |   |  N|     age|
+-------+---+---+--------+
|    sex|  F| 50|50.25440|
|       |  M| 50|51.32134|
+-------+---+---+--------+
|   var4|NNN| 13|46.64417|
|       |NNY| 17|51.34456|
|       |NYN| 15|52.92185|
|       |NYY| 17|47.35685|
|       |YNN|  9|50.91647|
|       |YNY|  7|48.04489|
|       |YYN| 10|53.23713|
|       |YYY| 12|56.14394|
+-------+---+---+--------+
|Overall|   |100|50.78787|
+-------+---+---+--------+
> 

I think the problem lies in that you are trying to combine statistics for two different data sets:

  1. 按个人分列的数据:

    summary.formula(age~sex, test)
    
    # age    N=100
    # 
    # +-------+-+---+--------+
    # |       | |N  |age     |
    # +-------+-+---+--------+
    # |sex    |F| 35|49.99930|
    # |       |M| 65|48.96266|
    # +-------+-+---+--------+
    # |Overall| |100|49.32548|
    # +-------+-+---+--------+
    
  2. 按汽车分列的数据

在这里,你需要每车一行;这里是创造数据的一种方式,但我确信,必须采取更恶毒的方式:

    var1 <- subset(test, var1 == "Y", c("age", "sex"))
    var2 <- subset(test, var2 == "Y", c("age", "sex"))
    var3 <- subset(test, var3 == "Y", c("age", "sex"))
    var1$var <- "var1"
    var2$var <- "var2"
    var3$var <- "var3"
    vars <- rbind(var1, var2, var3)

然后是汇总统计数据:

    summary.formula(age~var, data=vars)
    # age    N=147
    # 
    # +-------+----+---+--------+
    # |       |    |N  |age     |
    # +-------+----+---+--------+
    # |var    |var1| 47|48.91983|
    # |       |var2| 43|46.31811|
    # |       |var3| 57|49.35292|
    # +-------+----+---+--------+
    # |Overall|    |147|48.32672|
    # +-------+----+---+--------+

如你所知,两份摘要的<代码>Overall各节并不匹配,因为它们来自两个不同的数据集。 (不可能把你要求的方式结合起来。)





相关问题
How to plot fitted model over observed time series

This is a really really simple question to which I seem to be entirely unable to get a solution. I would like to do a scatter plot of an observed time series in R, and over this I want to plot the ...

REvolution for R

since the latest Ubuntu release (karmic koala), I noticed that the internal R package advertises on start-up the REvolution package. It seems to be a library collection for high-performance matrix ...

R - capturing elements of R output into text files

I am trying to run an analysis by invoking R through the command line as follows: R --no-save < SampleProgram.R > SampleProgram.opt For example, consider the simple R program below: mydata =...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Changing the order of dodged bars in ggplot2 barplot

I have a dataframe df.all and I m plotting it in a bar plot with ggplot2 using the code below. I d like to make it so that the order of the dodged bars is flipped. That is, so that the bars labeled "...

Strange error when using sparse matrices and glmnet

I m getting a weird error when training a glmnet regression. invalid class "dgCMatrix" object: length(Dimnames[[2]]) must match Dim[2] It only happens occasionally, and perhaps only under larger ...

Generating non-duplicate combination pairs in R

Sorry for the non-descriptive title but I don t know whether there s a word for what I m trying to achieve. Let s assume that I have a list of names of different classes like c( 1 , 2 , 3 , 4 ) ...

Per panel smoothing in ggplot2

I m plotting a group of curves, using facet in ggplot2. I d like to have a smoother applied to plots where there are enough points to smooth, but not on plots with very few points. In particular I d ...

热门标签