English 中文(简体)
描述性表格——如何建立一个包含数字和分类变量的表格
原标题:Descriptive tables - how to create a table containing both numeric and categorical variables
  • 时间:2012-01-13 22:09:11
  •  标签:
  • r

I can t find a really intuitive way of doing the most basic thing; creating a summary table with my base variables. The best method I ve found is currently using tapply:

seed(200)
my_stats <- function(x){
    if (is.factor(x)){
        a <- table(x, useNA="no")
        b <- round(a*100/sum(a),2)

        # If binary
        if (length(a) == 2){
            ret <- paste(a[1], " (", b[1], " %)", sep="")
        }
        return(ret)
    }else{
        ret <- mean(x, na.rm=T)
        if (ret < 1){
            ret <- round(ret, 2)
        }else{
            ret <- round(ret)
        }
        return(ret)
    }
}

library(rms)
groups <- factor(sample(c("Group A","Group B"), size=51, replace=T))
a <- 3:53 
b <- rnorm(51)
c <- factor(sample(c("male","female"), size=51, replace=T))

res <- rbind(a=tapply(a, groups, my_stats),
      b=tapply(b, groups, my_stats),
      c=tapply(c, groups, my_stats))
latex(latexTranslate(res))

The res contains:

> res
  Group A     Group B       
a "28"        "28"          
b "-0.08"     "-0.21"       
c "14 (56 %)" "14 (53.85 %)"

Now this works but it seems very complex and not the most elegant solution. I ve tried to search for how to create descriptive tables but the all focus on the table(), prop.table(), summary() for just single variable or variables of the same kind.

我的问题是: 是否有一揽子方案/功能,能够轻松地建立一个看好的晚餐桌? 如果是,请说明如何取得上述结果。

Thanks!

最佳回答

If you rewrite your function so that it always returns a string (it sometimes returns a string, sometimes a number, sometimes NULL), you can call ddply on the data.frame, without having to specify all the columns.

f <- function(u) {
  res <- "?" 
  if(is.factor(u) || is.character(u)) {
    u <- table(u, useNA = "no")
    if (length(u) == 0 || sum(u) == 0) { res <- "NA" }
    else { res <- sprintf( "%0.0f%%", 100 * u[1] / sum(u) ) }
  } else {
    u <- mean(u, na.rm=TRUE)
    if(is.na(u)) { res <- "NA" }
    else { res <- sprintf( ifelse( abs(u) < 1, "%0.2f", "%0.0f" ), u ) }
  }
  return( res )
}
# Same function, for data.frames
g <- function(d) do.call( data.frame, lapply(d, f) )

library(plyr)
ddply(data.frame(a,b,c), .(groups), g)

由于你想要LaTeX的表格,你可能还要尝试以下内容,因为这些数据不分类,而是为数字变量增加他的图表。

library(Hmisc)
latex(describe(d), file="")
问题回答

你再次要求的是公开的结束,因为你有不同的可能性,就什么构成“看好的拉尼特”表与我有分歧。

例如,我可能倾向于逐行而不是按一栏排列:

require(plyr)
require(xtable)
dat <- data.frame(a,b,c,groups)
xtable(ddply(dat,.(groups),summarise,a = my_stats(a),
                                     b = my_stats(b),
                                     c = my_stats(c)))


egin{table}[ht]
egin{center}
egin{tabular}{rlrrl}
  hline
 & groups & a & b & c \ 
  hline
1 & Group A & 28.00 & 0.14 & 13 (52 \%) \ 
  2 & Group B & 28.00 & -0.00 & 13 (50 \%) \ 
   hline
end{tabular}
end{center}
end{table}

当然,如果你看<代码>?xtable,并且还有<代码>?print.xtable,则其中很多是可定制的。

看看<代码>表包,以另一种方式使这一简单化。

如果你想建立一个含有饮食和连续变量的总表,你应研究一揽子表格。

这方面的一个例子是,它可以做些什么:。 https://cran.r-project.org/web/ Packages/tableone/tableone.pdf”rel=“nofollow” https://cran.r-project.org/web/ Packages/tableone/tableone。

我希望这一帮助。

  • Mike




相关问题
How to plot fitted model over observed time series

This is a really really simple question to which I seem to be entirely unable to get a solution. I would like to do a scatter plot of an observed time series in R, and over this I want to plot the ...

REvolution for R

since the latest Ubuntu release (karmic koala), I noticed that the internal R package advertises on start-up the REvolution package. It seems to be a library collection for high-performance matrix ...

R - capturing elements of R output into text files

I am trying to run an analysis by invoking R through the command line as follows: R --no-save < SampleProgram.R > SampleProgram.opt For example, consider the simple R program below: mydata =...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Changing the order of dodged bars in ggplot2 barplot

I have a dataframe df.all and I m plotting it in a bar plot with ggplot2 using the code below. I d like to make it so that the order of the dodged bars is flipped. That is, so that the bars labeled "...

Strange error when using sparse matrices and glmnet

I m getting a weird error when training a glmnet regression. invalid class "dgCMatrix" object: length(Dimnames[[2]]) must match Dim[2] It only happens occasionally, and perhaps only under larger ...

Generating non-duplicate combination pairs in R

Sorry for the non-descriptive title but I don t know whether there s a word for what I m trying to achieve. Let s assume that I have a list of names of different classes like c( 1 , 2 , 3 , 4 ) ...

Per panel smoothing in ggplot2

I m plotting a group of curves, using facet in ggplot2. I d like to have a smoother applied to plots where there are enough points to smooth, but not on plots with very few points. In particular I d ...

热门标签