I can t find a really intuitive way of doing the most basic thing; creating a summary table with my base variables. The best method I ve found is currently using tapply:

my_stats <- function(x){
    if (is.factor(x)){
        a <- table(x, useNA="no")
        b <- round(a*100/sum(a),2)

        # If binary
        if (length(a) == 2){
            ret <- paste(a[1], " (", b[1], " %)", sep="")
        ret <- mean(x, na.rm=T)
        if (ret < 1){
            ret <- round(ret, 2)
            ret <- round(ret)

groups <- factor(sample(c("Group A","Group B"), size=51, replace=T))
a <- 3:53 
b <- rnorm(51)
c <- factor(sample(c("male","female"), size=51, replace=T))

res <- rbind(a=tapply(a, groups, my_stats),
      b=tapply(b, groups, my_stats),
      c=tapply(c, groups, my_stats))

The res contains:

> res
  Group A     Group B       
a "28"        "28"          
b "-0.08"     "-0.21"       
c "14 (56 %)" "14 (53.85 %)"

Now this works but it seems very complex and not the most elegant solution. I ve tried to search for how to create descriptive tables but the all focus on the table(), prop.table(), summary() for just single variable or variables of the same kind.

If you rewrite your function so that it always returns a string (it sometimes returns a string, sometimes a number, sometimes NULL), you can call ddply on the data.frame, without having to specify all the columns.

f <- function(u) {
  res <- "?" 
  if(is.factor(u) || is.character(u)) {
    u <- table(u, useNA = "no")
    if (length(u) == 0 || sum(u) == 0) { res <- "NA" }
    else { res <- sprintf( "%0.0f%%", 100 * u[1] / sum(u) ) }
  } else {
    u <- mean(u, na.rm=TRUE)
    if(is.na(u)) { res <- "NA" }
    else { res <- sprintf( ifelse( abs(u) < 1, "%0.2f", "%0.0f" ), u ) }
  return( res )
# Same function, for data.frames
g <- function(d) do.call( data.frame, lapply(d, f) )

ddply(data.frame(a,b,c), .(groups), g)


latex(describe(d), file="")



dat <- data.frame(a,b,c,groups)
xtable(ddply(dat,.(groups),summarise,a = my_stats(a),
                                     b = my_stats(b),
                                     c = my_stats(c)))

 & groups & a & b & c \ 
1 & Group A & 28.00 & 0.14 & 13 (52 \%) \ 
  2 & Group B & 28.00 & -0.00 & 13 (50 \%) \ 




这方面的一个例子是,它可以做些什么:。 https://cran.r-project.org/web/ Packages/tableone/tableone.pdf”rel=“nofollow” https://cran.r-project.org/web/ Packages/tableone/tableone。


