English 中文(简体)
如何将这一数据框架与重塑一揽子计划相重复[重复]
原标题:How to reshape this dataframe with the reshape package [duplicate]
  • 时间:2012-01-13 15:57:25
  •  标签:
  • r
  • reshape

我拥有如此庞大的数据框架:

id    x1    x2    x3    y1    y2    y3    z1    z2    z3     v 
 1     2     4     5    10    20    15   200   150   170   2.5
 2     3     7     6    25    35    40   300   350   400   4.2

我需要建立这样一个数据框架:

id   xsource   xvalue   yvalue   zvalue       v 
 1        x1        2       10      200     2.5
 1        x2        4       20      150     2.5
 1        x3        5       15      170     2.5
 2        x1        3       25      300     4.2
 2        x2        7       35      350     4.2
 2        x3        6       40      400     4.2

I m quite sure I have to do it with the reshape package, but I m not able to get what I want.

你们能否帮助我?

增 编

最佳回答

此处为<代码>reshape() 解决办法。

关键比值是,<代码>varying= 论点可以采用与长期单一变量相对应的广泛格式的栏目矢量清单。 在这种情况下,原始数据框架中的<代码>x1”、“x2”、“x3”>栏将发送至长期数据框架中的一个栏目<代码>y1, y2, y3”。

# Read in the original data, x, from Andrie s answer

res <- reshape(x, direction = "long", idvar = "id",
               varying = list(c("x1","x2", "x3"), 
                              c("y1", "y2", "y3"), 
                              c("z1", "z2", "z3")),
               v.names = c("xvalue", "yvalue", "zvalue"), 
               timevar = "xsource", times = c("x1", "x2", "x3"))
#      id   v xsource xvalue yvalue zvalue
# 1.x1  1 2.5      x1      2     10    200
# 2.x1  2 4.2      x1      3     25    300
# 1.x2  1 2.5      x2      4     20    150
# 2.x2  2 4.2      x2      7     35    350
# 1.x3  1 2.5      x3      5     15    170
# 2.x3  2 4.2      x3      6     40    400

最后,如你的问题所示,要取得以下成果,就必须采取两步纯共同的步骤:

res <- res[order(res$id, res$xsource), c(1,3,4,5,6,2)]
row.names(res) <- NULL
res
#   id xsource xvalue yvalue zvalue   v
# 1  1      x1      2     10    200 2.5
# 2  1      x2      4     20    150 2.5
# 3  1      x3      5     15    170 2.5
# 4  2      x1      3     25    300 4.2
# 5  2      x2      7     35    350 4.2
# 6  2      x3      6     40    400 4.2
问题回答

有些人要证明我错了,但我认为,利用<代码>reshape<>/code>的包裹或基底<代码>reshape<<>/code>功能,解决这一问题并不容易。

However, it s easy enough using lapply and do.call:

Replicate the data:

x <- read.table(text="
id    x1    x2    x3    y1    y2    y3    z1    z2    z3     v 
1     2     4     5    10    20    15   200   150   170   2.5
2     3     7     6    25    35    40   300   350   400   4.2
", header=TRUE)

分析意见

chunks <- lapply(1:nrow(x), 
    function(i)cbind(x[i, 1], 1:3, matrix(x[i, 2:10], ncol=3), x[i, 11]))
res <- do.call(rbind, chunks)
colnames(res) <- c("id", "source", "x", "y", "z", "v")
res

     id source x y  z   v  
[1,] 1  1      2 10 200 2.5
[2,] 1  2      4 20 150 2.5
[3,] 1  3      5 15 170 2.5
[4,] 2  1      3 25 300 4.2
[5,] 2  2      7 35 350 4.2
[6,] 2  3      6 40 400 4.2

采用重塑古阿姆集团一揽子计划。 它利用了纸浆包和沙皮书包,为你提供了便于使用的接口,使你能够在你执行之前审查你的重塑。 该文件还给你以重塑法,使你能够把它传入你的可再生文字,从而你能够学会使用斜线,并在改塑中投下指挥。 它对复杂的数据操纵,例如对背叛国的人的操纵,是一种冰.。

这里是了解这一问题的人可能感兴趣的两个新办法:

备选办法1:反对

library(tidyverse)
x %>% 
  gather(var, val, -id, -v) %>% 
  extract(var, into = c("header", "source"), regex = "([a-z])([0-9])") %>% 
  spread(header, val)
#   id   v source x  y   z
# 1  1 2.5      1 2 10 200
# 2  1 2.5      2 4 20 150
# 3  1 2.5      3 5 15 170
# 4  2 4.2      1 3 25 300
# 5  2 4.2      2 7 35 350
# 6  2 4.2      3 6 40 400

备选办法2:数据。

library(data.table)
setDT(x)
melt(x, measure.vars = patterns("x", "y", "z"), 
     value.name = c("x", "y", "z"), 
     variable.name = "source")
#    id   v source x  y   z
# 1:  1 2.5      1 2 10 200
# 2:  2 4.2      1 3 25 300
# 3:  1 2.5      2 4 20 150
# 4:  2 4.2      2 7 35 350
# 5:  1 2.5      3 5 15 170
# 6:  2 4.2      3 6 40 400




相关问题
How to plot fitted model over observed time series

This is a really really simple question to which I seem to be entirely unable to get a solution. I would like to do a scatter plot of an observed time series in R, and over this I want to plot the ...

REvolution for R

since the latest Ubuntu release (karmic koala), I noticed that the internal R package advertises on start-up the REvolution package. It seems to be a library collection for high-performance matrix ...

R - capturing elements of R output into text files

I am trying to run an analysis by invoking R through the command line as follows: R --no-save < SampleProgram.R > SampleProgram.opt For example, consider the simple R program below: mydata =...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Changing the order of dodged bars in ggplot2 barplot

I have a dataframe df.all and I m plotting it in a bar plot with ggplot2 using the code below. I d like to make it so that the order of the dodged bars is flipped. That is, so that the bars labeled "...

Strange error when using sparse matrices and glmnet

I m getting a weird error when training a glmnet regression. invalid class "dgCMatrix" object: length(Dimnames[[2]]) must match Dim[2] It only happens occasionally, and perhaps only under larger ...

Generating non-duplicate combination pairs in R

Sorry for the non-descriptive title but I don t know whether there s a word for what I m trying to achieve. Let s assume that I have a list of names of different classes like c( 1 , 2 , 3 , 4 ) ...

Per panel smoothing in ggplot2

I m plotting a group of curves, using facet in ggplot2. I d like to have a smoother applied to plots where there are enough points to smooth, but not on plots with very few points. In particular I d ...