English 中文(简体)
从多种不同格式创建标准化数据表条目
原标题:Creating a standardized data table entry from multiple different formats
  • 时间:2011-10-20 15:46:06
  •  标签:
  • r
  • dataframe

我有一个包含若干领域的数据框架。 其中一个领域是“萨摩亚”,由于各种投入,我的样本用各种格式命名。 以下是一些例子:

 "12" "250" "1248" "1_100111" "16_100111" "125_081811" "1249_100111" 

The above examples represent the majority of the samples. I would like to change all of the samples to a 4 digit format so they can be easily sorted. The final result of the above examples would be:

 "0012" "0250" "1248" "0001" "0016" "0125" "1249" 

Thus, in some cases zeros must be added and in other cases, the date marker must be cut off. It is very important that the changes are made within the context of a data frame and returned in the same format.

最佳回答

各位:

x <- c("12", "250", "1248", "1_100111", "16_100111", "125_081811", "1249_100111")
sprintf(as.numeric(gsub("(\d*)_*\d*$", "\1", x)), fmt="%04d")

[1] "0012" "0250" "1248" "0001" "0016" "0125" "1249"
问题回答
sprintf("%04s",
  sub("_.+", "", c("12", "250", "1248", "1_100111", "16_100111", 
                   "125_081811", "1249_100111" ) )
[1] "0012" "0250" "1248" "0001" "0016" "0125" "1249"




相关问题
How to plot fitted model over observed time series

This is a really really simple question to which I seem to be entirely unable to get a solution. I would like to do a scatter plot of an observed time series in R, and over this I want to plot the ...

REvolution for R

since the latest Ubuntu release (karmic koala), I noticed that the internal R package advertises on start-up the REvolution package. It seems to be a library collection for high-performance matrix ...

R - capturing elements of R output into text files

I am trying to run an analysis by invoking R through the command line as follows: R --no-save < SampleProgram.R > SampleProgram.opt For example, consider the simple R program below: mydata =...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Changing the order of dodged bars in ggplot2 barplot

I have a dataframe df.all and I m plotting it in a bar plot with ggplot2 using the code below. I d like to make it so that the order of the dodged bars is flipped. That is, so that the bars labeled "...

Strange error when using sparse matrices and glmnet

I m getting a weird error when training a glmnet regression. invalid class "dgCMatrix" object: length(Dimnames[[2]]) must match Dim[2] It only happens occasionally, and perhaps only under larger ...

Generating non-duplicate combination pairs in R

Sorry for the non-descriptive title but I don t know whether there s a word for what I m trying to achieve. Let s assume that I have a list of names of different classes like c( 1 , 2 , 3 , 4 ) ...

Per panel smoothing in ggplot2

I m plotting a group of curves, using facet in ggplot2. I d like to have a smoother applied to plots where there are enough points to smooth, but not on plots with very few points. In particular I d ...

热门标签