English 中文(简体)
R的单独数据框架
原标题:Separate Dataframe in R
  • 时间:2023-08-21 02:28:53
  •  标签:
  • r

我试图在R中设立两栏。 我正在讨论的问题是,根据观察类型,这一年与上一栏没有分开。

数据框架中的一些名字只是第一名字,而另一些名字是头等和最后的名字。 我正试图将<代码>Name第一和第一/最后和<编码>>年与目前的<代码>Name栏分开。

Fake data = Employee and the year theystart employment

建立数据框架

dat <- tibble(Name = c("Percy Vere (2020)", "Ginger Plant (2017)", "Perry (2019)",
                    "Pat Thettick (2020)", "Samuel (2022)", "Fay Daway (2008)",
                    "Greg (2022)", "Simon Sais (2011)"))
# A tibble: 8 x 1
  Name               
  <fct>              
1 Percy Vere (2020)  
2 Ginger Plant (2017)
3 Perry (2019)       
4 Pat Thettick (2020)
5 Samuel (2022)      
6 Fay Daway (2008)   
7 Greg (2022)        
8 Simon Sais (2011) 

将该栏分为两栏:和>

dat %>% 
  select_all() %>% 
  separate(col = Name, into = c( Name ,  Year )) %>%    # sep =  ,  and  ;  does not create a fix 
  tibble()

# A tibble: 8 x 2
  Name   Year    
  <chr>  <chr>   
1 Percy  Vere    
2 Ginger Plant   
3 Perry  2019    
4 Pat    Thettick
5 Samuel 2022    
6 Fay    Daway   
7 Greg   2022    
8 Simon  Sais    
Warning message:
Expected 2 pieces. Additional pieces discarded in 8 rows [1, 2, 3, 4, 5, 6, 7, 8]. 

问题回答

页: 1

dat |> 
  separate_wider_regex(Name, patterns = c(Name = ".*(?= \()", " \(", Year = "\d{4}", "\)")) |> 
  mutate(Year = as.integer(Year))

产出:

# A tibble: 8 × 2
  Name          Year
  <chr>        <int>
1 Percy Vere    2020
2 Ginger Plant  2017
3 Perry         2019
4 Pat Thettick  2020
5 Samuel        2022
6 Fay Daway     2008
7 Greg          2022
8 Simon Sais    2011

Or if you re wanting names to be split up further:

dat |> 
  separate_wider_regex(Name, patterns = c(`First Name` = ".*?(?= )", "\s*", `Last Name` = ".*(?= \()", " \(", Year = "\d{4}", "\)")) |> 
  mutate(Year = as.integer(Year), 
        `Last Name` = if_else(nchar(`Last Name`) == 0, NA_character_, `Last Name`))
# A tibble: 8 × 3
  `First Name` `Last Name`  Year
  <chr>        <chr>       <int>
1 Percy        Vere         2020
2 Ginger       Plant        2017
3 Perry        NA           2019
4 Pat          Thettick     2020
5 Samuel       NA           2022
6 Fay          Daway        2008
7 Greg         NA           2022
8 Simon        Sais         2011




相关问题
How to plot fitted model over observed time series

This is a really really simple question to which I seem to be entirely unable to get a solution. I would like to do a scatter plot of an observed time series in R, and over this I want to plot the ...

REvolution for R

since the latest Ubuntu release (karmic koala), I noticed that the internal R package advertises on start-up the REvolution package. It seems to be a library collection for high-performance matrix ...

R - capturing elements of R output into text files

I am trying to run an analysis by invoking R through the command line as follows: R --no-save < SampleProgram.R > SampleProgram.opt For example, consider the simple R program below: mydata =...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Changing the order of dodged bars in ggplot2 barplot

I have a dataframe df.all and I m plotting it in a bar plot with ggplot2 using the code below. I d like to make it so that the order of the dodged bars is flipped. That is, so that the bars labeled "...

Strange error when using sparse matrices and glmnet

I m getting a weird error when training a glmnet regression. invalid class "dgCMatrix" object: length(Dimnames[[2]]) must match Dim[2] It only happens occasionally, and perhaps only under larger ...

Generating non-duplicate combination pairs in R

Sorry for the non-descriptive title but I don t know whether there s a word for what I m trying to achieve. Let s assume that I have a list of names of different classes like c( 1 , 2 , 3 , 4 ) ...

Per panel smoothing in ggplot2

I m plotting a group of curves, using facet in ggplot2. I d like to have a smoother applied to plots where there are enough points to smooth, but not on plots with very few points. In particular I d ...

热门标签