English 中文(简体)
我如何设立一个新的一栏,以取代今年连选连任的东方年?
原标题:How do I create a new column that subtracts the NEXT year of reelection with the current year?
  • 时间:2023-10-18 23:09:31
  •  标签:
  • r
  • dplyr

I have the following data frame:

structure(list(name = c("jones", "williams", "jones", 
                              "williams", "williams", "jones", "williams", "williams", "jones", 
                              "williams", "williams", "jones", "williams", "williams", "jones", 
                              "williams", "williams", "jones", "williams", "williams", "jones", 
                              "williams", "jones", "jones", "jones", "jones", 
                              "jones", "jones", "jones", "jones"), state = c("NY", 
                                                                             "NC", "NY", "NC", "TX", "NY", "NC", "TX", "NY", "TX", "NC", "NY", 
                                                                             "TX", "NC", "NY", "TX", "NC", "NY", "TX", "NC", "NY", "NC", "NY", 
                                                                             "NY", "NY", "NY", "NY", "NY", "NY", "NY"), year = structure(c(1995, 
                                                                                                                                           1995, 1996, 1996, 1996, 1997, 1997, 1997, 1998, 1998, 1998, 1999, 
                                                                                                                                           1999, 1999, 2000, 2000, 2000, 2001, 2001, 2001, 2002, 2002, 2003, 
                                                                                                                                           2004, 2005, 2006, 2007, 2008, 2009, 2010), format.stata = "%8.0g"), 
             year_of_election = c(NA, 1992, NA, 1992, 1996, NA, 1992, 
                                  1996, NA, 1998, 1998, 1999, 1998, 1998, 1999, 1998, 1998, 
                                  1999, 1998, 1998, 1999, 1998, 1999, 1999, 1999, 1999, 1999, 
                                  1999, 2009, 2009)), class = c("grouped_df", "tbl_df", "tbl", 
                                                                "data.frame"), row.names = c(NA, -30L), groups = structure(list(
                                                                  name = c("williams", "williams", "jones"), state = c("NC", 
                                                                                                                               "TX", "NY"), .rows = structure(list(c(2L, 4L, 7L, 11L, 14L, 
                                                                                                                                                                     17L, 20L, 22L), c(5L, 8L, 10L, 13L, 16L, 19L), c(1L, 3L, 
                                                                                                                                                                                                                      6L, 9L, 12L, 15L, 18L, 21L, 23L, 24L, 25L, 26L, 27L, 28L, 
                                                                                                                                                                                                                      29L, 30L)), ptype = integer(0), class = c("vctrs_list_of", 
                                                                                                                                                                                                                                                                "vctrs_vctr", "list"))), row.names = c(NA, -3L), .drop = TRUE, class = c("tbl_df", 
                                                                                                                                                                                                                                                                                                                                         "tbl", "data.frame")))

我想设立一个新的一栏,称为“选举年份”,该栏将取代本年度选举的下一个学年。 因此,例如,在纽约市-纽约-1995年,“不选举年”应为4年,因为1999年是选举的下一年。 关于纽约-1999年,我想是零。 而到2000年,明年将是2009年。 关于纽约-2009年,我想是零。 因此。 如果没有“快”年,我就希望成为美国人。

这是我目前正在使用的这一法典,它偏离了相反的方向。

df <- df %>%
  arrange(name, state, year) %>%
  group_by(name, state) %>%
  mutate(
    years_until_reelection = lead(year_of_election) - year
  ) %>%
  ungroup()
问题回答

也许这一自我参与:

library(dplyr)
df %>%
  left_join(select(df, name, state, next_election = year_of_election), 
            join_by(name, state, year <= next_election), multiple = "first") %>%
  mutate(years_until_reelection = next_election - year) %>%
  ungroup() %>%
  print(n=99)
# # A tibble: 30 × 6
#    name     state  year year_of_election next_election years_until_reelection
#    <chr>    <chr> <dbl>            <dbl>         <dbl>                  <dbl>
#  1 jones    NY     1995               NA          1999                      4
#  2 williams NC     1995             1992          1998                      3
#  3 jones    NY     1996               NA          1999                      3
#  4 williams NC     1996             1992          1998                      2
#  5 williams TX     1996             1996          1996                      0
#  6 jones    NY     1997               NA          1999                      2
#  7 williams NC     1997             1992          1998                      1
#  8 williams TX     1997             1996          1998                      1
#  9 jones    NY     1998               NA          1999                      1
# 10 williams TX     1998             1998          1998                      0
# 11 williams NC     1998             1998          1998                      0
# 12 jones    NY     1999             1999          1999                      0
# 13 williams TX     1999             1998            NA                     NA
# 14 williams NC     1999             1998            NA                     NA
# 15 jones    NY     2000             1999          2009                      9
# 16 williams TX     2000             1998            NA                     NA
# 17 williams NC     2000             1998            NA                     NA
# 18 jones    NY     2001             1999          2009                      8
# 19 williams TX     2001             1998            NA                     NA
# 20 williams NC     2001             1998            NA                     NA
# 21 jones    NY     2002             1999          2009                      7
# 22 williams NC     2002             1998            NA                     NA
# 23 jones    NY     2003             1999          2009                      6
# 24 jones    NY     2004             1999          2009                      5
# 25 jones    NY     2005             1999          2009                      4
# 26 jones    NY     2006             1999          2009                      3
# 27 jones    NY     2007             1999          2009                      2
# 28 jones    NY     2008             1999          2009                      1
# 29 jones    NY     2009             2009          2009                      0
# 30 jones    NY     2010             2009            NA                     NA

It is your "year of election" column is indicating the previous election and not the future election.
The below code zeros out the election year column when year <> election year and then refills it with the future election.

library(dplyr)
library(tidyr)

answer <- df %>% 
             group_by(name, state) %>% 
             mutate(year_of_election= ifelse(year==year_of_election, year_of_election, NA)) %>%
             fill(year_of_election, .direction = "up") %>% 
             mutate(years_until_reelection = year_of_election - year) %>%
             arrange(name, state, year)

answer




相关问题
How to plot fitted model over observed time series

This is a really really simple question to which I seem to be entirely unable to get a solution. I would like to do a scatter plot of an observed time series in R, and over this I want to plot the ...

REvolution for R

since the latest Ubuntu release (karmic koala), I noticed that the internal R package advertises on start-up the REvolution package. It seems to be a library collection for high-performance matrix ...

R - capturing elements of R output into text files

I am trying to run an analysis by invoking R through the command line as follows: R --no-save < SampleProgram.R > SampleProgram.opt For example, consider the simple R program below: mydata =...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Changing the order of dodged bars in ggplot2 barplot

I have a dataframe df.all and I m plotting it in a bar plot with ggplot2 using the code below. I d like to make it so that the order of the dodged bars is flipped. That is, so that the bars labeled "...

Strange error when using sparse matrices and glmnet

I m getting a weird error when training a glmnet regression. invalid class "dgCMatrix" object: length(Dimnames[[2]]) must match Dim[2] It only happens occasionally, and perhaps only under larger ...

Generating non-duplicate combination pairs in R

Sorry for the non-descriptive title but I don t know whether there s a word for what I m trying to achieve. Let s assume that I have a list of names of different classes like c( 1 , 2 , 3 , 4 ) ...

Per panel smoothing in ggplot2

I m plotting a group of curves, using facet in ggplot2. I d like to have a smoother applied to plots where there are enough points to smooth, but not on plots with very few points. In particular I d ...

热门标签