English 中文(简体)
按多个栏目分列的yr年数据
原标题:Filtering data in dplyr by multiple columns

这里是我的神学数据

    year <- c(2012, 2012, 2013, 2019, 2020, 2021, 2022)
    individual <- c(1, 1, 1, 2, 2, 3, 3)
    group <- c("A", "B", "A", "A", "B", "A", "B")
    mass <- c(84.3, 82.5, 80.6, 79.2, 82.8, 79.1, 82.6)
    dataset <- data.frame(year, individual, group, mass)

I have 10 years of body mass data for individual birds across two "groups". As you ll see from the dummy dataset, I have instances where there exists mass data for one group but not the other (individual 1, year 2013). I want to only include data where I have an individual s mass for BOTH group A AND group B . I just can t figure out how to filter data this way--ideally with dplyr s filter() command.

我抱歉,如果这在一定程度上重复了另一个员额,我就能够找到一个答案,帮助我完成这一准确的过滤任务。

    df2 <- dataset %>%
      group_by(individual, year) %>% filter(n()>1)

^ 我利用该守则逐年逐个地整理数据,因此,它只包括一度以上的个人(个人每年只能每组一次)。 然而,从这里我看,我无法说明为何排除我随后一年中为A组或B组的个人提供第三批大规模数据但并非两者兼有的情况。 我需要一个数据集,其中仅包括一年中我拥有PBOTH集团A和B集团集体价值的个人。

最佳回答

用途

dataset %>%
  filter(all(c( A ,  B ) %in% group), .by = c(year, individual))
问题回答

我只想列入我拥有A组和B组个人体质的数据。 我也能够说明如何以低压过滤器(过滤器)指挥的方式过滤数据。

我认为,可能的解决办法是<条码>pivot_wider。 您的数据。

dataset |>
  tidyr::pivot_wider(names_from = group, values_from = mass, names_prefix = "mass") 
#> # A tibble: 6 × 4
#>    year individual massA massB
#>   <dbl>      <dbl> <dbl> <dbl>
#> 1  2012          1  84.3  82.5
#> 2  2013          1  80.6  NA  
#> 3  2019          2  79.2  NA  
#> 4  2020          2  NA    82.8
#> 5  2021          3  79.1  NA  
#> 6  2022          3  NA    82.6

现在,您可以申请;na.omit()dplyr:filter(!is.na(massA) &!is.na(massB),把你的数据分类......

I need a dataset that only includes individuals where I have BOTH Group A and Group B mass values in a given year.

dataset |>
  tidyr::pivot_wider(names_from = group, 
                     values_from = mass, 
                     names_prefix = "mass") |>
  na.omit()
#> # A tibble: 1 × 4
#>    year individual massA massB
#>   <dbl>      <dbl> <dbl> <dbl>
#> 1  2012          1  84.3  82.5

Created on 2023-12-20 with reprex v2.0.2.





相关问题
How to plot fitted model over observed time series

This is a really really simple question to which I seem to be entirely unable to get a solution. I would like to do a scatter plot of an observed time series in R, and over this I want to plot the ...

REvolution for R

since the latest Ubuntu release (karmic koala), I noticed that the internal R package advertises on start-up the REvolution package. It seems to be a library collection for high-performance matrix ...

R - capturing elements of R output into text files

I am trying to run an analysis by invoking R through the command line as follows: R --no-save < SampleProgram.R > SampleProgram.opt For example, consider the simple R program below: mydata =...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Changing the order of dodged bars in ggplot2 barplot

I have a dataframe df.all and I m plotting it in a bar plot with ggplot2 using the code below. I d like to make it so that the order of the dodged bars is flipped. That is, so that the bars labeled "...

Strange error when using sparse matrices and glmnet

I m getting a weird error when training a glmnet regression. invalid class "dgCMatrix" object: length(Dimnames[[2]]) must match Dim[2] It only happens occasionally, and perhaps only under larger ...

Generating non-duplicate combination pairs in R

Sorry for the non-descriptive title but I don t know whether there s a word for what I m trying to achieve. Let s assume that I have a list of names of different classes like c( 1 , 2 , 3 , 4 ) ...

Per panel smoothing in ggplot2

I m plotting a group of curves, using facet in ggplot2. I d like to have a smoother applied to plots where there are enough points to smooth, but not on plots with very few points. In particular I d ...

热门标签