English 中文(简体)
d 具有有条件值的复方,或设立一类
原标题:dplyr mutate with conditional values AND OR to create a group category

I am having a dataset that has a variable called individuals with many options and it comes like that. I have observations for a given Day on different individuals (Individual_ID)

The different options of individuals look like this: Individual_ID("Adele", "Fitz", "Abba").... these would belong to a group that is Group=A Individual_ID("Noir", "Rouge", "Bleue").... these would belong to a group called Group=B

In some instances, the individuals from different groups, can get mixed, so we have something like this Individual_ID("Adele", "Rouge", "Bleue")... so this would represent a mixed-group,

I would like to create a variable called GroupingID that can be either GroupA, GroupB, or MixedGroup For that I do not precise that all individuals of the group are present, but instead, that the representation of the individuals is neat or not neat with respect to their group.

为了考虑混合群体,涉及不同群体至少两个个人的任何组合都足够了。

难道有人能解释我如何运用一种条件和(或)相互交织的条件来形成一种可变的集团?

这里我的数据如何看待

Date      IndividualsObserved    
1/1/2016   Abba,Adele
2/1/2016   Adele,Fitz
3/1/2016   Fitz,Rouge,Noir
4/1/2016   Fitz,Adele,Abba
5/1/2016   Rouge,Noir,Bleue
6/1/2016   Rouge,Abba,Fitz

(不同个人在个人服务一栏的每个切身边)

So I would like to have a grouping category that is able to discern whether the grouping is neat (only one group identity, or whether the grouping is composed by a mixed of individuals from different groups). It would be something like this (GroupingID)

Date      IndividualsObserved   GroupingID
1/1/2016   Abba,Adele           GroupA
2/1/2016   Adele,Fitz           GroupA
3/1/2016   Fitz,Rouge,Noir      MixedGrouping
4/1/2016   Fitz,Adele,Abba      GroupA
5/1/2016   Rouge,Noir,Bleue     GroupB
6/1/2016   Rouge,Abba,Fitz      MixedGrouping
7/1/2016   Noir,Bleue,Abba      MixedGrouping

我试图这样做,但并不奏效:

  mutate(GroupingID = case_when(IndividualsObserved %in% c("Adele","Abba", "Fitz") ~ "GroupA",
                                IndividualsObserved %in% c("Noir","Bleue", "Rouge") ~ "GroupB",
                                TRUE ~ ToCheck)) 

谨请你就如何利用相互交错的选择来处理这一问题提出看法。

I Trial using dplyr function mut

问题回答

步骤:

  1. Create a named list for the groups
  2. Split each Individuals row into a list, giving us a list column
  3. For each row in the new list column, check if any of the names are in groups A and B. If both, then mixed, A then A, B then B, neither then None.
library(tidyverse)

groups <- list("A" = c("Adele", "Fitz", "Abba"),
               "B" = c("Rouge", "Noir", "Bleue"))

df |>
  mutate(IndividualsObserved = str_split(IndividualsObserved, ","),
         Group = map_chr(IndividualsObserved, (x) {
            a <- any(x %in% groups$A)
            b <- any(x %in% groups$B)
            case_when(a & b ~ "MixedGrouping",
                      a ~ "GroupA",
                      b ~ "GroupB",
                      TRUE ~ "None")}))

产出:

      Date IndividualsObserved         Group
1 1/1/2016         Abba, Adele        GroupA
2 2/1/2016         Adele, Fitz        GroupA
3 3/1/2016   Fitz, Rouge, Noir MixedGrouping
4 4/1/2016   Fitz, Adele, Abba        GroupA
5 5/1/2016  Rouge, Noir, Bleue        GroupB
6 6/1/2016   Rouge, Abba, Fitz MixedGrouping
7 7/1/2016   Noir, Bleue, Abba MixedGrouping

你可以采取许多其他方式,例如,为各群体及其相应个人提供一个数据框架,将每个人分别列入<代码>df。 我认为这是最直截了当的。

Data:

df <- read.table(text= 
"Date      IndividualsObserved
1/1/2016   Abba,Adele
2/1/2016   Adele,Fitz
3/1/2016   Fitz,Rouge,Noir
4/1/2016   Fitz,Adele,Abba
5/1/2016   Rouge,Noir,Bleue
6/1/2016   Rouge,Abba,Fitz
7/1/2016   Noir,Bleue,Abba", header = T)




相关问题
How to plot fitted model over observed time series

This is a really really simple question to which I seem to be entirely unable to get a solution. I would like to do a scatter plot of an observed time series in R, and over this I want to plot the ...

REvolution for R

since the latest Ubuntu release (karmic koala), I noticed that the internal R package advertises on start-up the REvolution package. It seems to be a library collection for high-performance matrix ...

R - capturing elements of R output into text files

I am trying to run an analysis by invoking R through the command line as follows: R --no-save < SampleProgram.R > SampleProgram.opt For example, consider the simple R program below: mydata =...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Changing the order of dodged bars in ggplot2 barplot

I have a dataframe df.all and I m plotting it in a bar plot with ggplot2 using the code below. I d like to make it so that the order of the dodged bars is flipped. That is, so that the bars labeled "...

Strange error when using sparse matrices and glmnet

I m getting a weird error when training a glmnet regression. invalid class "dgCMatrix" object: length(Dimnames[[2]]) must match Dim[2] It only happens occasionally, and perhaps only under larger ...

Generating non-duplicate combination pairs in R

Sorry for the non-descriptive title but I don t know whether there s a word for what I m trying to achieve. Let s assume that I have a list of names of different classes like c( 1 , 2 , 3 , 4 ) ...

Per panel smoothing in ggplot2

I m plotting a group of curves, using facet in ggplot2. I d like to have a smoother applied to plots where there are enough points to smooth, but not on plots with very few points. In particular I d ...

热门标签