Question

我有这样的数据框架:

df<- data.frame(
  "Col1" = c("P1", "P1", "P1", "P2", "P2", "P2", "P3", "P3", "P3",
              "P3"),
  "Col2" = c("L", "L&R", "R", "V", "V&N", "N", "M", "I", "I&M",
             "I&M&G"),
  "Value" = c("20", "5", "75", "30", "7", "63", "10", "80", "2","8"))
df

我想根据第二栏重新界定第三栏的数值。我指的是,如果我在第二栏中拥有L&R希望将其价值区分为2(从第三栏中,在第五栏中等于5),并将结果添加到同一P1组的L和R。因此,L&R=5/2将为2.5。在P1组的L组中,这一2.5应加上22.5,P1组的R组应为77.5。但是,如果我有两处,再有两处;在第2栏中,我想要把价值分为三个。最后产出应如此:

df.output<- data.frame(
  "Col1" = c("P1",  "P1", "P2",  "P2", "P3", "P3","P3"),
  "Col2" = c("L",  "R", "V",  "N", "M", "I","G" ),
  "Value" = c("22.5",  "77.5", "33.5",  "66.5", "13.66",  "83.66","2.66"))
df.output
df.output
  Col1 Col2 Value
1   P1    L  22.5
2   P1    R  77.5
3   P2    V  33.5
4   P2    N  66.5
5   P3    M 13.66
6   P3    I 83.66
7   P3    G  2.66

我已经制定了一部法典,当时使用的是:&如下,但我未能在有<代码>和&时发挥作用。

library(tidyverse)

df %>%
  filter(!str_count(Col2, "&") > 1) %>%
  mutate(Value = ifelse(grepl("&", Col2), as.numeric(Value) / 2, as.numeric(Value))) %>%
  separate_rows(Col2, sep = "&") %>%
  group_by(Col1, Col2) %>%
  summarise(Value = sum(Value)) %>%
  ungroup()

感谢任何帮助。

Answer 1

您可按<代码>和>、+1>的编号对<代码>。

library(tidyverse)

df %>% 
  mutate(Value = as.numeric(Value)/(str_count(Col2, "&") + 1)) %>% 
  separate_longer_delim(Col2, delim = "&") %>% 
  summarize(Value = sum(Value), .by = c(Col1, Col2))

  Col1 Col2     Value
1   P1    L 22.500000
2   P1    R 77.500000
3   P2    V 33.500000
4   P2    N 66.500000
5   P3    M 13.666667
6   P3    I 83.666667
7   P3    G  2.666667

友情链接