如何解决R:如果有连续的列相等,如何只保留一个并分配一个新的列名
我有一组客户数据,包括他们在哪些商店购物、他们在每家商店购买了什么以及在哪一天购物。
Shop_list <- data.frame (Names = c('Adam','Eve','Lucy','Ricky','Gomez','Morticia','Adam','Lucy'),Day = c(1,1,2,3,4,5,6,6),Store= c('None','None','Lowes','Home Depot','None'),Item= c('None','Wood','Soil','Nails','Pots','Seeds',stringsAsFactors=FALSE
)
library(dplyr)
library(flextable)
Shop_fcn <- function(data){
data %>%
group_by(Day) %>%
mutate(N_nam = n_distinct(Names)) %>%
group_by(Names,Day,N_nam,Store,Item) %>%
summarize(n_item = n()) %>%
group_by(Day,Item) %>%
summarize(n_nam = n(),n_item = sum(n_item))%>%
mutate(pct = round(n_nam/N_nam*100,digits = 1),txt = paste0( n_nam," (",pct,"%)"),Day_n = (paste0("Day "," (N=",")")))%>%
ungroup %>% select(Day_n,Item,txt) %>%
pivot_wider(values_from = txt,names_from = Day_n) %>%
mutate_at(vars(starts_with(c("Day"))),~if_else(is.na(.),"",.)) %>%
arrange(Store,Item) %>%
group_by(store2 = Store) %>%
mutate(Store = if_else(row_number() != 1,Store))%>%
ungroup() %>% select(-store2)
}
Shop_day <- Shop_list %>%
bind_rows(Shop_list) %>%
Shop_fcn ()
flextable(Shop_day)
我得到以下输出。
第 2 天和第 3 天的列是相等的,第 4、5 和 6 天的列也是如此。我试图使具有相同信息的列的列标题与第 2 - 3 天相同(N=4) 和第 4 - 6 天 (N=3)。
到目前为止,我已经尝试删除重复的列
Shop_nodup <- Shop_day[!duplicated(as.list(Shop_day))]
flextable(Shop_nodup)
给了我什么
重复的列消失了,但我想不出一种在列标题中指定的方法来指定列涵盖的天数范围(第 2 - 3 天(N=4)和第 4 - 6 天( N=3) )
解决方法
如果我们需要更改标题,请进行如下更改
library(stringr)
Shop_fcn <- function(data){
data %>%
group_by(Day) %>%
mutate(N_nam = n_distinct(Names)) %>%
group_by(Names,Day,N_nam,Store,Item) %>%
summarize(n_item = n()) %>%
group_by(Day,Item) %>%
summarize(n_nam = n(),n_item = sum(n_item))%>%
mutate(pct = round(n_nam/N_nam*100,digits = 1),txt = paste0( n_nam," (",pct,"%)"),Day_n = (paste0("Day "," (N=",")")))%>%
ungroup %>%
select(Day_n,Item,txt) %>%
group_by(Store,txt) %>%
summarise(Day_n = if(n() > 1)
sprintf('Day %s %s',paste(range(readr::parse_number(unique(Day_n))),collapse=' - '),str_remove(first(Day_n),'^[^(]+')) else Day_n) %>%
pivot_wider(values_from = txt,names_from = Day_n) %>%
mutate_at(vars(starts_with(c("Day"))),~if_else(is.na(.),"",.)) %>%
arrange(Store,Item) %>%
group_by(store2 = Store) %>%
mutate(Store = if_else(row_number() != 1,Store))%>%
ungroup() %>%
select(Store,str_sort(names(.)[-(1:2)],numeric = TRUE),-store2)
}
-测试
Shop_day <- Shop_list %>%
bind_rows(Shop_list) %>%
Shop_fcn ()
flextable(Shop_day)
-输出
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。