如何解决如何更正 R 函数中的变异和过滤错误
我有一个函数,它将一个数据框和两个其他变量(horse 和 Race_date)作为输入。 Horse 和race_date 用于过滤传递给函数的数据帧,然后应用汇总函数来计算所需的输出。当我在管道外部和独立测试函数时,一切正常,但是当我尝试从 mutate 函数和管道中运行该函数时,我收到以下错误消息:
Error: Problem with `mutate()` input `split_Lt`.
x Problem with `filter()` input `..1`.
x Input `..1` must be of size 1,not size 18.
i Input `..1` is `Horse == horse & NewSplit == "LT Races" & race_date < date`.
i The error occurred in group 2: split = "A BIT OF BOTH_var106_Track: CD".
i Input `split_Lt` is `getsplit_LT(splits,horse,race_date)`.
i The error occurred in group 2: split = "A BIT OF BOTH_var106_Track: CD".
功能如下:
getsplit_LT <- function(df,date){
kpi <- df %>%
filter(Horse == horse & NewSplit == "LT Races" & race_date < date) %>%
group_by(split) %>%
summarise_if(is.numeric,sum) %>%
mutate(TopAvgB = ((E + 3.439) /(R+3.439 + 25.69))) %>%
select(TopAvgB)
x = if(is.data.frame(kpi) && nrow(kpi)==0){0}else{kpi[[1]]}
return(x)
}
这是我尝试运行的代码:
df <- df %>%
mutate(split_Lt = getsplit_LT(splits,race_date))
这是dput数据:
structure(list(horse = c("A BIT OF BOTH","A BIT OF BOTH","A BIT OF BOTH"),race_date = structure(c(17802,17906,17941,17969,18006,18062,18091,18183,18226,18244,18286,18454,18502,18546,18581,18601,18664),class = "Date")),row.names = c(NA,-18L),groups = structure(list(horse = "A BIT OF BOTH",.rows = structure(list(
1:18),ptype = integer(0),class = c("vctrs_list_of","vctrs_vctr","list"))),row.names = 1L,class = c("tbl_df","tbl","data.frame"
),.drop = TRUE),class = c("grouped_df","tbl_df","data.frame"
))
structure(list(split = c("A BIT OF BOTH_var102B_LifeTime: Life","A BIT OF BOTH_var102B_LifeTime: Life","A BIT OF BOTH_var106_Track: CD","A BIT OF BOTH_var106_Track: CT","A BIT OF BOTH_var106_Track: DE","A BIT OF BOTH_var106_Track: FG","A BIT OF BOTH_var106_Track: GP","A BIT OF BOTH_var106_Track: KE","A BIT OF BOTH_var106_Track: MT","A BIT OF BOTH_var106_Track: OT","A BIT OF BOTH_var106_Track: PX","A BIT OF BOTH_var107_Surface: Dirt","A BIT OF BOTH_var107_Surface: Synth","A BIT OF BOTH_var107_Surface: Turf","A BIT OF BOTH_var107_Surface: Turf"
解决方法
一种方法是使用 purrr::pmap
函数,它在 data.frame 上按行应用函数。
library(tidyverse)
pmap(df,~ getsplit_LT(splits,horse = .x,date = .y))
[[1]]
[1] 0.2156712
[[2]]
[1] 0
[[3]]
[1] 0.1070373
[[4]]
[1] 0.1339914
[[5]]
[1] 0.1593659
...
或返回原始数据框:
bind_cols(df,kpi = pmap_dbl(df,date = .y)))
# A tibble: 18 x 3
horse race_date kpi
<chr> <date> <dbl>
1 A BIT OF BOTH 2020-09-28 0.216
2 A BIT OF BOTH 2020-01-10 0
3 A BIT OF BOTH 2020-02-14 0.107
4 A BIT OF BOTH 2020-03-14 0.134
5 A BIT OF BOTH 2020-04-20 0.159
6 A BIT OF BOTH 2020-06-15 0.183
7 A BIT OF BOTH 2020-07-14 0.227
...
数据:
splits <- read_csv("https://raw.githubusercontent.com/Handicappr/Rstudio_test_project/main/splits.csv")
df <- read_csv("https://raw.githubusercontent.com/Handicappr/Rstudio_test_project/main/df.csv")
splits %>% mutate(race_date = as.Date(race_date,"%m/%d/%y")) -> splits
df %>% mutate(race_date = as.Date(race_date,"%m/%d/%y")) -> df
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。