如何更正 R 函数中的变异和过滤错误

如何解决如何更正 R 函数中的变异和过滤错误

我有一个函数，它将一个数据框和两个其他变量（horse 和 Race_date）作为输入。 Horse 和race_date 用于过滤传递给函数的数据帧，然后应用汇总函数来计算所需的输出。当我在管道外部和独立测试函数时，一切正常，但是当我尝试从 mutate 函数和管道中运行该函数时，我收到以下错误消息：

Error: Problem with `mutate()` input `split_Lt`.
x Problem with `filter()` input `..1`.
x Input `..1` must be of size 1,not size 18.
i Input `..1` is `Horse == horse & NewSplit == "LT Races" & race_date < date`.
i The error occurred in group 2: split = "A BIT OF BOTH_var106_Track: CD".
i Input `split_Lt` is `getsplit_LT(splits,horse,race_date)`.
i The error occurred in group 2: split = "A BIT OF BOTH_var106_Track: CD".

功能如下：

getsplit_LT <- function(df,date){

  kpi <- df %>% 
    filter(Horse == horse & NewSplit == "LT Races" & race_date < date) %>% 
    group_by(split) %>% 
    summarise_if(is.numeric,sum) %>% 
    mutate(TopAvgB = ((E + 3.439) /(R+3.439 + 25.69))) %>% 
    select(TopAvgB) 
    
  x = if(is.data.frame(kpi) && nrow(kpi)==0){0}else{kpi[[1]]}
   
  return(x)
 
}

这是我尝试运行的代码：

df <- df %>%  
  mutate(split_Lt = getsplit_LT(splits,race_date))

这是dput数据：

structure(list(horse = c("A BIT OF BOTH","A BIT OF BOTH","A BIT OF BOTH"),race_date = structure(c(17802,17906,17941,17969,18006,18062,18091,18183,18226,18244,18286,18454,18502,18546,18581,18601,18664),class = "Date")),row.names = c(NA,-18L),groups = structure(list(horse = "A BIT OF BOTH",.rows = structure(list(
    1:18),ptype = integer(0),class = c("vctrs_list_of","vctrs_vctr","list"))),row.names = 1L,class = c("tbl_df","tbl","data.frame"
),.drop = TRUE),class = c("grouped_df","tbl_df","data.frame"
))

structure(list(split = c("A BIT OF BOTH_var102B_LifeTime: Life","A BIT OF BOTH_var102B_LifeTime: Life","A BIT OF BOTH_var106_Track: CD","A BIT OF BOTH_var106_Track: CT","A BIT OF BOTH_var106_Track: DE","A BIT OF BOTH_var106_Track: FG","A BIT OF BOTH_var106_Track: GP","A BIT OF BOTH_var106_Track: KE","A BIT OF BOTH_var106_Track: MT","A BIT OF BOTH_var106_Track: OT","A BIT OF BOTH_var106_Track: PX","A BIT OF BOTH_var107_Surface: Dirt","A BIT OF BOTH_var107_Surface: Synth","A BIT OF BOTH_var107_Surface: Turf","A BIT OF BOTH_var107_Surface: Turf"

解决方法

一种方法是使用 purrr::pmap 函数，它在 data.frame 上按行应用函数。

library(tidyverse)
pmap(df,~ getsplit_LT(splits,horse = .x,date = .y))
[[1]]
[1] 0.2156712

[[2]]
[1] 0

[[3]]
[1] 0.1070373

[[4]]
[1] 0.1339914

[[5]]
[1] 0.1593659
...

或返回原始数据框：

bind_cols(df,kpi = pmap_dbl(df,date = .y)))
# A tibble: 18 x 3
   horse         race_date    kpi
   <chr>         <date>     <dbl>
 1 A BIT OF BOTH 2020-09-28 0.216
 2 A BIT OF BOTH 2020-01-10 0    
 3 A BIT OF BOTH 2020-02-14 0.107
 4 A BIT OF BOTH 2020-03-14 0.134
 5 A BIT OF BOTH 2020-04-20 0.159
 6 A BIT OF BOTH 2020-06-15 0.183
 7 A BIT OF BOTH 2020-07-14 0.227
...

数据：

splits <- read_csv("https://raw.githubusercontent.com/Handicappr/Rstudio_test_project/main/splits.csv")
df <- read_csv("https://raw.githubusercontent.com/Handicappr/Rstudio_test_project/main/df.csv")
splits %>% mutate(race_date = as.Date(race_date,"%m/%d/%y")) -> splits
df %>% mutate(race_date = as.Date(race_date,"%m/%d/%y")) -> df