微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

使用 foreach 返回多个不同的数据帧

如何解决使用 foreach 返回多个不同的数据帧

我有一个运行多个函数的 foreach 循环,每个函数都会生成一个独特的数据框,具有部分不同的列名和不同数量的列。我想以一种运行循环的方式调整 foreach 循环,并返回两个数据帧,每个数据帧都包含一个函数生成的所有迭代的 rbind() 版本(即函数 1 的一个数据帧和函数 2 的一个数据帧。>

到目前为止,我最好的尝试只生成了由第一个函数的 rbind() 迭代组成的数据帧。请找到以下代码

预期输出一个由 function1() 产生的结果的数据帧粘贴在一个有 10 行(每次迭代 1 行)的数据帧中,第二个数据帧由 function2() 产生的结果产生。

注意:在我的实际工作中,我在做排列。因此,这两个函数必须在同一个循环中使用,不能在两个不同的循环中使用。

if (!require("pacman")) install.packages("pacman")
pacman::p_load(foreach,doParallel,gtools)


#make a dataframe
participant <- c(1,2,3,4,5,6,7,8,9,10)
group <- c(1,1,2)
latency <- c(1.2,1.4,1.5,1.3,1.7,2.2,2.3,0.7,0.9,1.1)
data <- as.data.frame(cbind(participant,group,latency))

#does some stuff
function1 <- function(){
  ttest <- t.test(latency ~ group,data = data,paired = FALSE)
  df <- as.data.frame(cbind(ttest$statistic,ttest$p.value))
  return(df)
}
#does some different stuff
function2 <- function(){
  ttest <- t.test(latency ~ group,ttest$p.value,ttest$conf.int))
  return(df)
}

#both produce differently sized dataframes
df <- function1()
df2 <- function2()


#set up  repetitions
permutation <- 1:10

#set up the dopar version
#number of cores
n_cores <- parallel::detectCores() - 1

# This part you always need,you can delete the outfile = "mylog.txt" part later,but its good for debugging. All output from the dopar part will go into that file,you can open it in rstudio to see it in real time (see the cat1,cat2 print)
cl <- makeCluster(rep("localhost",n_cores),outfile="mylog.txt")


# this function will be exported to the workers and will do nothing but evaluate library("packagename") for each package.
export_packages <- function(package_list){
  for(pac in package_list){
    eval(bquote(library(.(pac))))
    
  }
}
# this is a list/collection of all packages
package_list <- (.packages())
# Now we export the entire global environment including the list of the packages that are loaded+ the function to load them on the workers
clusterExport(cl,as.list(names(as.list(.GlobalEnv))))
# instead of directly writing library("packagename") here as above,we just tell the workers to each evaluate the export_packages function with the package_list that have been exported to the workers
clusterEvalQ(cl,c(export_packages(package_list)))
# note that we delete the registerDoParalel in line 7 and instead only specify n_cores there
registerDoParallel(cl)


#Note: This code only saves the first dataframe created,not the second.
#I want to save both dataframes to the list,and then create 2 final dataframes,each containing only the output of one function.
system.time({
  dataframe <- foreach(i = permutation) %dopar% {
    
        #run multiverse
    
    
    df <- function1()
    #count itereations
    count <- i
    
    #save as dataset
    df_prelim <- as.data.frame(cbind(df,count))
    return(df_prelim)
    
    
    #second output dataset
    df2 <- function2()
    df2_prelim <- as.data.frame(cbind(df2,count))
    return(df_prelim2)
    
  }
  #turn list to dataframe.
  #this should return two dataframes
  df_final<- ldply(dataframe,data.frame)
})


stopCluster(cl)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。