如何解决这是在 R 中“并行化”代码的正确方法吗?
我正在使用 R 编程语言。我在这里看到了这个链接,它展示了如何“并行化”你的代码:https://www.r-bloggers.com/2017/10/running-r-code-in-parallel/
据我所知,“并行化”是指战略性地分配您的计算机资源,以便更快地运行您的代码。
例如,我可以在我的电脑上运行下面的代码,但运行需要一段时间:
#Load library:
library(mopsocd)
#load libraries
library(dplyr)
# create some data for this example
a1 = rnorm(1000,100,10)
b1 = rnorm(1000,10)
c1 = sample.int(1000,1000,replace = TRUE)
train_data = data.frame(a1,b1,c1)
#define function:
funct_set <- function (x) {
#bin data according to random criteria
train_data <- train_data %>%
mutate(cat = ifelse(a1 <= x[1] & b1 <= x[3],"a",ifelse(a1 <= x[2] & b1 <= x[4],"b","c")))
train_data$cat = as.factor(train_data$cat)
#new splits
a_table = train_data %>%
filter(cat == "a") %>%
select(a1,c1,cat)
b_table = train_data %>%
filter(cat == "b") %>%
select(a1,cat)
c_table = train_data %>%
filter(cat == "c") %>%
select(a1,cat)
#calculate quantile ("quant") for each bin
table_a = data.frame(a_table%>% group_by(cat) %>%
mutate(quant = ifelse(c1 > x[5],1,0 )))
table_b = data.frame(b_table%>% group_by(cat) %>%
mutate(quant = ifelse(c1 > x[6],0 )))
table_c = data.frame(c_table%>% group_by(cat) %>%
mutate(quant = ifelse(c1 > x[7],0 )))
f1 = mean(table_a$quant)
f2 = mean(table_b$quant)
f3 = mean(table_c$quant)
#group all tables
final_table = rbind(table_a,table_b,table_c)
# calculate the total mean : this is what needs to be optimized
f4 = mean(final_table$quant)
return (c(f1,f2,f3,f4));
}
gn <- function(x) {
g1 <- x[2] - x[1] > 0.0
g2 <- x[4] - x[3] > 0.0
g3 <- x[7] - x[6] >0
g4<- x[6] - x[5] >0
return(c(g1,g2,g3,g4))
}
## Set Arguments
varcount <- 7
fncount <- 4
lbound <- c(80,90,80,200,300)
ubound <- c(90,110,300,500)
optmin <- 0
#desired part to speed up
ex1 <- mopsocd(funct_set,gn,varcnt=varcount,fncnt=fncount,lowerbound=lbound,upperbound=ubound,opt=optmin)
假设我想“加速”上面代码的最后一部分:
#part to speed-up
ex1 <- mopsocd(funct_set,opt=optmin)
按照网站上的说明,您首先需要查看您的计算机有多少个内核:
library(parallel)
detectCores()
[1] 8
cl <- makeCluster(8)
从这里,您现在可以“并行化”代码:
#parallelize code
results <- parSapply(cl,train_data,mopsocd(funct_set,opt=optmin))
# close cluster object
stopCluster(cl)
问题:“结果”对象仍在我的计算机上运行 - 有人可以告诉我我是否正确地“并行化”了我的代码?
谢谢
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。