如何解决向量嵌套嵌套
我想向量化这个嵌套的“ for”循环。 想要这个的两个原因:
- 运行起来会更快。
- 尽管该代码适用于示例数据,但是在我的真实数据上运行该代码时,它并不能完全起作用(当我期望得到负数或正数的结果时,它就可以正常工作;但是,当我期望零结果时,它就可以了)。我想先对其向量化,看看是否有帮助。我认为这可能会有所帮助的原因是,当代码在X中循环时,问题出在第二个循环中。
我已经在Google上待了两天,并在这里阅读了有关向量化循环的问题,但我仍然无法自己完成。从理论上讲,我可以看到拥有第二个循环可以引用的 n *数据框X的索引列表(而不是在每次迭代中创建数据框X)可能会解决我的问题,但我什至没有能够做到这一点,更不用说矢量化了。
简而言之,该函数采用一个输入数据的Excel文件,并使用另一个Excel文件-实际上是map / lookup 表-指定哪些单元格将输入数据放入第三个“计算器” Excel工作簿中。 (即,地图/查找表指定了在“计算器” Excel工作簿中的何处放置输入值)。在此示例中,“计算器”工作簿是在第一个代码块中创建的exampleworkbook.xlsx。
谢谢。
您需要设置一个目录,以便您可以保存示例“计算器” Excel工作簿(带有必要的公式),然后将其加载到:
## Set directory
workingdir <-"yourfilepath"
setwd(workingdir)
## Load packages
library(readxl) # for reading in Excel sheets
library("XLConnect") # needs Java v6 or higher
if (!require('openxlsx')) install.packages('openxlsx')
library(openxlsx) # to create Excel workbooks,no dependency on Java
## Create a blank workbook
wb <- createWorkbook()
## Add two sheets to the workbook
addWorksheet(wb,"Sheet 1")
addWorksheet(wb,"Sheet 2")
## Name column 1 in sheet 2
writeData(wb,"Sheet 2","colsums",startCol = 1,startRow = 1)
## Specify formulae to be used
v <- c("SUM('Sheet 1'!$A$1:$A$10)","SUM('Sheet 1'!$B$1:$B$10)","SUM('Sheet 1'!$C$1:$C$10)","SUM('Sheet 1'!$D$1:$D$10)")
## Write formulae into column 1 of sheet 2
writeFormula(wb,sheet = 2,x = v,startRow = 2)
## Save workbook to working directory
saveWorkbook(wb,"exampleworkbook.xlsx")
现在是我要向量化的代码:
(注:此处,虚拟输入数据全为数字,以保持简单,但实际数据包含字符串和数字。)
## Load the workbook
exampleworkbook <- XLConnect::loadWorkbook("exampleworkbook.xlsx")
## Keep the formatting of the original document
setStyleAction(exampleworkbook,XLC$"STYLE_ACTION.NONE")
## Create example data
inputdata <- data.frame(id = 1:3,var1 = c(5,4,2),var2 = c(25,11,9),var3 = c(8,5,11),var4 = c(1,2,3))
lookup <- data.frame(DestSheet = c(NA,1,1),DestCol = c(NA,3),DestRow = c(NA,3,4) )
row.names(lookup) = c("id","var1","var2","var3","var4")
## Write the function
getresult <- function(DF){
output = data.frame(fix.empty.names = FALSE) # create an output df to hold the results
for (a in 1:nrow(DF)) { # loop through rows of 'inputdata'
X = DF[a,] # create a vector,X,from row a
X <- t(X) # transpose X (to later allow it to be merged with 'lookup' df)
ID = X["id",]
print.default(a) # so can see which iteration is occurring
## Add row numbers to enable sorting back into original order after merge
## (because IDs are strings in the real thing and it's easier to
## trouble-shoot if the variables in X are in the original order):
X <- cbind(X,seq.int(nrow(X)) )
X <- merge(X,lookup,by = "row.names") # gives destinations of each variable
X <- X[!is.na(X$DestCol),] # removes unnecessary data i.e. ID variable
X <- setNames(X,c("Variable","Value","OrigRow","DestSheet","DestCol","DestRow"))
X <- X[order(X$OrigRow),]
#X <- X[X$Variable != "var5",] # need to be able to remove variables if desired
for(i in 1:nrow(X)){
## For loop extracts the value for each variable then
## writes it to the specified destination cell in the Excel worksheet
b = X$Value[i]
c = X$DestSheet[i]
d = X$DestRow[i]
e = X$DestCol[i]
writeWorksheet(exampleworkbook,b,c,d,e,header = FALSE)
}
## Read results from sheet 2,startRow = 1,endRow = 6,endCol = 1;
## returns a data.frame
results = readWorksheet(exampleworkbook,6,1)
results[is.na(results)] = 0
results <- setNames(results,"Value")
## create a df containing results and their IDs
results <- data.frame(c(ID,results$Value[1],results$Value[2],results$Value[3],results$Value[4]),fix.empty.names = FALSE)
output <- rbind(output,t(results))
}
## rename columns
output <- setNames(output,c("ID","sumColA","sumColB","sumColC","sumColD"))
return(output)
}
getresult(inputdata)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。