如何解决简单线性回归循环
我的数据集中有很多预测变量。我想对每个预测变量执行简单的线性回归。所以我做了一个循环。我的代码如下:
m = ncol(finalmev)
predictorlist = colnames(finalmev)[2:m]
for (i in predictorlist){
model <- summary(lm(paste("ODR ~",i[[1]]),data=finalmev))
}
但是,在我运行循环后,我收到如下错误:
> for (i in predictorlist){
+ model <- summary(lm(paste("ODR ~",data=finalmev))
+ } Error in str2lang(x) : <text>:1:25: unexpected numeric constant 1: ODR ~ Overnight.Deposit 1
解决方法
当前代码在每次迭代时都会覆盖 model
。您可能想要创建一个列表来存储它们。
predictorlist = colnames(finalmev)[-1]
model_list <- vector('list',length(predictorlist))
for (i in seq_along(predictorlist)) {
model_list[[i]] <- summary(lm(paste("ODR ~",predictorlist[i]),data=finalmev))
}
或者使用 lapply
-
result <- lapply(predictorlist,function(x) summary(lm(paste("ODR ~",x),data=finalmev))
,
您似乎有一个名称中带有空格的列。因此,您需要如下所示的引号:
# create a data set
set.seed(1)
finalmev <- data.frame(ODR = 1:4,`Overnight.Deposit 1` = rnorm(4),`Overnight.Deposit 2` = rnorm(4),check.names = FALSE)
# reproduce the error
predictorlist <- colnames(finalmev)[2:NCOL(finalmev)]
for (i in predictorlist){
model <- summary(lm(paste("ODR ~",i[[1]]),data=finalmev))
}
#R> Error in str2lang(x) : <text>:1:25: unexpected numeric constant
#R> 1: ODR ~ Overnight.Deposit 1
#R> ^
# fix the error using quotes
for (i in predictorlist)
model <- summary(lm(sprintf("ODR ~ `%s`",data=finalmev))
# actually save all the output as pointed out by Ronak Shah
res <- lapply(
tail(colnames(finalmev),-1),function(x) eval(bquote(summary(lm(.(sprintf("ODR ~ `%s`",x)),data=finalmev)))))
# show the result
res
#R> [[1]]
#R>
#R> Call:
#R> lm(formula = "ODR ~ `Overnight.Deposit 1`",data = finalmev)
#R>
#R> Residuals:
#R> 1 2 3 4
#R> -0.9534 -0.5809 1.2087 0.3256
#R>
#R> Coefficients:
#R> Estimate Std. Error t value Pr(>|t|)
#R> (Intercept) 2.4386 0.5950 4.098 0.0547 .
#R> `Overnight.Deposit 1` 0.7746 0.6213 1.247 0.3387
#R> ---
#R> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#R>
#R> Residual standard error: 1.186 on 2 degrees of freedom
#R> Multiple R-squared: 0.4374,Adjusted R-squared: 0.156
#R> F-statistic: 1.555 on 1 and 2 DF,p-value: 0.3387
#R>
#R>
#R> [[2]]
#R>
#R> Call:
#R> lm(formula = "ODR ~ `Overnight.Deposit 2`",data = finalmev)
#R>
#R> Residuals:
#R> 1 2 3 4
#R> -1.6293 0.3902 0.2308 1.0083
#R>
#R> Coefficients:
#R> Estimate Std. Error t value Pr(>|t|)
#R> (Intercept) 2.3372 0.7282 3.209 0.0849 .
#R> `Overnight.Deposit 2` 0.8865 1.1645 0.761 0.5260
#R> ---
#R> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#R>
#R> Residual standard error: 1.392 on 2 degrees of freedom
#R> Multiple R-squared: 0.2247,Adjusted R-squared: -0.163
#R> F-statistic: 0.5795 on 1 and 2 DF,p-value: 0.526
#R>
我使用 eval(bquote(...))
来获得不错的输出。请注意,您可以将 colnames(finalmev)[2:ncol(finalmev)]
更改为 tail(colnames(finalmev),-1)
。如上所述,Ronak Shah 表明您实际上只保存了 for 循环中的最后一个输出。
另外两个选择是:
# move out sprintf
res1 <- lapply(sprintf("ODR ~ `%s`",tail(colnames(finalmev),-1)),function(frm) eval(bquote(summary(lm(.(frm),data = finalmev)))))
# in R 4.1.0 or greater
res2 <- tail(colnames(finalmev),-1) |>
sprintf(fmt = "ODR ~ `%s`") |>
lapply(\(frm) eval(bquote(summary(lm(.(frm),data = finalmev)))))
# we get the same
all.equal(res,res2)
#R> [1] TRUE
all.equal(res1,res2)
#R> [1] TRUE
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。