微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

套索回归问题:lambda 和混淆矩阵

如何解决套索回归问题:lambda 和混淆矩阵

我正在尝试按照我在以下链接中找到的代码对营业额进行套索回归:https://www.kaggle.com/acasalan/predict-bank-turnover-lasso-regression

这样做时,在我的结果中有两件事对我来说似乎很奇怪:

  1. lambda.min 和 lambda.1se 相等;
  2. 混淆矩阵结果未显示正数。

代码如下:

# Split the data into training and test set
set.seed(123) # cercare significato di questo valore
training.samples <- Dati3$Dimissioni %>% 
  createDataPartition(p = 0.7,list = FALSE) # randomly split the data into training set (70% for building a predictive model) and test set (30% for evaluating the model)
train.data <- Dati3[training.samples,]

x <- model.matrix(Dimissioni~.,train.data)[,-1]
# Convert the outcome (class) to a numerical variable
y <- train.data$Dimissioni
#R function glmnet() [glmnet package] for computing penalized logistic regression.

glmnet(x,y,family = "binomial",alpha = 1,lambda = NULL)

# Find the best lambda using cross-validation
set.seed(123) 
cv.lasso <- cv.glmnet(x,family = "binomial")
plot(cv.lasso) # The left dashed vertical line indicates that the log of the optimal value of lambda is approximately -5,which is the one that minimizes the prediction error. 

cv.lasso$lambda.min # exact value of lambda
cv.lasso$lambda.1se # value of lambda that gives the simplest model but also lies within one standard error of the optimal value of lambda
# both the two methods results the same value: 0.008018156,# Using lambda.min as the best lambda,gives the following regression coefficients
coef(cv.lasso,cv.lasso$lambda.min)

# Final model with lambda.min (the same will be with lambda.1se)
lasso.model2 <- glmnet(x,lambda = cv.lasso$lambda.min)

# Make prediction on test data
x.test <- model.matrix(Dimissioni ~.,test.data)[,-1]
probabilities2 <- lasso.model2 %>% predict(newx = x.test)
predicted.classes2 <- ifelse(probabilities2 > 0.5,"pos","neg")

# Model accuracy
observed.classes2 <- test.data$Dimissioni
mean(predicted.classes2 == observed.classes2)

#confusion matrix 
table(predicted.classes2,observed.classes2)
second <- table(predicted.classes2,observed.classes2)

# Precision or accuracy of predicting correctly employee turnover:
round(second[2,2]/ (second[2,2]+second[2,1]),4)

这些是混淆矩阵的结果:

enter image description here

感谢您的帮助。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。