如何解决SVM模型的超参数调整和嵌套交叉验证
我正在尝试找到具有多项式内核的“最佳” SVM,因为据我了解,任务是找到超参数的最佳集合。我正在尝试使用嵌套交叉验证以最小化偏差。但是,我没有得到的是,如果对我的外部交叉验证(例如10倍),对于每个拆分,我得到不同的最佳超参数集,那么我该如何选择最佳的总体参数集?我的最终目标是使用一组超参数来报告模型,以使准确性最大化。
cv_outer = StratifiedKFold(n_splits=3,shuffle=True,random_state=41)
outer_results = list()
for train_index,test_index in cv_outer.split(X,y):
# split data
X_train,X_test = X[train_index],X[test_index]
y_train,y_test = y[train_index],y[test_index]
y_train = y_train.ravel()
# configure the cross-validation procedure
cv_inner = StratifiedKFold(n_splits=2,random_state=41)
# define the model
model = SVC(kernel='poly')
# define search space
space = dict()
space['C'] = [0.1,1,10,100]
space['degree'] = [2,4]
# define search
search = gridsearchcv(model,space,scoring='accuracy',cv=cv_inner,refit=True)
# execute search
result = search.fit(X_train,y_train)
# get the best performing model fit on the whole training set
best_model = result.best_estimator_
# evaluate model on the hold out dataset
yhat = best_model.predict(X_test)
# evaluate the model
acc = accuracy_score(y_test,yhat)
# store the result
outer_results.append(acc)
# report progress
print('>acc=%.3f,est=%.3f,cfg=%s' % (acc,result.best_score_,result.best_params_))
# summarize the estimated performance of the model
print('Accuracy: %.3f (%.3f)' % (mean(outer_results),std(outer_results)))
print(best_model)`
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。