在 LightGBM - 编程之家

如何解决在 LightGBM

简而言之，我的初始 df 有一列包含来自外部预测模型的概率，我想将其与从我的 lightGBM 模型生成的预测进行比较。首先，我对我的数据使用了训练测试拆分，其中包括我的列 old_predictions

X = A,B,C,old_predictions

Y = outcome
seed=47
X_train,X_test,y_train,y_test = train_test_split(X,Y,test_size=0.2,random_state=seed)

但是，我不希望将 old_predictions 作为一个特征包含在我的 lightGBM 模型中，所以我从 X_test 数据中创建了一个单独的 df（稍后我将向其中附加 light GBM 预测概率）并从X_test 和 X_train

pred_df = X_test
X_test.drop(['old_predictions'],axis = 1,inplace = True)
X_train.drop(['old_predictions'],inplace = True)

然而，当我尝试训练我的模型时，我收到以下错误：

LightGBMError: The number of features in data (4) is not the same as it was in training data (3).
You can set ``predict_disable_shape_check=true`` to discard this error,but please be aware what you are doing.

我的两个问题是

使用我描述的关于我为什么删除这个变量的逻辑，您是否同意忽略这个错误确实没问题？
在哪里添加 predict_disable_shape_check=true 以忽略错误？我已经尝试了下面的方法，但没有一个成功，同样的错误再次出现。我尝试阅读文档，但找不到清晰的内容

model = lgb.LGBMClassifier(**parameters1,predict_disable_shape_check=True)

y_pred=model.predict(X_test,predict_disable_shape_check=True)

predictions = model.predict_proba(X_test,predict_disable_shape_check=True)[:,1]
predictions_train = model.predict_proba(X_train,1]

我也将它直接添加到参数列表中，但这也不起作用。