如何解决数字+分类+文本特征上的 LightGBM >> TypeError:未知类型的参数:boosting_type,得到:dict
我正在尝试在由数值、分类和文本数据组成的数据集上训练 lightGBM 模型。但是,在训练阶段,我收到以下错误:
params = {
'num_class':5,'max_depth':8,'num_leaves':200,'learning_rate': 0.05,'n_estimators':500
}
clf = LGBMClassifier(params)
data_processor = ColumnTransformer([
('numerical_processing',numerical_processor,numerical_features),('categorical_processing',categorical_processor,categorical_features),('text_processing_0',text_processor_1,text_features[0]),('text_processing_1',text_features[1])
])
pipeline = Pipeline([
('data_processing',data_processor),('lgbm',clf)
])
pipeline.fit(X_train,y_train)
错误是:
TypeError: UnkNown type of parameter:boosting_type,got:dict
我基本上有两个文本特征,都是某种形式的名称,我主要在其上进行词干提取。
任何指针将不胜感激。
解决方法
您错误地设置了分类器,这给了您错误,您可以在进入管道之前轻松尝试:
params = {
'num_class':5,'max_depth':8,'num_leaves':200,'learning_rate': 0.05,'n_estimators':500
}
clf = LGBMClassifier(params)
clf.fit(np.random.uniform(0,1,(50,2)),np.random.randint(0,5,50))
给你同样的错误:
TypeError: Unknown type of parameter:boosting_type,got:dict
您可以像这样设置分类器:
clf = LGBMClassifier(**params)
然后用一个例子,你可以看到它运行:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler,OneHotEncoder
from sklearn.compose import ColumnTransformer
numerical_processor = StandardScaler()
categorical_processor = OneHotEncoder()
numerical_features = ['A']
categorical_features = ['B']
data_processor = ColumnTransformer([('numerical_processing',numerical_processor,numerical_features),('categorical_processing',categorical_processor,categorical_features)])
X_train = pd.DataFrame({'A':np.random.uniform(100),'B':np.random.choice(['j','k'],100)})
y_train = np.random.randint(0,100)
pipeline = Pipeline([('data_processing',data_processor),('lgbm',clf)])
pipeline.fit(X_train,y_train)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。