如何解决在scikit-learn中使用make_pipeline时,为什么会出现“管道的最后一步”错误?
因此,我尝试使用make_pipeline
中的scikit-learn
清理数据(替换丢失的值,然后清理异常值,将编码函数应用于分类变量,最后添加随机森林回归器通过RandomForestRegressor
。输入是DataFrame
。最终,我想将其通过gridsearchcv
来搜索回归器的最佳超参数。
为此,我构建了一些自定义类,它们按照建议的here继承了TransformerMixin
类。这是我到目前为止所拥有的
from sklearn.pipeline import make_pipeline
from sklearn.base import TransformerMixin
import pandas as pd
class Cleaning(TransformerMixin):
def __init__(self,column_labels):
self.column_labels = column_labels
def fit(self,X,y=None):
return self
def transform(self,X):
"""Given a dataframe X with predictors,clean it."""
X_imputed,medians_X = median_imputer(X) # impute all missing numeric data with median
quantiles_X = get_quantiles(X_imputed,self.column_labels)
X_nooutliers,_ = replace_outliers(X_imputed,self.column_labels,medians_X,quantiles_X)
return X_nooutliers
class Encoding(TransformerMixin):
def __init__(self,encoder_list):
self.encoder_list = encoder_list
def fit(self,X):
"""Takes in dataframe X and applies encoding transformation to them"""
return encode_data(self.encoder_list,X)
import category_encoders as ce
pipeline_cleaning = Cleaning(column_labels = train_labels)
OneHot_binary = ce.OneHotEncoder(cols = ['new_store'])
OneHot = ce.OneHotEncoder(cols= ['transport_availability'])
Count = ce.CountEncoder(cols = ['county'])
pipeline_encoding = Encoding([OneHot_binary,OneHot,Count])
baseline = RandomForestRegressor(n_estimators=500,random_state=12)
make_pipeline([pipeline_cleaning,pipeline_encoding,baseline])
错误是说Last step of Pipeline should implement fit or be the string 'passthrough'
。我不明白为什么?
编辑:最后一行略有错别字,正确。传递给make_pipeline
的列表中的第三个元素是回归器
解决方法
更改行:
make_pipeline([pipeline_cleaning,pipeline_encoding,baseline])
至(无列表):
make_pipeline(pipeline_cleaning,baseline)
Pipeline(steps=[('cleaning',<__main__.Cleaning object at 0x7f617260c1d0>),('encoding',<__main__.Encoding object at 0x7f617260c278>),('randomforestregressor',RandomForestRegressor(n_estimators=500,random_state=12))])
你没事
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。