在 python 中解压自定义管道 Ml 模型时出错

如何解决在 python 中解压自定义管道 Ml 模型时出错

我在 python 中创建了一个自定义管道。我使用了 sklearn 管道,它似乎运行成功。 但是,当我将模型另存为 pickle 文件并希望将保存的 pickle 文件加载到其他笔记本中时,它会显示错误

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import LabelEncoder
from sklearn import metrics
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
import eli5
from eli5.sklearn import PermutationImportance
from sklearn.pipeline import Pipeline,make_pipeline,FeatureUnion
from sklearn.preprocessing import FunctionTransformer
from sklearn.compose import ColumnTransformer

path = 'C:/Users/Desktop/'
df = pd.read_excel (path + "df.xlsx",sheet_name='df')

############################################ ##################################

# import the BaseEstimator

from sklearn.base import BaseEstimator

# define the class OutletTypeEncoder
# custom transformer must have methods fit and transform
class OutletTypeEncoder(BaseEstimator):

def __init__(self):
    pass
    #self.name = name

def fit(self,df,y=None):
    return self

def transform(self,df):
    
    # replace NaN
    df[['pdf_tbl_pn_identifier','pdf_tbl_qty_identifier','pdf_header_present']] = df[['pdf_tbl_pn_identifier','pdf_header_present']].fillna(value=-999)
    df[['pdf_tbl_cnt']] = df[['pdf_tbl_cnt']].fillna(value=0)

    # Replace gt 1 count as 0
    df['pdf_tbl_cnt'] = np.where((df['pdf_tbl_cnt'] == '1'),1,0)
    df['part_cnt'] = np.where((df['part_cnt'] == '1'),0)

    # create numeric and categorica coulmns
    obj_df= df[['pdf_tbl_pn_identifier','pdf_header_present','pdf_body_pn_identifier','pdf_body_qty_identifier','pdf_model_rel_returned','pdf_model_ent_returned']]
    num_df= df[['pdf_tbl_cnt','pdf_model_avg_relationship_score','pdf_model_avg_entity_score','part_cnt']]

    # Labelencoding for categorica columns and then 
    obj_df=obj_df.apply(LabelEncoder().fit_transform)
    df = pd.concat([obj_df,num_df],axis=1)
    #df.reset_index(inplace=True,drop=True)

    df.pdf_tbl_pn_identifier = df.pdf_tbl_pn_identifier.astype(str)
    df.pdf_tbl_qty_identifier = df.pdf_tbl_qty_identifier.astype(str)
    df.pdf_body_pn_identifier = df.pdf_body_pn_identifier.astype(str)
    df.pdf_body_qty_identifier = df.pdf_body_qty_identifier.astype(str)
    df.pdf_model_rel_returned = df.pdf_model_rel_returned.astype(str)
    df.pdf_model_ent_returned = df.pdf_model_ent_returned.astype(str)
    df.pdf_header_present = df.pdf_header_present.astype(str)
    #df.matching = df.matching.astype(str)
    #df['pdf_tbl_cnt'] = df['pdf_tbl_cnt'].apply(np.int64) 
    df.pdf_tbl_cnt = df.pdf_tbl_cnt.apply(np.int64) 

    return df

############################################ #################################

feature_cols = df.drop(['matching'],axis=1)
X = feature_cols # Features
y = df.matching # Target variable

# split into train test sets
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25,random_state=0)

# Create Pipeline
logreg = LogisticRegression()
model_pipeline = Pipeline(steps=[('preprocess',OutletTypeEncoder()),('logreg',LogisticRegression())
                                 ])    
                                 
                                 
# fit the pipeline with the training data
model_pipeline.fit(X_train,y_train)

# Predict
y_pred=model_pipeline.predict(X_test)

现在我将模型保存为 pickle 文件,并想在另一个笔记本中使用该 pickle 文件。 但是得到了一个错误: AttributeError: 无法在 ma​​in'>

获取属性 'OutletTypeEncoder'
# Save the Modle to file in the current working directory

Pkl_Filename = "C:\\Users\\SafayetKarim\\Desktop\\confidence_score\\results_updated\\pdf\\logisic_Model_pipeline.pkl"  

with open(Pkl_Filename,'wb') as file:  
    pickle.dump(model_pipeline,file)
    
    
    
# Load the Model back from file
with open('C:\\Users\\SafayetKarim\\Desktop\\confidence_score\\results_updated\\pdf\\logisic_Model_pipeline.pkl','rb') as file:  
    logisic_Model_pipeline = pickle.load(file)

logisic_Model_pipeline

请帮助我解决问题。

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-28376e81d621> in <module>
      1 # Load the Model back from file
      2 with open('C:\\Users\\SafayetKarim\\Desktop\\confidence_score\\results_updated\\pdf\\OutletTypeEncoder.pkl','rb') as file:
----> 3     OutletTypeEncoder = pickle.load(file)
      4 
      5 OutletTypeEncoder

AttributeError: Can't get attribute 'OutletTypeEncoder' on <module '__main__'>

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其他元素将获得点击?
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。)
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbcDriver发生异常。为什么?
这是用Java进行XML解析的最佳库。
Java的PriorityQueue的内置迭代器不会以任何特定顺序遍历数据结构。为什么?
如何在Java中聆听按键时移动图像。
Java“Program to an interface”。这是什么意思?
Java在半透明框架/面板/组件上重新绘画。
Java“ Class.forName()”和“ Class.forName()。newInstance()”之间有什么区别?
在此环境中不提供编译器。也许是在JRE而不是JDK上运行?
Java用相同的方法在一个类中实现两个接口。哪种接口方法被覆盖?
Java 什么是Runtime.getRuntime()。totalMemory()和freeMemory()?
java.library.path中的java.lang.UnsatisfiedLinkError否*****。dll
JavaFX“位置是必需的。” 即使在同一包装中
Java 导入两个具有相同名称的类。怎么处理?
Java 是否应该在HttpServletResponse.getOutputStream()/。getWriter()上调用.close()?
Java RegEx元字符(。)和普通点?