如何解决如何将值传递给由 pd.dummies 编码的数据集训练的模型?我收到值错误:无法转换数组
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
dataset = pd.read_csv('test_scores.csv')
columns = dataset.columns.values.tolist()
columns.remove('student_id')
columns.remove('posttest')
X = dataset[columns]
y = dataset['posttest']
x = pd.get_dummies(X)
X_train,X_test,y_train,y_test = train_test_split(x,y,test_size = 0.15)
regressor = LinearRegression()
regressor.fit(X_train,y_train)
y_pred = regressor.predict(X_test)
y_pred = np.round(y_pred)
pickle_out = open("model.pkl",mode = "wb")
pickle.dump(regressor,pickle_out)
pickle_out.close()
流线型形式:
school = st.selectbox("School",(school_list))
school_setting = st.selectbox("School Setting",(setting_list))
school_type = st.selectbox("School Type",(school_type_list))
classroom = st.selectbox("Classroom",(classroom_list))
teaching_method = st. selectbox("Teaching Method",(teaching_method_list))
n_student = st.number_input("Number of students in the classroom")
gender = st.selectbox("Gender",(gender_list))
lunch = st.selectbox("Do you qualify for Lunch?",(lunch_list))
pretest = st.text_input("Pretest score")
对于使用上面训练的模型,我尝试使用下面的代码来获取
score = model.predict( [[school,school_setting,school_type,classroom,teaching_method,n_student,student_id,gender,lunch,pretest]])
我收到以下错误:
ValueError: Unable to convert array of bytes/strings into decimal numbers with dtype='numeric'
这是因为受过训练的模型是用假人训练的,并且只会将输入设为 0 或 1 吗?如何解决这个问题?
我正在使用 this dataset from Kaggle。
我正在处理的数据示例:
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。