微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

类型错误:尝试使用train_test_split在python中拆分数据集时出现单个数组

如何解决类型错误:尝试使用train_test_split在python中拆分数据集时出现单个数组

这是数据集的格式 enter image description here

这是我的代码

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

#Importing the dataset
dataset1 = pd.read_csv('DATASETS/movielens movie recommender/ml-25m/ratings.csv')

#Splitting into dependent and independent variables
X1 = dataset1.iloc[:,[0,3]].values
y1 = dataset1.iloc[:,1:3].values

#Encoding
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder',OneHotEncoder(),1])],remainder='passthrough')
ct2 = ColumnTransformer(transformers=[('encoder',[0])],remainder='passthrough')
y1 = np.array(ct.fit_transform(y1))
X1 = np.array(ct2.fit_transform(X1))


#Splitting into training set and test set
from sklearn.model_selection import train_test_split
X1_train,X1_test,y1_train,y1_test = train_test_split(X1,y1,test_size = 0.2,random_state = 1)

我收到以下错误

TypeError: Singleton array array(<25000095x162542 sparse matrix of type '<class 'numpy.float64'>'
    with 50000190 stored elements in Compressed Sparse Row format>,dtype=object) cannot be considered a valid collection.

有人可以告诉我这是什么意思,我该如何解决

解决方法

代替这个

y1 = np.array(ct.fit_transform(y1))

X1 = np.array(ct2.fit_transform(X1))

你可以使用

y1 = ct.fit_transform(y1).toarray()

x1 = ct.fit_transform(x1).toarray()

它对我有用!

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。