微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

在虹膜数据中应用 scipy zscore

如何解决在虹膜数据中应用 scipy zscore

我正在尝试在 iris 数据集中应用 zscore。由于 iris 数据集在最后一列中有 str,因此我收到以下错误

TypeError: unsupported operand type(s) for /: 'str' and 'int'

此处给出了 mwe:

import warnings

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas
from pandas.plotting import scatter_matrix
from scipy.stats import zscore
from sklearn import model_selection
from sklearn.discriminant_analysis import LineardiscriminantAnalysis
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import (accuracy_score,classification_report,confusion_matrix)
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier

matplotlib.use('TkAgg')
warnings.filterwarnings("ignore")

names = ['sepal-length','sepal-width','petal-length','petal-width','class']
df = pandas.read_csv("iris.data",names=names)
df.plot(kind='line',subplots=True,layout=(2,2),sharex=True,sharey=False)
plt.show()
df = df.sample(frac=1).reset_index(drop=True)
# Select upper triangle of correlation matrix
corr_matrix = df.corr().abs()
upper = corr_matrix.where(
    np.triu(np.ones(corr_matrix.shape),k=1).astype(np.bool))
# Find index of feature columns with correlation greater than 0.9
to_drop = [column for column in upper.columns if any(upper[column] > 0.9)]
dataset = df.drop(df[to_drop],axis=1)
dataset = dataset.apply(zscore)

如何为这样的数据集计算 zscore

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。