如何解决Python中SVM的可视化2D
我有一个作业,在下面。我已经完成了前5个任务,而最后一个却有问题。绘制它。请说明如何操作。预先谢谢你。
*(几天前我才开始学习SVM和ML,请考虑在内)
**(因为我认为对所有类型的内核进行绘图的操作顺序应该相同。如果即使对其中一种进行显示,那也很棒。我将尝试将您的代码用于其他代码)>
要遵循的步骤:
-
从这张地图中随机抽取样本。 (#100),并将其带入Python for SVC。该数据集包含东,北和岩石信息。
-
尝试使用线性,多项式,径向基函数和切线的内核运行SVC。
-
例如,如果您使用的是径向基函数,则找出每种方法中的最佳方法,根据您从准确性得分中获得的准确性,“ C”和“ gamma”可能是最佳的。
-
一旦有了拟合模型,您就可以计算出准确度得分(从测试数据集获得),然后将整个数据集导入获得的FIT模型中,并预测我们在模型中拥有的所有90,000个采样点的输出reference.csv。
-
请向我显示获得的地图,以及从每个FIT模型获得的准确性得分。
数据集如下:
相同样式的90000点。
代码如下:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
### Importing Info
df = pd.read_csv("C:/Users/Admin/Desktop/RA/step 1/reference.csv",header=0)
df_model = df.sample(n = 100)
df_model.shape
## X-y split
X = df_model.loc[:,df_model.columns!="Rock"]
y = df_model["Rock"]
y_initial = df["Rock"]
### for whole dataset
X_wd = df.loc[:,df_model.columns!="Rock"]
y_wd = df["Rock"]
## Test-train split
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size = 0.2,random_state = 0)
## Standardizing the Data
from sklearn.preprocessing import StandardScaler
sc = StandardScaler().fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)
## Linear
### Grid Search
from sklearn.model_selection import gridsearchcv
from sklearn import svm
from sklearn.metrics import accuracy_score,confusion_matrix
params_linear = {'C' : (0.001,0.005,0.01,0.05,0.1,0.5,1,5,10,50,100,500,1000)}
clf_svm_l = svm.SVC(kernel = 'linear')
svm_grid_linear = gridsearchcv(clf_svm_l,params_linear,n_jobs=-1,cv = 3,verbose = 1,scoring = 'accuracy')
svm_grid_linear.fit(X_train_std,y_train)
svm_grid_linear.best_params_
linsvm_clf = svm_grid_linear.best_estimator_
accuracy_score(y_test,linsvm_clf.predict(X_test_std))
### training svm
clf_svm_l = svm.SVC(kernel = 'linear',C = 0.1)
clf_svm_l.fit(X_train_std,y_train)
### predicting model
y_train_pred_linear = clf_svm_l.predict(X_train_std)
y_test_pred_linear = clf_svm_l.predict(X_test_std)
y_test_pred_linear
clf_svm_l.n_support_
### whole dataset
y_pred_linear_wd = clf_svm_l.predict(X_wd)
### map
## poly
### grid search for poly
params_poly = {'C' : (0.001,1000),'degree' : (1,2,3,4,6)}
clf_svm_poly = svm.SVC(kernel = 'poly')
svm_grid_poly = gridsearchcv(clf_svm_poly,params_poly,n_jobs = -1,scoring = 'accuracy')
svm_grid_poly.fit(X_train_std,y_train)
svm_grid_poly.best_params_
polysvm_clf = svm_grid_poly.best_estimator_
accuracy_score(y_test,polysvm_clf.predict(X_test_std))
### training svm
clf_svm_poly = svm.SVC(kernel = 'poly',C = 50,degree = 2)
clf_svm_poly.fit(X_train_std,y_train)
### predicting model
y_train_pred_poly = clf_svm_poly.predict(X_train_std)
y_test_pred_poly = clf_svm_poly.predict(X_test_std)
clf_svm_poly.n_support_
### whole dataset
y_pred_poly_wd = clf_svm_poly.predict(X_wd)
### map
## RBF
### grid search rbf
params_rbf = {'C' : (0.001,'gamma' : (0.001,1)}
clf_svm_r = svm.SVC(kernel = 'rbf')
svm_grid_r = gridsearchcv(clf_svm_r,params_rbf,cv = 10,scoring = 'accuracy')
svm_grid_r.fit(X_train_std,y_train)
svm_grid_r.best_params_
rsvm_clf = svm_grid_r.best_estimator_
accuracy_score(y_test,rsvm_clf.predict(X_test_std))
### training svm
clf_svm_r = svm.SVC(kernel = 'rbf',C = 500,gamma = 0.5)
clf_svm_r.fit(X_train_std,y_train)
### predicting model
y_train_pred_r = clf_svm_r.predict(X_train_std)
y_test_pred_r = clf_svm_r.predict(X_test_std)
### whole dataset
y_pred_r_wd = clf_svm_r.predict(X_wd)
### map
## Tangent
### grid search
params_tangent = {'C' : (0.001,50),1)}
clf_svm_tangent = svm.SVC(kernel = 'sigmoid')
svm_grid_tangent = gridsearchcv(clf_svm_tangent,params_tangent,scoring = 'accuracy')
svm_grid_tangent.fit(X_train_std,y_train)
svm_grid_tangent.best_params_
tangentsvm_clf = svm_grid_tangent.best_estimator_
accuracy_score(y_test,tangentsvm_clf.predict(X_test_std))
### training svm
clf_svm_tangent = svm.SVC(kernel = 'sigmoid',C = 1,gamma = 0.1)
clf_svm_tangent.fit(X_train_std,y_train)
### predicting model
y_train_pred_tangent = clf_svm_tangent.predict(X_train_std)
y_test_pred_tangent = clf_svm_tangent.predict(X_test_std)
### whole dataset
y_pred_tangent_wd = clf_svm_tangent.predict(X_wd)
### map
解决方法
从示例数据来看,您似乎正在处理规则排列的数据,并且行/列以单调递增的方式进行迭代。 这是将数据集重塑为2d数组(通过将数组重塑为行)并进行相应绘制的一种方法:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# create sample data
data = {
'Easting': [0,1,2,3,3],'Northing': [0,2],'Rocks': [0,0],}
df = pd.DataFrame(data)
# reshape data into 2d matrix (assuming easting / northing steps from 0 to max value)
max_easting = np.max(df['Easting'])
img_data = np.reshape(data['Rocks'],(max_easting,-1))
# plot as image
plt.imshow(img_data)
plt.show()
如果您要处理不规则间隔的数据,即不是每个东/北组合都具有值,则可以考虑使用plotting irregular spaced data。
,对于那些将遇到与我相同的问题的人,这是绘制线性可视化的答案。将这些代码用于其他内核将很容易。
# Visualising the Training set results
from matplotlib.colors import ListedColormap
X_set,y_set = X_train_std,y_train
X1,X2 = np.meshgrid(np.arange(start = X_set[:,0].min() - 1,stop = X_set[:,0].max() + 1,step = 0.01),np.arange(start = X_set[:,1].min() - 1,1].max() + 1,step = 0.01))
plt.contourf(X1,X2,clf_svm_l.predict(np.array([X1.ravel(),X2.ravel()]).T).reshape(X1.shape),alpha = 0.75,cmap = ListedColormap(('darkblue','yellow')))
plt.xlim(X1.min(),X1.max())
plt.ylim(X2.min(),X2.max())
for i,j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j,X_set[y_set == j,1],c = ListedColormap(('blue','gold'))(i),label = j)
plt.title('SVM (Training set)')
plt.xlabel('Easting')
plt.ylabel('Northing')
plt.legend()
plt.show()
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。