微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

使用时间序列分析的 KMeans 聚类

如何解决使用时间序列分析的 KMeans 聚类

这是我的代码

distance = getdistanceByPoint(hr_tm2sif_df,kmeans[1])

这是由此产生的错误

   Error:
   TypeError                                 Traceback (most recent call last)
   <ipython-input-77-79f84ace211e> in <module>()
      2 outliers_fraction = 0.15
      3 # get the distance between each point and its nearest centroid. The biggest distances are 
   considered as anomaly
   ----> 4 distance = getdistanceByPoint(hr_tm2sif_df,kmeans[1])
      5 # number of observations that equate to the 13% of the entire data set
      6 number_of_outliers = int(outliers_fraction*len(distance))

   TypeError: 'KMeans' object is not subscriptable

解决方法

Here are the details

# Write a function for clusters numbers
kmeans = KMeans(n_clusters=10,random_state=42)
kmeans.fit(hr_tm2sif_df.values)
labels = kmeans.predict(hr_tm2sif_df.values)
unique_elements,counts_elements  = np.unique(labels,return_counts=True)
clusters = np.asarray((unique_elements,counts_elements))

def getDistanceByPoint(data,model):
""" Function that calculates the distance between a point and centroid of a 
cluster,returns the distances in pandas series"""
distance = []
for i in range(0,len(data)):
    Xa = np.array(data.loc[i])
    Xb = model.cluster_centers_[model.labels_[i]-1]
    distance.append(np.linalg.norm(Xa-Xb))
return pd.Series(distance,index=data.index)

# Assume that 15% of the entire data set are anomalies 
outliers_fraction = 0.15
# get the distance between each point and its nearest centroid. The biggest 
distances are considered as anomaly
distance = getDistanceByPoint(hr_tm2sif_df,kmeans[1])
# number of observations that equate to the 13% of the entire data set
number_of_outliers = int(outliers_fraction*len(distance))
# Take the minimum of the largest 13% of the distances as the threshold
threshold = distance.nlargest(number_of_outliers).min()
# anomaly1 contain the anomaly result of the above method Cluster (0:normal,1:anomaly) 
hr_tm2sif_df['anomaly1'] = (distance >= threshold).astype(int)
.... 

TypeError                                 Traceback (most recent call last)
<ipython-input-78-79f84ace211e> in <module>()
  2 outliers_fraction = 0.15
  3 # get the distance between each point and its nearest centroid. The biggest 
distances are considered as anomaly
----> 4 distance = getDistanceByPoint(hr_tm2sif_df,kmeans[1])
  5 # number of observations that equate to the 13% of the entire data set
  6 number_of_outliers = int(outliers_fraction*len(distance))

TypeError: 'KMeans' object is not subscriptable

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。