如何解决尝试使用 MinMaxScaler() 缩放数据帧 (X_train)抛出 ValueError : array shape = (0,) 最少需要 1
为了预处理我的数据,我试图转换一个二元分类器。所有部分单独工作正常,但当放入函数时,值错误:Found array with 0 sample (s) (shape = (0,)) while a minimum of 1 is required.
OneHotEncoding 后的 X_train 数据:
0 0.66 759.5 318.5 220.50 3.5 2 0.40 3 1.0 0.0 0.0 0.0
1 0.76 661.5 416.5 122.50 7.0 3 0.10 1 0.0 1.0 0.0 0.0
2 0.66 759.5 318.5 220.50 3.5 3 0.10 1 0.0 1.0 0.0 0.0
3 0.74 686.0 245.0 220.50 3.5 5 0.10 4 0.0 0.0 0.0 1.0
4 0.64 784.0 343.0 220.50 3.5 2 0.40 4 1.0 0.0 0.0 0.0
... ... ... ... ... ... ... ... ... ... ... ... ...
609 0.98 514.5 294.0 110.25 7.0 4 0.40 2 0.0 0.0 1.0 0.0
610 0.90 563.5 318.5 122.50 7.0 3 0.10 1 0.0 1.0 0.0 0.0
611 0.82 612.5 318.5 147.00 7.0 4 0.25 2 0.0 0.0 1.0 0.0
612 0.71 710.5 269.5 220.50 3.5 5 0.10 1 0.0 0.0 0.0 1.0
613 0.64 784.0 343.0 220.50 3.5 2 0.25 5 1.0 0.0 0.0 0.0
我尝试使用的功能
def feature_engineering(data):
data = data[(np.nan_to_num(np.abs(stats.zscore(data,nan_policy='omit')),0) < 3).all(axis=1)]
data = data[(np.nan_to_num(np.abs(stats.zscore(data,0) > 3).all(axis=1)]
csc = OneHotEncoder(handle_unknown='ignore').fit_transform(data[['orientation']])
ob = pd.DataFrame(data=csc.toarray(),columns=["o1","o2","o3","o4"])
ob = csc.reshape((ob,614*12))
data = pd.concat([data,ob],axis = 1)
scaler = MinMaxScaler(feature_range=(0,1))
X_train = scaler.fit_transform(data)
data = pd.DataFrame(feature_engineering(X_train))
X_eval = pd.DataFrame(feature_engineering(X_eval))
回溯错误
-------------------------------------------------- -------------------------
ValueError Traceback (most recent call last)
<ipython-input-51-d0f1271d1165> in <module>
11
12 #
- -> 13 data = pd . DataFrame ( feature_engineering ( X_train ) )
14 X_eval = pd . DataFrame ( feature_engineering ( X_eval ) )
<ipython-input-51-d0f1271d1165> in feature_engineering (data)
3 data = data [ ( np . nan_to_num ( np . abs ( stats . zscore ( data,nan_policy = ' omit ' ) ),0 ) < 3 ) . all ( axis = 1 ) ]
4 data = data [ ( np .nan_to_num ( np . abs ( stats . zscore ( data,0 ) > 3 ) . all ( axis = 1 ) ]
----> 5 csc = OneHotEncoder ( handle_unknown = 'ignore' ) . fit_transform ( data [ [ 'orientation' ] ] )
6 ob= pd . DataFrame ( data = csc . Toarray ( ),columns = [ "o1","o4" ] )
7 ob = csc . reshape ( ( ob,614 * 12 ) )
~ \ anaconda3 \ lib \ site-packages \ sklearn \ preprocessing \ _encoders.py in fit_transform (self,X,y)
408 "" "
409 self . _validate_keywords ( )
-> 410 return super ( ) . fit_transform ( X,y ) 411 412 def transform ( self,X ) :
~ \ anaconda3 \ lib \ site-packages \ sklearn \ base.py in fit_transform (self,y,** fit_params)
688 if y is None :
689 # fit method of arity 1 (unsupervised transformation)
-> 690 return self . fit ( X,** fit_params ) . transform ( X ) 691 else : 692 # fit method of arity 2 (supervised transformation)
~ \ anaconda3 \ lib \ site-packages \ sklearn \ preprocessing \ _encoders.py in fit (self,y)
383 "" "
384 self . _validate_keywords ( )
-> 385 self . _fit ( X,handle_unknown = self . handle_unknown )
386 self . drop_idx_ = self . _compute_drop_idx ( )
387 return self
~ \ anaconda3 \ lib \ site-packages \ sklearn \ preprocessing \ _encoders.py in _fit (self,handle_unknown)
72
73 def _fit ( self,handle_unknown = 'error' ) :
---> 74 X_list,n_samples,n_features = self . _check_X ( X )
75
76 if self . categories ! = 'auto' :
~ \ anaconda3 \ lib \ site-packages \ sklearn \ preprocessing \ _encoders.py in _check_X (self,X)
58 for i in range ( n_features ) :
59 Xi = self . _get_feature ( X,feature_idx = i )
---> 60 Xi = check_array (Xi,ensure_2d = False,dtype = None,61 force_all_finite = needs_validation)
62 X_columns . append ( Xi )
~ \ anaconda3 \ lib \ site-packages \ sklearn \ utils \ validation.py in inner_f (* args,** kwargs)
70 FutureWarning)
71 kwargs . update ( { k : arg for k,arg in zip ( sig . parameters,args ) } )
---> 72 return f ( ** kwargs ) 73 return inner_f
74
~ \ anaconda3 \ lib \ site-packages \ sklearn \ utils \ validation.py in check_array (array,accept_sparse,accept_large_sparse,-dtype,order,copy,force_all_finite,ensure_2d,allow_nd,ensure_min_samples,ensure_min_features,estimator)
648 N_SAMPLES = _num_samples ( array )
649 if n_samples < ensure_min_samples :
-> 650 raise ValueError ("Found array with% d sample (s) (shape =% s) while a"
651 "minimum of% d is required% s."
652 % (n_samples,array.shape,ValueError : Found array with 0 sample (s) (shape = (0,)) while a minimum of 1 is required.
X_train['orientation']:
0 2
1 3
2 3
3 5
4 2
..
609 4
610 3
611 4
612 5
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。