如何解决为什么在缩放数据帧的列非空时,会返回许多 NaN 值?
我正在处理这个数据集(我已经清理过了,没有缺失值)。
Area No. of bedrooms Resale latitude longitude price Alaknanda Badarpur Bharat Vihar Bindapur Burari Chattarpur Chittaranjan Park Delhi Delhi Meerut Expressway Dwarka Mor Dwarka More Govindpuri Greater Kailash Hari Nagar Jamia Nagar Jasola Kalkaji Kamla Nagar Mahavir Enclave Mansa Ram Park Mayur Vihar Mayur Vihar II Model Town Mundka Munirka New Ashok Nagar Noida Road Okhla Om Nagar Om Vihar Palam Paschim Vihar Pitampura Preet Vihar Punjabi Bagh Rohini Sector 9 Rohini sector 24 Roop Nagar Sainik Farms Saket Sarita Vihar Sector 10 Dwarka Sector 11 Dwarka Sector 12 Dwarka Sector 13 Dwarka Sector 13 Rohini Sector 17 Dwarka Sector 18A Dwarka Sector 19 Dwarka Sector 2 Dwarka Sector 22 Dwarka Sector 22 Rohini Sector 23 Dwarka Sector 23 Rohini Sector 24 Rohini Sector 3 Dwarka Sector 4 Dwarka Sector 5 Dwarka Sector 6 Dwarka Sector 7 Dwarka Sector 9 Dwarka Sector-18 Dwarka Shahdara Shanti Park Dwarka Shastri Nagar Uttam Nagar Vasant Kunj Vikas Puri West End West Punjabi Bagh nawada
0 1200 2 1 28.584311 77.057693 105.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1000 3 0 28.619074 77.056686 60.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
2 1350 2 1 28.528574 77.288331 150.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 435 2 0 28.619074 77.056686 25.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
4 900 3 0 28.619310 77.033279 58.0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
4993 540 2 1 28.603176 77.063060 25.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4994 540 2 1 28.603176 77.063060 30.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4995 415 1 1 28.544790 77.051083 26.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4996 415 1 1 28.544790 77.051083 55.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4997 900 3 1 28.619074 77.056686 42.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
4157 rows × 77 columns
应用随机森林回归器后表现不佳, 所以我决定缩放特征 - (卧室转售纬度经度的区域号) 和目标变量 - (price)
但在执行缩放后:
from sklearn.preprocessing import StandardScaler
def scaleColumns(df,cols_to_scale):
for col in cols_to_scale:
scaler = StandardScaler()
df[col] = pd.DataFrame(scaler.fit_transform(df[col].values.reshape((-1,1))))
df
return df
scaled_df = scaleColumns(df,['Area','No. of bedrooms','latitude','longitude','price'])
scaled_df
我明白了:
Area No. of bedrooms Resale latitude longitude price Alaknanda Badarpur Bharat Vihar Bindapur Burari Chattarpur Chittaranjan Park Delhi Delhi Meerut Expressway Dwarka Mor Dwarka More Govindpuri Greater Kailash Hari Nagar Jamia Nagar Jasola Kalkaji Kamla Nagar Mahavir Enclave Mansa Ram Park Mayur Vihar Mayur Vihar II Model Town Mundka Munirka New Ashok Nagar Noida Road Okhla Om Nagar Om Vihar Palam Paschim Vihar Pitampura Preet Vihar Punjabi Bagh Rohini Sector 9 Rohini sector 24 Roop Nagar Sainik Farms Saket Sarita Vihar Sector 10 Dwarka Sector 11 Dwarka Sector 12 Dwarka Sector 13 Dwarka Sector 13 Rohini Sector 17 Dwarka Sector 18A Dwarka Sector 19 Dwarka Sector 2 Dwarka Sector 22 Dwarka Sector 22 Rohini Sector 23 Dwarka Sector 23 Rohini Sector 24 Rohini Sector 3 Dwarka Sector 4 Dwarka Sector 5 Dwarka Sector 6 Dwarka Sector 7 Dwarka Sector 9 Dwarka Sector-18 Dwarka Shahdara Shanti Park Dwarka Shastri Nagar Uttam Nagar Vasant Kunj Vikas Puri West End West Punjabi Bagh nawada
0 -0.156044 -0.846368 1 0.146719 0.197107 -0.154917 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 -0.361197 0.327590 0 0.154070 0.197058 -0.245661 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
2 -0.002180 -0.846368 1 0.134931 0.208280 -0.064172 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 -0.940754 -0.846368 0 0.154070 0.197058 -0.316239 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
4 -0.463774 0.327590 0 0.154120 0.195924 -0.249694 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
4993 NaN NaN 1 NaN NaN NaN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4994 NaN NaN 1 NaN NaN NaN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4995 NaN NaN 1 NaN NaN NaN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4996 NaN NaN 1 NaN NaN NaN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4997 NaN NaN 1 NaN NaN NaN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
4157 rows × 77 columns
许多值现在变成了 NaN。我该如何解决这个问题?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。