如何解决需要从python中提取或删除列
categorical_features = \
['FireplaceQu','BsmtQual','BsmtCond','GarageQual','GarageCond','ExterQual','ExterCond','HeatingQC','PoolQC','KitchenQual','BsmtFinType1','BsmtFinType2','Functional','Fence','BsmtExposure','GarageFinish','LandSlope','LotShape','PavedDrive','Street','Alley','CentralAir','MSSubClass','OverallQual','OverallCond','YrSold','MoSold']
我需要通过执行以下操作从数据集中删除这些列:
all_data = all_data.loc[:,categorical_features]
不幸的是,此步骤仅选择这些列。我将如何通过排除它们来逆转该过程?
解决方法
我建议您计算出想要的一个,这样会更容易
categorical_features = \
['FireplaceQu','BsmtQual','BsmtCond','GarageQual','GarageCond','ExterQual','ExterCond','HeatingQC','PoolQC','KitchenQual','BsmtFinType1','BsmtFinType2','Functional','Fence','BsmtExposure','GarageFinish','LandSlope','LotShape','PavedDrive','Street','Alley','CentralAir','MSSubClass','OverallQual','OverallCond','YrSold','MoSold']
cols = set(df.columns).difference(categorical_features)
all_data = all_data.loc[:,cols]
,
您可以使用pandas.drop
排除这些列:
all_data = all_data.drop(categorical_features,axis = 1)
请看以下示例作为测试:
import pandas as pd
import numpy as np
dates = pd.date_range('20130101',periods=6)
df = pd.DataFrame(np.random.randn(6,4),index = dates,columns = list('ABCD'))
print(df)
features = ['B','D']
df = df.drop(features,axis = 1)
print(df)
输出:
A B C D
2013-01-01 1.365473 -0.445448 0.244377 0.416889
2013-01-02 -0.307532 0.095569 1.356229 -0.306618
2013-01-03 0.971216 1.100189 0.932189 0.808151
2013-01-04 -0.030160 -0.796742 -0.383336 -0.409233
2013-01-05 0.006601 0.093678 -1.013768 1.439921
2013-01-06 0.560771 -0.452491 1.050500 -1.545958
A C
2013-01-01 1.365473 0.244377
2013-01-02 -0.307532 1.356229
2013-01-03 0.971216 0.932189
2013-01-04 -0.030160 -0.383336
2013-01-05 0.006601 -1.013768
2013-01-06 0.560771 1.050500
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。