将带有逗号千位分隔符的数据帧转换为空格分隔符 Pandas

如何解决将带有逗号千位分隔符的数据帧转换为空格分隔符 Pandas

我在 Pandas 中遇到格式问题。所以，我在 A DataFrame 中有一个 Column 用逗号分隔符计算数字，如 (200,000)。所以我想将其转换为 (200 000)。

使用替换函数的简单方法，但我也想将类型转换为整数。它不起作用，因为它们之间有空格。

最后，我只想做一个这样的降序排序值的排名：

ID	别墅	Price_nospace
3	和平	35000000
3	和平	35000000
2	罗莎	27000000
1	海滩	25000000
0	棕榈	22000000

如您所见，没有分隔符就不容易阅读价格。所以我想让价格更具可读性。但是当我有空格分隔符时，我无法转换为 Int。如果我不转换为整数，我可以使用 sort_values 函数。所以我被卡住了。

感谢您的帮助。

解决方法

稍微修改样本输入以对输出中的值进行排序（降序）。

以下解决方案将按 Price_nospace 对数据帧进行排序（降序）并将 comma 替换为 space。但是 Price_nospace 在输出中将是 object 类型。

样本输入

Id  Villas  Price_nospace
0   3   Peace   220,000
1   3   Peace   350,000
2   2   Rosa    270,000
3   1   Beach   250,000
4   0   Palm    230,000

代码

df['Price_new'] = df['Price_nospace'].str.replace(',','',regex=True).astype(int)
df = df.sort_values(by='Price_new',ascending=False)
df['Price_nospace'] = df['Price_nospace'].str.replace(',' ',regex=True)
df = df.drop(columns='Price_new').reset_index(drop=True) ## reset_index,if required
df

输出

    Id  Villas  Price_nospace
0   3   Peace   350 000
1   2   Rosa    270 000
2   1   Beach   250 000
3   0   Palm    230 000
4   3   Peace   220 000

说明

引入了一个新列 Price_new 以将 Price_nospace 值转换为 int 并对值进行排序。
df 排序后，只需替换 comma with space for Price_nospace 并删除临时列 Price_new。

另一种选择是改变数据的显示方式，但不影响基础类型。

在将 str 价格转换为 float 价格后使用 pd.options.display.float_format：

import pandas as pd


def my_float_format(x):
    '''
    Number formatting with custom thousands separator
    '''
    return f'{x:,.0f}'.replace(',' ')


# set display float_format
pd.options.display.float_format = my_float_format

df = pd.DataFrame({
    'Id': [3,3,2,1,0],'Villas': ['Peace','Peace','Rosa','Beach','Palm'],'Price_nospace': ['35,000,000','35,'27,'25,'22,000']
})

# Convert str prices to float
df['Price_nospace'] = (
    df['Price_nospace'].str.replace(',regex=True).astype(float)
)

Output：

print(df)

   Id Villas  Price_nospace
0   3  Peace     35 000 000
1   3  Peace     35 000 000
2   2   Rosa     27 000 000
3   1  Beach     25 000 000
4   0   Palm     22 000 000

print(df.dtypes)

Id                 int64
Villas            object
Price_nospace    float64
dtype: object

由于类型是 float64，因此任何数字运算都将正常运行。

同样的 my_float_format 函数也可以用于导出：

df.to_csv(float_format=my_float_format)

,Id,Villas,Price_nospace
0,Peace,35 000 000
1,35 000 000
2,Rosa,27 000 000
3,Beach,25 000 000
4,Palm,22 000 000

将带有逗号千位分隔符的数据帧转换为空格分隔符 Pandas

如何解决将带有逗号千位分隔符的数据帧转换为空格分隔符 Pandas

解决方法

相关推荐