我觉得这应该很明显,但是我有点卡住了.
我在行上有一个带有3级MultiIndex的DataFrame(df). MultiIndex的级别之一是ccy,代表表示该行中包含的信息的货币.每行有3列数据.
我想将所有数据转换为参考货币(比如说美元).为此,我有一个系列(forex),其中包含相关货币的汇率.
因此目标很简单:将df每行中的所有数据乘以与df中该行索引的ccy条目对应的外汇值.
机械设置如下所示:
import pandas as pd
import numpy as np
import itertools
np.random.seed(0)
tuples = list(itertools.product(
list('abd'),
['one', 'two', 'three'],
['USD', 'EUR', 'GBP']
))
np.random.shuffle(tuples)
idx = pd.MultiIndex.from_tuples(tuples[:-10], names=['letter', 'number', 'ccy'])
df = pd.DataFrame(np.random.randn(len(idx), 3), index=idx,
columns=['val_1', 'val_2', 'val_3'])
forex = pd.Series({'USD': 1.0,
'EUR': 1.3,
'GBP': 1.7})
我可以通过运行以下命令获得所需的信息:
df.apply(lambda col: col.mul(forex, level='ccy'), axis=0)
但是对我来说似乎很奇怪,在这种简单情况下我需要使用pd.DataFrame.apply.我希望以下语法(或类似的语法)能起作用:
df.mul(forex, level='ccy', axis=0)
但这给了我:
ValueError: cannot reindex from a duplicate axis
显然,apply方法不是灾难.但是似乎很奇怪,我无法弄清楚使用mul在所有列上直接执行此操作的语法.有没有更直接的方法来解决这个问题?如果不是,是否出于直觉原因不应该增强mul语法以这种方式工作?
解决方法:
现在可以在master / 0.14中使用.查看问题:https://github.com/pydata/pandas/pull/6682
In [11]: df.mul(forex,level='ccy',axis=0)
Out[11]:
val_1 val_2 val_3
letter number ccy
a one GBP -2.172854 2.443530 -0.132098
d three USD 1.089630 0.096543 1.418667
b two GBP 1.986064 1.610216 1.845328
three GBP 4.049782 -0.690240 0.452957
a two GBP -2.304713 -0.193974 -1.435192
b one GBP 1.199589 -0.677936 -1.406234
d two GBP -0.706766 -0.891671 1.382272
b two EUR -0.298026 2.810233 -1.244011
d one EUR 0.087504 0.268448 -0.593946
GBP -1.801959 1.045427 2.430423
b three EUR -0.275538 -0.104438 0.527017
a one EUR 0.154189 1.630738 1.844833
b one EUR -0.967013 -3.272668 -1.959225
d three GBP 1.953429 -2.029083 1.939772
EUR 1.962279 1.388108 -0.892566
a three GBP 0.025285 -0.638632 -0.064980
USD 0.367974 -0.044724 -0.302375
[17 rows x 3 columns]
这是另一种方法(也需要master / 0.14)
In [127]: df = df.sortlevel()
In [128]: df
Out[128]:
val_1 val_2 val_3
letter number ccy
a one EUR 0.118607 1.254414 1.419102
GBP -1.278149 1.437371 -0.077705
three GBP 0.014873 -0.375666 -0.038224
USD 0.367974 -0.044724 -0.302375
two GBP -1.355714 -0.114103 -0.844231
b one EUR -0.743856 -2.517437 -1.507096
GBP 0.705641 -0.398786 -0.827197
three EUR -0.211952 -0.080337 0.405398
GBP 2.382224 -0.406024 0.266445
two EUR -0.229251 2.161717 -0.956931
GBP 1.168273 0.947186 1.085487
d one EUR 0.067311 0.206499 -0.456881
GBP -1.059976 0.614957 1.429661
three EUR 1.509445 1.067775 -0.686589
GBP 1.149076 -1.193578 1.141042
USD 1.089630 0.096543 1.418667
two GBP -0.415745 -0.524512 0.813101
[17 rows x 3 columns]
idx = pd.IndexSlice
In [129]: pd.concat([ df.loc[idx[:,:,x],:]*v for x,v in forex.iteritems() ])
Out[129]:
val_1 val_2 val_3
letter number ccy
a one EUR 0.154189 1.630738 1.844833
b one EUR -0.967013 -3.272668 -1.959225
three EUR -0.275538 -0.104438 0.527017
two EUR -0.298026 2.810233 -1.244011
d one EUR 0.087504 0.268448 -0.593946
three EUR 1.962279 1.388108 -0.892566
a one GBP -2.172854 2.443530 -0.132098
three GBP 0.025285 -0.638632 -0.064980
two GBP -2.304713 -0.193974 -1.435192
b one GBP 1.199589 -0.677936 -1.406234
three GBP 4.049782 -0.690240 0.452957
two GBP 1.986064 1.610216 1.845328
d one GBP -1.801959 1.045427 2.430423
three GBP 1.953429 -2.029083 1.939772
two GBP -0.706766 -0.891671 1.382272
a three USD 0.367974 -0.044724 -0.302375
d three USD 1.089630 0.096543 1.418667
[17 rows x 3 columns]
这是通过合并的另一种方式
In [36]: f = forex.to_frame('value')
In [37]: f.index.name = 'ccy'
In [38]: pd.merge(df.reset_index(),f.reset_index(),on='ccy')
Out[38]:
letter number ccy val_1 val_2 val_3 value
0 a one GBP -1.278149 1.437371 -0.077705 1.7
1 b two GBP 1.168273 0.947186 1.085487 1.7
2 b three GBP 2.382224 -0.406024 0.266445 1.7
3 a two GBP -1.355714 -0.114103 -0.844231 1.7
4 b one GBP 0.705641 -0.398786 -0.827197 1.7
5 d two GBP -0.415745 -0.524512 0.813101 1.7
6 d one GBP -1.059976 0.614957 1.429661 1.7
7 d three GBP 1.149076 -1.193578 1.141042 1.7
8 a three GBP 0.014873 -0.375666 -0.038224 1.7
9 d three USD 1.089630 0.096543 1.418667 1.0
10 a three USD 0.367974 -0.044724 -0.302375 1.0
11 b two EUR -0.229251 2.161717 -0.956931 1.3
12 d one EUR 0.067311 0.206499 -0.456881 1.3
13 b three EUR -0.211952 -0.080337 0.405398 1.3
14 a one EUR 0.118607 1.254414 1.419102 1.3
15 b one EUR -0.743856 -2.517437 -1.507096 1.3
16 d three EUR 1.509445 1.067775 -0.686589 1.3
[17 rows x 7 columns]
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。