我想在pandas数据框或面板中存储一些多维数据,以便我能够返回例如:
>赛跑者A,比赛A的所有时间
> 2015年比赛A的所有时间(和名称)都说
示例数据看起来像这样,请注意并非所有参赛者都拥有所有年份或所有比赛的数据.
任何人都可以建议用熊猫或任何其他方式做到这一点的好方法吗?
Name | Gender | Age
Runner A | Male | 35
Race A
Year | Time
2015 | 2:35:09
2014 | 2:47:34
2013 | 2:50:12
Race B
Year | Time
2013 | 1:32:07
Runner B | Male | 29
Race A
Year | Time
2015 | 3:05:56
Runner C | Female | 32
Race B
Year | Time
1998 | 1:29:43
解决方法:
我想你可以使用Multiindex,然后按slicers选择数据:
import pandas as pd
df = pd.DataFrame({'Time': {('Runner A', 'Male', 35, 'Race A', 2014): '2:47:34', ('Runner C', 'Female', 32, 'Race B', 1998): '1:29:43', ('Runner B', 'Male', 29, 'Race A', 2015): '3:05:56', ('Runner A', 'Male', 35, 'Race A', 2013): '2:50:12', ('Runner A', 'Male', 35, 'Race B', 2013): '1:32:07', ('Runner A', 'Male', 35, 'Race A', 2015): '2:35:09'}})
print (df)
Time
Runner A Male 35 Race A 2013 2:50:12
2014 2:47:34
2015 2:35:09
Race B 2013 1:32:07
Runner B Male 29 Race A 2015 3:05:56
Runner C Female 32 Race B 1998 1:29:43
#index has to be fully lexsorted
df.sort_index(inplace=True)
print (df)
Time
Runner A Male 35 Race A 2013 2:50:12
2014 2:47:34
2015 2:35:09
Race B 2013 1:32:07
Runner B Male 29 Race A 2015 3:05:56
Runner C Female 32 Race B 1998 1:29:43
idx = pd.IndexSlice
print (df.loc[idx['Runner A',:,:,'Race A',:],:])
Time
Runner A Male 35 Race A 2013 2:50:12
2014 2:47:34
2015 2:35:09
print (df.loc[idx[:,:,:,'Race A',2015],:])
Time
Runner A Male 35 Race A 2015 2:35:09
Runner B Male 29 Race A 2015 3:05:56
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。