如何解决熊猫-根据另一个df索引对行重新采样
我有一个datframe看起来像这样:
zone Datetime Demand
48 2020-08-02 00:00:00 14292.550740
48 2020-08-02 01:00:00 14243.490740
48 2020-08-02 02:00:00 9130.840744
48 2020-08-02 03:00:00 10483.510740
48 2020-08-02 04:00:00 10014.970740
我想根据另一个df索引重新采样(求和)需求值,如下所示:
2020-08-02 03:00:00
2020-08-02 06:00:00
2020-08-02 07:00:00
2020-08-02 10:00:00
处理此问题的最佳方法是什么?
解决方法
我相信您需要merge_asof
:
print (df2)
a
2020-08-02 03:00:00 1
2020-08-02 06:00:00 2
2020-08-02 07:00:00 3
2020-08-02 10:00:00 4
df1['Datetime'] = pd.to_datetime(df1['Datetime'])
df2.index = pd.to_datetime(df2.index)
df = pd.merge_asof(df1,df2.rename_axis('date2').reset_index(),left_on='Datetime',right_on='date2',direction='forward'
)
print (df)
zone Datetime Demand date2 a
0 48 2020-08-02 00:00:00 14292.550740 2020-08-02 03:00:00 1
1 48 2020-08-02 01:00:00 14243.490740 2020-08-02 03:00:00 1
2 48 2020-08-02 02:00:00 9130.840744 2020-08-02 03:00:00 1
3 48 2020-08-02 03:00:00 10483.510740 2020-08-02 03:00:00 1
4 48 2020-08-02 04:00:00 10014.970740 2020-08-02 06:00:00 2
然后聚合sum
,例如如果两栏都需要:
df = df.groupby(['zone','date2'],as_index=False)['Demand'].sum()
print (df)
zone date2 Demand
0 48 2020-08-02 03:00:00 48150.392964
1 48 2020-08-02 06:00:00 10014.970740
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。