微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

时间序列数据:将数据分到每天,然后按星期几绘制

我有一个非常简单的pandas DataFrame,格式如下:

date        P1      P2      day
2015-01-01  190     1132    Thursday
2015-01-01  225     1765    Thursday
2015-01-01  3427    29421   Thursday
2015-01-01  945     7679    Thursday
2015-01-01  1228    9537    Thursday
2015-01-01  870     6903    Thursday
2015-01-02  785     4768    Friday
2015-01-02  1137    7065    Friday
2015-01-02  175     875     Friday

其中P1和P2是感兴趣的不同参数.我想为每个P1和P2创建一个看起来像this的条形图.如数据所示,我每天有几个值.我想对给定日期的给定值取平均值,然后针对星期几绘图(以便将第1周星期一的平均值添加到第2周星期一等).

我是python的新手,当前的方法很讨厌,涉及多个循环.目前,我有两个专用的代码部分-一个用于计算平均值,另一个则在一周的每一天进行一次,并计算出绘图结果.有没有更清洁的方法可以做到这一点?

解决方法:

似乎您在寻找:

df[['day', 'P1']].groupby('day').mean().plot(kind='bar', legend=None)

df[['day', 'P2']].groupby('day').mean().plot(kind='bar', legend=None)

完整示例:

import numpy as np
import pandas as pd

days = ['Mon', 'Tue', 'Wed', 'Thur', 'Fri', 'Sat', 'Sun']
day = np.random.choice(days, size=1000)
p1, p2 = np.random.randint(low=0, high=2500, size=(2, 1000))
df = pd.DataFrame({'P1': p1, 'P2': p2, 'day': day})

# Helps for ordering of day-of-week in plot
df['day'] = pd.Categorical(df.day, categories=days)

# %matplotlib inline

df[['day', 'P1']].groupby('day').mean().plot(kind='bar', legend=None)
df[['day', 'P2']].groupby('day').mean().plot(kind='bar', legend=None)

请注意,在现有的DataFrame上,对pd.Categorical的调用会为您提供一个自定义的排序键,如here所示.

结果(对于P1):

enter image description here

更新资料

您在评论中问,

Does groupby find the average of a given parameter (say P1), over all
instances of the group? For instance, if I have 8 Mondays, is the
resulting value the average of all datapoints that occurred on Monday?
An added hurdle here is that I have unreliable sampling for the data.
If I had a Monday with 10 samples and a Monday with 1, simply
averaging all 11 values would drown out the Monday with a small sample
size. Thus, I would like to average all values for a given date before
considering the day of week.

是的,上面的groupby可以找到所有实例的平均值.这是达到“两倍”平均的方法

# for P1; replace P2 with P1 to find P2 avgs.
df.drop('P2', axis=1).groupby(['date', 'day']).mean()\
    .reset_index().groupby('day').mean().plot(kind='bar', legend=None)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐