微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如何将操作分组并聚合到多个列?

如何解决如何将操作分组并聚合到多个列?

我正在尝试基于两列为数据框中的行创建均值,但是出现以下错误

TypeError: 'numpy.float64' object is not callable

数据框:

       date               origin  positive_score  neutral_score  negativity_score  compound_score
 2020-09-19            the verge           0.130          0.846             0.024          0.9833
 2020-09-19            the verge           0.130          0.846             0.024          0.9833
 2020-09-19                 fool           0.075          0.869             0.056          0.8560
 2020-09-19        seeking_alpha           0.067          0.918             0.015          0.9983
 2020-09-19        seeking_alpha           0.171          0.791             0.038          0.7506
 2020-09-19        seeking_alpha           0.095          0.814             0.091          0.9187
 2020-09-19        seeking_alpha           0.113          0.801             0.086          0.9890
 2020-09-19        seeking_alpha           0.094          0.869             0.038          0.9997
 2020-09-19  wall street journal           0.000          1.000             0.000          0.0000
 2020-09-19        seeking_alpha           0.179          0.779             0.042          0.9997
 2020-09-19        seeking_alpha           0.178          0.704             0.117          0.7360

我的代码

    def mean_indicators(cls,df: pd.DataFrame):
        df_with_mean = df.groupby([DATE,ORIGIN],as_index=False).agg({POSITIVE_score: df[POSITIVE_score].mean(),NEGATIVE_score: df[NEGATIVE_score].mean(),NEUTRAL_score: df[NEUTRAL_score].mean(),COMPOUND_score: df[COMPOUND_score].mean()
                                                                       })
        return df_with_mean

解决方法

我认为这应该做您想要的:

def mean_indicators(cls,df: pd.DataFrame):
    df_with_mean = df.groupby([DATE,ORIGIN],as_index=False).agg(
    {POSITIVE_SCORE: "mean",NEGATIVE_SCORE: "mean",NEUTRAL_SCORE: "mean",COMPOUND_SCORE: "mean",})
    return df_with_mean

您也可以使用命名聚合语法,如here

,
# just groupby and mean
df_mean = df.groupby(['date','origin'],as_index=False).mean()

# display(df_mean())
       date               origin  positive_score  neutral_score  negativity_score  compound_score
 2020-09-19                 fool        0.075000       0.869000             0.056        0.856000
 2020-09-19        seeking_alpha        0.128143       0.810857             0.061        0.913143
 2020-09-19            the verge        0.130000       0.846000             0.024        0.983300
 2020-09-19  wall street journal        0.000000       1.000000             0.000        0.000000

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。