如何在多列中使用python中的groupby？

如何解决如何在多列中使用python中的groupby？

具有一个df值：

mingw32-make install

如何使用groupby和表单值？

name   numb         exam       marks   

tom     2546        math         25     

tom     2546        science      25 

tom     2546        env         25 

mark    2547        math        15 

mark    2547        env         10


sam    2548         env         18

尝试了这个：

name   numb       total_exams_attended       total_maths_exam_attended  total_marks_scored_in_maths  total_marks_scored
 
tom    2546           3                               1                       25                          75
mark   2547           2                               1                       15                          25
sam    2548           1                               0                                                   18

但是卡在total_marks_scored_in_maths列中。如何仅对特定的列值（例如此处的数学值）进行分组/汇总

解决方法

考虑pivot_table，由于层次结构和聚集名称，对列名称进行了一些操作：

pivot_df = df.pivot_table(index='name',columns='exam',values='marks',aggfunc=['count','sum'],margins=True,margins_name='total')

pivot_df.columns = [i+'_'+j.replace('count','exams_attended').replace('sum','marks_scored') 
                            for i,j in zip(pivot_df.columns.get_level_values(1),pivot_df.columns.get_level_values(0))]

输出

pivot_df
#        env_exams_attended  math_exams_attended  science_exams_attended  total_exams_attended  env_marks_scored  math_marks_scored  science_marks_scored  total_marks_scored
# name
# mark                  1.0                  1.0                     0.0                     2              10.0               15.0                   0.0                  25
# sam                   1.0                  0.0                     0.0                     1              18.0                0.0                   0.0                  18
# tom                   1.0                  1.0                     1.0                     3              25.0               25.0                  25.0                  75
# total                 3.0                  2.0                     1.0                     6              53.0               40.0                  25.0                 118

是否需要过滤到数学公式，并且总列使用.loc：

math_pvt_df = pivot_df.loc[df['name'].unique(),["math_exams_attended","total_exams_attended","math_marks_scored","total_marks_scored"]]

math_pvt_df
#       math_exams_attended  total_exams_attended  math_marks_scored  total_marks_scored
# name
# mark                  1.0                     2               15.0                  25
# sam                   0.0                     1                0.0                  18
# tom                   1.0                     3               25.0                  75

如何在多列中使用python中的groupby？

如何解决如何在多列中使用python中的groupby？

解决方法

相关推荐