我需要一些使用pandas数据框的帮助.
这是数据框:
这是数据框:
group col1 col2 name 1 dog 40 canidae 1 dog 40 canidae 1 dog 40 canidae 1 dog 40 canidae 1 dog 40 1 dog 40 canidae 1 dog 40 canidae 2 frog 85 dendrobatidae 2 frog 89 leptodactylidae 2 frog 89 leptodactylidae 2 frog 82 leptodactylidae 2 frog 89 2 frog 81 2 frog 89 dendrobatidae 3 horse 87 equidae1 3 donkey 76 equidae2 3 zebra 67 equidae3 4 bird 54 psittacidae 4 bird 56 4 bird 34 5 bear 67 5 bear 54
group col1 col2 name consensus_name 1 dog 40 canidae canidae 1 dog 40 canidae canidae 1 dog 40 canidae 1 dog 40 canidae canidae 1 dog 40 canidae canidae 2 frog 85 dendrobatidae leptodactylidae 2 frog 89 leptodactylidae leptodactylidae 2 frog 89 leptodactylidae leptodactylidae 2 frog 82 leptodactylidae leptodactylidae 2 frog 89 leptodactylidae 2 frog 81 leptodactylidae 2 frog 89 dendrobatidae leptodactylidae 3 horse 87 equidae1 equidae3 3 donkey 76 equidae2 equidae3 3 zebra 67 equidae3 equidae3 4 bird 54 psittacidae psittacidae 4 bird 56 psittacidae 4 bird 34 psittacidae 5 bear 67 NA 5 bear 54 NA
为了获得每个组的新列,我得到了最具代表性的组名.
>对于group1,有4行,名称为’canidae’,另一行没有任何内容,因此对于每一行,我在列共有名称中写’canidae’
>对于group2,有2行名为’dendrobatidae’,2行没有任何东西,3行名称’leptodactylidae’所以对于每一行,我在’aggregate_name’中写’leptodactylidae’.
>对于group3,有3行具有不同的名称,因此没有达成共识,我得到的名称是col2的最低编号,所以我在共列名列中写了“equidae3”.
>对于组4,只有一行有信息,因此它是group4的一致名称,所以我在列共有名称中写了psittacidae.
>对于group5,没有信息,那么只需在consensus_name列中写入NA.
有没有人有任何想法与熊猫一起做?谢谢您帮忙 :)
输出为anky =
group col1 col2 name consensus_name 0 1 dog 40 canidae canidae 1 1 dog 40 canidae canidae 2 1 dog 40 canidae canidae 3 1 dog 40 canidae canidae 4 1 dog 40 NaN canidae 5 1 dog 40 canidae canidae 6 1 dog 40 canidae canidae 7 2 frog 85 dendrobatidae dendrobatidae 8 2 frog 89 leptodactylidae leptodactylidae 9 2 frog 89 leptodactylidae leptodactylidae 10 2 frog 82 leptodactylidae leptodactylidae 11 2 frog 89 NaN leptodactylidae 12 2 frog 81 NaN leptodactylidae 13 2 frog 89 dendrobatidae dendrobatidae 14 3 horse 87 equidae1 equidae1 15 3 donkey 76 equidae2 equidae2 16 3 zebra 67 equidae3 equidae3 17 4 bird 54 psittacidae psittacidae 18 4 bird 56 NaN psittacidae 19 4 bird 34 NaN psittacidae 20 5 bear 67 NaN NaN 21 5 bear 54 NaN NaN
解决方法
使用pandas.DataFrame.Groupby.Series.transform并将其传递给max函数:
#First fillna with empty string df.name.fillna('',inplace=True) df['consensus_name'] = df.groupby('group').name.transform('max') print(df) group col1 col2 name consensus_name 0 1 dog 40 canidae canidae 1 1 dog 40 canidae canidae 2 1 dog 40 canidae canidae 3 1 dog 40 canidae canidae 4 1 dog 40 canidae 5 1 dog 40 canidae canidae 6 1 dog 40 canidae canidae 7 2 frog 85 dendrobatidae leptodactylidae 8 2 frog 89 leptodactylidae leptodactylidae 9 2 frog 89 leptodactylidae leptodactylidae 10 2 frog 82 leptodactylidae leptodactylidae 11 2 frog 89 leptodactylidae 12 2 frog 81 leptodactylidae 13 2 frog 89 dendrobatidae leptodactylidae 14 3 horse 87 equidae1 equidae3 15 3 donkey 76 equidae2 equidae3 16 3 zebra 67 equidae3 equidae3 17 4 bird 54 psittacidae psittacidae 18 4 bird 56 psittacidae 19 4 bird 34 psittacidae 20 5 bear 67 21 5 bear 54
指出后编辑通常不适用:
df['name'] = df.groupby('group').name.ffill() df_group = df.groupby('group').name.apply(lambda x: pd.Series.mode(x,dropna=False)).reset_index() df_group = df_group[df_group.level_1 == df_group.groupby('group').level_1.transform('max')] df_group.rename({'name':'consensus_name'},axis=1,inplace=True) df_final = pd.merge(df,df_group,on='group') print(df_final) group col1 col2 name level_1 consensus_name 0 1 dog 40 canidae 0 canidae 1 1 dog 40 canidae 0 canidae 2 1 dog 40 canidae 0 canidae 3 1 dog 40 canidae 0 canidae 4 1 dog 40 canidae 0 canidae 5 1 dog 40 canidae 0 canidae 6 1 dog 40 canidae 0 canidae 7 2 frog 85 dendrobatidae 0 leptodactylidae 8 2 frog 89 leptodactylidae 0 leptodactylidae 9 2 frog 89 leptodactylidae 0 leptodactylidae 10 2 frog 82 leptodactylidae 0 leptodactylidae 11 2 frog 89 leptodactylidae 0 leptodactylidae 12 2 frog 81 leptodactylidae 0 leptodactylidae 13 2 frog 89 dendrobatidae 0 leptodactylidae 14 3 horse 87 equidae1 2 equidae3 15 3 donkey 76 equidae2 2 equidae3 16 3 zebra 67 equidae3 2 equidae3 17 4 bird 54 psittacidae 0 psittacidae 18 4 bird 56 psittacidae 0 psittacidae 19 4 bird 34 psittacidae 0 psittacidae 20 5 bear 67 NaN 0 NaN 21 5 bear 54 NaN 0 NaN
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。