微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

将最近一年的值添加到上一年最后一个月的 Pandas dataframe groupby year

如何解决将最近一年的值添加到上一年最后一个月的 Pandas dataframe groupby year

我想在这个示例数据框中按年份分组,如果“严重病例 U5”的总和 == 0 和“Num_U5_Received_Severe_Treatment”的总和大于 0,我想将“Num_U5_Received_Severe_Treatment”的总和添加到上一年的最后一个月。查看我的数据:

data = [[2020,11,1,1],[2020,12,2,2],[2021,0],1]]
df = pd.DataFrame(data,columns = ['year','month','Severe cases U5','Num_U5_Received_Severe_Treatment'])

我首先写下您在下面看到的内容(不确定如何索引 groupby 对象),但一直收到此错误:TypeError:元组索引必须是整数或切片,而不是 str

for group in df.groupby('year'):
    if group['Severe cases U5'].sum() == 0 and group['Num_U5_Received_Severe_Treatment'].sum() > 0:
         pre_year = group['year'] - 1

更新:我想出了组迭代。现在需要弄清楚行迭代。代码的那部分不起作用。见下文:

for year,group in df.groupby('year'):
        if (group['Severe cases U5'].sum() == 0 and group['Num_U5_Received_Severe_Treatment'].sum() > 0):
        
            year_current = np.unique(group['year'])[0] -1
            excess_death = group['Num_U5_Received_Severe_Treatment'].sum()
            group['Num_U5_Received_Severe_Treatment'] = 0
            print(excess_death)
            print(year_current)
    for month_row,row in group.iterrows():  
            if (row['year'] == year and row['month'] == 12): #not working but what i am trying to do
                    row['Num_U5_Received_Severe_Treatment'] =+ excess_death
                    print(row['Num_U5_Received_Severe_Treatment'])

解决方法

我终于明白了!庞大的解决方案,但它有效。

data = [[2020,11,1,1],[2020,12,2,2],[2021,0],1]]
df = pd.DataFrame(data,columns = ['year','month','Severe cases U5','Num_U5_Received_Severe_Treatment'])

for year,group in df.groupby('year'):
        if (group['Severe cases U5'].sum() == 0 and group['Num_U5_Received_Severe_Treatment'].sum() > 0):
        
            year_current = np.unique(group['year'])[0] -1
            excess_death = group['Num_U5_Received_Severe_Treatment'].sum()
            for month_row,row in group.iterrows():
                if (row['Num_U5_Received_Severe_Treatment'] > 0):
                      df.loc[month_row,'Num_U5_Received_Severe_Treatment'] = 0
           
      
        
for year,group in df.groupby('year'):
        for month_row,row in group.iterrows():
            if (row['year'] == year_current and row['month'] == 12):
            #not working but what i am trying to do
                 death_previous=row['Num_U5_Received_Severe_Treatment'] + excess_death
                 df.loc[month_row,'Num_U5_Received_Severe_Treatment'] = death_previous
df

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。