微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

根据条件分组+新列+值抢先行

如何解决根据条件分组+新列+值抢先行

我有这个设置

df = pd.DataFrame({'user':[1,1,2,3,4,4],'date':['1995-09-01','1995-09-02','1995-10-03','1995-10-04','1995-10-05','1995-11-07','1995-11-08','1995-11-09','1995-11-10','1995-11-15','1995-12-18','1995-12-19','1995-12-20','1995-12-23','1995-12-26','1995-12-27'],'dc':['1995-09-02','1995-10-02','1995-11-05','1995-12-10','1995-12-23'],'tp':['s','c','f','s','f'],'vt':['0','1','0','0'],'c1':['1','5','2','3','9','6','4','c2':['3','8','7','c3':['5','7']})
df

给出:

user    date        dc     tp   vt  c1   c2  c3
 1  1995-09-01  1995-09-02  s   0    1   3   5
 1  1995-09-02  1995-09-02  c   1    5   4   5
 1  1995-10-03  1995-10-02  f   0    0   0   2
 2  1995-10-04  1995-10-05  s   0    2   2   5
 2  1995-10-05  1995-10-05  c   1    3   5   6
 2  1995-11-07  1995-11-05  c   0    9   3   4
 2  1995-11-08  1995-11-05  f   0    3   8   2
 3  1995-11-09  1995-11-10  s   0    2   4   4
 3  1995-11-10  1995-11-10  c   1    0   0   4
 3  1995-11-15  1995-11-10  s   0    5   6   6
 3  1995-12-18  1995-12-10  f   0    5   2   3
 4  1995-12-19  1995-12-23  s   0    6   7   4
 4  1995-12-20  1995-12-23  s   0    4   0   3
 4  1995-12-23  1995-12-23  c   1    0   0   8
 4  1995-12-26  1995-12-23  s   0    6   8   2
 4  1995-12-27  1995-12-23  f   0    0   0   7

我要创建新列,请创建新列df ['dc2'],其中groupby用户的列df ['dc2'] = df ['dc']。 但是,如果df ['dc']满足条件'tp'='c'&'vt'= 1&'c1'= 0&'c2'= 0, 然后获取一个条目的日期(用户原始)

#ie。对于用户3,在df ['dc']列上,如果我们查看条目'tp'=' c '&'vtb'= 1 ,我们可以看到它具有'c1'= 0 和'c2'= 0 , #因此df ['dc2']的值将(对于用户3)为' 1995-11-09 '而不是'1995-11-10'

#ie。对于用户4,在列df ['dc']上,如果我们查看条目'tp'=' c '&'vtb'= 1 ,我们可以看到它具有'c1'= 0 和'c2'= 0 在这种情况下,df ['dc2']应该是(对于用户4)“ 1995-12-20 ”而不是“ 1995-12-23”

这是预期的结果:

user    date       dc           dc2     tp   vt c1  c2  c3
1   1995-09-01  1995-09-02  1995-09-02   s   0   1   3   5
1   1995-09-02  1995-09-02  1995-09-02   c   1   5   4   5
1   1995-10-03  1995-10-02  1995-10-02   f   0   0   0   2
2   1995-10-04  1995-10-05  1995-10-05   s   0   2   2   5
2   1995-10-05  1995-10-05  1995-10-05   c   1   3   5   6
2   1995-11-07  1995-11-05  1995-11-05   c   0   9   3   4
2   1995-11-08  1995-11-05  1995-11-05   f   0   3   8   2
3   1995-11-09  1995-11-10  1995-11-09   s   0   2   4   4
3   1995-11-10  1995-11-10  1995-11-09   c   1   0   0   4
3   1995-11-15  1995-11-10  1995-11-09   s   0   5   6   6
3   1995-12-18  1995-12-10  1995-12-09   f   0   5   2   3
4   1995-12-19  1995-12-23  1995-12-20   s   0   6   7   4
4   1995-12-20  1995-12-23  1995-12-20   s   0   4   0   3
4   1995-12-23  1995-12-23  1995-12-20   c   1   0   0   8
4   1995-12-26  1995-12-23  1995-12-20   s   0   6   8   2
4   1995-12-27  1995-12-23  1995-12-20   f   0   0   0   7

解决方法

让我们创建一个表示条件tp=cvt=1c1=0c2=0的布尔掩码,然后在列user上进行分组,并应用自定义转换函数f根据以下条件选择前一行的值:

m = df['tp'].eq('c') & df['vt'].eq('1')\
     & df['c1'].eq('0') & df['c2'].eq('0')
     
f = lambda s: s.mask(~m.shift(-1,fill_value=False)).ffill().bfill()
df['dc2'] = df.groupby('user')['date'].apply(f).fillna(df['dc'])

    user        date          dc tp vt c1 c2 c3         dc2
0      1  1995-09-01  1995-09-02  s  0  1  3  5  1995-09-02
1      1  1995-09-02  1995-09-02  c  1  5  4  5  1995-09-02
2      1  1995-10-03  1995-10-02  f  0  0  0  2  1995-10-02
3      2  1995-10-04  1995-10-05  s  0  2  2  5  1995-10-05
4      2  1995-10-05  1995-10-05  c  1  3  5  6  1995-10-05
5      2  1995-11-07  1995-11-05  c  0  9  3  4  1995-11-05
6      2  1995-11-08  1995-11-05  f  0  3  8  2  1995-11-05
7      3  1995-11-09  1995-11-10  s  0  2  4  4  1995-11-09
8      3  1995-11-10  1995-11-10  c  1  0  0  4  1995-11-09
9      3  1995-11-15  1995-11-10  s  0  5  6  6  1995-11-09
10     3  1995-12-18  1995-12-10  f  0  5  2  3  1995-11-09
11     4  1995-12-19  1995-12-23  s  0  6  7  4  1995-12-20
12     4  1995-12-20  1995-12-23  s  0  4  0  3  1995-12-20
13     4  1995-12-23  1995-12-23  c  1  0  0  8  1995-12-20
14     4  1995-12-26  1995-12-23  s  0  6  8  2  1995-12-20
15     4  1995-12-27  1995-12-23  f  0  0  0  7  1995-12-20

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。