python – 从pandas df中删除行

我想删除pandas df中的所有行.具体来说,当Col A中X下方的行为空时.因此,如果Col A中X下面的行为空,我想删除所有这些行,直到值X下面有一个字符串

import pandas as pd

d = ({
    'A' : ['X','','','X','Foo','','X','Fou','','X','Bar'],           
    'B' : ['Val',1,3,'Val',1,3,'Val',1,3,'Val',1],
    'C' : ['Val',2,4,'Val',2,4,'Val',2,4,'Val',2],
    })

df = pd.DataFrame(data=d)

输出：

      A    B    C
0     X  Val  Val
1          1    2
2          3    4
3     X  Val  Val
4   Foo    1    2
5          3    4
6     X  Val  Val
7   Fou    1    2
8          3    4
9     X  Val  Val
10  Bar    1    2

我试过了：

df = df[~(df['A'] == 'X').shift().fillna(False)]

但是这会删除X后跟的所有内容.如果X下面的下一行为空,我只希望删除它.

意：

     A    B    C
0    X  Val  Val
1  Foo    1    2
2         3    4
3    X  Val  Val
4  Fou    1    2
5         4    4
6    X  Val  Val
7  Bar    1    2

解决方法:

使用：

m1 = df['A'] == 'X'
g =  m1.cumsum()
m = (df['A'] == '') | m1

df = df[~m.groupby(g).transform('all')]
print (df)
      A    B    C
3     X  Val  Val
4   Foo    1    2
5          3    4
6     X  Val  Val
7   Fou    1    2
8          3    4
9     X  Val  Val
10  Bar    1    2

细节：

m1 = df['A'] == 'X'
g =  m1.cumsum()
m = (df['A'] == '') | m1

print (pd.concat([df,
                  df['A'] == 'X',
                  m1.cumsum(),
                  (df['A'] == ''), 
                  m,
                  m.groupby(g).transform('all'),
                  ~m.groupby(g).transform('all')], axis=1,
       keys=['orig','==X','g','==space','m', 'all', 'inverted all']))

   orig              ==X  g ==space      m    all inverted all
      A    B    C      A  A       A      A      A            A
0     X  Val  Val   True  1   False   True   True        False
1          1    2  False  1    True   True   True        False
2          3    4  False  1    True   True   True        False
3     X  Val  Val   True  2   False   True  False         True
4   Foo    1    2  False  2   False  False  False         True
5          3    4  False  2    True   True  False         True
6     X  Val  Val   True  3   False   True  False         True
7   Fou    1    2  False  3   False  False  False         True
8          3    4  False  3    True   True  False         True
9     X  Val  Val   True  4   False   True  False         True
10  Bar    1    2  False  4   False  False  False         True

说明：

>比较X并创建组的累计总和从X到g开始
>链2布尔掩码 – 将X和空白空间与m进行比较
> groupby with transform and DataFrameGroupBy.all for return Trues for groups only only True
>最后反转并按boolean indexing过滤

python – 从pandas df中删除行

相关推荐