微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

cbind 两个数据框后保留一列

如何解决cbind 两个数据框后保留一列

这是我的两个数据框

dput(head(C1_com))
structure(list(Term = c("GO:0030198","GO:0043062","GO:0001944","GO:0072358","GO:0001568","GO:0048514"),LogP = c(-17.4296193682,-16.3090192653,-17.0759726333,-15.9170353092,-14.7864136301)),row.names = c(NA,-6L),class = c("tbl_df","tbl","data.frame"))
> dput(head(C2_com))
structure(list(Term = c("GO:0030198","GO:0030335","GO:0040017","GO:0051272","GO:2000147"),LogP = c(-11.3445846204,-10.5074739613,-10.1220888832,-9.9838733854,-9.5214690772,-9.3731567195)),"data.frame"))

我想在 cbind 之后只保留一个公共列,这给了我这个

 head(C1_C2)
        Term      LogP       Term       LogP
1 GO:0030198 -17.42962 GO:0030198 -11.344585
2 GO:0043062 -16.30902 GO:0043062 -10.507474
3 GO:0001944 -17.07597 GO:0030335 -10.122089
4 GO:0072358 -17.07597 GO:0040017  -9.983873
5 GO:0001568 -15.91704 GO:0051272  -9.521469
6 GO:0048514 -14.78641 GO:2000147  -9.373157

我想只保留一栏常见的术语。我可以这样做

删除 Term 列之一的 cbind 之后只想保留第一个“Term”列,但这是一个漫长的过程。有什么可以与 cbind 一起使用并只保留一列“Term”吗?

更新

我的两个起始数据框都具有相同的列名。有没有办法在做 cbind标记列,前两个来自 C1_com 和第 3,4 个来自 C2_com?要知道

这里是我的最终输出

dput(head(C1_C2))
structure(list(Term = c("GO:0042330","GO:0006935","GO:0098609","GO:0001655","GO:0072001","GO:0001822"),LogP = c(-15.5665740868,-15.3333915705,-15.1730394873,-14.2710870407,-13.0316539848,-11.7720012424),Term = c("GO:0006935","GO:0042330","GO:0030155","GO:0045785","GO:0048589"),LogP = c(-9.1846695955,-9.0333614068,-8.2012718158,-6.9630841551,-3.1110110087,-5.6023202524
),Term = c("GO:0098609","GO:0002009","GO:0048729","GO:0060562"),LogP = c(-8.400270409,-5.1046710312,-2.2877603428,-5.0328708902,-4.8403582471,-3.367532764),Term = c("GO:0048589","GO:0002009"
),LogP = c(-12.0251459649,-7.4342736812,-7.2221883529,-11.3806941521,-10.2926537215,-9.6593776685),"GO:0060562","GO:0072073"),LogP = c(-7.1913732375,-7.1140368886,-7.668196714,-4.6060571139,-3.1414409878,-2.5797852608
),LogP = c(-10.6304171879,-10.5285058082,-8.2142677691,-7.8757600983,-6.1772502878,-7.4503144922)),6L),class = "data.frame")

我只想保留第一个术语列

head(C1_C2)
        Term      LogP       Term      LogP       Term      LogP       Term       LogP       Term      LogP       Term
1 GO:0042330 -15.56657 GO:0006935 -9.184670 GO:0098609 -8.400270 GO:0048589 -12.025146 GO:0006935 -7.191373 GO:0006935
2 GO:0006935 -15.33339 GO:0042330 -9.033361 GO:0030155 -5.104671 GO:0042330  -7.434274 GO:0042330 -7.114037 GO:0042330
3 GO:0098609 -15.17304 GO:0098609 -8.201272 GO:0045785 -2.287760 GO:0006935  -7.222188 GO:0048729 -7.668197 GO:0098609
4 GO:0001655 -14.27109 GO:0030155 -6.963084 GO:0002009 -5.032871 GO:0048729 -11.380694 GO:0002009 -4.606057 GO:0030155
5 GO:0072001 -13.03165 GO:0045785 -3.111011 GO:0048729 -4.840358 GO:0001655 -10.292654 GO:0060562 -3.141441 GO:0045785
6 GO:0001822 -11.77200 GO:0048589 -5.602320 GO:0060562 -3.367533 GO:0002009  -9.659378 GO:0072073 -2.579785 GO:0048589
        LogP
1 -10.630417
2 -10.528506
3  -8.214268
4  -7.875760
5  -6.177250
6  -7.450314

删除其余的术语列。因为它们都是相同的,但具有不同的p值,这是不同比较的结果。所以我的目标是查看每个术语的富集如何变化,在这种情况下是根据 pvalues 报告的。

解决方法

如果您使用 left_join,那么您将只保留术语列的一份副本,即 new_df <- left_join(C1_com,C2_com,by = "Term")。这是你想要的?当然,如果术语列实际上不相同,您会得到一些奇怪的结果。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。