如何解决如何创建基于 N 个百分位的二进制向量? 更新数据
如何为高 25% 基因表达编码为 1 而所有其他患者编码为 0 的患者创建二元向量(命名为 group
)?
> dput(head(dat,20))
c(14.0217647549219,4.38634192539018,11.230612647966,13.5882888840484,10.2699597878478,8.09562488203986,14.1224780341231,10.4488388145038,12.2745444001468,9.09203349810451,14.3513862469323,11.5782968747535,13.6411144041398,9.79892114560863,11.1019611651618,12.5146158084875,12.643970834391,1.09720624597437,5.83979838350692,11.1604484254692
)
解决方法
我们可以使用 quantile
创建群组
dat$group <- with(dat,+(V1 >= quantile(V1,0.75)))
dat$group
[1] 0 1 0 0 0 0 1 0
更新
根据评论,OP 的 'dat' 是 vector
dat <- c(14.0217647549219,4.38634192539018,11.230612647966,13.5882888840484,10.2699597878478,8.09562488203986,14.1224780341231,10.4488388145038,12.2745444001468,9.09203349810451,14.3513862469323,11.5782968747535,13.6411144041398,9.79892114560863,11.1019611651618,12.5146158084875,12.643970834391,1.09720624597437,5.83979838350692,11.1604484254692 )
因此,我们可以直接在对象上应用quantile
group <- +(dat >= quantile(dat,0.75))
out <- data.frame(V1 = dat,group)
-输出
out
V1 group
1 14.021765 1
2 4.386342 0
3 11.230613 0
4 13.588289 1
5 10.269960 0
6 8.095625 0
7 14.122478 1
8 10.448839 0
9 12.274544 0
10 9.092033 0
11 14.351386 1
12 11.578297 0
13 13.641114 1
14 9.798921 0
15 11.101961 0
16 12.514616 0
17 12.643971 0
18 1.097206 0
19 5.839798 0
20 11.160448 0
数据
dat <- structure(list(V1 = c(4.124,5.215,1.368,0.325,0.368,3.653,36.12,0.124)),class = "data.frame",row.names = c("1","2","3","4","5","6","7","8"))
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。