如何解决使用加权 t 检验汇总多列
我有以下数据,想计算加权 p 值。我查看了 dplyr summarise multiple columns using t.test。但我的版本应该使用重量。我可以使用 Code2 来做到这一点。但是有超过 30 列。如何有效地计算加权 p 值?
代码 1
# A tibble: 877 x 5
cat population farms farmland weight
<chr> <dbl> <dbl> <dbl> <dbl>
1 Treated 9.89 8.00 12.3 1
2 Control 10.3 7.81 12.1 0.714
3 Control 10.2 8.04 12.4 0.156
4 Control 10.3 7.97 12.1 0.340
5 Control 10.9 8.87 12.7 2.85
6 Control 10.4 8.35 12.5 0.934
7 Control 10.5 8.58 12.9 0.193
8 Control 10.6 8.57 12.6 0.276
9 Control 10.2 8.54 12.5 0.344
10 Control 10.5 8.76 12.6 0.625
# … with 867 more rows
代码 2
wtd.t.test(
x = df$population[df$cat == "Treated"],y = df$population[df$cat == "Control"],weight = df$weight[df$cat == "Treated"],weighty = df$weight[df$cat == "Control"])$coefficients[3]
解决方法
我们可以将 summarise
与 across
一起使用
library(dplyr)
df %>%
summarise(across(c(population:farmland),~ weights::wtd.t.test(x = .[cat == 'Treated'],y = .[cat == 'Control'],weight = weight[cat == 'Treated'],weighty= weight[cat == 'Control'])$coefficients[3]))
或者使用 lapply/sapply
sapply(df[2:4],function(v)
weights::wtd.t.test(x = v[df$cat == "Treated"],y = v[df$cat == "Control"],weight = df$weight[df$cat == "Treated"],weighty = df$weight[df$cat == "Control"])$coefficients[3])
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。