如何解决在级别中置换列,对2列进行测试,然后保存pvalues
> dput(df)
structure(list(id = c(1,2,3,4,1,4),level = structure(c(1L,1L,2L,2L),.Label = c("g01","g02"),class = "factor"),m_col = c(1,11,22,33,44),u_col = c(11,12,13,14,21,23,24),group = c(0,1
)),row.names = c(NA,-8L),class = "data.frame")
看起来像这样
id level m_col u_col group
1 1 g01 1 11 0
2 2 g01 2 12 0
3 3 g01 3 13 1
4 4 g01 4 14 1
5 1 g02 11 21 0
6 2 g02 22 22 0
7 3 g02 33 23 1
8 4 g02 44 24 1
我想在每个“级别”上执行二项式加权测试(本质上,我需要比较每个id的u_col和m_col)...因此,使用tidyverse
和broom
我可以以下:
res <- df %>%
group_by(level) %>%
do(tidy(glm(cbind(.$m_col,.$u_col) ~ .$group,family="binomial"))) %>%
filter(term == ".$group")
每个级别为我提供了一些p值:
> res
# A tibble: 2 x 6
# Groups: level [2]
level term estimate std.error statistic p.value
<fct> <chr> <dbl> <dbl> <dbl> <dbl>
1 g01 .$group 0.687 0.746 0.921 0.357
2 g02 .$group 0.758 0.296 2.56 0.0105
然后我可以问多少p
length(which(res$p.value < 0.05)
我现在想对数据进行置换,重复二项式检验,询问有多少p
但是,排列需要重新排列每个“级别”中的“组”列。我正在努力寻找一种方法来做到这一点,例如,一个排列看起来像这样
id level m_col u_col group
1 1 g01 1 11 1
2 2 g01 2 12 0
3 3 g01 3 13 1
4 4 g01 4 14 0
5 1 g02 11 21 1
6 2 g02 22 22 0
7 3 g02 33 23 1
8 4 g02 44 24 0
一秒钟会变成
id level m_col u_col group
1 1 g01 1 11 0
2 2 g01 2 12 1
3 3 g01 3 13 1
4 4 g01 4 14 0
5 1 g02 11 21 0
6 2 g02 22 22 1
7 3 g02 33 23 1
8 4 g02 44 24 0
等
让测试依赖于2列会限制洗牌选项,这让我很沮丧。我将不胜感激。
解决方法
如果您想要一个数据框,可以尝试以下操作:
<input name="study[field]">
<input name="study[branch]">
<input name="study[grade_id]">
<input name="study[institution_id]">
<input name="study[institution_education]">
<input name="study[gpa]">
<input name="study[nation_id]">
<input name="study[province_id]">
<input name="study[town_id]">
<input name="study[branch]">
<input name="study[province_name]">
<input name="study[town_name]">
<input name="study[entrance]">
<input name="study[graduate]">
<input name="study[currently_studying]">
,如果需要向量:
foreach ($request->field as $key => $value) {
Educational::firstOrCreate(['user_id' => auth()->id()],[
'grade_id' => $request->grade_id[$key],'field' => $request->field[$key],'institution_id' => $request->institution_id[$key],'branch' => $request->branch[$key],'institution_education' => $request->institution_education[$key],'gpa' => $request->gpa[$key],'nation_id' => $request->nation_id[$key],'province_id' => $request->province_id[$key],'town_id' => $request->town_id[$key],'province_name' => $request->province_name[$key],'town_name' => $request->town_name[$key],'entrance' => $request->entrance[$key],'graduate' => $request->graduate[$key],'currently_studying' => $request->has("currently_studying.$key"),]);
}
,
您可以编写一个函数:
library(dplyr)
library(broom)
apply_fun <- function(data) {
sum(subset(tidy(glm(cbind(m_col,u_col)~group,data,family="binomial")),term == 'group')$p.value < 0.05)
}
,然后使用replicate
重复它。
result <- replicate(100,df %>%
group_by(level) %>%
mutate(group = sample(group)) %>%
summarise(value = apply_fun(cur_data())),simplify = FALSE)
result
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。