多项T检验以找到主要效果的功能

如何解决多项T检验以找到主要效果的功能

df <- data.frame (rating1  = c(1,5,2,4,5),rating2  = c(2,1,2),rating3  = c(0,0),race = c("black","asian","white","black","white"),gender = c("male","female","male","female")
              ) 

我想对组平均值(例如,等级1中的亚洲人平均值)和每个等级的总体平均值(例如,等级1)进行t检验。下面是我对亚洲人的等级为1的代码。

asian_df <- df %>% 
   filter(race == "asian")
t.test(asian_df$rating1,mean(df$rating1)) 

然后在等级2的黑人中奔跑

   black_df <- df %>% 
       filter(race == "black")
    t.test(black_df$rating2,mean(df$rating2))

我该如何编写一个使每个小组的t检验自动化的函数?到目前为止,我必须手动将变量名称更改为实质上针对每个种族,每个性别和每个等级(等级1至等级3)运行。谢谢!

解决方法

执行多次t检验会增加I型错误的风险,您将需要adjust for multiple comparisons才能使结果有效/有意义。您可以通过遍历变量来运行t检验,例如

library(tidyverse)
df <- data.frame (rating1  = c(5,8,7,9,6,5,5),rating2  = c(2,4,3,1,1),rating3  = c(0,2,race = c("asian","asian","black","white","black"),gender = c("male","female","male","male")
)

for (rac in unique(df$race)){
tmp_df <- df %>% 
    filter(race == rac)
print(rac)
print(t.test(tmp_df$rating1,rep(mean(df$rating1),length(tmp_df$rating1))))
}
[1] "asian"

    Welch Two Sample t-test

data:  tmp_df$rating1 and rep(mean(df$rating1),length(tmp_df$rating1))
t = 0.19518,df = 3,p-value = 0.8577
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.550864  2.884198
sample estimates:
mean of x mean of y 
 7.250000  7.083333 

[1] "black"

    Welch Two Sample t-test

data:  tmp_df$rating1 and rep(mean(df$rating1),length(tmp_df$rating1))
t = -1.5149,df = 4,p-value = 0.2044
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.5022651  0.7355985
sample estimates:
mean of x mean of y 
 6.200000  7.083333 

[1] "white"

    Welch Two Sample t-test

data:  tmp_df$rating1 and rep(mean(df$rating1),length(tmp_df$rating1))
t = 3.75,df = 2,p-value = 0.06433
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1842176  2.6842176
sample estimates:
mean of x mean of y 
 8.333333  7.083333 


for (gend in unique(df$gender)){
  tmp_df <- df %>% 
    filter(gender == gend)
  print(gend)
  print(t.test(tmp_df$rating1,length(tmp_df$rating1))))
}
[1] "male"

    Welch Two Sample t-test

data:  tmp_df$rating1 and rep(mean(df$rating1),length(tmp_df$rating1))
t = -2.0979,df = 5,p-value = 0.09
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.4107761  0.2441094
sample estimates:
mean of x mean of y 
 6.000000  7.083333 

[1] "female"

    Welch Two Sample t-test

data:  tmp_df$rating1 and rep(mean(df$rating1),length(tmp_df$rating1))
t = 3.5251,p-value = 0.01683
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.2933469 1.8733198
sample estimates:
mean of x mean of y 
 8.166667  7.083333 

由于进行了多次测试(在本示例中为5次t检验),因此您出现假阳性的机率1 - (1 - 0.05)^5 = 22.62% Bonferroni correction,该方法基本上会获取所需的p值(在这种情况下,p

另一种方法是使用ANOVA比较所有条件下的均值,然后使用Tukey的HSD确定哪些组不同。 Tukey的HSD是一个事后测试,因此您无需考虑多个测试,并且结果是有效的。使这种方法适应您的问题可能是更好的解决方法,例如

anova_one_way <- aov(rating1 + rating2 + rating3 ~ race + gender,data = df)

summary(anova_one_way)

            Df Sum Sq Mean Sq F value  Pr(>F)   
race         2 266.70  133.35   14.01 0.00243 **
gender       1 140.08  140.08   14.72 0.00497 **
Residuals    8  76.13    9.52           
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


TukeyHSD(anova_one_way)

Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = rating1 + rating2 + rating3 ~ race + gender,data = df)

$race
                 diff        lwr       upr     p adj
black-asian -7.050000 -12.963253 -1.136747 0.0224905
white-asian  4.416667  -2.315868 11.149201 0.2076254
white-black 11.466667   5.029132 17.904201 0.0023910

$gender
                 diff       lwr       upr     p adj
male-female -3.416667 -7.523829 0.6904958 0.0913521

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-
参考1 参考2 解决方案 # 点击安装源 协议选择 http:// 路径填写 mirrors.aliyun.com/centos/8.3.2011/BaseOS/x86_64/os URL类型 软件库URL 其他路径 # 版本 7 mirrors.aliyun.com/centos/7/os/x86
报错1 [root@slave1 data_mocker]# kafka-console-consumer.sh --bootstrap-server slave1:9092 --topic topic_db [2023-12-19 18:31:12,770] WARN [Consumer clie
错误1 # 重写数据 hive (edu)&gt; insert overwrite table dwd_trade_cart_add_inc &gt; select data.id, &gt; data.user_id, &gt; data.course_id, &gt; date_format(
错误1 hive (edu)&gt; insert into huanhuan values(1,&#39;haoge&#39;); Query ID = root_20240110071417_fe1517ad-3607-41f4-bdcf-d00b98ac443e Total jobs = 1
报错1:执行到如下就不执行了,没有显示Successfully registered new MBean. [root@slave1 bin]# /usr/local/software/flume-1.9.0/bin/flume-ng agent -n a1 -c /usr/local/softwa
虚拟及没有启动任何服务器查看jps会显示jps,如果没有显示任何东西 [root@slave2 ~]# jps 9647 Jps 解决方案 # 进入/tmp查看 [root@slave1 dfs]# cd /tmp [root@slave1 tmp]# ll 总用量 48 drwxr-xr-x. 2
报错1 hive&gt; show databases; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Error in configuring object Time taken: 0.474 se
报错1 [root@localhost ~]# vim -bash: vim: 未找到命令 安装vim yum -y install vim* # 查看是否安装成功 [root@hadoop01 hadoop]# rpm -qa |grep vim vim-X11-7.4.629-8.el7_9.x
修改hadoop配置 vi /usr/local/software/hadoop-2.9.2/etc/hadoop/yarn-site.xml # 添加如下 &lt;configuration&gt; &lt;property&gt; &lt;name&gt;yarn.nodemanager.res