如何解决多元相关过滤器
例如:
如果三个特征包含 2 个分类变量和 1 个目标变量 在使用卡方检验确定每个特征与目标变量的相关性时,我无法找到强关系。所以我想使用这两个特征的组合来检查是否与目标变量存在相关性,但我很困惑对于这种情况我们是否可以使用卡方检验或其他一些方法?
例如:
ct_reloc_status = pd.crosstab(df_offer_details['percentage_hike_offered_bin'].sample(frac=0.5,replace=True,random_state=1),[df_offer_details['Candidate relocation status'].sample(frac=0.5,df_offer_details['Acceptance status'].sample(frac=0.5,random_state=1)])
ct_reloc_status
# we carry out a contingency test to check whether there is a correlation with the target variable
# and relocation status
H0 = "There is no relationship between Relocation status and Acceptance status"
Ha = "There is a relationship between Relocation status and Acceptance status"
stat,p,dof,expected = chi2_contingency(ct_reloc_status)
print('p-value: ',p)
prob = 0.95
critical = chi2.ppf(prob,dof)
print('probability=%.3f,critical=%.3f,stat=%.3f' % (prob,critical,stat))
if abs(stat) >= critical :
print(f'''Since p-value {p} < 0.05 we reject null hypothesis: {H0}.Thus alternate hypothesis: {Ha} holds good ''')
else:
print(f'Fail to reject null hypothesis {H0}')
Result:
p-value: 0.019814129159194147
probability=0.950,critical=28.869,stat=32.380
Since p-value 0.019814129159194147 < 0.05 we reject the null hypothesis: There is no relationship between Relocation status and Acceptance status.Thus alternate hypothesis: There is a relationship between Relocation status and Acceptance status holds good
但我不确定这是否是正确的方法
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。