微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

在正态分布上获得 p 值为 0

如何解决在正态分布上获得 p 值为 0

这是我的代码

import pandas as pd
import matplotlib.pyplot as plt

wine = pd.read_csv('red wine quality.csv')
wine = wine.dropna()
wine_density = wine[['density']] # Isolating column 'density'
wine_density = wine_density.squeeze() # Reformating as pd.Series
wine_density = pd.to_numeric(wine_density) # Converting from Str to numeric
from scipy import stats
stat,p1 = stats.shapiro(wine_density)
print('Statistics = {0:.3f},p = {1:.3f}'.format(stat,p1))
alpha = 0.05
if p1 > alpha:
    print('Fail to reject H0,sample is normal.')
else:
    print('Reject H0,sample is NOT normal.)')

当我运行这段代码时,我得到:

runcell(9,'C:/Users/Adam/Desktop/DSCI 200/Week 8/Week8_Final_Project.py')
Statistics = 0.991,p = 0.000
Reject H0,sample is NOT normal.)

但是,当我分别运行图形正态性测试(例如 qqtest 或直方图)时,我得到:

from statsmodels.graphics.gofplots import qqplot
qqplot(wine_density,line = 's')
plt.show()

Normality qqtest

plt.hist(wine_density,bins=50)
plt.show()

Normality histogram

对我来说,这两个图像看起来都像正态分布,但即使不是,获得绝对零的 p 值似乎也是错误的。我有什么误解和/或做错了什么?

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。