微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

Python 中的因果影响分析 - P 值似乎不正确

如何解决Python 中的因果影响分析 - P 值似乎不正确

我正在用 Python 进行因果影响分析,与对照组(A/B 测试)相比,这有助于衡量干预后治疗组的影响。为了开始使用 Python,我参考了 https://github.com/jamalsenouci/causalimpact/blob/master/GettingStarted.ipynb

假设我的数据采用以下格式:

enter image description here

将 Period_1 视为处理,将 Period_2 视为对照

以下代码完美运行:

from causalimpact import CausalImpact
pre_period = [pd.to_datetime(date) for date in  [start_date,cut_date_1]]
post_period = [pd.to_datetime(date) for date in [cut_date_2,end_date]]
impact = CausalImpact(df_AA.loc[start_date:end_date_AA],pre_period,post_period,model_args={"nseasons":7})
impact.run()
impact.plot()

我得到了低于 2 个图表,而且由于预测值的置信区间与实际值重叠,因此运动似乎在统计上并不显着

enter image description here

但是我想最终回答运动是否具有统计意义以及治疗和控制之间的 p 值是多少?为此,我使用了

print(impact.summary())
print(impact.summary("report"))

我得到的结果如下。它说 p 值为 0.0,并且有 stat sig 正向运动。这似乎不正确。我尝试了不同的数据,其中实际和预测的差异非常大,并且它们不是预测的 CI 与实际不重叠,我仍然得到 p 值为 0。似乎计算出的 p 值与这个值不正确。是否有任何指针可以自行计算此因果影响库的 p 值,或者是否有办法修复此库?

                              Average     Cumulative
Actual                             15            247
Predicted                          15            246
95% CI                       [15,15]     [244,249]
                                                    
Absolute Effect                     0              1
95% CI                         [0,0]        [3,-1]
                                                    
Relative Effect                  0.4%           0.4%
95% CI                  [1.5%,-0.6%]  [1.5%,-0.6%]
                                                    
P-value                          0.0%               
Prob. of Causal Effect         100.0%               
None
 During the post-intervention period,the response variable had an average value of approx. 15.  By contrast,in  the
absence of an intervention,we would have expected an average response of 15. The 90% interval of this counterfactual
prediction is [15,15]. Subtracting this prediction from the observed response yields an estimate of the causal effect
the intervention had on the response variable. This effect is 0 with a 90% interval of [0,0]. For a discussion of the
significance of this effect,see below.


 Summing up the individual data points during the post-intervention period (which can only sometimes be meaningfully
interpreted),the response variable had an overall value of 247.  By contrast,had  the intervention not taken place,we
would have expected a sum of 247. The 90% interval of this prediction is [244,249]


 The above results are given in terms of absolute numbers. In relative terms,the response variable showed  an increase
of  0.4%. The 90% interval of this percentage is [1.5%,-0.6%]


 This means that the positive effect observed during the intervention period is statistically significant and unlikely
to be due to random fluctuations. It should be noted,however,that the question of whether this increase also bears
substantive significance can only be answered by comparing the absolute effect 0 to the original goal of the underlying
intervention.
None

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。