微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如何解读发散和低效样本警告?

如何解决如何解读发散和低效样本警告?

我正在运行一个教程,教一些新手如何使用 PyMC3 进行回归。我以 Ted 演讲数据为例,试图找出评论数量、转录语言数量和演讲视频长度如何预测 Ted 演讲的受欢迎程度。我为 PyMC3 运行了以下代码

    intercept = pm.normal("Intercept",5,sigma=3)
    beta_duration = pm.normal('duration',mu = 0.05,sd = 0.3) 
    beta_languages = pm.normal('languages',sd = 0.1) 
    beta_comments = pm.normal('comments',sd = 0.1)
    epsilon = pm.HalfCauchy('epsilon',5)

    likelihood = pm.normal('likelihood',mu = intercept + beta_duration * ted_talk['duration'] + beta_languages * ted_talk['languages'] + beta_comments * ted_talk['comments'],sd = epsilon,observed = ted_talk['views'])
    trace = pm.sample(4000,tune = 2000,chains = 3)

结果:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (3 chains in 1 job)
NUTS: [epsilon,comments,languages,duration,Intercept]

Sampling 3 chains for 2_000 tune and 4_000 draw iterations (6_000 + 12_000 draws total) took 91 seconds.
There were 973 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.5480812333460533,but should be close to 0.8. Try to increase the number of tuning steps.
There were 973 divergences after tuning. Increase `target_accept` or reparameterize.
There were 973 divergences after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 10% for some parameters.

问题 1:为什么 MCMC 模拟即使在调整后仍返回一些分歧的可能原因是什么?由于程序建议我增加target_accept,并增加调优,您认为哪个更有用?但如果它高度依赖,我想知道为什么会这样?

问题2:如果有效样本太小,潜在的问题是什么?由于我还没有看到任何“阈值”来确定有效样本的数量是否太大/太小(包括mcmc_diagnostic),您认为贝叶斯回归模型中多少有效样本是合理的?

非常感谢您的时间!您的帮助是巨大的!

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。