非混合贝叶斯计算的堆叠：多模式后期的诅咒和祝福

论文标题

非混合贝叶斯计算的堆叠：多模式后期的诅咒和祝福

Stacking for Non-mixing Bayesian Computations: The Curse and Blessing of Multimodal Posteriors

论文作者

Yao, Yuling, Vehtari, Aki, Gelman, Andrew

论文摘要

在使用多模式贝叶斯后分布时，马尔可夫链蒙特卡洛（MCMC）算法在模式之间移动难度，默认变异或基于模式的近似推断将低估后不确定性。而且，即使可以找到最重要的模式，也很难评估其后部的相对权重。在这里，我们提出了一种使用MCMC，变异或基于模式的推断的平行运行，以达到尽可能多的模式或分离区域，然后使用贝叶斯堆叠组合这些方法，这是一种可扩展的方法，用于构建加权平均分布的平均值。从多模式后分布中有效堆叠的结果，最大程度地减少交叉验证预测误差，并代表后验不确定性比变异推断更好，但不一定是等效的，即使是渐近地，也可以完全等效地进行贝叶斯推断。我们以一个示例呈现理论一致性，在该示例中，堆叠的推理近似于误称模型和非混合采样器的真实数据生成过程，从中，预测性能比全贝叶斯推断要好，因此可以将多模态视为福音，而不是在模型下的诅咒。我们在几个模型系列中展示了实际实施：潜在的迪里奇分配，高斯流程回归，分层回归，马蹄可变选择和神经网络。

When working with multimodal Bayesian posterior distributions, Markov chain Monte Carlo (MCMC) algorithms have difficulty moving between modes, and default variational or mode-based approximate inferences will understate posterior uncertainty. And, even if the most important modes can be found, it is difficult to evaluate their relative weights in the posterior. Here we propose an approach using parallel runs of MCMC, variational, or mode-based inference to hit as many modes or separated regions as possible and then combine these using Bayesian stacking, a scalable method for constructing a weighted average of distributions. The result from stacking efficiently samples from multimodal posterior distribution, minimizes cross validation prediction error, and represents the posterior uncertainty better than variational inference, but it is not necessarily equivalent, even asymptotically, to fully Bayesian inference. We present theoretical consistency with an example where the stacked inference approximates the true data generating process from the misspecified model and a non-mixing sampler, from which the predictive performance is better than full Bayesian inference, hence the multimodality can be considered a blessing rather than a curse under model misspecification. We demonstrate practical implementation in several model families: latent Dirichlet allocation, Gaussian process regression, hierarchical regression, horseshoe variable selection, and neural networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题