高维贝叶斯型号的强大的剩余交叉验证

论文标题

高维贝叶斯型号的强大的剩余交叉验证

Robust leave-one-out cross-validation for high-dimensional Bayesian models

论文作者

Silva, Luca, Zanella, Giacomo

论文摘要

剩下的交叉验证（LOO-CV）是一种估计样本外预测准确性的流行方法。但是，由于需要多次拟合模型，因此计算LOO-CV标准在计算上可能很昂贵。在贝叶斯的情况下，重要性采样提供了一种可能的解决方案，但是经典方法可以轻松地产生渐近方差无限的估计器，从而使它们可能不可靠。在这里，我们提出和分析一种新型混合估计量来计算贝叶斯loo-CV标准。我们的方法保留了经典方法的简单性和计算便利性，同时保证了所得估计器的有限渐近方差。提供了理论和数值结果，以说明提高的鲁棒性和效率。在高维问题中，计算益处尤为重要，可以为更广泛的模型执行贝叶斯loo-CV，并且具有具有很高影响力的数据集。所提出的方法可以在标准概率编程软件中很容易实现，并且计算成本大致相当于拟合原始模型一次。

Leave-one-out cross-validation (LOO-CV) is a popular method for estimating out-of-sample predictive accuracy. However, computing LOO-CV criteria can be computationally expensive due to the need to fit the model multiple times. In the Bayesian context, importance sampling provides a possible solution but classical approaches can easily produce estimators whose asymptotic variance is infinite, making them potentially unreliable. Here we propose and analyze a novel mixture estimator to compute Bayesian LOO-CV criteria. Our method retains the simplicity and computational convenience of classical approaches, while guaranteeing finite asymptotic variance of the resulting estimators. Both theoretical and numerical results are provided to illustrate the improved robustness and efficiency. The computational benefits are particularly significant in high-dimensional problems, allowing to perform Bayesian LOO-CV for a broader range of models, and datasets with highly influential observations. The proposed methodology is easily implementable in standard probabilistic programming software and has a computational cost roughly equivalent to fitting the original model once.

下载PDF全文

下载文献需遵守相关版权规定

论文标题