论文标题

高斯过程中稀疏变异推断的收敛性回归

Convergence of Sparse Variational Inference in Gaussian Processes Regression

论文作者

Burt, David R., Rasmussen, Carl Edward, van der Wilk, Mark

论文摘要

高斯过程是贝叶斯建模中用途广泛且数学方便的先验的功能的分布。但是,由于用于精确推断的矩阵操作的立方($ n $)成本,通常会阻碍具有大量观测值的数据,即$ n $。已经提出了许多解决方案,这些解决方案依赖于$ m \ ll n $诱导变量以$ \ MATHCAL {O}(NM^2)$的成本形成近似。虽然计算成本以$ n $的形式出现线性,但真正的复杂性取决于$ m $必须用$ n $扩展以确保近似的某些质量。在这项工作中,我们研究了上限和下限,以了解$ m $如何使用$ n $增长,以确保高质量近似。我们表明,对于具有$ M \ ll n $的高斯噪声回归模型,我们可以在近似模型和确切的后部任意后近距离之间进行KL差异。具体而言,对于流行的平方指数内核和$ d $二维的高斯分布式协变量,$ m = \ Mathcal {o}(((\ log log n)^d)$足够,并且具有整体计算成本的方法,总体计算成本为$ \ \ \ \ \ \ \ \ \ \ \ \ \ \ n of}(n(n log n)^n)

Gaussian processes are distributions over functions that are versatile and mathematically convenient priors in Bayesian modelling. However, their use is often impeded for data with large numbers of observations, $N$, due to the cubic (in $N$) cost of matrix operations used in exact inference. Many solutions have been proposed that rely on $M \ll N$ inducing variables to form an approximation at a cost of $\mathcal{O}(NM^2)$. While the computational cost appears linear in $N$, the true complexity depends on how $M$ must scale with $N$ to ensure a certain quality of the approximation. In this work, we investigate upper and lower bounds on how $M$ needs to grow with $N$ to ensure high quality approximations. We show that we can make the KL-divergence between the approximate model and the exact posterior arbitrarily small for a Gaussian-noise regression model with $M\ll N$. Specifically, for the popular squared exponential kernel and $D$-dimensional Gaussian distributed covariates, $M=\mathcal{O}((\log N)^D)$ suffice and a method with an overall computational cost of $\mathcal{O}(N(\log N)^{2D}(\log\log N)^2)$ can be used to perform inference.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源