论文标题

具有偏置模型的强大高斯过程回归

Robust Gaussian Process Regression with a Bias Model

论文作者

Park, Chiwoo, Borth, David J., Wilson, Nicholas S., Hunter, Chad N., Friedersdorf, Fritz J.

论文摘要

本文提出了一种新的高斯过程(GP)回归的方法。大多数现有的方法用由较重的尾巴分布(例如拉普拉斯分布和Student-t分布)引起的非高斯可能性代替了异常容易的高斯可能性。但是,使用非高斯的可能性将征收对后推断中计算昂贵的贝叶斯近似计算的需求。提出的方法将异常值模拟为对未知回归函数的嘈杂和有偏见的观察,因此,可能性的可能性包含偏差术语来解释与回归函数的偏差程度。我们需要如何通过正规化的最大似然估计来准确地与其他超参数估算偏差。以偏差估计为条件,可以将强大的GP回归减少到标准的GP回归问题,并具有预测均值和方差估计的分析形式。因此,所提出的方法简单,并且在计算上很有吸引力。对于许多经过测试的方案,它还提供了非常强大且准确的GP估计。对于数值评估,我们进行了一项全面的仿真研究,以与现有的强大GP方法进行比较,以评估所提出的方法,在不同的异常比例和不同噪声水平的各种模拟场景下的现有强大的GP方法。该方法应用于来自两个测量系统的数据,其中预测因子基于健壮的环境参数测量值,并且响应变量利用了包含一定百分比的异常值的更复杂的化学传感方法。通过计算有效的GP回归和偏置模型,可以改善测量系统的实用性和环境数据的价值。

This paper presents a new approach to a robust Gaussian process (GP) regression. Most existing approaches replace an outlier-prone Gaussian likelihood with a non-Gaussian likelihood induced from a heavy tail distribution, such as the Laplace distribution and Student-t distribution. However, the use of a non-Gaussian likelihood would incur the need for a computationally expensive Bayesian approximate computation in the posterior inferences. The proposed approach models an outlier as a noisy and biased observation of an unknown regression function, and accordingly, the likelihood contains bias terms to explain the degree of deviations from the regression function. We entail how the biases can be estimated accurately with other hyperparameters by a regularized maximum likelihood estimation. Conditioned on the bias estimates, the robust GP regression can be reduced to a standard GP regression problem with analytical forms of the predictive mean and variance estimates. Therefore, the proposed approach is simple and very computationally attractive. It also gives a very robust and accurate GP estimate for many tested scenarios. For the numerical evaluation, we perform a comprehensive simulation study to evaluate the proposed approach with the comparison to the existing robust GP approaches under various simulated scenarios of different outlier proportions and different noise levels. The approach is applied to data from two measurement systems, where the predictors are based on robust environmental parameter measurements and the response variables utilize more complex chemical sensing methods that contain a certain percentage of outliers. The utility of the measurement systems and value of the environmental data are improved through the computationally efficient GP regression and bias model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源