论文标题
带有子空间信息的变分贝叶斯过滤,以实现极端时空矩阵完成
Variational Bayesian Filtering with Subspace Information for Extreme Spatio-Temporal Matrix Completion
论文作者
论文摘要
缺少数据是现实世界传感器数据收集中的常见问题。在低数据抽样和嘈杂抽样的极端情况下,各种方法的性能迅速降低了数据,这是在交通传感和环境监测领域的许多现实问题中存在的案例。我们提出了一种用于时空基质完成的贝叶斯方法,其中我们使用各种贝叶斯技术估算了暂时变化的子空间。我们将低级矩阵的完成与状态空间自回归框架以及缓慢变化的子空间的惩罚函数以及数据中的时间和周期性演变建模。我们方法的一个主要优点是,与大多数矩阵/张量完成技术不同,使用自动相关性测定(ARD)方法会自动调整模型等级之类的关键参数。我们还提出了上述配方的强大版本,该版本可改善异常值的存在。我们使用子空间信息(VBFSI)方法评估了提出的变分贝叶斯过滤,以将矩阵归为现实世界流量和空气污染数据。仿真结果表明,所提出的方法的表现优于最新的最新方法,并为不同的采样率提供了足够准确的插补。特别是,我们证明,在数天内融合子空间演化可以通过数据采样的15%来提高插补性能。
Missing data is a common problem in real-world sensor data collection. The performance of various approaches to impute data degrade rapidly in the extreme scenarios of low data sampling and noisy sampling, a case present in many real-world problems in the field of traffic sensing and environment monitoring, etc. However, jointly exploiting the spatiotemporal and periodic structure, which is generally not captured by classical matrix completion approaches, can improve the imputation performance of sensor data in such real-world conditions. We present a Bayesian approach towards spatiotemporal matrix completion wherein we estimate the underlying temporarily varying subspace using a Variational Bayesian technique. We jointly couple the low-rank matrix completion with the state space autoregressive framework along with a penalty function on the slowly varying subspace to model the temporal and periodic evolution in the data. A major advantage of our method is that a critical parameter like the rank of the model is automatically tuned using the automatic relevance determination (ARD) approach, unlike most matrix/tensor completion techniques. We also propose a robust version of the above formulation, which improves the performance of imputation in the presence of outliers. We evaluate the proposed Variational Bayesian Filtering with Subspace Information (VBFSI) method to impute matrices in real-world traffic and air pollution data. Simulation results demonstrate that the proposed method outperforms the recent state-of-the-art methods and provides a sufficiently accurate imputation for different sampling rates. In particular, we demonstrate that fusing the subspace evolution over days can improve the imputation performance with even 15% of the data sampling.