论文标题
非线性缩小尺寸降低分布分布回归
Nonlinear Sufficient Dimension Reduction for Distribution-on-Distribution Regression
论文作者
论文摘要
在预测因子和响应都是分布数据的情况下,我们引入了一种新的非线性降低方法,以度量空间的成员为模型。我们的关键步骤是在度量空间上构建通用核(CC-宇宙),这导致繁殖Hilbert空间的预测变量和响应足以表征有条件的独立性,以表征确定足够尺寸降低的条件独立性。对于单变量分布,我们使用Wasserstein距离构建通用内核,而对于多元分布,我们求助于切成薄片的Wasserstein距离。切成薄片的瓦斯坦距离确保了度量空间具有与瓦斯坦斯坦空间相似的拓扑特性,同时还具有重大的计算益处。基于合成数据的数值结果表明,我们的方法优于可能的竞争方法。该方法还应用于几个数据集,包括生育能力和死亡率数据和卡尔加里温度数据。
We introduce a new approach to nonlinear sufficient dimension reduction in cases where both the predictor and the response are distributional data, modeled as members of a metric space. Our key step is to build universal kernels (cc-universal) on the metric spaces, which results in reproducing kernel Hilbert spaces for the predictor and response that are rich enough to characterize the conditional independence that determines sufficient dimension reduction. For univariate distributions, we construct the universal kernel using the Wasserstein distance, while for multivariate distributions, we resort to the sliced Wasserstein distance. The sliced Wasserstein distance ensures that the metric space possesses similar topological properties to the Wasserstein space while also offering significant computation benefits. Numerical results based on synthetic data show that our method outperforms possible competing methods. The method is also applied to several data sets, including fertility and mortality data and Calgary temperature data.