Lipschitz的多元学习标准化

论文标题

Lipschitz的多元学习标准化

Lipschitz standardization for multivariate learning

论文作者

Javaloy, Adrián, Valera, Isabel

论文摘要

概率学习越来越多地作为优化问题解决，基于梯度的方法是主要方法。在对多元可能性进行建模时，通常但不希望的结果是，学到的模型仅符合观察到的变量的一个子集，俯瞰其余的变量。在这项工作中，我们通过多任务学习（MTL）的角度研究了这个问题，在这些镜头上已经广泛研究了类似的效果。尽管MTL解决方案并未直接应用于概率设置（因为它们无法处理可能性约束），但我们表明在数据预处理过程中可能会利用类似的想法。首先，我们表明数据标准化通常在常见的连续可能性下有助于，但在一般情况下还不够，特别是在混合连续和离散的可能性模型下。为了平衡多元学习，我们提出了一种新颖的数据预处理，Lipschitz的标准化，该标准化平衡了跨变量的本地Lipschitz平滑度。我们在现实世界数据集上的实验表明，Lipschitz标准化比使用现有数据预处理技术学到的模型更准确地进行多元模型。实验中使用的模型和数据集可以在https://github.com/adrianjav/lipschitz-standardization中找到。

Probabilistic learning is increasingly being tackled as an optimization problem, with gradient-based approaches as predominant methods. When modelling multivariate likelihoods, a usual but undesirable outcome is that the learned model fits only a subset of the observed variables, overlooking the rest. In this work, we study this problem through the lens of multitask learning (MTL), where similar effects have been broadly studied. While MTL solutions do not directly apply in the probabilistic setting (as they cannot handle the likelihood constraints) we show that similar ideas may be leveraged during data preprocessing. First, we show that data standardization often helps under common continuous likelihoods, but it is not enough in the general case, specially under mixed continuous and discrete likelihood models. In order for balance multivariate learning, we then propose a novel data preprocessing, Lipschitz standardization, which balances the local Lipschitz smoothness across variables. Our experiments on real-world datasets show that Lipschitz standardization leads to more accurate multivariate models than the ones learned using existing data preprocessing techniques. The models and datasets employed in the experiments can be found in https://github.com/adrianjav/lipschitz-standardization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题