NDCG替代物的大规模随机优化，可证明可证明的收敛性

论文标题

NDCG替代物的大规模随机优化，可证明可证明的收敛性

Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence

论文作者

Qiu, Zi-Hao, Hu, Quanqi, Zhong, Yongjian, Zhang, Lijun, Yang, Tianbao

论文摘要

NDCG是标准化的折扣累积增益，是信息检索和机器学习中广泛使用的排名指标。但是，仍然缺乏最大化NDCG的有效且可证明的随机方法，尤其是对于深层模型。在本文中，我们提出了一种优化NDCG及其最高$ K $变体的原则方法。首先，我们制定了一个新颖的组成优化问题，用于优化NDCG替代物，以及一个新型的双层构图优化问题，用于优化顶部$ K $ NDCG替代物。然后，我们开发有效的随机算法，并为非凸目标提供了可证明的收敛保证。与现有的NDCG优化方法不同，我们的算法尺度的触电复杂性具有迷你批量的大小，而不是总项目的数量。为了提高深度学习的有效性，我们通过使用初始热身和停止梯度操作员进一步提出实用策略。多个数据集的实验结果表明，我们的方法在NDCG方面优于先前的排名方法。据我们所知，这是首次提出随机算法以优化具有可证明的收敛保证的NDCG。我们提出的方法是在https://libauc.org/的libauc库中实现的。

NDCG, namely Normalized Discounted Cumulative Gain, is a widely used ranking metric in information retrieval and machine learning. However, efficient and provable stochastic methods for maximizing NDCG are still lacking, especially for deep models. In this paper, we propose a principled approach to optimize NDCG and its top-$K$ variant. First, we formulate a novel compositional optimization problem for optimizing the NDCG surrogate, and a novel bilevel compositional optimization problem for optimizing the top-$K$ NDCG surrogate. Then, we develop efficient stochastic algorithms with provable convergence guarantees for the non-convex objectives. Different from existing NDCG optimization methods, the per-iteration complexity of our algorithms scales with the mini-batch size instead of the number of total items. To improve the effectiveness for deep learning, we further propose practical strategies by using initial warm-up and stop gradient operator. Experimental results on multiple datasets demonstrate that our methods outperform prior ranking approaches in terms of NDCG. To the best of our knowledge, this is the first time that stochastic algorithms are proposed to optimize NDCG with a provable convergence guarantee. Our proposed methods are implemented in the LibAUC library at https://libauc.org/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题