论文标题

处理位置偏见,以无偏学习在酒店搜索中排名

Handling Position Bias for Unbiased Learning to Rank in Hotels Search

论文作者

Li, Yinxiao

论文摘要

如今,搜索排名和建议系统依靠许多数据来培训机器学习模型,例如学习到秩(LTR)模型来对给定查询的结果进行排名,而隐含的用户反馈(例如,点击数据)已成为数据收集的主要来源,因为它的丰度和低成本,尤其是对于大型大型互联网公司而言。但是,这种数据收集方法的缺点是数据可能是高度偏差的,最重要的偏见之一是位置偏见,在该位置偏见中,用户倾向于单击更高的排名结果。在这项工作中,我们将研究在TripAdvisor酒店搜索中正确处理在线测试环境中的位置偏见的边际重要性。我们提出了一种经验有效的方法来处理完全利用用户行动数据的位置偏差。我们利用了一个事实,即当用户点击结果时,几乎可以肯定地观察到上面的所有结果,并且下面点击结果的结果倾向将通过简单但有效的位置偏差模型来估算。在线A/B测试结果表明,此方法可导致改进的搜索排名模型。

Nowadays, search ranking and recommendation systems rely on a lot of data to train machine learning models such as Learning-to-Rank (LTR) models to rank results for a given query, and implicit user feedbacks (e.g. click data) have become the dominant source of data collection due to its abundance and low cost, especially for major Internet companies. However, a drawback of this data collection approach is the data could be highly biased, and one of the most significant biases is the position bias, where users are biased towards clicking on higher ranked results. In this work, we will investigate the marginal importance of properly handling the position bias in an online test environment in Tripadvisor Hotels search. We propose an empirically effective method of handling the position bias that fully leverages the user action data. We take advantage of the fact that when user clicks a result, he has almost certainly observed all the results above, and the propensities of the results below the clicked result will be estimated by a simple but effective position bias model. The online A/B test results show that this method leads to an improved search ranking model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源