论文标题
关于图表上机器学习公平性的调查
A Survey on Fairness for Machine Learning on Graphs
论文作者
论文摘要
如今,对图形建模的复杂现象的分析在许多现实世界应用领域都起着至关重要的作用,在这些领域中,决策可以产生强烈的社会影响。但是,许多研究和论文最近揭示了机器学习模型可能导致个人与不公平结果之间的潜在不同治疗。在这种情况下,公平性问题并不能幸免于图形的算法贡献,并提出了与图形的内在性质相关的一些具体挑战:(1)图形数据是非IID的,并且该假设可能无效公平机器学习中的许多现有研究,(2)适合于与良好类型的公平性相处的挑战,以评估与临时数据的良好类型和(3)Algorsic(3)ALGORSICS(3)ALGORSICS(3),3)公平。这项调查是第一个致力于公平性数据的调查。它的目的是在图表开采方面对最新技术进行全面审查,并确定开放的挑战和未来趋势。特别是,我们首先介绍几个明智的应用程序域和关联的图形挖掘任务,重点是续集中的边缘预测和节点分类。我们还回想起提出的不同指标,以评估图表挖掘过程不同水平的潜在偏差。然后,我们全面概述了图形公平机器学习领域的最新贡献,这些贡献将其分类为预处理,处理和后处理模型。我们还建议描述现有的图形数据,合成和现实世界基准。最后,我们详细介绍了五个潜在的有希望的方向,以提高研究图算法公平性的研究。
Nowadays, the analysis of complex phenomena modeled by graphs plays a crucial role in many real-world application domains where decisions can have a strong societal impact. However, numerous studies and papers have recently revealed that machine learning models could lead to potential disparate treatment between individuals and unfair outcomes. In that context, algorithmic contributions for graph mining are not spared by the problem of fairness and present some specific challenges related to the intrinsic nature of graphs: (1) graph data is non-IID, and this assumption may invalidate many existing studies in fair machine learning, (2) suited metric definitions to assess the different types of fairness with relational data and (3) algorithmic challenge on the difficulty of finding a good trade-off between model accuracy and fairness. This survey is the first one dedicated to fairness for relational data. It aims to present a comprehensive review of state-of-the-art techniques in fairness on graph mining and identify the open challenges and future trends. In particular, we start by presenting several sensible application domains and the associated graph mining tasks with a focus on edge prediction and node classification in the sequel. We also recall the different metrics proposed to evaluate potential bias at different levels of the graph mining process; then we provide a comprehensive overview of recent contributions in the domain of fair machine learning for graphs, that we classify into pre-processing, in-processing and post-processing models. We also propose to describe existing graph data, synthetic and real-world benchmarks. Finally, we present in detail five potential promising directions to advance research in studying algorithmic fairness on graphs.