论文标题
一项有关程序性弱监督的调查
A Survey on Programmatic Weak Supervision
论文作者
论文摘要
标记培训数据已成为使用机器学习的主要障碍之一。在各种薄弱的监督范式中,程序化弱监督(PWS)在通过编程合成多个潜在嘈杂的监督来源的培训标签的手动标记瓶颈方面取得了巨大的成功。本文对PWS的最新进展进行了全面的调查。特别是,我们简要介绍了PWS学习范式,并回顾了PWS学习工作流程中每个组件的代表性方法。此外,我们讨论了用于解决有限标记的数据方案以及如何与PW结合使用这些相关方法的互补学习范例。最后,我们确定了在该领域仍未探索的几个关键挑战,以激发该领域的未来研究指示。
Labeling training data has become one of the major roadblocks to using machine learning. Among various weak supervision paradigms, programmatic weak supervision (PWS) has achieved remarkable success in easing the manual labeling bottleneck by programmatically synthesizing training labels from multiple potentially noisy supervision sources. This paper presents a comprehensive survey of recent advances in PWS. In particular, we give a brief introduction of the PWS learning paradigm, and review representative approaches for each component within PWS's learning workflow. In addition, we discuss complementary learning paradigms for tackling limited labeled data scenarios and how these related approaches can be used in conjunction with PWS. Finally, we identify several critical challenges that remain under-explored in the area to hopefully inspire future research directions in the field.