论文标题
通过迭代亲和力学习弱监督的语义细分
Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning
论文作者
论文摘要
弱监督的语义细分是一项具有挑战性的任务,因为没有像素标签的信息用于培训。最近的方法通过选择具有强烈响应的区域来利用分类网络来定位对象。但是,这种响应图提供了稀疏的信息,但是,自然图像中像素之间存在很强的成对关系,可以将其用于将稀疏映射传播到一个非常密集的图。在本文中,我们提出了一种迭代算法来学习这种成对关系,该算法由两个分支组成,一个分段网络,该网络学习每个像素的标签概率,以及一个学习亲和力矩阵的成对亲和力网络,并提高了亲和力矩阵的概率映射。然后将成对网络的完善结果用作训练单网络的监督,并迭代进行过程以逐渐获得更好的分割。为了在没有准确注释的情况下学习可靠的像素亲和力,我们还建议挖掘自信区域。我们表明,迭代训练该框架等同于优化具有收敛到局部最小值的能量功能。 Pascal VOC 2012和可可数据集的实验结果表明,所提出的算法对最先进的方法表现出色。
Weakly-supervised semantic segmentation is a challenging task as no pixel-wise label information is provided for training. Recent methods have exploited classification networks to localize objects by selecting regions with strong response. While such response map provides sparse information, however, there exist strong pairwise relations between pixels in natural images, which can be utilized to propagate the sparse map to a much denser one. In this paper, we propose an iterative algorithm to learn such pairwise relations, which consists of two branches, a unary segmentation network which learns the label probabilities for each pixel, and a pairwise affinity network which learns affinity matrix and refines the probability map generated from the unary network. The refined results by the pairwise network are then used as supervision to train the unary network, and the procedures are conducted iteratively to obtain better segmentation progressively. To learn reliable pixel affinity without accurate annotation, we also propose to mine confident regions. We show that iteratively training this framework is equivalent to optimizing an energy function with convergence to a local minimum. Experimental results on the PASCAL VOC 2012 and COCO datasets demonstrate that the proposed algorithm performs favorably against the state-of-the-art methods.