对抗性双生学生，具有半监督语义分段的可区分空间扭曲

论文标题

对抗性双生学生，具有半监督语义分段的可区分空间扭曲

Adversarial Dual-Student with Differentiable Spatial Warping for Semi-Supervised Semantic Segmentation

论文作者

Cao, Cong, Lin, Tianwei, He, Dongliang, Li, Fu, Yue, Huanjing, Yang, Jingyu, Ding, Errui

论文摘要

强大的语义细分面临的一个普遍挑战是昂贵的数据注释成本。现有的半监督解决方案显示出解决此问题的巨大潜力。他们的关键想法是通过未经监督的数据扩大来构建一致性正则化，从未标记的数据进行模型培训。未标记数据的扰动使一致性训练损失使半监督语义分割受益。但是，这些扰动破坏了图像上下文并引入了不自然的边界，这对语义分割有害。此外，广泛采用的半监督学习框架，即均值老师，遭受了绩效限制，因为学生模型最终会收敛于教师模型。在本文中，首先，我们提出了一个友好的可区分几何扭曲，以进行无监督的数据增强。其次，提出了一个新颖的对抗性双学生框架，以从以下两个方面从以下两个方面提高均等老师：（1）双重学生模型独立学习，除了稳定约束以鼓励利用模型多样性；（2）对对抗性训练计划适用于学生，并诉诸歧视者以区分无标记数据的可靠伪标签进行自我培训。通过对Pascal VOC2012和CityScapes进行的广泛实验来验证有效性。我们的解决方案可显着提高两个数据集的性能和最先进的结果。值得注意的是，与完全监督相比，我们的解决方案仅使用Pascal VOC2012上的12.5％注释数据获得了73.4％的可比MIOU。我们的代码和模型可在https://github.com/cao-cong/ads-semiseg上找到。

A common challenge posed to robust semantic segmentation is the expensive data annotation cost. Existing semi-supervised solutions show great potential for solving this problem. Their key idea is constructing consistency regularization with unsupervised data augmentation from unlabeled data for model training. The perturbations for unlabeled data enable the consistency training loss, which benefits semi-supervised semantic segmentation. However, these perturbations destroy image context and introduce unnatural boundaries, which is harmful for semantic segmentation. Besides, the widely adopted semi-supervised learning framework, i.e. mean-teacher, suffers performance limitation since the student model finally converges to the teacher model. In this paper, first of all, we propose a context friendly differentiable geometric warping to conduct unsupervised data augmentation; secondly, a novel adversarial dual-student framework is proposed to improve the Mean-Teacher from the following two aspects: (1) dual student models are learned independently except for a stabilization constraint to encourage exploiting model diversities; (2) adversarial training scheme is applied to both students and the discriminators are resorted to distinguish reliable pseudo-label of unlabeled data for self-training. Effectiveness is validated via extensive experiments on PASCAL VOC2012 and Cityscapes. Our solution significantly improves the performance and state-of-the-art results are achieved on both datasets. Remarkably, compared with fully supervision, our solution achieves comparable mIoU of 73.4% using only 12.5% annotated data on PASCAL VOC2012. Our codes and models are available at https://github.com/cao-cong/ADS-SemiSeg.

下载PDF全文

下载文献需遵守相关版权规定

论文标题