教我通过混合监督进行细分：自信的学生成为大师

论文标题

教我通过混合监督进行细分：自信的学生成为大师

Teach me to segment with mixed supervision: Confident students become masters

论文作者

Dolz, Jose, Desrosiers, Christian, Ayed, Ismail Ben

论文摘要

深度分割神经网络需要具有像素细分的大型培训数据集，而在实践中获得昂贵。混合监督可以减轻这种困难，其中一小部分包含完整的像素注释的数据，而其余的则较少受监督，例如，只有少数像素被标记。在这项工作中，我们提出了一个双分支建筑，上部分支（老师）会收到强大的注释，而底部（学生）则由有限的监督驱动，并在上部分支的指导下进行指导。结合标记的像素上的标准跨凝结，我们的新颖配方整合了两个重要术语：（i）在较少监督的图像上定义的香农熵损失，这鼓励了底部分支的自信学生预测；（ii）kullback-leibler（kl）差异将知识从强有监督的分支产生的预测转移到了较少监督的分支，并指导熵（学生信心）术语以避免琐碎的解决方案。有趣的是，我们表明熵和KL差异之间的协同作用可实现表现的大量改善。此外，我们讨论了香农 - 凝聚最小化与标准伪掩模生成之间的有趣联系，并认为前者应该优先于后者利用未标记像素的信息。通过一系列定量和定性实验，我们显示了所提出的配方在MRI图像中分割左内心脏内膜内的有效性。我们证明，我们的方法极大地超过了其他策略来解决在混合阶段框架内的语义细分。更有趣的是，与最近的分类观察一致，我们表明，接受监督减少的分支机构在很大程度上胜过老师。

Deep segmentation neural networks require large training datasets with pixel-wise segmentations, which are expensive to obtain in practice. Mixed supervision could mitigate this difficulty, with a small fraction of the data containing complete pixel-wise annotations, while the rest being less supervised, e.g., only a handful of pixels are labeled. In this work, we propose a dual-branch architecture, where the upper branch (teacher) receives strong annotations, while the bottom one (student) is driven by limited supervision and guided by the upper branch. In conjunction with a standard cross-entropy over the labeled pixels, our novel formulation integrates two important terms: (i) a Shannon entropy loss defined over the less-supervised images, which encourages confident student predictions at the bottom branch; and (ii) a Kullback-Leibler (KL) divergence, which transfers the knowledge from the predictions generated by the strongly supervised branch to the less-supervised branch, and guides the entropy (student-confidence) term to avoid trivial solutions. Very interestingly, we show that the synergy between the entropy and KL divergence yields substantial improvements in performances. Furthermore, we discuss an interesting link between Shannon-entropy minimization and standard pseudo-mask generation and argue that the former should be preferred over the latter for leveraging information from unlabeled pixels. Through a series of quantitative and qualitative experiments, we show the effectiveness of the proposed formulation in segmenting the left-ventricle endocardium in MRI images. We demonstrate that our method significantly outperforms other strategies to tackle semantic segmentation within a mixed-supervision framework. More interestingly, and in line with recent observations in classification, we show that the branch trained with reduced supervision largely outperforms the teacher.

下载PDF全文

下载文献需遵守相关版权规定

论文标题