论文标题
语义分割的对比预测对嘈杂的正对是可靠的
Contrastive pretraining for semantic segmentation is robust to noisy positive pairs
论文作者
论文摘要
对比度学习的特定领域特异性变体可以从两个不同的内域图像中构建正对,而传统方法只是两次增强相同图像。例如,我们可以从两个卫星图像中形成一个阳性对,在不同时间显示相同的位置。理想情况下,这教会模型忽略季节,天气状况或图像采集伪像引起的变化。但是,与传统的对比方法不同,这可能会导致不希望的积极对,因为我们在没有人类监督的情况下形成了它们。例如,一个正面的一对可能由灾难前的一张图像组成,然后是一个图像。这可以教会该模型忽略完整建筑物和受损建筑物之间的差异,这可能是我们在下游任务中要检测到的。与假负对类似,这可能会阻碍模型性能。至关重要的是,在这种设置中,图像的一部分在相关方式方面有所不同,而其他部分仍然相似。令人惊讶的是,我们发现下游的语义细分要么对如此匹配的对匹配是强大的,甚至可以从中受益。实验是在遥感数据集XBD上进行的,以及合成分割数据集,我们可以完全控制配对条件。结果,从业者可以使用这些特定领域的对比方法,而无需事先过滤其正对,甚至可能鼓励有目的地将这些对包含在其预处理数据集中。
Domain-specific variants of contrastive learning can construct positive pairs from two distinct in-domain images, while traditional methods just augment the same image twice. For example, we can form a positive pair from two satellite images showing the same location at different times. Ideally, this teaches the model to ignore changes caused by seasons, weather conditions or image acquisition artifacts. However, unlike in traditional contrastive methods, this can result in undesired positive pairs, since we form them without human supervision. For example, a positive pair might consist of one image before a disaster and one after. This could teach the model to ignore the differences between intact and damaged buildings, which might be what we want to detect in the downstream task. Similar to false negative pairs, this could impede model performance. Crucially, in this setting only parts of the images differ in relevant ways, while other parts remain similar. Surprisingly, we find that downstream semantic segmentation is either robust to such badly matched pairs or even benefits from them. The experiments are conducted on the remote sensing dataset xBD, and a synthetic segmentation dataset for which we have full control over the pairing conditions. As a result, practitioners can use these domain-specific contrastive methods without having to filter their positive pairs beforehand, or might even be encouraged to purposefully include such pairs in their pretraining dataset.