目标感知双对抗性学习和多模式基准融合红外且可见以进行对象检测的多模式基准

论文标题

目标感知双对抗性学习和多模式基准融合红外且可见以进行对象检测的多模式基准

Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection

论文作者

Liu, Jinyuan, Fan, Xin, Huang, Zhanbo, Wu, Guanyao, Liu, Risheng, Zhong, Wei, Luo, Zhongxuan

论文摘要

这项研究解决了融合红外且可见图像的问题，这些图像出现不同的对象检测。为了产生高视觉质量的图像，以前的方法发现了两种方式的基础，并通过迭代优化或深层网络融合了共同空间。这些方法忽略了形式差异，暗示互补信息对于融合和随后的检测任务非常重要。本文提出了融合和检测的联合问题的双重优化公式，然后向目标感知的双重对手学习（TARDAL）网络展开，以进行融合和常用的检测网络。与一个发电机和双重歧视者的融合网络在从差异中学习的同时寻求下议院，从而保留了目标的结构信息，从可见的红外和纹理细节中保留了目标信息。此外，我们构建了一个具有校准红外和光学传感器的同步成像系统，并收集目前最全面的基准测试，涵盖了各种场景。在几个公共数据集和我们的基准上进行的广泛实验表明，我们的方法不仅输出视觉吸引人的融合，而且比最新方法更高的检测图。

This study addresses the issue of fusing infrared and visible images that appear differently for object detection. Aiming at generating an image of high visual quality, previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks. These approaches neglect that modality differences implying the complementary information are extremely important for both fusion and subsequent detection task. This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network. The fusion network with one generator and dual discriminators seeks commons while learning from differences, which preserves structural information of targets from the infrared and textural details from the visible. Furthermore, we build a synchronized imaging system with calibrated infrared and optical sensors, and collect currently the most comprehensive benchmark covering a wide range of scenarios. Extensive experiments on several public datasets and our benchmark demonstrate that our method outputs not only visually appealing fusion but also higher detection mAP than the state-of-the-art approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题