论文标题
水下图像的同时增强和超分辨率,以改善视觉感知
Simultaneous Enhancement and Super-Resolution of Underwater Imagery for Improved Visual Perception
论文作者
论文摘要
在本文中,我们介绍和解决水下机器人视觉的同时增强和超分辨率(SESR)问题,并为近实时应用提供有效的解决方案。我们提出了Deep SESR,这是一种基于残基网络的生成模型,可以学会以2倍,3倍或4倍的空间分辨率恢复感知图像质量。我们通过制定一个多模式目标函数来监督其训练,该目标函数解决了特定水下的水下颜色降解,缺乏图像清晰度以及高级特征表示中的损失。它还受到监督,以学习图像中的显着前景区域,这反过来又指导网络学习全球对比度增强。我们设计了一条端到端的培训管道,以共同学习以快速推断的共享层次功能空间的显着性预测和SESR。此外,我们提出了UFO-1220,这是第一个促进大规模SESR学习的数据集;它包含1500多个培训样本和基准测试集,包括120个样品。通过对UFO-1220和其他标准数据集进行彻底的实验评估,我们证明了深层SESR优于现有的水下图像增强和超分辨率的解决方案。我们还验证了其在几个测试用例上的概括性能,包括具有不同光谱和空间降解水平的水下图像,以及具有看不见的天然物体的陆地图像。最后,我们分析了其对单板部署的计算可行性,并证明了视觉引导的水下机器人的操作益处。该模型和数据集信息将在以下网址提供:https://github.com/xahidbuffon/deep-sesr。
In this paper, we introduce and tackle the simultaneous enhancement and super-resolution (SESR) problem for underwater robot vision and provide an efficient solution for near real-time applications. We present Deep SESR, a residual-in-residual network-based generative model that can learn to restore perceptual image qualities at 2x, 3x, or 4x higher spatial resolution. We supervise its training by formulating a multi-modal objective function that addresses the chrominance-specific underwater color degradation, lack of image sharpness, and loss in high-level feature representation. It is also supervised to learn salient foreground regions in the image, which in turn guides the network to learn global contrast enhancement. We design an end-to-end training pipeline to jointly learn the saliency prediction and SESR on a shared hierarchical feature space for fast inference. Moreover, we present UFO-120, the first dataset to facilitate large-scale SESR learning; it contains over 1500 training samples and a benchmark test set of 120 samples. By thorough experimental evaluation on the UFO-120 and other standard datasets, we demonstrate that Deep SESR outperforms the existing solutions for underwater image enhancement and super-resolution. We also validate its generalization performance on several test cases that include underwater images with diverse spectral and spatial degradation levels, and also terrestrial images with unseen natural objects. Lastly, we analyze its computational feasibility for single-board deployments and demonstrate its operational benefits for visually-guided underwater robots. The model and dataset information will be available at: https://github.com/xahidbuffon/Deep-SESR.