视频与视频域适应的生成对抗网络

论文标题

视频与视频域适应的生成对抗网络

Generative Adversarial Networks for Video-to-Video Domain Adaptation

论文作者

Chen, Jiawei, Li, Yuexiang, Ma, Kai, Zheng, Yefeng

论文摘要

来自多中心的内窥镜视频通常具有不同的成像条件，例如颜色和照明，这使得在一个领域进行训练的模型通常无法很好地推广到另一个领域。域适应性是解决该问题的潜在解决方案之一。但是，很少有现有作品集中在基于视频的数据的翻译上。在这项工作中，我们提出了一个新颖的生成对抗网络（GAN），即视频程序，以将基于视频的数据转移到不同领域。由于视频的框架可能具有相似的内容和成像条件，因此拟议的视频程序具有X形发电机，可在翻译过程中保留视频内的一致性。此外，提出了损失函数，即颜色直方图损失，以调整每个翻译框架的颜色分布。采用了来自不同中心的两个结肠镜数据集，即CVC-Clinic和Etis-Larib，以评估我们的视频群的域适应性。实验结果表明，我们的视频程序生成的改编的结肠镜视频可以显着提高分割精度，即在多中心数据集中提高5％的大肠息肉。由于我们的视频程序是一般网络体系结构，因此我们还通过CAMVID驱动视频数据集在Cloudy-Sunny翻译任务上评估了其性能。全面的实验表明，我们的视频程序可以大大缩小域间隙。

Endoscopic videos from multicentres often have different imaging conditions, e.g., color and illumination, which make the models trained on one domain usually fail to generalize well to another. Domain adaptation is one of the potential solutions to address the problem. However, few of existing works focused on the translation of video-based data. In this work, we propose a novel generative adversarial network (GAN), namely VideoGAN, to transfer the video-based data across different domains. As the frames of a video may have similar content and imaging conditions, the proposed VideoGAN has an X-shape generator to preserve the intra-video consistency during translation. Furthermore, a loss function, namely color histogram loss, is proposed to tune the color distribution of each translated frame. Two colonoscopic datasets from different centres, i.e., CVC-Clinic and ETIS-Larib, are adopted to evaluate the performance of domain adaptation of our VideoGAN. Experimental results demonstrate that the adapted colonoscopic video generated by our VideoGAN can significantly boost the segmentation accuracy, i.e., an improvement of 5%, of colorectal polyps on multicentre datasets. As our VideoGAN is a general network architecture, we also evaluate its performance with the CamVid driving video dataset on the cloudy-to-sunny translation task. Comprehensive experiments show that the domain gap could be substantially narrowed down by our VideoGAN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题