论文标题
MSMG-NET:用于多任务图像操纵检测和本地化的多尺度多层监督的METWORKS
MSMG-Net: Multi-scale Multi-grained Supervised Metworks for Multi-task Image Manipulation Detection and Localization
论文作者
论文摘要
近年来,随着图像编辑技术的快速发展,自篡改图像带来的安全风险增加以来,图像操纵检测引起了很大的关注。为了应对这些挑战,提出了一种新型的多尺度多层面深网(MSMG-NET)来自动识别受操纵的区域。在我们的MSMG-NET中,使用平行的多尺度特征提取结构来提取多尺度特征。然后,通过引入分流的自我注意力,将多元元素的特征学习用于感知多尺度特征的对象级语义关系。为了融合多尺度的多层次功能,通过自下而上的方法设计了全球和局部特征融合块,专为操纵区域分割而设计,多层特征聚合块设计用于通过自上而下的方法来检测边缘伪像检测。因此,MSMG-NET可以有效地感知对象级语义并编码边缘伪像。五个基准数据集的实验结果证明了该方法的出色性能,表现优于最先进的操纵检测和定位方法。广泛的消融实验和特征可视化证明了多尺度的多刻度学习可以呈现操纵区域的有效视觉表示。此外,当各种后处理方法进一步操纵图像时,MSMG-NET显示出更好的鲁棒性。
With the rapid advances of image editing techniques in recent years, image manipulation detection has attracted considerable attention since the increasing security risks posed by tampered images. To address these challenges, a novel multi-scale multi-grained deep network (MSMG-Net) is proposed to automatically identify manipulated regions. In our MSMG-Net, a parallel multi-scale feature extraction structure is used to extract multi-scale features. Then the multi-grained feature learning is utilized to perceive object-level semantics relation of multi-scale features by introducing the shunted self-attention. To fuse multi-scale multi-grained features, global and local feature fusion block are designed for manipulated region segmentation by a bottom-up approach and multi-level feature aggregation block is designed for edge artifacts detection by a top-down approach. Thus, MSMG-Net can effectively perceive the object-level semantics and encode the edge artifact. Experimental results on five benchmark datasets justify the superior performance of the proposed method, outperforming state-of-the-art manipulation detection and localization methods. Extensive ablation experiments and feature visualization demonstrate the multi-scale multi-grained learning can present effective visual representations of manipulated regions. In addition, MSMG-Net shows better robustness when various post-processing methods further manipulate images.