论文标题
图像生成如何帮助可见的红外人员重新识别?
How Image Generation Helps Visible-to-Infrared Person Re-Identification?
论文作者
论文摘要
与可见的可见(V2V)人重新识别(REID)相比,由于缺乏足够的培训样本和较大的跨模性差异,可见的 - 边红外(V2I)人REID任务更具挑战性。 为此,我们提出了Flow2Flow,这是一个统一的框架,可以共同实现V2i人REID的训练样本扩展和跨模式图像产生。 具体而言,Flow2Flow学习了从可见图像结构域和红外域到具有可逆可见的基于流动的生成器和红外线的共享的各向同性高斯域,从可见的图像域和红外域学习了两者。 使用Flow2Flow,我们能够通过从潜在高斯噪音到可见或红外图像的转换来生成伪训练样本,并通过从现有模式图像到潜在高斯噪声到缺失模式图像的转换来生成交叉模式图像。 为了使生成图像的身份一致性和方式对齐方式,我们制定了对抗性训练策略来训练Flow2Flow。 具体来说,我们为每种模态设计了一个图像编码器和模态歧视器。 图像编码器鼓励生成的图像通过身份对抗训练类似于相同身份的真实图像,并且模态鉴别器使生成的图像通过模态对抗训练与真实图像可区分。 SYSU-MM01和REGDB的实验结果表明,训练样品扩展和跨模式图像产生都可以显着提高V2I REID的准确性。
Compared to visible-to-visible (V2V) person re-identification (ReID), the visible-to-infrared (V2I) person ReID task is more challenging due to the lack of sufficient training samples and the large cross-modality discrepancy. To this end, we propose Flow2Flow, a unified framework that could jointly achieve training sample expansion and cross-modality image generation for V2I person ReID. Specifically, Flow2Flow learns bijective transformations from both the visible image domain and the infrared domain to a shared isotropic Gaussian domain with an invertible visible flow-based generator and an infrared one, respectively. With Flow2Flow, we are able to generate pseudo training samples by the transformation from latent Gaussian noises to visible or infrared images, and generate cross-modality images by transformations from existing-modality images to latent Gaussian noises to missing-modality images. For the purpose of identity alignment and modality alignment of generated images, we develop adversarial training strategies to train Flow2Flow. Specifically, we design an image encoder and a modality discriminator for each modality. The image encoder encourages the generated images to be similar to real images of the same identity via identity adversarial training, and the modality discriminator makes the generated images modal-indistinguishable from real images via modality adversarial training. Experimental results on SYSU-MM01 and RegDB demonstrate that both training sample expansion and cross-modality image generation can significantly improve V2I ReID accuracy.