论文标题

EGSDE:通过能源引导的随机微分方程的未配对图像到图像翻译

EGSDE: Unpaired Image-to-Image Translation via Energy-Guided Stochastic Differential Equations

论文作者

Zhao, Min, Bao, Fan, Li, Chongxuan, Zhu, Jun

论文摘要

基于得分的扩散模型(SBDM)已实现了SOTA FID导致未配对的图像到图像翻译(I2i)。但是,我们注意到现有方法完全忽略了源域中的训练数据,从而导致了不成对I2i的次优点。为此,我们提出了能源引导的随机微分方程(EGSDE),该方程采用了在源和目标域上鉴定的能量函数,以指导验证的SDE推理,以实现现实和忠实的不成对的I2I。在两个特征提取器的基础上,我们仔细设计了能量功能,以鼓励传输的图像保留独立于域的特征并丢弃特定于域的特征。此外,我们提供了EGSDE作为专家的产品的另一种解释,其中三位专家(与SDE和两个功能提取器相对应)仅有助于忠诚或现实主义。从经验上讲,我们将EGSDE与三个公认的未配对I2I任务的大型基线进行比较。 EGSDE不仅在几乎所有设置中都始终优于现有的基于SBDMS的方法,而且在不损害忠实绩效的情况下实现了SOTA现实主义的结果​​。此外,eGSDE可以在现实主义和忠诚之间进行灵活的权衡,我们通过调整超参数来进一步改善现实主义的结果​​(例如,猫到狗的FID为51.04,在猫狗到狗的FID为51.04,而在野外对狗的FID为50.43)。该代码可在https://github.com/ml-gsai/egsde上找到。

Score-based diffusion models (SBDMs) have achieved the SOTA FID results in unpaired image-to-image translation (I2I). However, we notice that existing methods totally ignore the training data in the source domain, leading to sub-optimal solutions for unpaired I2I. To this end, we propose energy-guided stochastic differential equations (EGSDE) that employs an energy function pretrained on both the source and target domains to guide the inference process of a pretrained SDE for realistic and faithful unpaired I2I. Building upon two feature extractors, we carefully design the energy function such that it encourages the transferred image to preserve the domain-independent features and discard domain-specific ones. Further, we provide an alternative explanation of the EGSDE as a product of experts, where each of the three experts (corresponding to the SDE and two feature extractors) solely contributes to faithfulness or realism. Empirically, we compare EGSDE to a large family of baselines on three widely-adopted unpaired I2I tasks under four metrics. EGSDE not only consistently outperforms existing SBDMs-based methods in almost all settings but also achieves the SOTA realism results without harming the faithful performance. Furthermore, EGSDE allows for flexible trade-offs between realism and faithfulness and we improve the realism results further (e.g., FID of 51.04 in Cat to Dog and FID of 50.43 in Wild to Dog on AFHQ) by tuning hyper-parameters. The code is available at https://github.com/ML-GSAI/EGSDE.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源