论文标题
Xraygan:放射学报告中具有一致性的X射线图像
XRayGAN: Consistency-preserving Generation of X-ray Images from Radiology Reports
论文作者
论文摘要
为了有效地训练医学生成为合格的放射科医生,需要从患有不同医疗状况的患者中收集的大量X射线图像。但是,由于数据隐私问题,通常难以获得此类图像。为了解决这个问题,我们开发了从放射学报告中生成视图一致,高保真和高分辨率X射线图像的方法,以促进医学生的放射学培训。这项任务面临着几个挑战。首先,从单个报告中,需要生成具有不同视图的图像(例如,额叶,横向)。如何确保这些图像的一致性(即确保它们是同一患者)?其次,需要X射线图像具有高分辨率。否则,疾病的许多细节将丢失。如何生成高分辨率图像?第三,放射学报告很长,结构复杂。如何有效理解其语义以产生准确反映报告内容的高保真图像?为了应对这三个挑战,我们提出了一个由三个模块组成的Xraygan:(1)一个视图一致性网络,该网络最大程度地提高了生成的额叶视图和横向视图图像之间的一致性; (2)一个多尺度的条件gan,逐渐生成以增加分辨率的图像级联的; (3)通过捕获其层次的语言结构和单词和句子的各种临床重要性,可以了解放射学报告的潜在语义。两个放射学数据集的实验证明了我们方法的有效性。据我们所知,这项工作代表了第一个从放射学报告中产生一致且高分辨率的X射线图像的作品。该代码可从https://github.com/ucsd-ai4h/xraygan获得。
To effectively train medical students to become qualified radiologists, a large number of X-ray images collected from patients with diverse medical conditions are needed. However, due to data privacy concerns, such images are typically difficult to obtain. To address this problem, we develop methods to generate view-consistent, high-fidelity, and high-resolution X-ray images from radiology reports to facilitate radiology training of medical students. This task is presented with several challenges. First, from a single report, images with different views (e.g., frontal, lateral) need to be generated. How to ensure consistency of these images (i.e., make sure they are about the same patient)? Second, X-ray images are required to have high resolution. Otherwise, many details of diseases would be lost. How to generate high-resolutions images? Third, radiology reports are long and have complicated structure. How to effectively understand their semantics to generate high-fidelity images that accurately reflect the contents of the reports? To address these three challenges, we propose an XRayGAN composed of three modules: (1) a view consistency network that maximizes the consistency between generated frontal-view and lateral-view images; (2) a multi-scale conditional GAN that progressively generates a cascade of images with increasing resolution; (3) a hierarchical attentional encoder that learns the latent semantics of a radiology report by capturing its hierarchical linguistic structure and various levels of clinical importance of words and sentences. Experiments on two radiology datasets demonstrate the effectiveness of our methods. To our best knowledge, this work represents the first one generating consistent and high-resolution X-ray images from radiology reports. The code is available at https://github.com/UCSD-AI4H/XRayGAN.