来自未经灌注图像的自我监督的多模式培训和报告可以使零射击监督人工智能在放射学中

论文标题

来自未经灌注图像的自我监督的多模式培训和报告可以使零射击监督人工智能在放射学中

Self-supervised Multi-modal Training from Uncurated Image and Reports Enables Zero-shot Oversight Artificial Intelligence in Radiology

论文作者

Park, Sangjoon, Lee, Eun Sun, Shin, Kyung Sook, Lee, Jeong Eun, Ye, Jong Chul

论文摘要

Aciphite AI是放射学中的一个新兴概念，通过不断支持放射科医生在决策中，AI与放射科医生形成共生。视觉模型的最新进展通过理解视觉和文本概念及其语义对应关系，阐明了监督AI的长期问题。但是，随着当前的视觉模型和摄影图像和字幕的学习策略要求使用图像和文本对的网络尺度数据语料库，在医疗领域的应用中，视觉模型的应用取得了有限的成功，这在医疗领域中通常不可行。为了解决这个问题，在这里，我们提出了一个称为医学跨注意视觉语言模型（医学X-VL）的模型，利用为医疗领域量身定制的关键组件。我们的医学X-VL模型基于以下组成部分：医疗领域和融合编码器中的自我监管的单模式模型，以桥接它们，动量蒸馏，对医学报告的句子对比度学习以及句子相似性调整后的硬性负面挖掘。我们通过实验证明，我们的模型可以实现各种零击任务以进行监督AI，范围从零射击分类到零照片误差校正。我们的模型在两个不同的医学图像数据库中的表现优于当前的最新模型，这表明我们的监督AI模型的新型临床用法用于监测人类错误。我们的方法在数据限制的环境中尤其成功，该设置经常在诊所遇到，这表明医疗领域的潜在广泛适用性。

Oversight AI is an emerging concept in radiology where the AI forms a symbiosis with radiologists by continuously supporting radiologists in their decision-making. Recent advances in vision-language models sheds a light on the long-standing problems of the oversight AI by the understanding both visual and textual concepts and their semantic correspondences. However, there have been limited successes in the application of vision-language models in the medical domain, as the current vision-language models and learning strategies for photographic images and captions call for the web-scale data corpus of image and text pairs which was not often feasible in the medical domain. To address this, here we present a model dubbed Medical Cross-attention Vision-Language model (Medical X-VL), leveraging the key components to be tailored for the medical domain. Our medical X-VL model is based on the following components: self-supervised uni-modal models in medical domain and fusion encoder to bridge them, momentum distillation, sentence-wise contrastive learning for medical reports, and the sentence similarity-adjusted hard negative mining. We experimentally demonstrated that our model enables various zero-shot tasks for oversight AI, ranging from the zero-shot classification to zero-shot error correction. Our model outperformed the current state-of-the-art models in two different medical image database, suggesting the novel clinical usage of our oversight AI model for monitoring human errors. Our method was especially successful in the data-limited setting, which is frequently encountered in the clinics, suggesting the potential widespread applicability in medical domain.

下载PDF全文

下载文献需遵守相关版权规定

论文标题