在数字乳房断层合成中检测质量和建筑扭曲：5,060名患者的公开数据集和一个深度学习模型

论文标题

在数字乳房断层合成中检测质量和建筑扭曲：5,060名患者的公开数据集和一个深度学习模型

Detection of masses and architectural distortions in digital breast tomosynthesis: a publicly available dataset of 5,060 patients and a deep learning model

论文作者

Buda, Mateusz, Saha, Ashirbani, Walsh, Ruth, Ghate, Sujata, Li, Nianyi, Święcicki, Albert, Lo, Joseph Y., Mazurowski, Maciej A.

论文摘要

乳腺癌筛查是最常见的放射学任务之一，每年进行3900万次考试。虽然乳腺癌筛查一直是人工智能研究最多的医学成像应用之一，但由于缺乏通知的大规模公开可用数据集，算法的开发和评估受到阻碍。对于数字乳房合成（DBT）来说，这尤其是一个问题，这是一种相对较新的乳腺癌筛查方式。我们已经策划并公开提供了数字乳房合成图像的大规模数据集。它包含来自5,060名患者的5,610项研究的22,032次重建的DBT体积。这包括四个组：（1）5,129项正常研究，（2）280个研究，需要进行额外的成像，但没有进行活检，（3）112良性活检研究，（4）89个对癌症的研究。我们的数据集包括两位经验丰富的放射科医生注释的质量和建筑扭曲。此外，我们开发了一个单相深度学习检测模型，并使用数据集对其进行了测试，以作为未来研究的基准。我们的模型在每个乳房的2个假阳性时达到了65％的灵敏度。我们的大型，多样化且高度策划的数据集将通过提供培训数据以及用于模型验证的常见病例来开发和评估乳腺癌筛查的AI算法。我们研究中开发的模型的性能表明，任务仍然具有挑战性，并将成为未来模型开发的基准。

Breast cancer screening is one of the most common radiological tasks with over 39 million exams performed each year. While breast cancer screening has been one of the most studied medical imaging applications of artificial intelligence, the development and evaluation of the algorithms are hindered due to the lack of well-annotated large-scale publicly available datasets. This is particularly an issue for digital breast tomosynthesis (DBT) which is a relatively new breast cancer screening modality. We have curated and made publicly available a large-scale dataset of digital breast tomosynthesis images. It contains 22,032 reconstructed DBT volumes belonging to 5,610 studies from 5,060 patients. This included four groups: (1) 5,129 normal studies, (2) 280 studies where additional imaging was needed but no biopsy was performed, (3) 112 benign biopsied studies, and (4) 89 studies with cancer. Our dataset included masses and architectural distortions which were annotated by two experienced radiologists. Additionally, we developed a single-phase deep learning detection model and tested it using our dataset to serve as a baseline for future research. Our model reached a sensitivity of 65% at 2 false positives per breast. Our large, diverse, and highly-curated dataset will facilitate development and evaluation of AI algorithms for breast cancer screening through providing data for training as well as common set of cases for model validation. The performance of the model developed in our study shows that the task remains challenging and will serve as a baseline for future model development.

下载PDF全文

下载文献需遵守相关版权规定

论文标题