基于学习的深度学习算法，用于评估膝关节骨关节炎的严重程度与放射科医生的性能相匹配

论文标题

基于学习的深度学习算法，用于评估膝关节骨关节炎的严重程度与放射科医生的性能相匹配

Deep learning-based algorithm for assessment of knee osteoarthritis severity in radiographs matches performance of radiologists

论文作者

Swiecicki, Albert, Li, Nianyi, O'Donnell, Jonathan, Said, Nicholas, Yang, Jichen, Mather, Richard C., Jiranek, William A., Mazurowski, Maciej A.

论文摘要

使用Kellgren-Lawence分级系统在放射线照片中评估放射线骨关节炎的严重程度评估放射科医生的表现中，完全自动的深度学习算法匹配。根据Kellgren-Lawrence分级系统，开发一种自动化的基于深度学习的算法，该算法使用膝盖X光片的后侧（PA）和侧面（LAT）视图来评估膝关节骨关节炎的严重程度。我们使用了来自多中心骨关节炎研究（大多数）的2802名患者的9739例检查的数据集。该数据集分为2040名患者的训练集，259例患者的验证和503例患者的测试组。一种新型的基于深度学习的方法用于评估膝关节OA分为两个步骤：（1）图像中膝关节的定位，（2）根据KL分级系统进行分类。我们的方法同时使用PA和LAT视图作为模型的输入。将算法产生的分数与整个测试集的最多数据集中提供的成绩以及我们机构中的5位放射科医生提供的成绩进行了比较。与大多数数据集中提供的评分相比，该模型在整个测试集上获得了71.90％的多级准确性。该集合的二次加权KAPPA系数为0.9066。我们机构的所有放射科医生对研究的平均二次加权kappa为0.748。我们机构的算法和放射科医生之间的平均二次加权Kappa为0.769。提出的模型表明，KL分类与MSK放射科医生的等效性，但显然可重复性。我们的模型还与我们机构的放射科医生同意与放射科医生相同的程度。该算法可用于对膝关节骨关节炎严重程度的可重复评估。

A fully-automated deep learning algorithm matched performance of radiologists in assessment of knee osteoarthritis severity in radiographs using the Kellgren-Lawrence grading system. To develop an automated deep learning-based algorithm that jointly uses Posterior-Anterior (PA) and Lateral (LAT) views of knee radiographs to assess knee osteoarthritis severity according to the Kellgren-Lawrence grading system. We used a dataset of 9739 exams from 2802 patients from Multicenter Osteoarthritis Study (MOST). The dataset was divided into a training set of 2040 patients, a validation set of 259 patients and a test set of 503 patients. A novel deep learning-based method was utilized for assessment of knee OA in two steps: (1) localization of knee joints in the images, (2) classification according to the KL grading system. Our method used both PA and LAT views as the input to the model. The scores generated by the algorithm were compared to the grades provided in the MOST dataset for the entire test set as well as grades provided by 5 radiologists at our institution for a subset of the test set. The model obtained a multi-class accuracy of 71.90% on the entire test set when compared to the ratings provided in the MOST dataset. The quadratic weighted Kappa coefficient for this set was 0.9066. The average quadratic weighted Kappa between all pairs of radiologists from our institution who took a part of study was 0.748. The average quadratic-weighted Kappa between the algorithm and the radiologists at our institution was 0.769. The proposed model performed demonstrated equivalency of KL classification to MSK radiologists, but clearly superior reproducibility. Our model also agreed with radiologists at our institution to the same extent as the radiologists with each other. The algorithm could be used to provide reproducible assessment of knee osteoarthritis severity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题