论文标题

无监督的单细胞图像分类的跨域特征提取

Unsupervised Cross-Domain Feature Extraction for Single Blood Cell Image Classification

论文作者

Salehi, Raheleh, Sadafi, Ario, Gruber, Armin, Lienemann, Peter, Navab, Nassir, Albarqouni, Shadi, Marr, Carsten

论文摘要

诊断血液学恶性肿瘤需要鉴定和分类外周血涂片中的白细胞。由不同的实验室程序,染色,照明和显微镜设置引起的域移位妨碍了最近开发的机器学习方法对从不同站点收集的数据的重复性。在这里,我们提出了一个跨域调整的自动编码器,以在三个不同的单个白色血细胞中从外周血涂片扫描的单个白细胞的三个不同数据集中提取特征。自动编码器基于R-CNN结构,使其可以专注于相关的白色血细胞并消除图像中的伪影。为了评估提取特征的质量,我们使用一个简单的随机森林对单个细胞进行分类。我们表明,由于仅在一个数据集上训练的自动编码器提取的丰富功能,随机森林分类器在看不见的数据集上令人满意地执行,并且在交叉域任务中发表的Oracle网络跑得优越。我们的结果表明,有可能在更复杂的诊断和预后任务中采用这种无监督的方法,而无需添加昂贵的专家标签来看不见数据。

Diagnosing hematological malignancies requires identification and classification of white blood cells in peripheral blood smears. Domain shifts caused by different lab procedures, staining, illumination, and microscope settings hamper the re-usability of recently developed machine learning methods on data collected from different sites. Here, we propose a cross-domain adapted autoencoder to extract features in an unsupervised manner on three different datasets of single white blood cells scanned from peripheral blood smears. The autoencoder is based on an R-CNN architecture allowing it to focus on the relevant white blood cell and eliminate artifacts in the image. To evaluate the quality of the extracted features we use a simple random forest to classify single cells. We show that thanks to the rich features extracted by the autoencoder trained on only one of the datasets, the random forest classifier performs satisfactorily on the unseen datasets, and outperforms published oracle networks in the cross-domain task. Our results suggest the possibility of employing this unsupervised approach in more complicated diagnosis and prognosis tasks without the need to add expensive expert labels to unseen data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源