论文标题
从多模式电子健康记录中学习模式间对应和表型
Learning Inter-Modal Correspondence and Phenotypes from Multi-Modal Electronic Health Records
论文作者
论文摘要
非负张量分解已显示出一种实用解决方案,可自动从电子健康记录(EHR)中自动通过最少的人类监督发现表型。这种方法通常需要一个输入张量,以描述要预先建立的模式间相互作用;但是,在实践中通常会缺少不同方式(例如,药物和诊断之间的对应关系)之间的对应关系。尽管可以应用启发式方法来估计它们,但它们不可避免地会引入错误,并导致次优的表型质量。这对于患有复杂健康状况的患者(例如,在重症监护中)尤其重要,因为记录中同时存在多种诊断和药物。为了减轻此问题并以未观察到的模式间对应性发现EHR的表型,我们提出了集体隐藏的相互作用张量分解(CHITF),以推断与表型发现共同的多种方式之间的对应关系。我们假设每种模态的观察到的矩阵是未观察到的模式间对应关系的边缘化,这是通过最大化观察到的矩阵的可能性来重建的。在现实世界中的模拟III数据集上进行的广泛实验表明,Chitf有效地渗透了临床意义上有意义的模式间对应关系,发现与临床上更相关且多样性的表型与许多是最好的预测性能相比,与许多目前的预测性能相比。
Non-negative tensor factorization has been shown a practical solution to automatically discover phenotypes from the electronic health records (EHR) with minimal human supervision. Such methods generally require an input tensor describing the inter-modal interactions to be pre-established; however, the correspondence between different modalities (e.g., correspondence between medications and diagnoses) can often be missing in practice. Although heuristic methods can be applied to estimate them, they inevitably introduce errors, and leads to sub-optimal phenotype quality. This is particularly important for patients with complex health conditions (e.g., in critical care) as multiple diagnoses and medications are simultaneously present in the records. To alleviate this problem and discover phenotypes from EHR with unobserved inter-modal correspondence, we propose the collective hidden interaction tensor factorization (cHITF) to infer the correspondence between multiple modalities jointly with the phenotype discovery. We assume that the observed matrix for each modality is marginalization of the unobserved inter-modal correspondence, which are reconstructed by maximizing the likelihood of the observed matrices. Extensive experiments conducted on the real-world MIMIC-III dataset demonstrate that cHITF effectively infers clinically meaningful inter-modal correspondence, discovers phenotypes that are more clinically relevant and diverse, and achieves better predictive performance compared with a number of state-of-the-art computational phenotyping models.