论文标题
Deapenroll:与深层嵌入和构成预测的患者试验匹配
DeepEnroll: Patient-Trial Matching with Deep Embedding and Entailment Prediction
论文作者
论文摘要
临床试验对于药物开发至关重要,但通常遭受昂贵,不准确和不足的患者招募。患者进行试验匹配的核心问题是在试验中存储患者信息(EHR),而试验资格标准(EC)在网络上可用的文本文档中描述了有资格的患者。如何代表纵向患者EHR?如何从EC中提取复杂的逻辑规则?大多数现有的作品都依赖于基于手动规则的提取,这是耗时的,并且不适用于复杂的推断。为了应对这些挑战,我们提出了DepenRoll,即跨模式推理学习模型,将共同编码入学标准(文本)和患者记录(表格数据)共享潜在的潜在空间,以匹配推理。 Deapenroll将预先训练的双向编码器从变形金刚(BERT)模型中使用,以将临床试验信息编码为嵌入句子。并使用分层嵌入模型来表示患者纵向EHR。此外,DeapEnroll通过数值信息嵌入和需要模块来增强,以推理EC和EHR中的数值信息。这些编码器经过培训,以优化患者进行试验匹配评分。我们通过在现实世界数据集中进行了证明,评估了对试验能力匹配任务的深度奖。 Deapenroll的平均F1的最佳基线的表现高达12.4%。
Clinical trials are essential for drug development but often suffer from expensive, inaccurate and insufficient patient recruitment. The core problem of patient-trial matching is to find qualified patients for a trial, where patient information is stored in electronic health records (EHR) while trial eligibility criteria (EC) are described in text documents available on the web. How to represent longitudinal patient EHR? How to extract complex logical rules from EC? Most existing works rely on manual rule-based extraction, which is time consuming and inflexible for complex inference. To address these challenges, we proposed DeepEnroll, a cross-modal inference learning model to jointly encode enrollment criteria (text) and patients records (tabular data) into a shared latent space for matching inference. DeepEnroll applies a pre-trained Bidirectional Encoder Representations from Transformers(BERT) model to encode clinical trial information into sentence embedding. And uses a hierarchical embedding model to represent patient longitudinal EHR. In addition, DeepEnroll is augmented by a numerical information embedding and entailment module to reason over numerical information in both EC and EHR. These encoders are trained jointly to optimize patient-trial matching score. We evaluated DeepEnroll on the trial-patient matching task with demonstrated on real world datasets. DeepEnroll outperformed the best baseline by up to 12.4% in average F1.