论文标题
用于双链DNA断裂预测的图形神经网络
Graph Neural Networks for Double-Strand DNA Breaks Prediction
论文作者
论文摘要
双链DNA断裂(DSB)是DNA损伤的一种形式,可能导致异常的染色体重排。基于高通量实验的最新技术具有明显的高成本和技术挑战。因此,我们使用DNA序列特征和染色体结构信息设计了一种基于图神经网络的方法来预测DSB(GraphDSB)。为了提高模型的表达能力,我们引入了跳跃知识体系结构和几种有效的结构编码方法。结构信息对DSB的预测的贡献得到了正常人表皮角质形成细胞(NHEK)和慢性髓样白血病细胞系(K562)的实验的验证,而消融研究进一步证明了设计组件在提议的图形框架中的有效性。最后,我们使用gnnexplainer分析节点特征和拓扑对DSB的预测的贡献,并证明了5-Mer DNA序列特征和两种染色质相互作用模式的高贡献。
Double-strand DNA breaks (DSBs) are a form of DNA damage that can cause abnormal chromosomal rearrangements. Recent technologies based on high-throughput experiments have obvious high costs and technical challenges.Therefore, we design a graph neural network based method to predict DSBs (GraphDSB), using DNA sequence features and chromosome structure information. In order to improve the expression ability of the model, we introduce Jumping Knowledge architecture and several effective structural encoding methods. The contribution of structural information to the prediction of DSBs is verified by the experiments on datasets from normal human epidermal keratinocytes (NHEK) and chronic myeloid leukemia cell line (K562), and the ablation studies further demonstrate the effectiveness of the designed components in the proposed GraphDSB framework. Finally, we use GNNExplainer to analyze the contribution of node features and topology to DSBs prediction, and proved the high contribution of 5-mer DNA sequence features and two chromatin interaction modes.