论文标题

相关的体细胞突变混合成员建模

Correlated Mixed Membership Modeling of Somatic Mutations

论文作者

Mehta, Rahul, Karaman, Muge

论文摘要

最近对癌症体突变特征的研究旨在鉴定个性化医学中有针对性治疗的突变。但是,对谱的分析并不是微不足道的,因为每个特征都是异质的,并且存在多种混杂因素,这些因素会影响癌症基因(例如癌症(SUB)类型,生物学过程,突变总数和非线性突变相互作用)之间的因果关系。此外,癌症在生物学上是多余的,即不同的突变会导致相似的生物学过程的改变,因此必须确定所有可能的突变组合集合以进行有效的患者治疗,这一点很重要。为了建模这种现象,我们提出了相关的零泄漏的负二项式过程,以通过潜在的表示来推断体突变谱的固有结构。这种随机过程考虑了使用特定特定的负二项式分散参数,这些突变与相关的beta-bernoulli过程和模型配置文件异质性的概率参数混合在一起。这些模型参数是通过使用癌症基因组档案(TCGA)的PAN癌数据集(TCGA)的摊销和随机变异推断来推断的。通过检查潜在空间,我们可以确定体细胞突变之间的生物学相关相关性。

Recent studies of cancer somatic mutation profiles seek to identify mutations for targeted therapy in personalized medicine. Analysis of profiles, however, is not trivial, as each profile is heterogeneous and there are multiple confounding factors that influence the cause-and-effect relationships between cancer genes such as cancer (sub)type, biological processes, total number of mutations, and non-linear mutation interactions. Moreover, cancer is biologically redundant, i.e., distinct mutations can result in the alteration of similar biological processes, so it is important to identify all possible combinatorial sets of mutations for effective patient treatment. To model this phenomena, we propose the correlated zero-inflated negative binomial process to infer the inherent structure of somatic mutation profiles through latent representations. This stochastic process takes into account different, yet correlated, co-occurring mutations using profile-specific negative binomial dispersion parameters that are mixed with a correlated beta-Bernoulli process and a probability parameter to model profile heterogeneity. These model parameters are inferred by iterative optimization via amortized and stochastic variational inference using the Pan Cancer dataset from The Cancer Genomic Archive (TCGA). By examining the the latent space, we identify biologically relevant correlations between somatic mutations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源