自适应贝叶斯可变通过乳腺癌数据的结构学习

论文标题

自适应贝叶斯可变通过乳腺癌数据的结构学习

Adaptive Bayesian Variable Clustering via Structural Learning of Breast Cancer Data

论文作者

Ghosh, Riddhi Pratim, Maity, Arnab Kumar, Pourahmadi, Mohsen, Mallick, Bani K.

论文摘要

蛋白质的聚类在癌细胞生物学中引起了人们的关注。本文提出了一种用于蛋白质（可变）聚类呈现相关结构的分层贝叶斯模型。从多元正常可能性开始，我们通过使用基于角度的不受约束的相关性重新聚集的重新聚集来实施聚类，并假定截短的泊松分布（以惩罚大量的簇），因为簇数量的群集数量。参数的后验分布不采用明确的形式，我们使用基于可逆的跳跃马尔可夫链蒙特卡洛（RJMCMC）技术来模拟从后代的参数。所提出方法的最终产物是蛋白质（变量）的群集构型以及簇数。贝叶斯方法足够灵活，可以聚集蛋白质以及估计簇的数量。该方法的性能已通过广泛的模拟研究和一个蛋白质表达数据证实，其中具有来自不同途径的蛋白质中的遗传性分配。

Clustering of proteins is of interest in cancer cell biology. This article proposes a hierarchical Bayesian model for protein (variable) clustering hinging on correlation structure. Starting from a multivariate normal likelihood, we enforce the clustering through prior modeling using angle based unconstrained reparameterization of correlations and assume a truncated Poisson distribution (to penalize the large number of clusters) as prior on the number of clusters. The posterior distributions of the parameters are not in explicit form and we use a reversible jump Markov chain Monte Carlo (RJMCMC) based technique is used to simulate the parameters from the posteriors. The end products of the proposed method are estimated cluster configuration of the proteins (variables) along with the number of clusters. The Bayesian method is flexible enough to cluster the proteins as well as the estimate the number of clusters. The performance of the proposed method has been substantiated with extensive simulation studies and one protein expression data with a hereditary disposition in breast cancer where the proteins are coming from different pathways.

下载PDF全文

下载文献需遵守相关版权规定

论文标题