论文标题
比较和贝叶斯的特征分配估计
Comparison and Bayesian Estimation of Feature Allocations
论文作者
论文摘要
特征分配模型假定其参数源自共享特征的采样分布。贝叶斯模型将先前的分布放在特征分配上,马尔可夫链蒙特卡洛通常用于模型拟合,这导致从后分布采样数千种特征分配。基于这些样本,我们提出了一种提供潜在特征分配的点估计的方法。首先,我们介绍FARO损失,这是满足准 - 金属属性的功能分配之间的功能,并允许比较具有不同特征数量的特征分配。损失涉及在所有可能的所有可能中找到最佳特征排序,但是通过将这项任务作为线性分配问题来实现计算可行性。我们还引入了牙算法,以最大程度地减少使用可用样品的后验预期FARO损失的蒙特卡洛估计值来获得贝叶斯估计。除了马尔可夫链中访问的牙齿以外,毒牙可以产生估计值。我们对现有方法和我们提出的方法进行了研究。我们的损失功能和搜索算法是在R的Fangs软件包中实现的。
Feature allocation models postulate a sampling distribution whose parameters are derived from shared features. Bayesian models place a prior distribution on the feature allocation, and Markov chain Monte Carlo is typically used for model fitting, which results in thousands of feature allocations sampled from the posterior distribution. Based on these samples, we propose a method to provide a point estimate of a latent feature allocation. First, we introduce FARO loss, a function between feature allocations which satisfies quasi-metric properties and allows for comparing feature allocations with differing numbers of features. The loss involves finding the optimal feature ordering among all possible, but computational feasibility is achieved by framing this task as a linear assignment problem. We also introduce the FANGS algorithm to obtain a Bayes estimate by minimizing the Monte Carlo estimate of the posterior expected FARO loss using the available samples. FANGS can produce an estimate other than those visited in the Markov chain. We provide an investigation of existing methods and our proposed methods. Our loss function and search algorithm are implemented in the fangs package in R.