论文标题
神经主题模型破裂了吗?
Are Neural Topic Models Broken?
论文作者
论文摘要
最近,对主题模型的自动化和人类评估之间的关系受到质疑。方法开发人员已将新主题模型变体的功效置于自动措施上,并且他们未能近似人类偏好将这些模型置于不确定的基础上。此外,现有的评估范例通常与现实使用。 通过内容分析作为主题建模的主要现实用例,我们分析了主题模型的两个相关方面,这些方面影响了其实践中其实践的有效性和可信赖性:其估计的稳定性以及该模型发现的类别与数据中人类确定的类别相吻合的程度。我们发现,与已建立的经典方法相比,神经主题模型在这两个方面的表现都差。我们通过证明直接的结合方法可以可靠地优于合奏成员来解决两个问题。
Recently, the relationship between automated and human evaluation of topic models has been called into question. Method developers have staked the efficacy of new topic model variants on automated measures, and their failure to approximate human preferences places these models on uncertain ground. Moreover, existing evaluation paradigms are often divorced from real-world use. Motivated by content analysis as a dominant real-world use case for topic modeling, we analyze two related aspects of topic models that affect their effectiveness and trustworthiness in practice for that purpose: the stability of their estimates and the extent to which the model's discovered categories align with human-determined categories in the data. We find that neural topic models fare worse in both respects compared to an established classical method. We take a step toward addressing both issues in tandem by demonstrating that a straightforward ensembling method can reliably outperform the members of the ensemble.