论文标题
文本和因果推论:对使用文本从因果估计中删除混杂的评论的评论
Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates
论文作者
论文摘要
计算社会科学的许多应用旨在从非实验数据中推断出因果关系。这样的观察数据通常包含混杂因素,会影响潜在原因和潜在影响的变量。未衡量或潜在的混杂因素可能会偏向因果估计,这激发了人们从观察到的文本中衡量潜在混杂因素的兴趣。例如,个人在社交媒体帖子的整个历史或新闻文章的内容可以提供多个混杂因素的丰富衡量标准。然而,该问题的方法和应用分散在不同的社区中,评估实践不一致。这篇综述是第一个收集和分类这些示例并为数据处理和评估决策提供指南的评论。尽管对使用文本进行混淆的调整的关注越来越大,但仍有许多开放问题,我们在本文中突出了这一点。
Many applications of computational social science aim to infer causal conclusions from non-experimental data. Such observational data often contains confounders, variables that influence both potential causes and potential effects. Unmeasured or latent confounders can bias causal estimates, and this has motivated interest in measuring potential confounders from observed text. For example, an individual's entire history of social media posts or the content of a news article could provide a rich measurement of multiple confounders. Yet, methods and applications for this problem are scattered across different communities and evaluation practices are inconsistent. This review is the first to gather and categorize these examples and provide a guide to data-processing and evaluation decisions. Despite increased attention on adjusting for confounding using text, there are still many open problems, which we highlight in this paper.