论文标题
在因果研究中通过校准稳定发现可解释的亚组
Stable discovery of interpretable subgroups via calibration in causal studies
论文作者
论文摘要
在Yu和Kumbier的PCS框架和随机实验的基础上,我们引入了一种新颖的方法,可通过校准(Stadisc)稳定发现可解释的亚组,具有较大的异构治疗效果。 Stadisc是在我们对1999 - 2000年Vigor研究的重新分析期间开发的,该研究是一项8076例患者随机对照试验(RCT),该研究比较了当时新批准的药物Rofecoxib(Vioxx)的不良事件的风险与较早的药物naproxen的风险。与萘普生相比,发现Vioxx平均降低了胃肠道(GI)事件的风险,但增加了血栓性心血管(CVT)事件的风险。应用Stadisc,我们适合18个流行的条件平均治疗效果(CATE)估计量,并使用校准来证明其全球性能不佳。但是,它们是局部校准且稳定的,可以鉴定出大于(估计)平均治疗效果的患者群体。实际上,Stadisc发现了三个临床解释的亚组,分别以GI结果(总计研究规模的29.4%)和CVT结果(总计11.0%)。使用2001-2004批准研究对发现的亚组的互补分析,该研究是一项独立的RCT,与2587名患者进行了独立进行RCT,为Stadisc的承诺提供了进一步的支持证据。
Building on Yu and Kumbier's PCS framework and for randomized experiments, we introduce a novel methodology for Stable Discovery of Interpretable Subgroups via Calibration (StaDISC), with large heterogeneous treatment effects. StaDISC was developed during our re-analysis of the 1999-2000 VIGOR study, an 8076 patient randomized controlled trial (RCT), that compared the risk of adverse events from a then newly approved drug, Rofecoxib (Vioxx), to that from an older drug Naproxen. Vioxx was found to, on average and in comparison to Naproxen, reduce the risk of gastrointestinal (GI) events but increase the risk of thrombotic cardiovascular (CVT) events. Applying StaDISC, we fit 18 popular conditional average treatment effect (CATE) estimators for both outcomes and use calibration to demonstrate their poor global performance. However, they are locally well-calibrated and stable, enabling the identification of patient groups with larger than (estimated) average treatment effects. In fact, StaDISC discovers three clinically interpretable subgroups each for the GI outcome (totaling 29.4% of the study size) and the CVT outcome (totaling 11.0%). Complementary analyses of the found subgroups using the 2001-2004 APPROVe study, a separate independently conducted RCT with 2587 patients, provides further supporting evidence for the promise of StaDISC.