论文标题
基于未知特异性和灵敏度的测试的疾病患病率的精确推断
Exact Inference for Disease Prevalence Based on a Test with Unknown Specificity and Sensitivity
论文作者
论文摘要
为了在与正在进行的Covid-19的大流行作斗争时做出信息丰富的公共政策决定,重要的是要了解人口中的疾病患病率。基于一组受试者的测试结果估算这种患病率时,有两个相互交织的困难。首先,该测试容易出现,并具有未知的灵敏度和特异性。其次,在大流行的初始阶段,患病率往往很低,由于测试的特异性不完善,我们可能无法确定阳性测试结果是否为假阳性。基于大型样本近似或常规引导程序的统计推论可能不够可靠,并且产生置信区间,而置信区间不涵盖标称水平的真实流行率。在本文中,我们提出了一组95%的置信区间,其有效性得到了保证,并且不依赖于未加权设置中的样本量。对于加权设置,所提出的推断等于一类混合引导方法,其性能也比基于渐近近似的样本量更强大。这些方法用于重新分析一项研究,该研究调查了加利福尼亚州圣克拉拉县的抗体患病率,这是这项研究的激励例子,除了其他几项血清阳性研究,作者还试图纠正其测试性能的估计。已经进行了广泛的仿真研究,以检查拟议置信区间的有限样本表现。
To make informative public policy decisions in battling the ongoing COVID-19 pandemic, it is important to know the disease prevalence in a population. There are two intertwined difficulties in estimating this prevalence based on testing results from a group of subjects. First, the test is prone to measurement error with unknown sensitivity and specificity. Second, the prevalence tends to be low at the initial stage of the pandemic and we may not be able to determine if a positive test result is a false positive due to the imperfect specificity of the test. The statistical inference based on large sample approximation or conventional bootstrap may not be sufficiently reliable and yield confidence intervals that do not cover the true prevalence at the nominal level. In this paper, we have proposed a set of 95% confidence intervals, whose validity is guaranteed and doesn't depend on the sample size in the unweighted setting. For the weighted setting, the proposed inference is equivalent to a class of hybrid bootstrap methods, whose performance is also more robust to the sample size than those based on asymptotic approximations. The methods are used to reanalyze data from a study investigating the antibody prevalence in Santa Clara county, California, which was the motivating example of this research, in addition to several other seroprevalence studies where authors had tried to correct their estimates for test performance. Extensive simulation studies have been conducted to examine the finite-sample performance of the proposed confidence intervals.