论文标题

官方确认的病例和死亡人数是否足以研究COVID-19-19大流行动态?通过意大利案件进行的批判性评估

Are official confirmed cases and fatalities counts good enough to study the COVID-19 pandemic dynamics? A critical assessment through the case of Italy

论文作者

Bartoszek, Krzysztof, Guidotti, Emanuele, Iacus, Stefano Maria, Okrój, Marcin

论文摘要

随着COVID-19的爆发正在开发两个最常报告的统计数据,似乎是原始的确认案件和案件死亡人数。我们关注意大利是受欢迎的国家之一,我们研究如何将这两个价值观视为反映病毒传播的动态。特别是,我们发现仅考虑确认的案件计数将非常误导。每日测试的数量增加,而确认的病例的每日分数则具有变化点。 IT(取决于区域)通常会随着强烈的波动而增加,直到(根据区域,取决于区域)3月15日至22日,然后在线性上线性减少。结合日常进行测试的趋势的增加,原始的确认案例计数并不代表这种情况,并且与采样工作相混淆。当按时回归时,我们会观察到阳性测试的记录部分,并为了比较记录的原始确认计数。因此,对该病毒动态的校准模型参数不应仅基于确认的病例计数(不通过测试数量进行重新缩放),而应考虑到正在考虑的死亡和住院计数,因为不容易通过测试工作来扭曲的变量。此外,关于国家一级的报告统计数据并没有说出在区域一级发生的疾病的动态。这些发现是基于截至2020年4月15日发布的官方数据,由ISTAT发布,直到2020年5月10日在案件中。在这项工作中,我们不符合模型,而宁愿研究此任务是否可能。这项工作还为一种新工具提供了一种新工具,以收集和协调来自不同来源的官方统计数据,以r统计环境的包装形式,并提供了COVID-19数据中心。

As the COVID-19 outbreak is developing the two most frequently reported statistics seem to be the raw confirmed case and case fatalities counts. Focusing on Italy, one of the hardest hit countries, we look at how these two values could be put in perspective to reflect the dynamics of the virus spread. In particular, we find that merely considering the confirmed case counts would be very misleading. The number of daily tests grows, while the daily fraction of confirmed cases to total tests has a change point. It (depending on region) generally increases with strong fluctuations till (around, depending on region) 15th-22nd March and then decreases linearly after. Combined with the increasing trend of daily performed tests, the raw confirmed case counts are not representative of the situation and are confounded with the sampling effort. This we observe when regressing on time the logged fraction of positive tests and for comparison the logged raw confirmed count. Hence, calibrating model parameters for this virus's dynamics should not be done based only on confirmed case counts (without rescaling by the number of tests), but take also fatalities and hospitalization count under consideration as variables not prone to be distorted by testing efforts. Furthermore, reporting statistics on the national level does not say much about the dynamics of the disease, which are taking place at the regional level. These findings are based on the official data of total death counts up to 15th April 2020 released by ISTAT and up to 10th May 2020 for the number of cases. In this work we do not fit models but we rather investigate whether this task is possible at all. This work also informs about a new tool to collect and harmonize official statistics coming from different sources in the form of a package for the R statistical environment and presents the COVID-19 Data Hub.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源