论文标题
泊松和过度分散的泊松回归模型的预测区域,其应用于预测在19日期期间的死亡人数
Prediction Regions for Poisson and Over-Dispersed Poisson Regression Models with Applications to Forecasting Number of Deaths during the COVID-19 Pandemic
论文作者
论文摘要
由当前的冠状病毒疾病(Covid-19)大流行,这是由于SARS-COV-2病毒引起的,以及预测每日死亡和累积死亡的重要问题,研究了预测区域或Poisson回归模型的构建和间隔的构建,并介绍了一个过度撤退的Poisson Repression模型。对于Poisson回归模型,开发了几个预测区域,并通过模拟研究比较它们的性能。这些方法适用于美国(美国)由于19日,美国(美国)预测每日和累积死亡的问题。为了检查其相对于实际发生的事情的表现,直到5月15日,每天的死亡数据被用于预测6月1日的累积死亡。据观察,相对于泊松回归模型,观察到的数据中存在过度分散。因此,提出了一个过度分散的泊松回归模型。这种新模型基于生存分析中的脆弱想法,并且通过附加参数量化了过度分散。泊松回归模型是这个过度分散的泊松回归模型中的一个隐藏模型,当过度分散参数增加到无穷大时,它作为限制情况获得。列出了直到7月2日的数据,到7月16日到7月16日到7月16日,美国累计死亡人数的预测区域。最后,本文讨论了提议的程序的局限性,并提到了开放研究问题,以及在远程预测时的危险和陷阱,重点关注这种大流行,在这种大流行中,预测和不可预见的事件可能会对点预测和预测区域产生巨大影响。
Motivated by the current Coronavirus Disease (COVID-19) pandemic, which is due to the SARS-CoV-2 virus, and the important problem of forecasting daily deaths and cumulative deaths, this paper examines the construction of prediction regions or intervals under the Poisson regression model and for an over-dispersed Poisson regression model. For the Poisson regression model, several prediction regions are developed and their performance are compared through simulation studies. The methods are applied to the problem of forecasting daily and cumulative deaths in the United States (US) due to COVID-19. To examine their performance relative to what actually happened, daily deaths data until May 15th were used to forecast cumulative deaths by June 1st. It was observed that there is over-dispersion in the observed data relative to the Poisson regression model. An over-dispersed Poisson regression model is therefore proposed. This new model builds on frailty ideas in Survival Analysis and over-dispersion is quantified through an additional parameter. The Poisson regression model is a hidden model in this over-dispersed Poisson regression model and obtains as a limiting case when the over-dispersion parameter increases to infinity. A prediction region for the cumulative number of US deaths due to COVID-19 by July 16th, given the data until July 2nd, is presented. Finally, the paper discusses limitations of proposed procedures and mentions open research problems, as well as the dangers and pitfalls when forecasting on a long horizon, with focus on this pandemic where events, both foreseen and unforeseen, could have huge impacts on point predictions and prediction regions.