论文标题
碰撞频率模型的拟合程序的互补优点
Complementary Goodness of Fit Procedure for Crash Frequency Models
论文作者
论文摘要
本文提出了一种新的程序,用于评估通过道路出发(RWD)碰撞频率数据估计的通用线性模型(GLM)的良好性,夏威夷州在两车道两条路(TLTW)州道路上估计。使用十年的RWD崩溃数据(包括所有严重性水平)和道路特征(例如,流量,几何图形和库存数据库)分析该过程,可以在部分级别汇总。使用建议的程序评估的三种估计方法包括:负二项式(NB),零泄漏的负二项式(ZINB)和广义的线性混合模型阴性二项式(GLMM-NB)。该步骤表明,这三种方法可以根据预测的崩溃平均频率以及观察到的数据段的预测平均碰撞频率在狭窄范围内的崩溃范围内的崩溃分布来提供非常好的拟合。所提出的程序补充了其他统计数据,例如Akaike信息标准,贝叶斯信息标准以及用于模型选择的日志样式。它与没有随机效果的模型的统计数据一致,但对于GLMM-NB模型而言有分歧。该过程可以通过提供碰撞频率模型的拟合度的清晰可视化并允许计算伪R2的拟合方法来帮助模型选择。建议评估其用于评估GLMM-NB模型中随机效应数量与使用更合适的数据集中的随机效果数量之间的权衡,这些数据集不会导致收敛问题。
This paper presents a new procedure for evaluating the goodness of fit of Generalized Linear Models (GLM) estimated with Roadway Departure (RwD) crash frequency data for the State of Hawaii on two-lane two-way (TLTW) state roads. The procedure is analyzed using ten years of RwD crash data (including all severity levels) and roadway characteristics (e.g., traffic, geometry, and inventory databases) that can be aggregated at the section level. The three estimation methods evaluated using the proposed procedure include: Negative Binomial (NB), Zero-Inflated Negative Binomial (ZINB), and Generalized Linear Mixed Model-Negative Binomial (GLMM-NB). The procedure shows that the three methodologies can provide very good fits in terms of the distributions of crashes within narrow ranges of the predicted mean frequency of crashes and in terms of observed vs. predicted average crash frequencies for those data segments. The proposed procedure complements other statistics such as Akaike Information Criterion, Bayesian Information Criterion, and Log-likelihood used for model selection. It is consistent with those statistics for models without random effects, but it diverges for GLMM-NB models. The procedure can aid model selection by providing a clear visualization of the fit of crash frequency models and allowing the computation of a pseudo R2 similar the one used in linear regression. It is recommended to evaluate its use for evaluating the trade-off between the number of random effects in GLMM-NB models and their goodness of fit using more appropriate datasets that do not lead to convergence problems.