COX模型测试误差的置信区间来自交叉验证

论文标题

COX模型测试误差的置信区间来自交叉验证

Confidence intervals for the Cox model test error from cross-validation

论文作者

Sun, Min Woo, Tibshirani, Robert

论文摘要

交叉验证（CV）是用于估计模型测试误差的统计学习中最广泛使用的技术之一，但尚未完全了解其行为。已经表明，使用CV的估计值的标准置信区间可能的覆盖范围低于名义水平。这种现象之所以发生，是因为每个样本在CV期间都使用在训练和测试程序中，因此，误差的CV估计值相关。如果不考虑此相关性，则差异的估计值比应有的要小。减轻此问题的一种方法是估计预测错误的平方误差，而不是使用嵌套的简历。与标准简历得出的间隔相比，该方法已显示出可以实现出色的覆盖率。在这项工作中，我们将嵌套的CV概念推广到COX比例危害模型，并探索此设置的各种测试错误选择。

Cross-validation (CV) is one of the most widely used techniques in statistical learning for estimating the test error of a model, but its behavior is not yet fully understood. It has been shown that standard confidence intervals for test error using estimates from CV may have coverage below nominal levels. This phenomenon occurs because each sample is used in both the training and testing procedures during CV and as a result, the CV estimates of the errors become correlated. Without accounting for this correlation, the estimate of the variance is smaller than it should be. One way to mitigate this issue is by estimating the mean squared error of the prediction error instead using nested CV. This approach has been shown to achieve superior coverage compared to intervals derived from standard CV. In this work, we generalize the nested CV idea to the Cox proportional hazards model and explore various choices of test error for this setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题