论文标题

注射机器学习成符号线性模型的显着性测试

Inject Machine Learning into Significance Test for Misspecified Linear Models

论文作者

Teng, Jiaye, Yuan, Yang

论文摘要

由于其强大的解释性,线性回归被广泛用于社会科学,从中,显着性测试从中提供了传统统计推断中模型或系数的显着性水平。但是,线性回归方法依赖于地面真实函数的线性假设,而实践中不一定存在。结果,即使对于简单的非线性情况,线性回归也可能无法报告正确的显着性水平。 在本文中,我们提出了一种简单有效的无假设方法,用于线性和非线性方案中的线性近似方法。首先,我们采用机器学习方法来符合训练集的地面真实功能并计算其线性近似。之后,我们通过根据验证集添加调整来获得估计器。我们证明了估计值的浓度不平等和渐近性能,从而导致相应的显着性检验。实验结果表明,我们的估计器明显优于非线性地面真实函数的线性回归,这表明我们的估计器可能是更好的工具进行显着性测试。

Due to its strong interpretability, linear regression is widely used in social science, from which significance test provides the significance level of models or coefficients in the traditional statistical inference. However, linear regression methods rely on the linear assumptions of the ground truth function, which do not necessarily hold in practice. As a result, even for simple non-linear cases, linear regression may fail to report the correct significance level. In this paper, we present a simple and effective assumption-free method for linear approximation in both linear and non-linear scenarios. First, we apply a machine learning method to fit the ground truth function on the training set and calculate its linear approximation. Afterward, we get the estimator by adding adjustments based on the validation set. We prove the concentration inequalities and asymptotic properties of our estimator, which leads to the corresponding significance test. Experimental results show that our estimator significantly outperforms linear regression for non-linear ground truth functions, indicating that our estimator might be a better tool for the significance test.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源