论文标题
在广义线性模型中使用标签置换的回归
Regression with Label Permutation in Generalized Linear Model
论文作者
论文摘要
在实践中可能会违反响应和预测因子属于同一统计单元的假设。基于未标记数据的真实标签排序的无偏估计和恢复是具有挑战性的任务,并且在最近的文献中引起了人们的注意。在本文中,我们对具有多元响应的广义线性模型进行了相对完整的分析。该理论是在不同的情况下建立的,具有真实参数的知识,具有基础标签置换矩阵的部分知识,而没有任何知识。我们的结果消除了当前文献所需的严格条件,并进一步扩展到缺少的观察环境,而观察环境在标签排列问题领域从未考虑过。在计算方面,我们提出了两种方法:“最大似然估计”算法和“两步估计”算法,以适应不同的设置。当排列标签的比例中等时,两种方法都可以有效地工作。提供了多个数值实验并证实了我们的理论发现。
The assumption that response and predictor belong to the same statistical unit may be violated in practice. Unbiased estimation and recovery of true label ordering based on unlabeled data are challenging tasks and have attracted increasing attentions in the recent literature. In this paper, we present a relatively complete analysis of label permutation problem for the generalized linear model with multivariate responses. The theory is established under different scenarios, with knowledge of true parameters, with partial knowledge of underlying label permutation matrix and without any knowledge. Our results remove the stringent conditions required by the current literature and are further extended to the missing observation setting which has never been considered in the field of label permutation problem. On computational side, we propose two methods, "maximum likelihood estimation" algorithm and "two-step estimation" algorithm, to accommodate for different settings. When the proportion of permuted labels is moderate, both methods work effectively. Multiple numerical experiments are provided and corroborate our theoretical findings.