论文标题

使用$ \ ell_2 $调查线性估计器中的结构学习在反面问题中学习

Structure Learning in Inverse Ising Problems Using $\ell_2$-Regularized Linear Estimator

论文作者

Meng, Xiangming, Obuchi, Tomoyuki, Kabashima, Yoshiyuki

论文摘要

当采用$ \ ell_2 $ regularized(ridge)线性回归时,在反向iSing问题的框架中讨论了伪核方法的推理性能。引入了此设置,用于从理论上研究数据生成模型与推论一(即模型不匹配情况)不同的情况。在师生的情况下,在教师耦合稀疏的假设下,分析是使用复制品和空腔方法进行的,特别关注是否正确推断出教师耦合的存在/不存在。结果表明,尽管模型不匹配,但当旋转$ n $的数量小于数据集尺寸$ m $,在热力学极限$ n \ to \ infty $中时,可以使用幼稚的线性回归完美地识别网络结构。此外,为了访问不确定的区域$ m <n $,我们检查了$ \ ell_2 $正则化的效果,并发现在所有耦合估计值中出现偏见,从而阻止了网络结构的完美识别。但是,我们发现偏见显示出随着伪叶氏方法中选择的中心旋转的距离而逐渐衰减的。基于这一发现,我们提出了一个两阶段的估计器:在第一阶段,使用脊回归,估计值通过相对较小的阈值来修剪;在第二阶段,幼稚的线性回归仅在其余的耦合上进行,并且结果估计再次被另一个相对较大的阈值所修剪。该估计器具有适当的正则化系数和阈值,即使在$ 0 <m/n <1 $中,也可以实现网络结构的完美识别。广泛的数值实验的结果支持这些发现。

The inference performance of the pseudolikelihood method is discussed in the framework of the inverse Ising problem when the $\ell_2$-regularized (ridge) linear regression is adopted. This setup is introduced for theoretically investigating the situation where the data generation model is different from the inference one, namely the model mismatch situation. In the teacher-student scenario under the assumption that the teacher couplings are sparse, the analysis is conducted using the replica and cavity methods, with a special focus on whether the presence/absence of teacher couplings is correctly inferred or not. The result indicates that despite the model mismatch, one can perfectly identify the network structure using naive linear regression without regularization when the number of spins $N$ is smaller than the dataset size $M$, in the thermodynamic limit $N\to \infty$. Further, to access the underdetermined region $M < N$, we examine the effect of the $\ell_2$ regularization, and find that biases appear in all the coupling estimates, preventing the perfect identification of the network structure. We, however, find that the biases are shown to decay exponentially fast as the distance from the center spin chosen in the pseudolikelihood method grows. Based on this finding, we propose a two-stage estimator: In the first stage, the ridge regression is used and the estimates are pruned by a relatively small threshold; in the second stage the naive linear regression is conducted only on the remaining couplings, and the resultant estimates are again pruned by another relatively large threshold. This estimator with the appropriate regularization coefficient and thresholds is shown to achieve the perfect identification of the network structure even in $0<M/N<1$. Results of extensive numerical experiments support these findings.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源