论文标题
样品块相关矩阵的光谱统计
Spectral Statistics of Sample Block Correlation Matrices
论文作者
论文摘要
当种群平均值和协方差未知时,多变量统计中的基本概念通常用于推断随机变量之间的相关性/依赖性结构。当随机变量推广到随机子向量时,提出了它的自然块扩展为{\ it样品块相关矩阵}。在本文中,我们在高维设置下建立了样品块相关矩阵的光谱理论,并将其应用于群体独立的测试和相关问题。更具体地说,我们考虑了一个尺寸$ p $的随机向量,由$ k $ dimension $ p_t $'s组成,其中$ p_t $从$ 1 $到订单$ p $都可能不等。我们的主要目标是调查$ K $子向量的依赖性。为此,我们构建了一个随机矩阵模型,称为样品块相关矩阵,以此目的为$ n $样本。样品块相关矩阵的光谱统计数据包括经典的Wilks的统计数据和Schott的统计量为特殊情况。事实证明,光谱统计不取决于未知的人口平均值和协方差。此外,在无效的假设下,子向量是独立的,可以借助自由概率理论来描述光谱统计的限制行为。具体而言,在三种不同的设置可能是$ n $依赖性$ k $和$ p_t $的情况下,我们表明样品块相关矩阵的经验光谱分布会收敛于免费的泊松二项式分布,免费的Poisson分布,Marchenko-Pastur Law)和自由式高斯分布(Semicirircle Law)。然后,我们进一步得出在一般设置下块相关矩阵的线性光谱统计的CLT。
A fundamental concept in multivariate statistics, sample correlation matrix, is often used to infer the correlation/dependence structure among random variables, when the population mean and covariance are unknown. A natural block extension of it, {\it sample block correlation matrix}, is proposed to take on the same role, when random variables are generalized to random sub-vectors. In this paper, we establish a spectral theory of the sample block correlation matrices and apply it to group independent test and related problem, under the high-dimensional setting. More specifically, we consider a random vector of dimension $p$, consisting of $k$ sub-vectors of dimension $p_t$'s, where $p_t$'s can vary from $1$ to order $p$. Our primary goal is to investigate the dependence of the $k$ sub-vectors. We construct a random matrix model called sample block correlation matrix based on $n$ samples for this purpose. The spectral statistics of the sample block correlation matrix include the classical Wilks' statistic and Schott's statistic as special cases. It turns out that the spectral statistics do not depend on the unknown population mean and covariance. Further, under the null hypothesis that the sub-vectors are independent, the limiting behavior of the spectral statistics can be described with the aid of the Free Probability Theory. Specifically, under three different settings of possibly $n$-dependent $k$ and $p_t$'s, we show that the empirical spectral distribution of the sample block correlation matrix converges to the free Poisson binomial distribution, free Poisson distribution (Marchenko-Pastur law) and free Gaussian distribution (semicircle law), respectively. We then further derive the CLTs for the linear spectral statistics of the block correlation matrix under general setting.