论文标题
细胞为最小值协方差决定因素估计器
The Cellwise Minimum Covariance Determinant Estimator
论文作者
论文摘要
相协方差矩阵的通常的最小协方差决定因素(MCD)估计值对案例定性离群值具有鲁棒性。这些是与大多数案件不同的情况(即数据矩阵的行),这引起了人们的怀疑,认为它们可能属于不同的人群。另一方面,单元格离群值是数据矩阵中的单个单元格。当行包含一个或多个外围的单元时,同一行中的另一个单元格仍然包含我们希望保留的有用信息。我们提出了一种称为CellMCD的MCD方法的细胞稳健版本。观察到其主要的构件可能性,并在标记的细胞离群值的数量上进行罚款。它具有良好的分解属性。我们基于浓度步骤(C-Steps)构建一种快速算法,以始终降低目标。该方法在具有单元格离群值的模拟中表现良好,并且在干净的数据上具有很高的有限样本效率。在实际数据上显示了结果的可视化结果。
The usual Minimum Covariance Determinant (MCD) estimator of a covariance matrix is robust against casewise outliers. These are cases (that is, rows of the data matrix) that behave differently from the majority of cases, raising suspicion that they might belong to a different population. On the other hand, cellwise outliers are individual cells in the data matrix. When a row contains one or more outlying cells, the other cells in the same row still contain useful information that we wish to preserve. We propose a cellwise robust version of the MCD method, called cellMCD. Its main building blocks are observed likelihood and a penalty term on the number of flagged cellwise outliers. It possesses good breakdown properties. We construct a fast algorithm for cellMCD based on concentration steps (C-steps) that always lower the objective. The method performs well in simulations with cellwise outliers, and has high finite-sample efficiency on clean data. It is illustrated on real data with visualizations of the results.