论文标题
簇光环II的动态建模会产生什么。与随机森林一起研究动态状态指标
What to expect from dynamical modelling of cluster haloes II. Investigating dynamical state indicators with Random Forest
论文作者
论文摘要
我们根据随机森林(RF)机器学习方法研究了各种动力学特征在预测星系簇的动态状态(DS)方面的重要性。我们使用了来自三百个流体动力放大模拟项目的大量星系簇,并从原始数据以及来自光学,X射线和Sunyaev-Zel'dovich(SZ)通道中相应的模拟图中构造动力学特征。我们不依赖RF算法的基于杂质的特征的重要性,而是直接使用偏外(OOB)分数来评估单个特征和不同特征组合的重要性。在研究的所有功能中,我们发现病毒比率为$η$,是最重要的单一功能。与模拟地图构造的功能直接从模拟和3维中计算出的功能具有更多的DS信息。与基于X射线或SZ地图的功能相比,与质心位置相关的特征更为重要。尽管研究了大量的功能,但多达三种不同类型的三个特征的组合已经可以饱和预测的得分。最后,我们表明,最敏感的功能$η$与动态建模中著名的半质量偏见密切相关。如果没有在DS中进行选择,群集光环在$η$中具有不对称分布,对应于总体正质量偏置。我们的工作提供了定量参考,以选择最佳特征,以区分模拟和观测中的星系簇DS。
We investigate the importances of various dynamical features in predicting the dynamical state (DS) of galaxy clusters, based on the Random Forest (RF) machine learning approach. We use a large sample of galaxy clusters from the Three Hundred Project of hydrodynamical zoomed-in simulations, and construct dynamical features from the raw data as well as from the corresponding mock maps in the optical, X-ray, and Sunyaev-Zel'dovich (SZ) channels. Instead of relying on the impurity based feature importance of the RF algorithm, we directly use the out-of-bag (OOB) scores to evaluate the importances of individual features and different feature combinations. Among all the features studied, we find the virial ratio, $η$, to be the most important single feature. The features calculated directly from the simulations and in 3-dimensions carry more information on the DS than those constructed from the mock maps. Compared with the features based on X-ray or SZ maps, features related to the centroid positions are more important. Despite the large number of investigated features, a combination of up to three features of different types can already saturate the score of the prediction. Lastly, we show that the most sensitive feature $η$ is strongly correlated with the well-known half-mass bias in dynamical modelling. Without a selection in DS, cluster halos have an asymmetric distribution in $η$, corresponding to an overall positive half-mass bias. Our work provides a quantitative reference for selecting the best features to discriminate the DS of galaxy clusters in both simulations and observations.