论文标题

使用集合机学习算法选择类星体的可行性和灵活性

The Feasibility and Flexibility of Selecting Quasars by Variability Using Ensemble Machine Learning Algorithms

论文作者

Yang, Da-Ming, Xie, Zhang-Liang, Wang, Jun-Xian

论文摘要

In this work we train three decision-tree based ensemble machine learning algorithms (Random Forest Classifier, Adaptive Boosting and Gradient Boosting Decision Tree respectively) to study quasar selection in the variable source catalog in SDSS Stripe 82. We build training and test samples (both containing 1:1 of quasars and stars) using the spectroscopic confirmed sources in SDSS DR14 (including 8330 quasars and 3966星)。我们发现,仅接受变化参数的培训,这三个模型都可以选择具有相似精度和完整性的类星体($ \ sim $ 98.5%和97.5%),甚至比单独接受SDSS颜色的培训($ \ sim $ 97.2%和96.5%)更好,甚至更好。通过在没有光谱识别的可变源上应用训练的模型,我们估计了Stripe 82变量源目录中的光谱确认的类星体样品为$ \ sim $ 93%(95%,$ _I <19.0 $)。使用随机森林分类器,我们得出了用于分类的观测特征的相对重要性。我们进一步表明,即使使用一年或两年的时间域观察,基于变异性的类星体选择仍然可以非常有效。

In this work we train three decision-tree based ensemble machine learning algorithms (Random Forest Classifier, Adaptive Boosting and Gradient Boosting Decision Tree respectively) to study quasar selection in the variable source catalog in SDSS Stripe 82. We build training and test samples (both containing 1:1 of quasars and stars) using the spectroscopic confirmed sources in SDSS DR14 (including 8330 quasars and 3966 stars). We find that, trained with variation parameters alone, all three models can select quasars with similarly and remarkably high precision and completeness ($\sim$ 98.5% and 97.5%), even better than trained with SDSS colors alone ($\sim$ 97.2% and 96.5%), consistent with previous studies. Through applying the trained models on the variable sources without spectroscopic identifications, we estimate the spectroscopically confirmed quasar sample in Stripe 82 variable source catalog is $\sim$ 93% complete (95% for $m_i<19.0$). Using the Random Forest Classifier we derive the relative importance of the observational features utilized for classifications. We further show that even using one- or two-year time domain observations, variability-based quasar selection could still be highly efficient.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源