论文标题

SVM-lattice:双峰配置文件的识别和评估框架

SVM-Lattice: A Recognition & Evaluation Frame for Double-peaked Profiles

论文作者

Yang, Haifeng, Qu, Caixia, Cai, Jianghui, Zhang, Sulan, Zhao, Xujun

论文摘要

在大数据时代,具有罕见特征的特殊数据可能具有很大的意义。但是,很难自动从大量和高维数据集中搜索这些样本并系统地对其进行评估。 DOP是我们以前的工作[2],提供了一种稀有光谱的搜索方法,并具有Lamost调查的大量和高维数据的双峰曲线。结果的识别主要取决于天文学家的视觉检查。在本文中,作为一项后续研究,基于SVM(支持向量机)和FCL(正式概念晶格)设计了一种名为SVM晶格的新晶格结构,尤其适用于对具有双峰型号的稀有光谱的识别和评估。首先,SVM-tattice结构中的每个节点包含两个组成部分:意图由具有特定特征的光谱样本训练的支持向量定义,相关的范围都是由支持向量分类的正面样本。可以从每个晶格节点中提取增压平面,并用作分类器按类别搜索目标的分类器。层之间表示概括和专业关系,较高的层表示目标的置信度更高。然后,包括一种基于关联规则的SVM晶体建筑算法,一种基于关联规则的修剪算法以及评估算法,提供并分析了支持算法。最后,为了识别和评估双峰剖面的光谱,Lamost调查中的几个数据集用作实验数据集。与其他类似方法相比,该结果与传统方法,对分类结果的更详细和准确评估以及更高的搜索效率表现出良好的一致性。

In big data era, the special data with rare characteristics may be of great significations. However, it is very difficult to automatically search these samples from the massive and high-dimensional datasets and systematically evaluate them. The DoPS, our previous work [2], provided a search method of rare spectra with double-peaked profiles from massive and high-dimensional data of LAMOST survey. The identification of the results is mainly depended on visually inspection by astronomers. In this paper, as a follow-up study, a new lattice structure named SVM-Lattice is designed based on SVM(Support Vector Machine) and FCL(Formal Concept Lattice) and particularly applied in the recognition and evaluation of rare spectra with double-peaked profiles. First, each node in the SVM-Lattice structure contains two components: the intents are defined by the support vectors trained by the spectral samples with the specific characteristics, and the relevant extents are all the positive samples classified by the support vectors. The hyperplanes can be extracted from every lattice node and used as classifiers to search targets by categories. A generalization and specialization relationship is expressed between the layers, and higher layers indicate higher confidence of targets. Then, including a SVM-Lattice building algorithm, a pruning algorithm based on association rules, and an evaluation algorithm, the supporting algorithms are provided and analysed. Finally, for the recognition and evaluation of spectra with double-peaked profiles, several data sets from LAMOST survey are used as experimental dataset. The results exhibit good consistency with traditional methods, more detailed and accurate evaluations of classification results, and higher searching efficiency than other similar methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源