论文标题
半监督结构域的适应性,用于交叉访问星系形态分类和异常检测
Semi-Supervised Domain Adaptation for Cross-Survey Galaxy Morphology Classification and Anomaly Detection
论文作者
论文摘要
在大型天文学调查的时代,我们利用人工智能算法同时使用多个数据集的能力将为科学发现开辟新的途径。不幸的是,简单地在一个数据域中的图像上训练深层神经网络通常会导致任何其他数据集的性能非常差。在这里,我们开发了一种通用的域适应方法deepastrouda,能够执行半监督域对齐,可以应用于具有不同类型的类重叠的数据集。可以在两个数据集中的任何一个中都存在额外的类,并且该方法甚至可以在未知类的情况下使用。我们第一次证明了在两个非常不同的观察数据集(来自SDS和贴花)上成功使用域的适应性。我们表明,我们的方法能够弥合两个天文调查之间的差距,并且在未标记的数据集中对未知数据的异常检测和聚类也表现良好。我们将模型应用于具有异常检测的星系形态分类任务的两个示例:1)与检测合并星系的检测(包括一个未知异常类别)的螺旋和椭圆星系进行分类; 2)一个更详细的问题,其中类别描述了星系的更详细的形态学特性,并检测到重力透镜(包括一个未知异常类别的十个类别)。
In the era of big astronomical surveys, our ability to leverage artificial intelligence algorithms simultaneously for multiple datasets will open new avenues for scientific discovery. Unfortunately, simply training a deep neural network on images from one data domain often leads to very poor performance on any other dataset. Here we develop a Universal Domain Adaptation method DeepAstroUDA, capable of performing semi-supervised domain alignment that can be applied to datasets with different types of class overlap. Extra classes can be present in any of the two datasets, and the method can even be used in the presence of unknown classes. For the first time, we demonstrate the successful use of domain adaptation on two very different observational datasets (from SDSS and DECaLS). We show that our method is capable of bridging the gap between two astronomical surveys, and also performs well for anomaly detection and clustering of unknown data in the unlabeled dataset. We apply our model to two examples of galaxy morphology classification tasks with anomaly detection: 1) classifying spiral and elliptical galaxies with detection of merging galaxies (three classes including one unknown anomaly class); 2) a more granular problem where the classes describe more detailed morphological properties of galaxies, with the detection of gravitational lenses (ten classes including one unknown anomaly class).