论文标题
寻找宇宙中最奇怪的星系
In search of the weirdest galaxies in the Universe
论文作者
论文摘要
怪异的星系是异常值,具有未知或非常罕见的特征,使它们与普通样本不同。这些星系非常有趣,因为它们可能会对当前理论提供新的见解,或者可以用来形成有关宇宙过程的新理论。有趣的离群值通常是偶然发现的,但是随着未来的大调查产生大量数据,这将变得越来越困难。这使得需要机器学习检测技术来找到有趣的怪异对象。在这项工作中,我们检查了Galaxy和Mass Assembly调查的第三个数据释放的星系光谱,并使用两种不同的离群检测技术寻找怪异的外部星系。首先,我们使用通量值作为输入特征在星系光谱上应用基于距离的无监督随机森林。检查具有高离数分数的光谱并将其分为不同类别,例如混合,准恒星对象和BPT异常值。我们还使用变异自动编码器尝试了基于重建的离群检测方法,并比较两种不同方法的结果。最后,我们将降低降低技术应用于方法的输出,以检查相似光谱的聚类。我们发现两种无监督的方法都从数据中提取重要特征,可用于查找许多不同类型的异常值。
Weird galaxies are outliers that have either unknown or very uncommon features making them different from the normal sample. These galaxies are very interesting as they may provide new insights into current theories, or can be used to form new theories about processes in the Universe. Interesting outliers are often found by accident, but this will become increasingly more difficult with future big surveys generating an enormous amount of data. This gives the need for machine learning detection techniques to find the interesting weird objects. In this work, we inspect the galaxy spectra of the third data release of the Galaxy And Mass Assembly survey and look for the weird outlying galaxies using two different outlier detection techniques. First, we apply distance-based Unsupervised Random Forest on the galaxy spectra using the flux values as input features. Spectra with a high outlier score are inspected and divided into different categories such as blends, quasi-stellar objects, and BPT outliers. We also experiment with a reconstruction-based outlier detection method using a variational autoencoder and compare the results of the two different methods. At last, we apply dimensionality reduction techniques on the output of the methods to inspect the clustering of similar spectra. We find that both unsupervised methods extract important features from the data and can be used to find many different types of outliers.