{"title":"Exploring Dimensionality Reduction of SDSS Spectral Abundances","authors":"Qianyu Fan, Joshua S. Speagle","doi":"arxiv-2409.09227","DOIUrl":null,"url":null,"abstract":"High-resolution stellar spectra offer valuable insights into atmospheric\nparameters and chemical compositions. However, their inherent complexity and\nhigh-dimensionality present challenges in fully utilizing the information they\ncontain. In this study, we utilize data from the Apache Point Observatory\nGalactic Evolution Experiment (APOGEE) within the Sloan Digital Sky Survey IV\n(SDSS-IV) to explore latent representations of chemical abundances by applying\nfive dimensionality reduction techniques: PCA, t-SNE, UMAP, Autoencoder, and\nVAE. Through this exploration, we evaluate the preservation of information and\ncompare reconstructed outputs with the original 19 chemical abundance data. Our\nfindings reveal a performance ranking of PCA < UMAP < t-SNE < VAE <\nAutoencoder, through comparing their explained variance under optimized MSE.\nThe performance of non-linear (Autoencoder and VAE) algorithms has\napproximately 10\\% improvement compared to linear (PCA) algorithm. This\ndifference can be referred to as the \"non-linearity gap.\" Future work should\nfocus on incorporating measurement errors into extension VAEs, thereby\nenhancing the reliability and interpretability of chemical abundance\nexploration in astronomical spectra.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09227","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
High-resolution stellar spectra offer valuable insights into atmospheric
parameters and chemical compositions. However, their inherent complexity and
high-dimensionality present challenges in fully utilizing the information they
contain. In this study, we utilize data from the Apache Point Observatory
Galactic Evolution Experiment (APOGEE) within the Sloan Digital Sky Survey IV
(SDSS-IV) to explore latent representations of chemical abundances by applying
five dimensionality reduction techniques: PCA, t-SNE, UMAP, Autoencoder, and
VAE. Through this exploration, we evaluate the preservation of information and
compare reconstructed outputs with the original 19 chemical abundance data. Our
findings reveal a performance ranking of PCA < UMAP < t-SNE < VAE <
Autoencoder, through comparing their explained variance under optimized MSE.
The performance of non-linear (Autoencoder and VAE) algorithms has
approximately 10\% improvement compared to linear (PCA) algorithm. This
difference can be referred to as the "non-linearity gap." Future work should
focus on incorporating measurement errors into extension VAEs, thereby
enhancing the reliability and interpretability of chemical abundance
exploration in astronomical spectra.