{"title":"基于Hadoop的恒星发射谱线数据挖掘","authors":"Guozhou Ge, Jingchang Pan","doi":"10.1109/IMCEC.2016.7867469","DOIUrl":null,"url":null,"abstract":"Large Sky Area Multi-Object Fiber Spectroscopy Telescope (LAMOST) is a meridian reflecting Schmidt telescope. For each observation, it will produce tens of thousands of spectra. The spectra obtained from LAMOST pilot survey and the first two years of its regular survey, LMOST data release 2 (DR2) was released online in December 2014. This data set contains about more than four million spectra, which include stars, galaxies, quasars and other unknown stars. LAMOST large scientific survey project has provide massive spectra for the astronomers to search some rare special stars such as Cataclysmic Variable stars (CVs), Herbig Ae/Be etc. These special stars always contain emission lines. The existing of emission lines indicate that the stars have experienced or are not stable ejection process. The search for these objects is helpful in astronomy for scholars to study the stellar evolution. In this paper, we study the identification method of emission line stars, using the distributed, parallel computing large data processing technology, Hadoop, the emission line stars (ELS) spectra were screened from the DR2 spectra data set. Through by a multi node cluster parallel data mining experiment, we got 51092 spectra with emission lines from these spectra. Hadoop cluster has greatly improved the identification transmission line of the stellar spectrum efficiency, and this paper provides important reference value for the future to resolve similar massive spectra data processing problems.","PeriodicalId":218222,"journal":{"name":"2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Data mining of stellar spectra with emission lines based on Hadoop\",\"authors\":\"Guozhou Ge, Jingchang Pan\",\"doi\":\"10.1109/IMCEC.2016.7867469\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large Sky Area Multi-Object Fiber Spectroscopy Telescope (LAMOST) is a meridian reflecting Schmidt telescope. For each observation, it will produce tens of thousands of spectra. The spectra obtained from LAMOST pilot survey and the first two years of its regular survey, LMOST data release 2 (DR2) was released online in December 2014. This data set contains about more than four million spectra, which include stars, galaxies, quasars and other unknown stars. LAMOST large scientific survey project has provide massive spectra for the astronomers to search some rare special stars such as Cataclysmic Variable stars (CVs), Herbig Ae/Be etc. These special stars always contain emission lines. The existing of emission lines indicate that the stars have experienced or are not stable ejection process. The search for these objects is helpful in astronomy for scholars to study the stellar evolution. In this paper, we study the identification method of emission line stars, using the distributed, parallel computing large data processing technology, Hadoop, the emission line stars (ELS) spectra were screened from the DR2 spectra data set. Through by a multi node cluster parallel data mining experiment, we got 51092 spectra with emission lines from these spectra. Hadoop cluster has greatly improved the identification transmission line of the stellar spectrum efficiency, and this paper provides important reference value for the future to resolve similar massive spectra data processing problems.\",\"PeriodicalId\":218222,\"journal\":{\"name\":\"2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IMCEC.2016.7867469\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMCEC.2016.7867469","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data mining of stellar spectra with emission lines based on Hadoop
Large Sky Area Multi-Object Fiber Spectroscopy Telescope (LAMOST) is a meridian reflecting Schmidt telescope. For each observation, it will produce tens of thousands of spectra. The spectra obtained from LAMOST pilot survey and the first two years of its regular survey, LMOST data release 2 (DR2) was released online in December 2014. This data set contains about more than four million spectra, which include stars, galaxies, quasars and other unknown stars. LAMOST large scientific survey project has provide massive spectra for the astronomers to search some rare special stars such as Cataclysmic Variable stars (CVs), Herbig Ae/Be etc. These special stars always contain emission lines. The existing of emission lines indicate that the stars have experienced or are not stable ejection process. The search for these objects is helpful in astronomy for scholars to study the stellar evolution. In this paper, we study the identification method of emission line stars, using the distributed, parallel computing large data processing technology, Hadoop, the emission line stars (ELS) spectra were screened from the DR2 spectra data set. Through by a multi node cluster parallel data mining experiment, we got 51092 spectra with emission lines from these spectra. Hadoop cluster has greatly improved the identification transmission line of the stellar spectrum efficiency, and this paper provides important reference value for the future to resolve similar massive spectra data processing problems.