一种基于不平衡学习的变星分类新方法

IF 4.5 3区 物理与天体物理 Q1 ASTRONOMY & ASTROPHYSICS Publications of the Astronomical Society of Australia Pub Date : 2023-08-09 DOI:10.1017/pasa.2023.35
Jingyi Zhang, Yanxia Zhang, Zihan Kang, Changhua Li, Yihan Tao, Yongheng Zhao, Xue-bing Wu
{"title":"一种基于不平衡学习的变星分类新方法","authors":"Jingyi Zhang, Yanxia Zhang, Zihan Kang, Changhua Li, Yihan Tao, Yongheng Zhao, Xue-bing Wu","doi":"10.1017/pasa.2023.35","DOIUrl":null,"url":null,"abstract":"Abstract The advent of time-domain sky surveys has generated a vast amount of light variation data, enabling astronomers to investigate variable stars with large-scale samples. However, this also poses new opportunities and challenges for the time-domain research. In this paper, we focus on the classification of variable stars from the Catalina Surveys Data Release 2 and propose an imbalanced learning classifier based on Self-paced Ensemble (SPE) method. Compared with the work of Hosenie et al. (2020), our approach significantly enhances the classification Recall of Blazhko RR Lyrae stars from 12% to 85%, mixed-mode RR Lyrae variables from 29% to 64%, detached binaries from 68% to 97%, and LPV from 87% to 99%. SPE demonstrates a rather good performance on most of the variable classes except RRab, RRc, and contact and semi-detached binary. Moreover, the results suggest that SPE tends to target the minority classes of objects, while Random Forest is more effective in finding the majority classes. To balance the overall classification accuracy, we construct a Voting Classifier that combines the strengths of SPE and Random Forest. The results show that the Voting Classifier can achieve a balanced performance across all classes with minimal loss of accuracy. In summary, the SPE algorithm and Voting Classifier are superior to traditional machine learning methods and can be well applied to classify the periodic variable stars. This paper contributes to the current research on imbalanced learning in astronomy and can also be extended to the time-domain data of other larger sky survey projects (LSST, etc.).","PeriodicalId":20753,"journal":{"name":"Publications of the Astronomical Society of Australia","volume":null,"pages":null},"PeriodicalIF":4.5000,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel approach for variable star classification based on imbalanced learning\",\"authors\":\"Jingyi Zhang, Yanxia Zhang, Zihan Kang, Changhua Li, Yihan Tao, Yongheng Zhao, Xue-bing Wu\",\"doi\":\"10.1017/pasa.2023.35\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract The advent of time-domain sky surveys has generated a vast amount of light variation data, enabling astronomers to investigate variable stars with large-scale samples. However, this also poses new opportunities and challenges for the time-domain research. In this paper, we focus on the classification of variable stars from the Catalina Surveys Data Release 2 and propose an imbalanced learning classifier based on Self-paced Ensemble (SPE) method. Compared with the work of Hosenie et al. (2020), our approach significantly enhances the classification Recall of Blazhko RR Lyrae stars from 12% to 85%, mixed-mode RR Lyrae variables from 29% to 64%, detached binaries from 68% to 97%, and LPV from 87% to 99%. SPE demonstrates a rather good performance on most of the variable classes except RRab, RRc, and contact and semi-detached binary. Moreover, the results suggest that SPE tends to target the minority classes of objects, while Random Forest is more effective in finding the majority classes. To balance the overall classification accuracy, we construct a Voting Classifier that combines the strengths of SPE and Random Forest. The results show that the Voting Classifier can achieve a balanced performance across all classes with minimal loss of accuracy. In summary, the SPE algorithm and Voting Classifier are superior to traditional machine learning methods and can be well applied to classify the periodic variable stars. This paper contributes to the current research on imbalanced learning in astronomy and can also be extended to the time-domain data of other larger sky survey projects (LSST, etc.).\",\"PeriodicalId\":20753,\"journal\":{\"name\":\"Publications of the Astronomical Society of Australia\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2023-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Publications of the Astronomical Society of Australia\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1017/pasa.2023.35\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ASTRONOMY & ASTROPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Publications of the Astronomical Society of Australia","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1017/pasa.2023.35","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0

摘要

时域巡天的出现产生了大量的光变化数据,使天文学家能够用大尺度的样本来研究变星。然而,这也给时域研究带来了新的机遇和挑战。本文以Catalina survey Data Release 2中的变星分类为研究对象,提出了一种基于自同步集成(self -pace Ensemble, SPE)方法的不平衡学习分类器。与Hosenie et al.(2020)的工作相比,我们的方法显著提高了Blazhko RR Lyrae恒星的分类召回率,从12%提高到85%,混合模式RR Lyrae变量从29%提高到64%,分离双星从68%提高到97%,LPV从87%提高到99%。SPE在除RRab、RRc以及接触和半分离二进制之外的大多数变量类上都表现出相当好的性能。此外,结果表明SPE倾向于针对对象的少数类,而随机森林在寻找多数类方面更有效。为了平衡整体分类精度,我们构建了一个结合SPE和随机森林优势的投票分类器。结果表明,投票分类器可以在最小的准确性损失的情况下实现所有类的平衡性能。综上所述,SPE算法和投票分类器优于传统的机器学习方法,可以很好地应用于周期变星的分类。本文对当前天文学中不平衡学习的研究有一定的贡献,也可以推广到其他大型巡天项目(LSST等)的时域数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A novel approach for variable star classification based on imbalanced learning
Abstract The advent of time-domain sky surveys has generated a vast amount of light variation data, enabling astronomers to investigate variable stars with large-scale samples. However, this also poses new opportunities and challenges for the time-domain research. In this paper, we focus on the classification of variable stars from the Catalina Surveys Data Release 2 and propose an imbalanced learning classifier based on Self-paced Ensemble (SPE) method. Compared with the work of Hosenie et al. (2020), our approach significantly enhances the classification Recall of Blazhko RR Lyrae stars from 12% to 85%, mixed-mode RR Lyrae variables from 29% to 64%, detached binaries from 68% to 97%, and LPV from 87% to 99%. SPE demonstrates a rather good performance on most of the variable classes except RRab, RRc, and contact and semi-detached binary. Moreover, the results suggest that SPE tends to target the minority classes of objects, while Random Forest is more effective in finding the majority classes. To balance the overall classification accuracy, we construct a Voting Classifier that combines the strengths of SPE and Random Forest. The results show that the Voting Classifier can achieve a balanced performance across all classes with minimal loss of accuracy. In summary, the SPE algorithm and Voting Classifier are superior to traditional machine learning methods and can be well applied to classify the periodic variable stars. This paper contributes to the current research on imbalanced learning in astronomy and can also be extended to the time-domain data of other larger sky survey projects (LSST, etc.).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Publications of the Astronomical Society of Australia
Publications of the Astronomical Society of Australia 地学天文-天文与天体物理
CiteScore
5.90
自引率
9.50%
发文量
41
审稿时长
>12 weeks
期刊介绍: Publications of the Astronomical Society of Australia (PASA) publishes new and significant research in astronomy and astrophysics. PASA covers a wide range of topics within astronomy, including multi-wavelength observations, theoretical modelling, computational astronomy and visualisation. PASA also maintains its heritage of publishing results on southern hemisphere astronomy and on astronomy with Australian facilities. PASA publishes research papers, review papers and special series on topical issues, making use of expert international reviewers and an experienced Editorial Board. As an electronic-only journal, PASA publishes paper by paper, ensuring a rapid publication rate. There are no page charges. PASA''s Editorial Board approve a certain number of papers per year to be published Open Access without a publication fee.
期刊最新文献
Increasing the detectability of long-period and nulling pulsars in next-generation pulsar surveys Multi-band optical variability on diverse timescales of blazar 1E 1458.8+2249 Measuring the stellar and planetary parameters of the 51 Eridani system Imaging pulsar census of the galactic plane using MWA VCS data High-time resolution GPU imager for FRB searches at low radio frequencies
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1