D. Sierra-Porta , M. Tarazona-Alvarado , D.D. Herrera Acevedo
{"title":"Predicting sunspot number from topological features in spectral images I: Machine learning approach","authors":"D. Sierra-Porta , M. Tarazona-Alvarado , D.D. Herrera Acevedo","doi":"10.1016/j.ascom.2024.100857","DOIUrl":null,"url":null,"abstract":"<div><p>This study presents an advanced machine learning approach to predict the number of sunspots using a comprehensive dataset derived from solar images provided by the Solar and Heliospheric Observatory (SOHO). The dataset encompasses various spectral bands, capturing the complex dynamics of solar activity and facilitating interdisciplinary analyses with other solar phenomena. We employed five machine learning models: Random Forest Regressor, Gradient Boosting Regressor, Extra Trees Regressor, Ada Boost Regressor, and Hist Gradient Boosting Regressor, to predict sunspot numbers. These models utilized four key heliospheric variables — Proton Density, Temperature, Bulk Flow Speed and Interplanetary Magnetic Field (IMF) — alongside 14 newly introduced topological variables. These topological features were extracted from solar images using different filters, including HMIIGR, HMIMAG, EIT171, EIT195, EIT284, and EIT304. In total, 60 models were constructed, both incorporating and excluding the topological variables. Our analysis reveals that models incorporating the topological variables achieved significantly higher accuracy, with the r2-score improving from approximately 0.30 to 0.93 on average. The Extra Trees Regressor (ET) emerged as the best-performing model, demonstrating superior predictive capabilities across all datasets. These results underscore the potential of combining machine learning models with additional topological features from spectral analysis, offering deeper insights into the complex dynamics of solar activity and enhancing the precision of sunspot number predictions. This approach provides a novel methodology for improving space weather forecasting and contributes to a more comprehensive understanding of solar-terrestrial interactions.</p></div>","PeriodicalId":48757,"journal":{"name":"Astronomy and Computing","volume":"48 ","pages":"Article 100857"},"PeriodicalIF":1.9000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2213133724000726/pdfft?md5=263e96a037564f7a5811a7559eb104fa&pid=1-s2.0-S2213133724000726-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Astronomy and Computing","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213133724000726","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
This study presents an advanced machine learning approach to predict the number of sunspots using a comprehensive dataset derived from solar images provided by the Solar and Heliospheric Observatory (SOHO). The dataset encompasses various spectral bands, capturing the complex dynamics of solar activity and facilitating interdisciplinary analyses with other solar phenomena. We employed five machine learning models: Random Forest Regressor, Gradient Boosting Regressor, Extra Trees Regressor, Ada Boost Regressor, and Hist Gradient Boosting Regressor, to predict sunspot numbers. These models utilized four key heliospheric variables — Proton Density, Temperature, Bulk Flow Speed and Interplanetary Magnetic Field (IMF) — alongside 14 newly introduced topological variables. These topological features were extracted from solar images using different filters, including HMIIGR, HMIMAG, EIT171, EIT195, EIT284, and EIT304. In total, 60 models were constructed, both incorporating and excluding the topological variables. Our analysis reveals that models incorporating the topological variables achieved significantly higher accuracy, with the r2-score improving from approximately 0.30 to 0.93 on average. The Extra Trees Regressor (ET) emerged as the best-performing model, demonstrating superior predictive capabilities across all datasets. These results underscore the potential of combining machine learning models with additional topological features from spectral analysis, offering deeper insights into the complex dynamics of solar activity and enhancing the precision of sunspot number predictions. This approach provides a novel methodology for improving space weather forecasting and contributes to a more comprehensive understanding of solar-terrestrial interactions.
Astronomy and ComputingASTRONOMY & ASTROPHYSICSCOMPUTER SCIENCE,-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
CiteScore
4.10
自引率
8.00%
发文量
67
期刊介绍:
Astronomy and Computing is a peer-reviewed journal that focuses on the broad area between astronomy, computer science and information technology. The journal aims to publish the work of scientists and (software) engineers in all aspects of astronomical computing, including the collection, analysis, reduction, visualisation, preservation and dissemination of data, and the development of astronomical software and simulations. The journal covers applications for academic computer science techniques to astronomy, as well as novel applications of information technologies within astronomy.