Advancing Solar Energetic Particle Event Prediction through Survival Analysis and Cloud Computing. I. Kaplan–Meier Estimation and Cox Proportional Hazards Modeling

India Jackson, Petrus Martens
{"title":"Advancing Solar Energetic Particle Event Prediction through Survival Analysis and Cloud Computing. I. Kaplan–Meier Estimation and Cox Proportional Hazards Modeling","authors":"India Jackson, Petrus Martens","doi":"10.3847/1538-4365/ad3fba","DOIUrl":null,"url":null,"abstract":"Solar energetic particles (SEPs) pose significant challenges to technology, astronaut health, and space missions. This initial paper in our two-part series undertakes a comprehensive analysis of the time to detection for SEPs, applying advanced statistical techniques and cloud-computing resources to deepen our understanding of SEP event probabilities over time. We employ a range of models encompassing nonparametric, semiparametric, and parametric approaches, such as the Kaplan–Meier estimator and Cox Proportional Hazards models. These are complemented by various distribution models—including exponential, Weibull, lognormal, and log-logistic distributions—to effectively tackle the challenges associated with “censored data,” a common issue in survival analysis. Employing Amazon Web Services and Python’s “lifelines” and “scikit-survival” libraries, we efficiently preprocess and analyze large data sets. This methodical approach not only enhances our current analysis, but also sets a robust statistical foundation for the development of predictive models, which will be the focus of the subsequent paper. In identifying the key determinants that affect the timing of SEP detection, we establish the vital features that will inform the machine-learning (ML) techniques explored in the second paper. There, we will utilize advanced ML models—such as survival trees and random survival forests—to evolve SEP event prediction capabilities. This research is committed to advancing space weather, strengthening the safety of space-borne technology, and safeguarding astronaut health.","PeriodicalId":22368,"journal":{"name":"The Astrophysical Journal Supplement Series","volume":"37 20","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Astrophysical Journal Supplement Series","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3847/1538-4365/ad3fba","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Solar energetic particles (SEPs) pose significant challenges to technology, astronaut health, and space missions. This initial paper in our two-part series undertakes a comprehensive analysis of the time to detection for SEPs, applying advanced statistical techniques and cloud-computing resources to deepen our understanding of SEP event probabilities over time. We employ a range of models encompassing nonparametric, semiparametric, and parametric approaches, such as the Kaplan–Meier estimator and Cox Proportional Hazards models. These are complemented by various distribution models—including exponential, Weibull, lognormal, and log-logistic distributions—to effectively tackle the challenges associated with “censored data,” a common issue in survival analysis. Employing Amazon Web Services and Python’s “lifelines” and “scikit-survival” libraries, we efficiently preprocess and analyze large data sets. This methodical approach not only enhances our current analysis, but also sets a robust statistical foundation for the development of predictive models, which will be the focus of the subsequent paper. In identifying the key determinants that affect the timing of SEP detection, we establish the vital features that will inform the machine-learning (ML) techniques explored in the second paper. There, we will utilize advanced ML models—such as survival trees and random survival forests—to evolve SEP event prediction capabilities. This research is committed to advancing space weather, strengthening the safety of space-borne technology, and safeguarding astronaut health.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过生存分析和云计算推进太阳高能粒子事件预测。I. Kaplan-Meier 估计和 Cox 比例危害模型
太阳高能粒子(SEP)对技术、宇航员健康和太空任务构成了重大挑战。本文是我们两部分系列论文中的第一篇,对太阳高能粒子的探测时间进行了全面分析,应用先进的统计技术和云计算资源加深了我们对太阳高能粒子事件随时间变化的概率的理解。我们采用了一系列模型,包括非参数、半参数和参数方法,如 Kaplan-Meier 估计器和 Cox 比例危害模型。这些模型由各种分布模型(包括指数分布、Weibull 分布、对数正态分布和对数-对数分布)进行补充,以有效解决与 "删减数据 "相关的挑战,这是生存分析中的一个常见问题。利用亚马逊网络服务和 Python 的 "lifelines "和 "scikit-survival "库,我们可以高效地预处理和分析大型数据集。这种有条不紊的方法不仅增强了我们当前的分析能力,还为预测模型的开发奠定了坚实的统计基础,这将是后续论文的重点。在确定影响 SEP 检测时间的关键决定因素时,我们建立了重要的特征,这些特征将为第二篇论文中探讨的机器学习(ML)技术提供信息。在第二篇论文中,我们将利用先进的 ML 模型(如生存树和随机生存森林)来发展 SEP 事件预测能力。这项研究致力于推动空间天气的发展,加强空间技术的安全性,保障宇航员的健康。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Identifying Light-curve Signals with a Deep-learning-based Object Detection Algorithm. II. A General Light-curve Classification Framework Optical Variability of Gaia CRF3 Sources with Robust Statistics and the 5000 Most Variable Quasars Metrics of Astrometric Variability in the International Celestial Reference Frame. I. Statistical Analysis and Selection of the Most Variable Sources Forecast of Foreground Cleaning Strategies for AliCPT-1 Catalog of Proper Orbits for 1.25 Million Main-belt Asteroids and Discovery of 136 New Collisional Families
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1