罕见事件时间序列预测:以太阳耀斑预测为例

Azim Ahmadzadeh, Berkay Aydin, Dustin J. Kempton, Maxwell Hostetter, R. Angryk, M. Georgoulis, Sushant S. Mahajan
{"title":"罕见事件时间序列预测:以太阳耀斑预测为例","authors":"Azim Ahmadzadeh, Berkay Aydin, Dustin J. Kempton, Maxwell Hostetter, R. Angryk, M. Georgoulis, Sushant S. Mahajan","doi":"10.1109/ICMLA.2019.00293","DOIUrl":null,"url":null,"abstract":"We present a case study for time series prediction models in extreme class-imbalance problems. We have extracted multiple properties from the Space Weather ANalytics for Solar Flares (SWAN-SF) benchmark dataset which comprises of magnetic features from over 4075 active regions over a period of 9 years to create the forecasting dataset used in this study. In the extracted dataset, the class-imbalance ratio is 1:60, where the minority class is formed by instances of strong solar flares (GOES M-and X-class). This ratio reaches to 1:800 if we only consider the strongest class of flares (GOES X-class). This case of extreme imbalance, along with the temporal coherence of the sliced time series, provides us with an interesting set of challenges in the forecasting of scarce real-life phenomena. We have explored remedies to tackle the class-imbalance issue such as undersampling, oversampling and misclassification weights. In the process, we elaborate on common mistakes and pitfalls caused by ignoring the side effects of these remedies, including how and why they weaken the robustness of the trained models while seemingly improving the performance.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Rare-Event Time Series Prediction: A Case Study of Solar Flare Forecasting\",\"authors\":\"Azim Ahmadzadeh, Berkay Aydin, Dustin J. Kempton, Maxwell Hostetter, R. Angryk, M. Georgoulis, Sushant S. Mahajan\",\"doi\":\"10.1109/ICMLA.2019.00293\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a case study for time series prediction models in extreme class-imbalance problems. We have extracted multiple properties from the Space Weather ANalytics for Solar Flares (SWAN-SF) benchmark dataset which comprises of magnetic features from over 4075 active regions over a period of 9 years to create the forecasting dataset used in this study. In the extracted dataset, the class-imbalance ratio is 1:60, where the minority class is formed by instances of strong solar flares (GOES M-and X-class). This ratio reaches to 1:800 if we only consider the strongest class of flares (GOES X-class). This case of extreme imbalance, along with the temporal coherence of the sliced time series, provides us with an interesting set of challenges in the forecasting of scarce real-life phenomena. We have explored remedies to tackle the class-imbalance issue such as undersampling, oversampling and misclassification weights. In the process, we elaborate on common mistakes and pitfalls caused by ignoring the side effects of these remedies, including how and why they weaken the robustness of the trained models while seemingly improving the performance.\",\"PeriodicalId\":436714,\"journal\":{\"name\":\"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2019.00293\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2019.00293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

我们提出了一个极端类别失衡问题的时间序列预测模型的案例研究。我们从太阳耀斑空间天气分析(SWAN-SF)基准数据集中提取了多个属性,该数据集包括9年来超过4075个活跃区域的磁特征,以创建本研究中使用的预测数据集。在提取的数据集中,类不平衡比为1:60,其中少数类是由强太阳耀斑(GOES m级和x级)的实例形成的。如果我们只考虑最强级别的耀斑(GOES x级),这个比例达到1:80。这种极端不平衡的情况,以及切片时间序列的时间一致性,为我们在预测稀缺的现实生活现象方面提供了一系列有趣的挑战。我们探索了解决类不平衡问题的补救措施,如采样不足、过采样和错误的分类权重。在此过程中,我们详细说明了由于忽视这些补救措施的副作用而导致的常见错误和陷阱,包括它们如何以及为什么在表面上提高性能的同时削弱了训练模型的鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Rare-Event Time Series Prediction: A Case Study of Solar Flare Forecasting
We present a case study for time series prediction models in extreme class-imbalance problems. We have extracted multiple properties from the Space Weather ANalytics for Solar Flares (SWAN-SF) benchmark dataset which comprises of magnetic features from over 4075 active regions over a period of 9 years to create the forecasting dataset used in this study. In the extracted dataset, the class-imbalance ratio is 1:60, where the minority class is formed by instances of strong solar flares (GOES M-and X-class). This ratio reaches to 1:800 if we only consider the strongest class of flares (GOES X-class). This case of extreme imbalance, along with the temporal coherence of the sliced time series, provides us with an interesting set of challenges in the forecasting of scarce real-life phenomena. We have explored remedies to tackle the class-imbalance issue such as undersampling, oversampling and misclassification weights. In the process, we elaborate on common mistakes and pitfalls caused by ignoring the side effects of these remedies, including how and why they weaken the robustness of the trained models while seemingly improving the performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Automated Stenosis Classification of Carotid Artery Sonography using Deep Neural Networks Hybrid Condition Monitoring for Power Electronic Systems Time Series Anomaly Detection from a Markov Chain Perspective Anyone here? Smart Embedded Low-Resolution Omnidirectional Video Sensor to Measure Room Occupancy Deep Learning with Domain Randomization for Optimal Filtering
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1