Yangkang Chen, Alexandras Savvaidis, O.M. Saad, Guo-Chin Dino Huang, Daniel Siervo, Vincent O’Sullivan, Cooper McCabe, Bede Uku, Preston Fleck, Grace Burke, Natalie L. Alvarez, Jessica Domino, I. Grigoratos
{"title":"TXED:用于人工智能的德克萨斯州地震数据集","authors":"Yangkang Chen, Alexandras Savvaidis, O.M. Saad, Guo-Chin Dino Huang, Daniel Siervo, Vincent O’Sullivan, Cooper McCabe, Bede Uku, Preston Fleck, Grace Burke, Natalie L. Alvarez, Jessica Domino, I. Grigoratos","doi":"10.1785/0220230327","DOIUrl":null,"url":null,"abstract":"\n Machine-learning (ML) seismology relies on large datasets with high-fidelity labels from humans to train generalized models. Among the seismological applications of ML, earthquake detection, and P- and S-wave arrival picking are the most widely studied, with capabilities that can exceed humans. Here, we present a regional artificial intelligence (AI) earthquake dataset (TXED) compiled for the state of Texas. The TXED dataset is composed of earthquake signals with manually picked P- and S-wave arrival times and manually picked noise waveforms corresponding to more than 20,000 earthquake events spanning from the beginning of the Texas seismological network (TexNet) (1 January 2017) to date. These data are a supplement to the existing worldwide open-access seismological AI datasets and represent the signal and noise characteristics of Texas. Direct applications of the TXED datasets include improving the performance of a global picking model in Texas by transfer learning using the new dataset. This dataset will also serve as a benchmark dataset for fundamental AI research like designing seismology-oriented deep-learning architectures. We plan to continue to expand the TXED dataset as more observations are made by TexNet analysts.","PeriodicalId":508466,"journal":{"name":"Seismological Research Letters","volume":"38 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TXED: The Texas Earthquake Dataset for AI\",\"authors\":\"Yangkang Chen, Alexandras Savvaidis, O.M. Saad, Guo-Chin Dino Huang, Daniel Siervo, Vincent O’Sullivan, Cooper McCabe, Bede Uku, Preston Fleck, Grace Burke, Natalie L. Alvarez, Jessica Domino, I. Grigoratos\",\"doi\":\"10.1785/0220230327\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Machine-learning (ML) seismology relies on large datasets with high-fidelity labels from humans to train generalized models. Among the seismological applications of ML, earthquake detection, and P- and S-wave arrival picking are the most widely studied, with capabilities that can exceed humans. Here, we present a regional artificial intelligence (AI) earthquake dataset (TXED) compiled for the state of Texas. The TXED dataset is composed of earthquake signals with manually picked P- and S-wave arrival times and manually picked noise waveforms corresponding to more than 20,000 earthquake events spanning from the beginning of the Texas seismological network (TexNet) (1 January 2017) to date. These data are a supplement to the existing worldwide open-access seismological AI datasets and represent the signal and noise characteristics of Texas. Direct applications of the TXED datasets include improving the performance of a global picking model in Texas by transfer learning using the new dataset. This dataset will also serve as a benchmark dataset for fundamental AI research like designing seismology-oriented deep-learning architectures. We plan to continue to expand the TXED dataset as more observations are made by TexNet analysts.\",\"PeriodicalId\":508466,\"journal\":{\"name\":\"Seismological Research Letters\",\"volume\":\"38 3\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Seismological Research Letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1785/0220230327\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seismological Research Letters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1785/0220230327","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
机器学习(ML)地震学依赖于带有人类高保真标签的大型数据集来训练通用模型。在机器学习的地震学应用中,地震探测、P 波和 S 波到达采样是研究最广泛的,其能力可以超过人类。在此,我们介绍一个为德克萨斯州编制的地区人工智能(AI)地震数据集(TXED)。TXED 数据集由人工选取的 P 波和 S 波到达时间以及人工选取的噪声波形的地震信号组成,对应于从德克萨斯地震学网络(TexNet)开始(2017 年 1 月 1 日)至今的 20,000 多个地震事件。这些数据是对现有的全球开放式地震人工影响数据集的补充,代表了德克萨斯州的信号和噪声特征。TXED 数据集的直接应用包括通过使用新数据集进行迁移学习,提高德克萨斯州全球采样模型的性能。该数据集还将作为基础人工智能研究的基准数据集,如设计面向地震学的深度学习架构。我们计划随着 TexNet 分析师进行更多观测,继续扩展 TXED 数据集。
Machine-learning (ML) seismology relies on large datasets with high-fidelity labels from humans to train generalized models. Among the seismological applications of ML, earthquake detection, and P- and S-wave arrival picking are the most widely studied, with capabilities that can exceed humans. Here, we present a regional artificial intelligence (AI) earthquake dataset (TXED) compiled for the state of Texas. The TXED dataset is composed of earthquake signals with manually picked P- and S-wave arrival times and manually picked noise waveforms corresponding to more than 20,000 earthquake events spanning from the beginning of the Texas seismological network (TexNet) (1 January 2017) to date. These data are a supplement to the existing worldwide open-access seismological AI datasets and represent the signal and noise characteristics of Texas. Direct applications of the TXED datasets include improving the performance of a global picking model in Texas by transfer learning using the new dataset. This dataset will also serve as a benchmark dataset for fundamental AI research like designing seismology-oriented deep-learning architectures. We plan to continue to expand the TXED dataset as more observations are made by TexNet analysts.