局部地震探测:在全球范围内建立的三分量地震记录基准数据集

IF 4.2 Artificial Intelligence in Geosciences Pub Date : 2020-12-01 Epub Date: 2020-08-03 DOI:10.1016/j.aiig.2020.04.001

Fabrizio Magrini , Dario Jozinović , Fabio Cammarano , Alberto Michelini , Lapo Boschi

{"title":"局部地震探测:在全球范围内建立的三分量地震记录基准数据集","authors":"Fabrizio Magrini , Dario Jozinović , Fabio Cammarano , Alberto Michelini , Lapo Boschi","doi":"10.1016/j.aiig.2020.04.001","DOIUrl":null,"url":null,"abstract":"<div><p>Machine learning is becoming increasingly important in scientific and technological progress, due to its ability to create models that describe complex data and generalize well. The wealth of publicly-available seismic data nowadays requires automated, fast, and reliable tools to carry out a multitude of tasks, such as the detection of small, local earthquakes in areas characterized by sparsity of receivers. A similar application of machine learning, however, should be built on a large amount of labeled seismograms, which is neither immediate to obtain nor to compile. In this study we present a large dataset of seismograms recorded along the vertical, north, and east components of 1487 broad-band or very broad-band receivers distributed worldwide; this includes 629,095 3-component seismograms generated by 304,878 local earthquakes and labeled as EQ, and 615,847 ones labeled as noise (AN). Application of machine learning to this dataset shows that a simple Convolutional Neural Network of 67,939 parameters allows discriminating between earthquakes and noise single-station recordings, even if applied in regions not represented in the training set. Achieving an accuracy of 96.7, 95.3, and 93.2% on training, validation, and test set, respectively, we prove that the large variety of geological and tectonic settings covered by our data supports the generalization capabilities of the algorithm, and makes it applicable to real-time detection of local events. We make the database publicly available, intending to provide the seismological and broader scientific community with a benchmark for time-series to be used as a testing ground in signal processing.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"1 ","pages":"Pages 1-10"},"PeriodicalIF":4.2000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiig.2020.04.001","citationCount":"18","resultStr":"{\"title\":\"Local earthquakes detection: A benchmark dataset of 3-component seismograms built on a global scale\",\"authors\":\"Fabrizio Magrini , Dario Jozinović , Fabio Cammarano , Alberto Michelini , Lapo Boschi\",\"doi\":\"10.1016/j.aiig.2020.04.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Machine learning is becoming increasingly important in scientific and technological progress, due to its ability to create models that describe complex data and generalize well. The wealth of publicly-available seismic data nowadays requires automated, fast, and reliable tools to carry out a multitude of tasks, such as the detection of small, local earthquakes in areas characterized by sparsity of receivers. A similar application of machine learning, however, should be built on a large amount of labeled seismograms, which is neither immediate to obtain nor to compile. In this study we present a large dataset of seismograms recorded along the vertical, north, and east components of 1487 broad-band or very broad-band receivers distributed worldwide; this includes 629,095 3-component seismograms generated by 304,878 local earthquakes and labeled as EQ, and 615,847 ones labeled as noise (AN). Application of machine learning to this dataset shows that a simple Convolutional Neural Network of 67,939 parameters allows discriminating between earthquakes and noise single-station recordings, even if applied in regions not represented in the training set. Achieving an accuracy of 96.7, 95.3, and 93.2% on training, validation, and test set, respectively, we prove that the large variety of geological and tectonic settings covered by our data supports the generalization capabilities of the algorithm, and makes it applicable to real-time detection of local events. We make the database publicly available, intending to provide the seismological and broader scientific community with a benchmark for time-series to be used as a testing ground in signal processing.</p></div>\",\"PeriodicalId\":100124,\"journal\":{\"name\":\"Artificial Intelligence in Geosciences\",\"volume\":\"1 \",\"pages\":\"Pages 1-10\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/j.aiig.2020.04.001\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence in Geosciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666544120300010\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2020/8/3 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Geosciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666544120300010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/8/3 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 18

摘要

机器学习在科学和技术进步中变得越来越重要，因为它能够创建描述复杂数据和良好概括的模型。如今，丰富的公开地震数据需要自动化、快速和可靠的工具来执行大量任务，例如在接收器稀疏的地区检测小的局部地震。然而，类似的机器学习应用应该建立在大量标记地震图的基础上，这些地震图既不能立即获得，也不能立即编译。在这项研究中，我们提出了沿垂直、北、东分量记录的大型地震记录数据集，这些数据来自分布在世界各地的1487个宽带或甚宽带接收器;这包括由304,878次局部地震产生的629,095个三分量地震图，标记为EQ，以及标记为噪声(AN)的615,847个地震图。对该数据集的机器学习应用表明，一个包含67,939个参数的简单卷积神经网络可以区分地震和噪声单站记录，即使应用于训练集中未表示的区域。在训练集、验证集和测试集上，我们分别获得了96.7、95.3和93.2%的准确率，证明了我们的数据所涵盖的大量地质和构造环境支持算法的泛化能力，并使其适用于局部事件的实时检测。我们公开了这个数据库，目的是为地震学和更广泛的科学界提供一个时间序列的基准，作为信号处理的试验场。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Local earthquakes detection: A benchmark dataset of 3-component seismograms built on a global scale

Machine learning is becoming increasingly important in scientific and technological progress, due to its ability to create models that describe complex data and generalize well. The wealth of publicly-available seismic data nowadays requires automated, fast, and reliable tools to carry out a multitude of tasks, such as the detection of small, local earthquakes in areas characterized by sparsity of receivers. A similar application of machine learning, however, should be built on a large amount of labeled seismograms, which is neither immediate to obtain nor to compile. In this study we present a large dataset of seismograms recorded along the vertical, north, and east components of 1487 broad-band or very broad-band receivers distributed worldwide; this includes 629,095 3-component seismograms generated by 304,878 local earthquakes and labeled as EQ, and 615,847 ones labeled as noise (AN). Application of machine learning to this dataset shows that a simple Convolutional Neural Network of 67,939 parameters allows discriminating between earthquakes and noise single-station recordings, even if applied in regions not represented in the training set. Achieving an accuracy of 96.7, 95.3, and 93.2% on training, validation, and test set, respectively, we prove that the large variety of geological and tectonic settings covered by our data supports the generalization capabilities of the algorithm, and makes it applicable to real-time detection of local events. We make the database publicly available, intending to provide the seismological and broader scientific community with a benchmark for time-series to be used as a testing ground in signal processing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Artificial Intelligence in Geosciences

CiteScore

4.20

自引率

0.00%

发文量