首页 > 最新文献

Artificial Intelligence in Geosciences最新文献

英文 中文
Capsule network-based approach for estimating grassland coverage using time series data from enhanced vegetation index 基于胶囊网络的增强植被指数时间序列草地覆盖度估算方法
Pub Date : 2021-12-01 DOI: 10.1016/j.aiig.2021.08.001
Yaqi Sun, Hailong Liu, Zhengqiang Guo

The degradation and desertification of grasslands pose a daunting challenge to China's arid and semiarid areas owing to the increasing demand for them in light of the rise of animal husbandry. Monitoring grasslands by using big data has emerged as a popular area of research in recent years. As grassland degradation is a slow and gradual process, the accurate identification of grassland cover is key to monitoring it. Vegetation coverage is currently monitored mainly by combining inversion-based methods with field surveys, which requires significant human effort and other resources and is thus unsuitable for use at a large scale. We proposed to use time series from the enhanced vegetation index (EVI) in capsule network-based methods to identify grasslands. The process classified grassland coverage into four levels, high, medium, low, and other, based on Landsat images from 2019. The accuracy in classifying the grasslands at each level was higher than 90%, with an overall accuracy of 96.32% and a kappa coefficient of 0.9508. The proposed method outperformed the SVM, RF, and LSTM algorithms in terms of classification accuracy.

随着畜牧业的兴起,对草原的需求不断增加,草原的退化和荒漠化对中国干旱和半干旱地区构成了严峻的挑战。近年来,利用大数据监测草原已成为一个热门的研究领域。草地退化是一个缓慢而渐进的过程,准确识别草地覆盖度是监测草地退化的关键。目前对植被覆盖度的监测主要是将基于反演的方法与实地调查相结合,这需要大量的人力和其他资源,因此不适合大规模使用。本文提出了基于胶囊网络的增强植被指数(EVI)的时间序列识别草地的方法。该过程基于2019年的Landsat图像,将草地覆盖率分为高、中、低和其他四个级别。各等级草地分类精度均在90%以上,总体精度为96.32%,kappa系数为0.9508。该方法在分类精度方面优于SVM、RF和LSTM算法。
{"title":"Capsule network-based approach for estimating grassland coverage using time series data from enhanced vegetation index","authors":"Yaqi Sun,&nbsp;Hailong Liu,&nbsp;Zhengqiang Guo","doi":"10.1016/j.aiig.2021.08.001","DOIUrl":"10.1016/j.aiig.2021.08.001","url":null,"abstract":"<div><p>The degradation and desertification of grasslands pose a daunting challenge to China's arid and semiarid areas owing to the increasing demand for them in light of the rise of animal husbandry. Monitoring grasslands by using big data has emerged as a popular area of research in recent years. As grassland degradation is a slow and gradual process, the accurate identification of grassland cover is key to monitoring it. Vegetation coverage is currently monitored mainly by combining inversion-based methods with field surveys, which requires significant human effort and other resources and is thus unsuitable for use at a large scale. We proposed to use time series from the enhanced vegetation index (EVI) in capsule network-based methods to identify grasslands. The process classified grassland coverage into four levels, high, medium, low, and other, based on Landsat images from 2019. The accuracy in classifying the grasslands at each level was higher than 90%, with an overall accuracy of 96.32% and a kappa coefficient of 0.9508. The proposed method outperformed the SVM, RF, and LSTM algorithms in terms of classification accuracy.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"2 ","pages":"Pages 26-34"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiig.2021.08.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91484651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Flood susceptibility assessment using artificial neural networks in Indonesia 基于人工神经网络的印尼洪水易感性评估
Pub Date : 2021-12-01 DOI: 10.1016/j.aiig.2022.03.002
Stela Priscillia , Calogero Schillaci , Aldo Lipani

Flood incidents can massively damage and disrupt a city economic or governing core. However, flood risk can be mitigated through event planning and city-wide preparation to reduce damage. For, governments, firms, and civilians to make such preparations, flood susceptibility predictions are required. To predict flood susceptibility nine environmental related factors have been identified. They are elevation, slope, curvature, topographical wetness index (TWI), Euclidean distance from a river, land-cover, stream power index (SPI), soil type and precipitation. This work will use these environmental related factors alongside Sentinel-1 satellite imagery in a model intercomparison study to back-predict flood susceptibility in Jakarta for the January 2020 historic flood event across 260 key locations. For each location, this study uses current environmental conditions to predict flood status in the following month. Considering the imbalance between instances of flooded and non-flooded conditions, the Synthetic Minority Oversampling Technique (SMOTE) has been implemented to balance both classes in the training set. This work compares predictions from artificial neural networks (ANN), k-Nearest Neighbors algorithms (k-NN) and Support Vector Machines (SVM) against a random baseline. The effects of the SMOTE are also assessed by training each model on balanced and imbalanced datasets. The ANN is found to be superior to the other machine learning models.

洪水事件会严重破坏和扰乱城市的经济或治理核心。然而,洪水风险可以通过活动规划和全市范围的准备来减轻,以减少损失。政府、企业和平民要做好这样的准备,就需要进行洪水易感性预测。为了预测洪水易感性,确定了9个环境相关因子。它们是高程、坡度、曲率、地形湿度指数(TWI)、与河流的欧几里得距离、土地覆盖、河流功率指数(SPI)、土壤类型和降水。这项工作将在模型比对研究中使用这些与环境相关的因素以及Sentinel-1卫星图像,对雅加达260个关键地点2020年1月历史性洪水事件的洪水易感性进行反向预测。对于每个地点,本研究使用当前的环境条件来预测下个月的洪水状况。考虑到洪水和非洪水条件实例之间的不平衡,在训练集中实现了合成少数派过采样技术(SMOTE)来平衡这两类。这项工作比较了人工神经网络(ANN)、k-近邻算法(k-NN)和支持向量机(SVM)对随机基线的预测。通过在平衡和不平衡数据集上训练每个模型,还评估了SMOTE的效果。研究发现,人工神经网络优于其他机器学习模型。
{"title":"Flood susceptibility assessment using artificial neural networks in Indonesia","authors":"Stela Priscillia ,&nbsp;Calogero Schillaci ,&nbsp;Aldo Lipani","doi":"10.1016/j.aiig.2022.03.002","DOIUrl":"10.1016/j.aiig.2022.03.002","url":null,"abstract":"<div><p>Flood incidents can massively damage and disrupt a city economic or governing core. However, flood risk can be mitigated through event planning and city-wide preparation to reduce damage. For, governments, firms, and civilians to make such preparations, flood susceptibility predictions are required. To predict flood susceptibility nine environmental related factors have been identified. They are elevation, slope, curvature, topographical wetness index (TWI), Euclidean distance from a river, land-cover, stream power index (SPI), soil type and precipitation. This work will use these environmental related factors alongside Sentinel-1 satellite imagery in a model intercomparison study to back-predict flood susceptibility in Jakarta for the January 2020 historic flood event across 260 key locations. For each location, this study uses current environmental conditions to predict flood status in the following month. Considering the imbalance between instances of flooded and non-flooded conditions, the Synthetic Minority Oversampling Technique (SMOTE) has been implemented to balance both classes in the training set. This work compares predictions from artificial neural networks (ANN), k-Nearest Neighbors algorithms (k-NN) and Support Vector Machines (SVM) against a random baseline. The effects of the SMOTE are also assessed by training each model on balanced and imbalanced datasets. The ANN is found to be superior to the other machine learning models.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"2 ","pages":"Pages 215-222"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666544122000090/pdfft?md5=be9afd52112c0a20ec31a3de99a5d5da&pid=1-s2.0-S2666544122000090-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86957808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Wavefield solutions from machine learned functions constrained by the Helmholtz equation 由亥姆霍兹方程约束的机器学习函数的波场解
Pub Date : 2021-12-01 DOI: 10.1016/j.aiig.2021.08.002
Tariq Alkhalifah , Chao Song , Umair bin Waheed , Qi Hao

Solving the wave equation is one of the most (if not the most) fundamental problems we face as we try to illuminate the Earth using recorded seismic data. The Helmholtz equation provides wavefield solutions that are dimensionally reduced, per frequency, compared to the time domain, which is useful for many applications, like full waveform inversion. However, our ability to attain such wavefield solutions depends often on the size of the model and the complexity of the wave equation. Thus, we use here a recently introduced framework based on neural networks to predict functional solutions through setting the underlying physical equation as a loss function to optimize the neural network (NN) parameters. For an input given by a location in the model space, the network learns to predict the wavefield value at that location, and its partial derivatives using a concept referred to as automatic differentiation, to fit, in our case, a form of the Helmholtz equation. We specifically seek the solution of the scattered wavefield considering a simple homogeneous background model that allows for analytical solutions of the background wavefield. Providing the NN with a reasonable number of random points from the model space will ultimately train a fully connected deep NN to predict the scattered wavefield function. The size of the network depends mainly on the complexity of the desired wavefield, with such complexity increasing with increasing frequency and increasing model complexity. However, smaller networks can provide smoother wavefields that might be useful for inversion applications. Preliminary tests on a two-box-shaped scatterer model with a source in the middle, as well as, the Marmousi model with a source at the surface demonstrate the potential of the NN for this application. Additional tests on a 3D model demonstrate the potential versatility of the approach.

当我们试图利用记录的地震数据来照亮地球时,求解波动方程是我们面临的最基本(如果不是最基本)的问题之一。与时域相比,亥姆霍兹方程提供了每频率维数降低的波场解,这对许多应用都很有用,比如全波形反演。然而,我们获得这种波场解的能力往往取决于模型的大小和波动方程的复杂程度。因此,我们在这里使用最近引入的基于神经网络的框架,通过将底层物理方程设置为损失函数来优化神经网络(NN)参数,从而预测函数解。对于模型空间中给定位置的输入,网络学习预测该位置的波场值及其偏导数,使用称为自动微分的概念,在我们的情况下,拟合亥姆霍兹方程的一种形式。我们特别寻求散射波场的解,考虑一个简单的均匀背景模型,允许背景波场的解析解。从模型空间中为神经网络提供合理数量的随机点,最终将训练出一个完全连接的深度神经网络来预测散射波场函数。网络的大小主要取决于所需波场的复杂性,这种复杂性随着频率的增加和模型复杂性的增加而增加。然而,较小的网络可以提供更平滑的波场,这可能对反演应用有用。在中间有源的两盒形散射体模型以及表面有源的Marmousi模型上进行的初步测试证明了神经网络在这种应用中的潜力。在3D模型上的额外测试证明了该方法的潜在多功能性。
{"title":"Wavefield solutions from machine learned functions constrained by the Helmholtz equation","authors":"Tariq Alkhalifah ,&nbsp;Chao Song ,&nbsp;Umair bin Waheed ,&nbsp;Qi Hao","doi":"10.1016/j.aiig.2021.08.002","DOIUrl":"10.1016/j.aiig.2021.08.002","url":null,"abstract":"<div><p>Solving the wave equation is one of the most (if not the most) fundamental problems we face as we try to illuminate the Earth using recorded seismic data. The Helmholtz equation provides wavefield solutions that are dimensionally reduced, per frequency, compared to the time domain, which is useful for many applications, like full waveform inversion. However, our ability to attain such wavefield solutions depends often on the size of the model and the complexity of the wave equation. Thus, we use here a recently introduced framework based on neural networks to predict functional solutions through setting the underlying physical equation as a loss function to optimize the neural network (NN) parameters. For an input given by a location in the model space, the network learns to predict the wavefield value at that location, and its partial derivatives using a concept referred to as automatic differentiation, to fit, in our case, a form of the Helmholtz equation. We specifically seek the solution of the scattered wavefield considering a simple homogeneous background model that allows for analytical solutions of the background wavefield. Providing the NN with a reasonable number of random points from the model space will ultimately train a fully connected deep NN to predict the scattered wavefield function. The size of the network depends mainly on the complexity of the desired wavefield, with such complexity increasing with increasing frequency and increasing model complexity. However, smaller networks can provide smoother wavefields that might be useful for inversion applications. Preliminary tests on a two-box-shaped scatterer model with a source in the middle, as well as, the Marmousi model with a source at the surface demonstrate the potential of the NN for this application. Additional tests on a 3D model demonstrate the potential versatility of the approach.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"2 ","pages":"Pages 11-19"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiig.2021.08.002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80598192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Sparse inversion-based seismic random noise attenuation via self-paced learning 基于自定节奏学习的稀疏反演地震随机噪声衰减
Pub Date : 2021-12-01 DOI: 10.1016/j.aiig.2022.03.003
Yang Yang , Zhiguo Wang , Jinghuai Gao , Naihao Liu , Zhen Li

Seismic random noise reduction is an important task in seismic data processing at the Chinese loess plateau area, which benefits the geologic structure interpretation and further reservoir prediction. The sparse inversion is one of the widely used tools for seismic random noise reduction, which is often solved via the sparse approximation with a regularization term. The 1 norm and total variation (TV) regularization term are two commonly used regularization terms. However, the 1 norm is only a relaxation of the 0 norm, which cannot always provide a sparse result. The TV regularization term may provide unexpected staircase artifacts. To avoid these disadvantages, we proposed a workflow for seismic random noise reduction by using the self-paced learning (SPL) scheme and a sparse representation (i.e. the continuous wavelet transform, CWT) with a mixed norm regularization, which includes the p norm and the TV regularization. In the implementation, the SPL, which is inspired by human cognitive learning, is introduced to avoid the bad minima of the non-convex cost function. The SPL can first select the high signal-to-noise ratio (SNR) seismic data and then gradually select the low SNR seismic data into the proposed workflow. Moreover, the generalized Beta wavelet (GBW) is adopted as the basic wavelet of the CWT to better match for seismic wavelets and then obtain a more localized time-frequency (TF) representation. It should be noted that the GBW can easily constitute a tight frame, which saves the calculation time. Synthetic and field data examples are adopted to demonstrate the effectiveness of the proposed workflow for effectively suppressing seismic random noises and accurately preserving valid seismic reflections.

地震随机降噪是黄土高原区地震资料处理中的一项重要工作,有利于地质构造解释和储层预测。稀疏反演是应用广泛的地震随机降噪工具之一,通常采用带正则化项的稀疏逼近来解决。1范数和总变分(TV)正则化项是两种常用的正则化项。然而,1范数只是0范数的松弛,不能总是提供稀疏的结果。TV正则化项可能提供意想不到的阶梯伪影。为了避免这些缺点,我们提出了一种采用自定步学习(SPL)方案和混合范数正则化的稀疏表示(即连续小波变换,CWT)的地震随机降噪工作流程,混合范数正则化包括p范数和TV正则化。在实现中,引入了受人类认知学习启发的SPL来避免非凸代价函数的不良极小值。SPL可以首先选择高信噪比的地震数据,然后逐步将低信噪比的地震数据选择到该工作流中。此外,采用广义β小波(GBW)作为CWT的基本小波,可以更好地匹配地震小波,从而获得更局部化的时频(TF)表示。需要注意的是,GBW可以很容易地构成一个紧框架,从而节省了计算时间。通过综合和现场数据实例验证了该工作流程在有效抑制地震随机噪声和准确保留有效反射波方面的有效性。
{"title":"Sparse inversion-based seismic random noise attenuation via self-paced learning","authors":"Yang Yang ,&nbsp;Zhiguo Wang ,&nbsp;Jinghuai Gao ,&nbsp;Naihao Liu ,&nbsp;Zhen Li","doi":"10.1016/j.aiig.2022.03.003","DOIUrl":"10.1016/j.aiig.2022.03.003","url":null,"abstract":"<div><p>Seismic random noise reduction is an important task in seismic data processing at the Chinese loess plateau area, which benefits the geologic structure interpretation and further reservoir prediction. The sparse inversion is one of the widely used tools for seismic random noise reduction, which is often solved via the sparse approximation with a regularization term. The <em>ℓ</em><sub>1</sub> norm and total variation (TV) regularization term are two commonly used regularization terms. However, the <em>ℓ</em><sub>1</sub> norm is only a relaxation of the <em>ℓ</em><sub>0</sub> norm, which cannot always provide a sparse result. The TV regularization term may provide unexpected staircase artifacts. To avoid these disadvantages, we proposed a workflow for seismic random noise reduction by using the self-paced learning (SPL) scheme and a sparse representation (i.e. the continuous wavelet transform, CWT) with a mixed norm regularization, which includes the <em>ℓ</em><sub><em>p</em></sub> norm and the TV regularization. In the implementation, the SPL, which is inspired by human cognitive learning, is introduced to avoid the bad minima of the non-convex cost function. The SPL can first select the high signal-to-noise ratio (SNR) seismic data and then gradually select the low SNR seismic data into the proposed workflow. Moreover, the generalized Beta wavelet (GBW) is adopted as the basic wavelet of the CWT to better match for seismic wavelets and then obtain a more localized time-frequency (TF) representation. It should be noted that the GBW can easily constitute a tight frame, which saves the calculation time. Synthetic and field data examples are adopted to demonstrate the effectiveness of the proposed workflow for effectively suppressing seismic random noises and accurately preserving valid seismic reflections.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"2 ","pages":"Pages 223-233"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666544122000107/pdfft?md5=f1a54c0d9a60a906b15a366bf305460a&pid=1-s2.0-S2666544122000107-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91264915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Local earthquakes detection: A benchmark dataset of 3-component seismograms built on a global scale 局部地震探测:在全球范围内建立的三分量地震记录基准数据集
Pub Date : 2020-12-01 DOI: 10.1016/j.aiig.2020.04.001
Fabrizio Magrini , Dario Jozinović , Fabio Cammarano , Alberto Michelini , Lapo Boschi

Machine learning is becoming increasingly important in scientific and technological progress, due to its ability to create models that describe complex data and generalize well. The wealth of publicly-available seismic data nowadays requires automated, fast, and reliable tools to carry out a multitude of tasks, such as the detection of small, local earthquakes in areas characterized by sparsity of receivers. A similar application of machine learning, however, should be built on a large amount of labeled seismograms, which is neither immediate to obtain nor to compile. In this study we present a large dataset of seismograms recorded along the vertical, north, and east components of 1487 broad-band or very broad-band receivers distributed worldwide; this includes 629,095 3-component seismograms generated by 304,878 local earthquakes and labeled as EQ, and 615,847 ones labeled as noise (AN). Application of machine learning to this dataset shows that a simple Convolutional Neural Network of 67,939 parameters allows discriminating between earthquakes and noise single-station recordings, even if applied in regions not represented in the training set. Achieving an accuracy of 96.7, 95.3, and 93.2% on training, validation, and test set, respectively, we prove that the large variety of geological and tectonic settings covered by our data supports the generalization capabilities of the algorithm, and makes it applicable to real-time detection of local events. We make the database publicly available, intending to provide the seismological and broader scientific community with a benchmark for time-series to be used as a testing ground in signal processing.

机器学习在科学和技术进步中变得越来越重要,因为它能够创建描述复杂数据和良好概括的模型。如今,丰富的公开地震数据需要自动化、快速和可靠的工具来执行大量任务,例如在接收器稀疏的地区检测小的局部地震。然而,类似的机器学习应用应该建立在大量标记地震图的基础上,这些地震图既不能立即获得,也不能立即编译。在这项研究中,我们提出了沿垂直、北、东分量记录的大型地震记录数据集,这些数据来自分布在世界各地的1487个宽带或甚宽带接收器;这包括由304,878次局部地震产生的629,095个三分量地震图,标记为EQ,以及标记为噪声(AN)的615,847个地震图。对该数据集的机器学习应用表明,一个包含67,939个参数的简单卷积神经网络可以区分地震和噪声单站记录,即使应用于训练集中未表示的区域。在训练集、验证集和测试集上,我们分别获得了96.7、95.3和93.2%的准确率,证明了我们的数据所涵盖的大量地质和构造环境支持算法的泛化能力,并使其适用于局部事件的实时检测。我们公开了这个数据库,目的是为地震学和更广泛的科学界提供一个时间序列的基准,作为信号处理的试验场。
{"title":"Local earthquakes detection: A benchmark dataset of 3-component seismograms built on a global scale","authors":"Fabrizio Magrini ,&nbsp;Dario Jozinović ,&nbsp;Fabio Cammarano ,&nbsp;Alberto Michelini ,&nbsp;Lapo Boschi","doi":"10.1016/j.aiig.2020.04.001","DOIUrl":"10.1016/j.aiig.2020.04.001","url":null,"abstract":"<div><p>Machine learning is becoming increasingly important in scientific and technological progress, due to its ability to create models that describe complex data and generalize well. The wealth of publicly-available seismic data nowadays requires automated, fast, and reliable tools to carry out a multitude of tasks, such as the detection of small, local earthquakes in areas characterized by sparsity of receivers. A similar application of machine learning, however, should be built on a large amount of labeled seismograms, which is neither immediate to obtain nor to compile. In this study we present a large dataset of seismograms recorded along the vertical, north, and east components of 1487 broad-band or very broad-band receivers distributed worldwide; this includes 629,095 3-component seismograms generated by 304,878 local earthquakes and labeled as EQ, and 615,847 ones labeled as noise (AN). Application of machine learning to this dataset shows that a simple Convolutional Neural Network of 67,939 parameters allows discriminating between earthquakes and noise single-station recordings, even if applied in regions not represented in the training set. Achieving an accuracy of 96.7, 95.3, and 93.2% on training, validation, and test set, respectively, we prove that the large variety of geological and tectonic settings covered by our data supports the generalization capabilities of the algorithm, and makes it applicable to real-time detection of local events. We make the database publicly available, intending to provide the seismological and broader scientific community with a benchmark for time-series to be used as a testing ground in signal processing.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"1 ","pages":"Pages 1-10"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiig.2020.04.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83549665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Exact Conditioning of Regression Random Forest for Spatial Prediction 回归随机森林空间预测的精确条件
Pub Date : 2020-12-01 DOI: 10.1016/j.aiig.2021.01.001
Francky Fouedjio

Regression random forest is becoming a widely-used machine learning technique for spatial prediction that shows competitive prediction performance in various geoscience fields. Like other popular machine learning methods for spatial prediction, regression random forest does not exactly honor the response variable’s measured values at sampled locations. However, competitor methods such as regression-kriging perfectly fit the response variable’s observed values at sampled locations by construction. Exactly matching the response variable’s measured values at sampled locations is often desirable in many geoscience applications. This paper presents a new approach ensuring that regression random forest perfectly matches the response variable’s observed values at sampled locations. The main idea consists of using the principal component analysis to create an orthogonal representation of the ensemble of regression tree predictors resulting from the traditional regression random forest. Then, the exact conditioning problem is reformulated as a Bayes-linear-Gauss problem on principal component scores. This problem has an analytical solution making it easy to perform Monte Carlo sampling of new principal component scores and then reconstruct regression tree predictors that perfectly match the response variable’s observed values at sampled locations. The reconstructed regression tree predictors’ average also precisely matches the response variable’s measured values at sampled locations by construction. The proposed method’s effectiveness is illustrated on the one hand using a synthetic dataset where the ground-truth is available everywhere within the study region, and on the other hand, using a real dataset comprising southwest England’s geochemical concentration data. It is compared with the regression-kriging and the traditional regression random forest. It appears that the proposed method can perfectly fit the response variable’s measured values at sampled locations while achieving good out of sample predictive performance comparatively to regression-kriging and traditional regression random forest.

回归随机森林正在成为一种广泛使用的空间预测机器学习技术,在各个地球科学领域显示出具有竞争力的预测性能。与其他流行的用于空间预测的机器学习方法一样,回归随机森林并不完全尊重采样位置的响应变量的测量值。然而,竞争对手的方法,如回归-克里格法,通过构造来完美地拟合采样位置的响应变量的观测值。在许多地球科学应用中,通常需要在采样位置精确匹配响应变量的测量值。本文提出了一种保证回归随机森林在采样点上与响应变量的观测值完美匹配的新方法。主要思想包括使用主成分分析来创建由传统回归随机森林产生的回归树预测因子集合的正交表示。然后,将精确条件问题重新表述为关于主成分分数的贝叶斯-线性-高斯问题。这个问题有一个解析解,可以很容易地对新的主成分得分进行蒙特卡罗采样,然后重建回归树预测器,使其完全匹配采样位置的响应变量的观测值。重建的回归树预测因子的平均值也通过构造精确匹配采样位置的响应变量的实测值。所提出的方法的有效性一方面通过使用合成数据集来说明,该数据集在研究区域内的任何地方都可以获得地面真相,另一方面使用包含英格兰西南部地球化学浓度数据的真实数据集来说明。并与回归克里格和传统回归随机森林进行了比较。结果表明,与回归克里格和传统回归随机森林相比,该方法可以很好地拟合采样位置的响应变量的实测值,同时具有良好的样本外预测性能。
{"title":"Exact Conditioning of Regression Random Forest for Spatial Prediction","authors":"Francky Fouedjio","doi":"10.1016/j.aiig.2021.01.001","DOIUrl":"10.1016/j.aiig.2021.01.001","url":null,"abstract":"<div><p>Regression random forest is becoming a widely-used machine learning technique for spatial prediction that shows competitive prediction performance in various geoscience fields. Like other popular machine learning methods for spatial prediction, regression random forest does not exactly honor the response variable’s measured values at sampled locations. However, competitor methods such as regression-kriging perfectly fit the response variable’s observed values at sampled locations by construction. Exactly matching the response variable’s measured values at sampled locations is often desirable in many geoscience applications. This paper presents a new approach ensuring that regression random forest perfectly matches the response variable’s observed values at sampled locations. The main idea consists of using the principal component analysis to create an orthogonal representation of the ensemble of regression tree predictors resulting from the traditional regression random forest. Then, the exact conditioning problem is reformulated as a Bayes-linear-Gauss problem on principal component scores. This problem has an analytical solution making it easy to perform Monte Carlo sampling of new principal component scores and then reconstruct regression tree predictors that perfectly match the response variable’s observed values at sampled locations. The reconstructed regression tree predictors’ average also precisely matches the response variable’s measured values at sampled locations by construction. The proposed method’s effectiveness is illustrated on the one hand using a synthetic dataset where the ground-truth is available everywhere within the study region, and on the other hand, using a real dataset comprising southwest England’s geochemical concentration data. It is compared with the regression-kriging and the traditional regression random forest. It appears that the proposed method can perfectly fit the response variable’s measured values at sampled locations while achieving good out of sample predictive performance comparatively to regression-kriging and traditional regression random forest.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"1 ","pages":"Pages 11-23"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiig.2021.01.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"111719870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Artificial Intelligence in Geosciences 地球科学中的人工智能
Pub Date : 2020-12-01 DOI: 10.1016/j.aiig.2021.02.001
Hua Wang, Gabriele Morra
{"title":"Artificial Intelligence in Geosciences","authors":"Hua Wang,&nbsp;Gabriele Morra","doi":"10.1016/j.aiig.2021.02.001","DOIUrl":"10.1016/j.aiig.2021.02.001","url":null,"abstract":"","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"1 ","pages":"Pages 52-53"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiig.2021.02.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"106644087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic fault instance segmentation based on mask propagation neural network 基于掩码传播神经网络的故障实例自动分割
Pub Date : 2020-12-01 DOI: 10.1016/j.aiig.2020.12.001
Ruoshui Zhou, Yufei Cai, Jingjing Zong, Xingmiao Yao, Fucai Yu, Guangmin Hu

Fault interpretation plays a critical role in understanding the crustal development and exploring the subsurface reservoirs such as gas and oil. Recently, significant advances have been made towards fault semantic segmentation using deep learning. However, few studies employ deep learning in fault instance segmentation. We introduce mask propagation neural network for fault instance segmentation. Our study focuses on the description of the differences and relationships between each fault profile and the consistency of fault instance segmentations with adjacent profiles. Our method refers to the reference-guided mask propagation network, which is firstly used in video object segmentation: taking the seismic profiles as video frames while the seismic data volume as a video sequence along the inline direction we can achieve fault instance segmentation based on the mask propagation method. As a multi-level convolutional neural network, the mask propagation network receives a small number of user-defined tags as the guidance and outputs the fault instance segmentation on 3D seismic data, which can facilitate the fault reconstruction workflow. Compared with the traditional deep learning method, the introduced mask propagation neural network can complete the fault instance segmentation work under the premise of ensuring the accuracy of fault detection.

断层解释对于认识地壳发育和勘探地下油气等储层具有重要意义。近年来,深度学习在故障语义分割方面取得了重大进展。然而,将深度学习应用于故障实例分割的研究很少。引入掩码传播神经网络进行故障实例分割。我们的研究重点是描述每个断层剖面之间的差异和关系,以及断层实例分割与相邻剖面的一致性。我们的方法参考了参考制导掩码传播网络,该方法首次应用于视频对象分割,以地震剖面为视频帧,地震数据体为内联方向的视频序列,实现基于掩码传播方法的故障实例分割。掩码传播网络作为一种多级卷积神经网络,以少量用户自定义标签为导向,在三维地震数据上输出故障实例分割,便于故障重构工作。与传统的深度学习方法相比,所引入的掩模传播神经网络可以在保证故障检测准确性的前提下完成故障实例分割工作。
{"title":"Automatic fault instance segmentation based on mask propagation neural network","authors":"Ruoshui Zhou,&nbsp;Yufei Cai,&nbsp;Jingjing Zong,&nbsp;Xingmiao Yao,&nbsp;Fucai Yu,&nbsp;Guangmin Hu","doi":"10.1016/j.aiig.2020.12.001","DOIUrl":"10.1016/j.aiig.2020.12.001","url":null,"abstract":"<div><p>Fault interpretation plays a critical role in understanding the crustal development and exploring the subsurface reservoirs such as gas and oil. Recently, significant advances have been made towards fault semantic segmentation using deep learning. However, few studies employ deep learning in fault instance segmentation. We introduce mask propagation neural network for fault instance segmentation. Our study focuses on the description of the differences and relationships between each fault profile and the consistency of fault instance segmentations with adjacent profiles. Our method refers to the reference-guided mask propagation network, which is firstly used in video object segmentation: taking the seismic profiles as video frames while the seismic data volume as a video sequence along the inline direction we can achieve fault instance segmentation based on the mask propagation method. As a multi-level convolutional neural network, the mask propagation network receives a small number of user-defined tags as the guidance and outputs the fault instance segmentation on 3D seismic data, which can facilitate the fault reconstruction workflow. Compared with the traditional deep learning method, the introduced mask propagation neural network can complete the fault instance segmentation work under the premise of ensuring the accuracy of fault detection.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"1 ","pages":"Pages 31-35"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiig.2020.12.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"105842170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Seismic labeled data expansion using variational autoencoders 地震标记数据扩展使用变分自编码器
Pub Date : 2020-12-01 DOI: 10.1016/j.aiig.2020.12.002
Kunhong Li , Song Chen , Guangmin Hu Ph.D

Supervised machine learning algorithms have been widely used in seismic exploration processing, but the lack of labeled examples complicates its application. Therefore, we propose a seismic labeled data expansion method based on deep variational Autoencoders (VAE), which are made of neural networks and contains two parts-Encoder and Decoder. Lack of training samples leads to overfitting of the network. We training the VAE with whole seismic data, which is a data-driven process and greatly alleviates the risk of overfitting. The Encoder captures the ability to map the seismic waveform Y to latent deep features z, and the Decoder captures the ability to reconstruct high-dimensional waveform Yˆ from latent deep features z. Later, we put the labeled seismic data into Encoders and get the latent deep features. We can easily use gaussian mixture model to fit the deep feature distribution of each class labeled data. We resample a mass of expansion deep features z according to the Gaussian mixture model, and put the expansion deep features into the decoder to generate expansion seismic data. The experiments in synthetic and real data show that our method alleviates the problem of lacking labeled seismic data for supervised seismic facies analysis.

有监督机器学习算法在地震勘探处理中得到了广泛的应用,但缺乏标记样例使其应用变得复杂。为此,我们提出了一种基于深度变分自编码器(VAE)的地震标记数据扩展方法,该方法由神经网络构成,包含编码器和解码器两部分。训练样本的缺乏会导致网络的过拟合。我们用整个地震数据来训练VAE,这是一个数据驱动的过程,大大降低了过拟合的风险。Encoder捕获了将地震波形Y映射到潜在深度特征z的能力,Decoder捕获了从潜在深度特征z重构高维波形Y -的能力。随后,我们将标记的地震数据放入Encoder中并获得潜在深度特征。我们可以很容易地使用高斯混合模型来拟合每一类标记数据的深度特征分布。我们根据高斯混合模型重新采样大量的扩展深度特征z *,并将扩展深度特征放入解码器中生成扩展地震数据。合成数据和实际数据的实验表明,该方法解决了监督地震相分析缺乏标记地震数据的问题。
{"title":"Seismic labeled data expansion using variational autoencoders","authors":"Kunhong Li ,&nbsp;Song Chen ,&nbsp;Guangmin Hu Ph.D","doi":"10.1016/j.aiig.2020.12.002","DOIUrl":"https://doi.org/10.1016/j.aiig.2020.12.002","url":null,"abstract":"<div><p>Supervised machine learning algorithms have been widely used in seismic exploration processing, but the lack of labeled examples complicates its application. Therefore, we propose a seismic labeled data expansion method based on deep variational Autoencoders (VAE), which are made of neural networks and contains two parts-Encoder and Decoder. Lack of training samples leads to overfitting of the network. We training the VAE with whole seismic data, which is a data-driven process and greatly alleviates the risk of overfitting. The Encoder captures the ability to map the seismic waveform <span><math><mrow><mi>Y</mi></mrow></math></span> to latent deep features <span><math><mrow><mi>z</mi></mrow></math></span>, and the Decoder captures the ability to reconstruct high-dimensional waveform <span><math><mrow><mover><mi>Y</mi><mo>ˆ</mo></mover></mrow></math></span> from latent deep features <span><math><mrow><mi>z</mi></mrow></math></span>. Later, we put the labeled seismic data into Encoders and get the latent deep features. We can easily use gaussian mixture model to fit the deep feature distribution of each class labeled data. We resample a mass of expansion deep features <span><math><mrow><msup><mi>z</mi><mo>∗</mo></msup></mrow></math></span> according to the Gaussian mixture model, and put the expansion deep features into the decoder to generate expansion seismic data. The experiments in synthetic and real data show that our method alleviates the problem of lacking labeled seismic data for supervised seismic facies analysis.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"1 ","pages":"Pages 24-30"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiig.2020.12.002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91764762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
ShakeDaDO: A data collection combining earthquake building damage and ShakeMap parameters for Italy ShakeDaDO:意大利地震建筑物损坏和地震地图参数的数据集合
Pub Date : 2020-12-01 DOI: 10.1016/j.aiig.2021.01.002
Licia Faenza , Alberto Michelini , Helen Crowley , Barbara Borzi , Marta Faravelli

In this article, we present a new data collection that combines information about earthquake damage with seismic shaking. Starting from the Da.D.O. database, which provides information on the damage of individual buildings subjected to sequences of past earthquakes in Italy, we have generated ShakeMaps for all the events with magnitude greater than 5.0 that have contributed to these sequences. The sequences under examination are those of Irpinia 1980, Umbria Marche 1997, Pollino 1998, Molise 2002, L’Aquila 2009 and Emilia 2012. In this way, we were able to combine, for a total of the 117,695 buildings, the engineering parameters included in Da.D.O., but revised and reprocessed in this application, and the ground shaking data for six different variables (namely, intensity in MCS scale, PGA, PGV, SA at 0.3s, 1.0s and 3.0s). The potential applications of this data collection are innumerable: from recalibrating fragility curves to training machine learning models to quantifying earthquake damage. This data collection will be made available within Da.D.O., a platform of the Italian Department of Civil Protection, developed by EUCENTRE.

在这篇文章中,我们提出了一个新的数据集,结合了地震破坏和地震震动的信息。从d.d.o.开始。数据库,该数据库提供了意大利过去地震序列中单个建筑物的损坏信息,我们为所有大于5.0级的事件生成了震动地图,这些事件导致了这些序列。研究的序列是1980年的Irpinia, 1997年的Umbria Marche, 1998年的Pollino, 2002年的Molise, 2009年的L 'Aquila和2012年的Emilia。通过这种方式,我们能够将总共117,695座建筑的工程参数包含在Da.D.O中。,但在本应用中进行了修改和重新处理,以及六个不同变量的地震动数据(即0.3s、1.0s和3.0s时的MCS强度、PGA、PGV、SA)。这些数据收集的潜在应用数不胜数:从重新校准脆弱性曲线到训练机器学习模型,再到量化地震破坏。该数据收集将在da . do内提供。,意大利民防部门的一个平台,由EUCENTRE开发。
{"title":"ShakeDaDO: A data collection combining earthquake building damage and ShakeMap parameters for Italy","authors":"Licia Faenza ,&nbsp;Alberto Michelini ,&nbsp;Helen Crowley ,&nbsp;Barbara Borzi ,&nbsp;Marta Faravelli","doi":"10.1016/j.aiig.2021.01.002","DOIUrl":"https://doi.org/10.1016/j.aiig.2021.01.002","url":null,"abstract":"<div><p>In this article, we present a new data collection that combines information about earthquake damage with seismic shaking. Starting from the Da.D.O. database, which provides information on the damage of individual buildings subjected to sequences of past earthquakes in Italy, we have generated ShakeMaps for all the events with magnitude greater than 5.0 that have contributed to these sequences. The sequences under examination are those of Irpinia 1980, Umbria Marche 1997, Pollino 1998, Molise 2002, L’Aquila 2009 and Emilia 2012. In this way, we were able to combine, for a total of the 117,695 buildings, the engineering parameters included in Da.D.O., but revised and reprocessed in this application, and the ground shaking data for six different variables (namely, intensity in MCS scale, PGA, PGV, SA at 0.3s, 1.0s and 3.0s). The potential applications of this data collection are innumerable: from recalibrating fragility curves to training machine learning models to quantifying earthquake damage. This data collection will be made available within Da.D.O., a platform of the Italian Department of Civil Protection, developed by EUCENTRE.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"1 ","pages":"Pages 36-51"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiig.2021.01.002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91764763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Artificial Intelligence in Geosciences
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1