Pub Date : 2025-06-01DOI: 10.1016/j.aiig.2025.100118
Stalyn Paucar, Christian Mejia-Escobar, Victor Collaguazo
The identification and characterization of rock types is a core activity in geology and related fields, including mining, petroleum, environmental science, industry, and construction. Traditionally, this task is performed by human specialists who analyze and describe the type, composition, texture, shape, and other properties of rock samples, whether collected in-situ or prepared in a laboratory. However, the process is subjective, dependent on the specialist’s experience, and time-consuming. This study proposes an artificial intelligence-based approach that combines computer vision and natural language processing to generate both textual and verbal descriptions from images of rock thin sections. A dataset of images and corresponding textual descriptions is used to train a hybrid deep learning model. Features extracted from the images using EfficientNetB7 are processed by a Transformer network to generate textual descriptions, which are then converted into verbal responses using a speech synthesis service. The experimental results show an accuracy of 0.892 and a BLEU score of 0.71. This model offers potential utility for research, professional, and academic applications and has been deployed as a web application for public use.
{"title":"Automatic description of rock thin sections: A web application","authors":"Stalyn Paucar, Christian Mejia-Escobar, Victor Collaguazo","doi":"10.1016/j.aiig.2025.100118","DOIUrl":"10.1016/j.aiig.2025.100118","url":null,"abstract":"<div><div>The identification and characterization of rock types is a core activity in geology and related fields, including mining, petroleum, environmental science, industry, and construction. Traditionally, this task is performed by human specialists who analyze and describe the type, composition, texture, shape, and other properties of rock samples, whether collected in-situ or prepared in a laboratory. However, the process is subjective, dependent on the specialist’s experience, and time-consuming. This study proposes an artificial intelligence-based approach that combines computer vision and natural language processing to generate both textual and verbal descriptions from images of rock thin sections. A dataset of images and corresponding textual descriptions is used to train a hybrid deep learning model. Features extracted from the images using EfficientNetB7 are processed by a Transformer network to generate textual descriptions, which are then converted into verbal responses using a speech synthesis service. The experimental results show an accuracy of 0.892 and a BLEU score of 0.71. This model offers potential utility for research, professional, and academic applications and has been deployed as a web application for public use.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 1","pages":"Article 100118"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144212094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01DOI: 10.1016/j.aiig.2025.100123
Omar M. Saad , Matteo Ravasi , Tariq Alkhalifah
Seismic data denoising is a critical process usually applied at various stages of the seismic processing workflow, as our ability to mitigate noise in seismic data affects the quality of our subsequent analyses. However, finding an optimal balance between preserving seismic signals and effectively reducing seismic noise presents a substantial challenge. In this study, we introduce a multi-stage deep learning model, trained in a self-supervised manner, designed specifically to suppress seismic noise while minimizing signal leakage. This model operates as a patch-based approach, extracting overlapping patches from the noisy data and converting them into 1D vectors for input. It consists of two identical sub-networks, each configured differently. Inspired by the transformer architecture, each sub-network features an embedded block that comprises two fully connected layers, which are utilized for feature extraction from the input patches. After reshaping, a multi-head attention module enhances the model’s focus on significant features by assigning higher attention weights to them. The key difference between the two sub-networks lies in the number of neurons within their fully connected layers. The first sub-network serves as a strong denoiser with a small number of neurons, effectively attenuating seismic noise; in contrast, the second sub-network functions as a signal-add-back model, using a larger number of neurons to retrieve some of the signal that was not preserved in the output of the first sub-network. The proposed model produces two outputs, each corresponding to one of the sub-networks, and both sub-networks are optimized simultaneously using the noisy data as the label for both outputs. Evaluations conducted on both synthetic and field data demonstrate the model’s effectiveness in suppressing seismic noise with minimal signal leakage, outperforming some benchmark methods.
{"title":"Self-supervised multi-stage deep learning network for seismic data denoising","authors":"Omar M. Saad , Matteo Ravasi , Tariq Alkhalifah","doi":"10.1016/j.aiig.2025.100123","DOIUrl":"10.1016/j.aiig.2025.100123","url":null,"abstract":"<div><div>Seismic data denoising is a critical process usually applied at various stages of the seismic processing workflow, as our ability to mitigate noise in seismic data affects the quality of our subsequent analyses. However, finding an optimal balance between preserving seismic signals and effectively reducing seismic noise presents a substantial challenge. In this study, we introduce a multi-stage deep learning model, trained in a self-supervised manner, designed specifically to suppress seismic noise while minimizing signal leakage. This model operates as a patch-based approach, extracting overlapping patches from the noisy data and converting them into 1D vectors for input. It consists of two identical sub-networks, each configured differently. Inspired by the transformer architecture, each sub-network features an embedded block that comprises two fully connected layers, which are utilized for feature extraction from the input patches. After reshaping, a multi-head attention module enhances the model’s focus on significant features by assigning higher attention weights to them. The key difference between the two sub-networks lies in the number of neurons within their fully connected layers. The first sub-network serves as a strong denoiser with a small number of neurons, effectively attenuating seismic noise; in contrast, the second sub-network functions as a signal-add-back model, using a larger number of neurons to retrieve some of the signal that was not preserved in the output of the first sub-network. The proposed model produces two outputs, each corresponding to one of the sub-networks, and both sub-networks are optimized simultaneously using the noisy data as the label for both outputs. Evaluations conducted on both synthetic and field data demonstrate the model’s effectiveness in suppressing seismic noise with minimal signal leakage, outperforming some benchmark methods.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 1","pages":"Article 100123"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01DOI: 10.1016/j.aiig.2025.100114
{"title":"Thank you reviewers!","authors":"","doi":"10.1016/j.aiig.2025.100114","DOIUrl":"10.1016/j.aiig.2025.100114","url":null,"abstract":"","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 1","pages":"Article 100114"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144366260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01DOI: 10.1016/j.aiig.2025.100127
M.A. Dalhat, Sami A. Osman
This study employed convolutional neural networks (CNNs) for the classification of rock minerals based on 3179 RGB-scale original microstructural images of undisturbed broken surfaces. The image dataset covers 40 distinct rock mineral-types. Three CNN architectures (Simple model, SqueezeNet, and Xception) were evaluated to compare their performance and feature extraction capabilities. Gradient-weighted Class Activation Mapping (Grad-CAM) was employed to visualize the features influencing model predictions, providing insights into how each model distinguishes between mineral classes. Key discriminative attributes included texture, grain size, pattern, and color variations. Texture and grain boundaries were identified as the most critical features, as they were strongly activated regions by the best model. Patterns such as banding and chromatic contrasts further enhanced classification accuracy. Performance analysis revealed that the Simple model had limited ability to isolate fine-grained details, producing broad and less specific activations (0.84 test accuracy). SqueezeNet demonstrated improved localization of discriminative features but occasionally missed finer textural details (0.95 test accuracy). The Xception model outperformed the others, achieving the highest classification accuracy (0.98 test accuracy) by exhibiting precise and tightly focused activations, capturing intricate textures and subtle chromatic variations. Its superior performance can be attributed to its deep architecture and efficient depth-wise separable convolutions, which enabled hierarchical and detailed feature extraction. Results underscores the importance of texture, pattern, and chromatic features in accurate mineral classification and highlights the suitability of deep, efficient architectures like Xception for such tasks. These findings demonstrate the potential of CNNs in geoscience research, offering a framework for automated mineral identification in industrial and scientific applications.
{"title":"Deep learning based identification of rock minerals from un-processed digital microscopic images of undisturbed broken-surfaces","authors":"M.A. Dalhat, Sami A. Osman","doi":"10.1016/j.aiig.2025.100127","DOIUrl":"10.1016/j.aiig.2025.100127","url":null,"abstract":"<div><div>This study employed convolutional neural networks (CNNs) for the classification of rock minerals based on 3179 RGB-scale original microstructural images of undisturbed broken surfaces. The image dataset covers 40 distinct rock mineral-types. Three CNN architectures (Simple model, SqueezeNet, and Xception) were evaluated to compare their performance and feature extraction capabilities. Gradient-weighted Class Activation Mapping (Grad-CAM) was employed to visualize the features influencing model predictions, providing insights into how each model distinguishes between mineral classes. Key discriminative attributes included texture, grain size, pattern, and color variations. Texture and grain boundaries were identified as the most critical features, as they were strongly activated regions by the best model. Patterns such as banding and chromatic contrasts further enhanced classification accuracy. Performance analysis revealed that the Simple model had limited ability to isolate fine-grained details, producing broad and less specific activations (0.84 test accuracy). SqueezeNet demonstrated improved localization of discriminative features but occasionally missed finer textural details (0.95 test accuracy). The Xception model outperformed the others, achieving the highest classification accuracy (0.98 test accuracy) by exhibiting precise and tightly focused activations, capturing intricate textures and subtle chromatic variations. Its superior performance can be attributed to its deep architecture and efficient depth-wise separable convolutions, which enabled hierarchical and detailed feature extraction. Results underscores the importance of texture, pattern, and chromatic features in accurate mineral classification and highlights the suitability of deep, efficient architectures like Xception for such tasks. These findings demonstrate the potential of CNNs in geoscience research, offering a framework for automated mineral identification in industrial and scientific applications.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 1","pages":"Article 100127"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144255175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01DOI: 10.1016/j.aiig.2025.100129
Kun Chen , Meng Li , Xiaolian Li , Guangzhi Cui , Jia Tian , JiaLe Li , RuoYao Mu , JunJie Zhu
Microseismic monitoring is essential for understanding subsurface dynamics and optimizing oil and gas production. However, traditional methods for the automatic detection of microseismic events rely heavily on characteristic functions and human intervention, often resulting in suboptimal performance when dealing with complex and noisy data. In this study, we propose a novel approach that leverages deep learning frame to extract multiscale features from microseismic data using a TransUNet neural network. Our model integrates the advantages of Transformer and UNet architectures to achieve high accuracy in multivariate image segmentation and precise picking of P-wave and S-wave first arrivals simultaneously. We validate our approach using both synthetic and field microseismic datasets recorded from gas storage monitoring and roof fracturing in a coal seam. The robustness of the proposed method has been verified in the testing of synthetic data with various levels of Gaussian and real background noises extracted from field data. The comparisons of the proposed method with UNet and SwinUNet in terms of the model architecture and classification performance demonstrate the TransUNet achieves the optimal balance in its architecture and inference speed. With relatively low inference time and network complexity, it operates effectively in high-precision microseismic phase pickings. This advancement holds significant promise for enhancing microseismic monitoring technology in hydraulic fracturing and reservoir monitoring applications.
{"title":"Enhancing microseismic event detection with TransUNet: A deep learning approach for simultaneous pickings of P-wave and S-wave first arrivals","authors":"Kun Chen , Meng Li , Xiaolian Li , Guangzhi Cui , Jia Tian , JiaLe Li , RuoYao Mu , JunJie Zhu","doi":"10.1016/j.aiig.2025.100129","DOIUrl":"10.1016/j.aiig.2025.100129","url":null,"abstract":"<div><div>Microseismic monitoring is essential for understanding subsurface dynamics and optimizing oil and gas production. However, traditional methods for the automatic detection of microseismic events rely heavily on characteristic functions and human intervention, often resulting in suboptimal performance when dealing with complex and noisy data. In this study, we propose a novel approach that leverages deep learning frame to extract multiscale features from microseismic data using a TransUNet neural network. Our model integrates the advantages of Transformer and UNet architectures to achieve high accuracy in multivariate image segmentation and precise picking of P-wave and S-wave first arrivals simultaneously. We validate our approach using both synthetic and field microseismic datasets recorded from gas storage monitoring and roof fracturing in a coal seam. The robustness of the proposed method has been verified in the testing of synthetic data with various levels of Gaussian and real background noises extracted from field data. The comparisons of the proposed method with UNet and SwinUNet in terms of the model architecture and classification performance demonstrate the TransUNet achieves the optimal balance in its architecture and inference speed. With relatively low inference time and network complexity, it operates effectively in high-precision microseismic phase pickings. This advancement holds significant promise for enhancing microseismic monitoring technology in hydraulic fracturing and reservoir monitoring applications.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 1","pages":"Article 100129"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144314011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01DOI: 10.1016/j.aiig.2025.100117
Elisangela L. Faria , Rayan Barbosa , Juliana M. Coelho , Thais F. Matos , Bernardo C.C. Santos , J.L. Gonzalez , Clécio R. Bom , Márcio P. de Albuquerque , P.J. Russano , Marcelo P. de Albuquerque
Convolutional neural networks have been widely used for analyzing image data in industry, especially in the oil and gas area. Brazil has an extensive hydrocarbon reserve on its coast and has also benefited from these neural network models. Image data from petrographic thin section can be essential to provide information about reservoir quality, highlighting important features such as carbonate lithology. However, the automatic identification of lithology in reservoir rocks is still a significant challenge, mainly due to the heterogeneity that is part of the lithologies of the Brazilian pre-salt. Within this context, this work presents an approach using one-class or specialist models to identify four classes of lithology present in reservoir rocks in the Brazilian pre-salt. The proposed methodology had the challenge of dealing with a small number of images for training the neural networks, in addition to the complexity involved in the analyzed data. An auto-machine learning tool called AutoKeras was used to define the hyperparameters of the implemented models. The results found were satisfactory and presented an accuracy greater than 70% for image samples belonging to other wells not seen during the model building, which increases the applicability of the implemented model. Finally, a comparison was made between the proposed methodology and multiple-class models, demonstrating the superiority of one-class models.
{"title":"Automatic classification of Carbonatic thin sections by computer vision techniques and one-vs-all models","authors":"Elisangela L. Faria , Rayan Barbosa , Juliana M. Coelho , Thais F. Matos , Bernardo C.C. Santos , J.L. Gonzalez , Clécio R. Bom , Márcio P. de Albuquerque , P.J. Russano , Marcelo P. de Albuquerque","doi":"10.1016/j.aiig.2025.100117","DOIUrl":"10.1016/j.aiig.2025.100117","url":null,"abstract":"<div><div>Convolutional neural networks have been widely used for analyzing image data in industry, especially in the oil and gas area. Brazil has an extensive hydrocarbon reserve on its coast and has also benefited from these neural network models. Image data from petrographic thin section can be essential to provide information about reservoir quality, highlighting important features such as carbonate lithology. However, the automatic identification of lithology in reservoir rocks is still a significant challenge, mainly due to the heterogeneity that is part of the lithologies of the Brazilian pre-salt. Within this context, this work presents an approach using one-class or specialist models to identify four classes of lithology present in reservoir rocks in the Brazilian pre-salt. The proposed methodology had the challenge of dealing with a small number of images for training the neural networks, in addition to the complexity involved in the analyzed data. An auto-machine learning tool called AutoKeras was used to define the hyperparameters of the implemented models. The results found were satisfactory and presented an accuracy greater than 70% for image samples belonging to other wells not seen during the model building, which increases the applicability of the implemented model. Finally, a comparison was made between the proposed methodology and multiple-class models, demonstrating the superiority of one-class models.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 1","pages":"Article 100117"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144261331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01DOI: 10.1016/j.aiig.2025.100125
Khadija Meghraoui , Teeradaj Racharak , Kenza Ait El Kadi , Saloua Bensiali , Imane Sebari
Crop-yield is a crucial metric in agriculture, essential for effective sector management and improving the overall production process. This indicator is heavily influenced by numerous environmental factors, particularly those related to soil and climate, which present a challenging task due to the complex interactions involved. In this paper, we introduce a novel integrated neurosymbolic framework that combines knowledge-based approaches with sensor data for crop-yield prediction. This framework merges predictions from vectors generated by modeling environmental factors using a newly developed ontology focused on key elements and evaluates this ontology using quantitative methods, specifically representation learning techniques, along with predictions derived from remote sensing imagery. We tested our proposed methodology on a public dataset centered on corn, aiming to predict crop-yield. Our developed smart model achieved promising results in terms of crop-yield prediction, with a root mean squared error (RMSE) of 1.72, outperforming the baseline models. The ontology-based approach achieved an RMSE of 1.73, while the remote sensing-based method yielded an RMSE of 1.77. This confirms the superior performance of our proposed approach over those using single modalities. This integrated neurosymbolic approach demonstrates that the fusion of statistical and symbolic artificial intelligence (AI) represents a significant advancement in agricultural applications. It is particularly effective for crop-yield prediction at the field scale, thus facilitating more informed decision-making in advanced agricultural practices. Additionally, it is acknowledged that results might be further improved by incorporating more detailed ontological knowledge and testing the model with higher-resolution imagery to enhance prediction accuracy.
{"title":"A new integrated neurosymbolic approach for crop-yield prediction using environmental data and satellite imagery at field scale","authors":"Khadija Meghraoui , Teeradaj Racharak , Kenza Ait El Kadi , Saloua Bensiali , Imane Sebari","doi":"10.1016/j.aiig.2025.100125","DOIUrl":"10.1016/j.aiig.2025.100125","url":null,"abstract":"<div><div>Crop-yield is a crucial metric in agriculture, essential for effective sector management and improving the overall production process. This indicator is heavily influenced by numerous environmental factors, particularly those related to soil and climate, which present a challenging task due to the complex interactions involved. In this paper, we introduce a novel integrated neurosymbolic framework that combines knowledge-based approaches with sensor data for crop-yield prediction. This framework merges predictions from vectors generated by modeling environmental factors using a newly developed ontology focused on key elements and evaluates this ontology using quantitative methods, specifically representation learning techniques, along with predictions derived from remote sensing imagery. We tested our proposed methodology on a public dataset centered on corn, aiming to predict crop-yield. Our developed smart model achieved promising results in terms of crop-yield prediction, with a root mean squared error (RMSE) of 1.72, outperforming the baseline models. The ontology-based approach achieved an RMSE of 1.73, while the remote sensing-based method yielded an RMSE of 1.77. This confirms the superior performance of our proposed approach over those using single modalities. This integrated neurosymbolic approach demonstrates that the fusion of statistical and symbolic artificial intelligence (AI) represents a significant advancement in agricultural applications. It is particularly effective for crop-yield prediction at the field scale, thus facilitating more informed decision-making in advanced agricultural practices. Additionally, it is acknowledged that results might be further improved by incorporating more detailed ontological knowledge and testing the model with higher-resolution imagery to enhance prediction accuracy.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 1","pages":"Article 100125"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144195736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-05-26DOI: 10.1016/j.aiig.2025.100124
André William Boroh , Alpha Baster Kenfack Fokem , Martin Luther Mfenjou , Firmin Dimitry Hamat , Fritz Mbounja Besseme
The objective of this study is to develop an advanced approach to variogram modelling by integrating genetic algorithms (GA) with machine learning-based linear regression, aiming to improve the accuracy and efficiency of geostatistical analysis, particularly in mineral exploration. The study combines GA and machine learning to optimise variogram parameters, including range, sill, and nugget, by minimising the root mean square error (RMSE) and maximising the coefficient of determination (R2). The experimental variograms were computed and modelled using theoretical models, followed by optimisation via evolutionary algorithms. The method was applied to gravity data from the Ngoura-Batouri-Kette mining district in Eastern Cameroon, covering 141 data points. Sequential Gaussian Simulations (SGS) were employed for predictive mapping to validate simulated results against true values. Key findings show variograms with ranges between 24.71 km and 49.77 km, optimised RMSE and R2 values of 11.21 mGal2 and 0.969, respectively, after 42 generations of GA optimisation. Predictive mapping using SGS demonstrated that simulated values closely matched true values, with the simulated mean at 21.75 mGal compared to the true mean of 25.16 mGal, and variances of 465.70 mGal2 and 555.28 mGal2, respectively. The results confirmed spatial variability and anisotropies in the N170-N210 directions, consistent with prior studies. This work presents a novel integration of GA and machine learning for variogram modelling, offering an automated, efficient approach to parameter estimation. The methodology significantly enhances predictive geostatistical models, contributing to the advancement of mineral exploration and improving the precision and speed of decision-making in the petroleum and mining industries.
{"title":"Variogram modelling optimisation using genetic algorithm and machine learning linear regression: application for Sequential Gaussian Simulations mapping","authors":"André William Boroh , Alpha Baster Kenfack Fokem , Martin Luther Mfenjou , Firmin Dimitry Hamat , Fritz Mbounja Besseme","doi":"10.1016/j.aiig.2025.100124","DOIUrl":"10.1016/j.aiig.2025.100124","url":null,"abstract":"<div><div>The objective of this study is to develop an advanced approach to variogram modelling by integrating genetic algorithms (GA) with machine learning-based linear regression, aiming to improve the accuracy and efficiency of geostatistical analysis, particularly in mineral exploration. The study combines GA and machine learning to optimise variogram parameters, including range, sill, and nugget, by minimising the root mean square error (RMSE) and maximising the coefficient of determination (R<sup>2</sup>). The experimental variograms were computed and modelled using theoretical models, followed by optimisation via evolutionary algorithms. The method was applied to gravity data from the Ngoura-Batouri-Kette mining district in Eastern Cameroon, covering 141 data points. Sequential Gaussian Simulations (SGS) were employed for predictive mapping to validate simulated results against true values. Key findings show variograms with ranges between 24.71 km and 49.77 km, optimised RMSE and R<sup>2</sup> values of 11.21 mGal<sup>2</sup> and 0.969, respectively, after 42 generations of GA optimisation. Predictive mapping using SGS demonstrated that simulated values closely matched true values, with the simulated mean at 21.75 mGal compared to the true mean of 25.16 mGal, and variances of 465.70 mGal<sup>2</sup> and 555.28 mGal<sup>2</sup>, respectively. The results confirmed spatial variability and anisotropies in the N170-N210 directions, consistent with prior studies. This work presents a novel integration of GA and machine learning for variogram modelling, offering an automated, efficient approach to parameter estimation. The methodology significantly enhances predictive geostatistical models, contributing to the advancement of mineral exploration and improving the precision and speed of decision-making in the petroleum and mining industries.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 1","pages":"Article 100124"},"PeriodicalIF":0.0,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144168286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-05-20DOI: 10.1016/j.aiig.2025.100122
Gamze Maden Muftuoglu , Kaveh Dehghanian
Liquefaction is one of the prominent factors leading to damage to soil and structures. In this study, the relationship between liquefaction potential and soil parameters is determined by applying feature importance methods to Random Forest (RF), Logistic Regression (LR), Multilayer Perceptron (MLP), Support Vector Machine (SVM) and eXtreme Gradient Boosting (XGBoost) algorithms. Feature importance methods consist of permutation and Shapley Additive exPlanations (SHAP) importances along with the used model's built-in feature importance method if it exists. These suggested approaches incorporate an extensive dataset of geotechnical parameters, historical liquefaction events, and soil properties. The feature set comprises 18 parameters that are gathered from 161 field cases. Algorithms are used to determine the optimum performance feature set. Compared to other approaches, the study assesses how well these algorithms predict soil liquefaction potential. Early findings show that the algorithms perform well, demonstrating their capacity to identify non-linear connections and improve prediction accuracy. Among the feature set, σ,v (psf), MSF, CSRσ,v, FC%, Vs∗,40ft(f ps) and N1,60,CS are the ones that have the highest deterministic power on the result. The study's contribution is that, in the absence of extensive data for liquefaction assessment, the proposed method estimates the liquefaction potential using five parameters with promising accuracy.
{"title":"Soil liquefaction assessment using machine learning","authors":"Gamze Maden Muftuoglu , Kaveh Dehghanian","doi":"10.1016/j.aiig.2025.100122","DOIUrl":"10.1016/j.aiig.2025.100122","url":null,"abstract":"<div><div>Liquefaction is one of the prominent factors leading to damage to soil and structures. In this study, the relationship between liquefaction potential and soil parameters is determined by applying feature importance methods to Random Forest (RF), Logistic Regression (LR), Multilayer Perceptron (MLP), Support Vector Machine (SVM) and eXtreme Gradient Boosting (XGBoost) algorithms. Feature importance methods consist of permutation and Shapley Additive exPlanations (SHAP) importances along with the used model's built-in feature importance method if it exists. These suggested approaches incorporate an extensive dataset of geotechnical parameters, historical liquefaction events, and soil properties. The feature set comprises 18 parameters that are gathered from 161 field cases. Algorithms are used to determine the optimum performance feature set. Compared to other approaches, the study assesses how well these algorithms predict soil liquefaction potential. Early findings show that the algorithms perform well, demonstrating their capacity to identify non-linear connections and improve prediction accuracy. Among the feature set, <em>σ</em><sup><em>,</em></sup><sub><em>v</em></sub> (<em>psf</em>), MSF, <em>CSR</em><sub><em>σ,</em></sub> <sub><em>v</em></sub>, FC%, V<sub>s∗,40f</sub> <sub>t</sub>(f ps) and <em>N</em><sub>1<em>,</em>60<em>,CS</em></sub> are the ones that have the highest deterministic power on the result. The study's contribution is that, in the absence of extensive data for liquefaction assessment, the proposed method estimates the liquefaction potential using five parameters with promising accuracy.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 1","pages":"Article 100122"},"PeriodicalIF":0.0,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144134957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-05-12DOI: 10.1016/j.aiig.2025.100121
Raed H. Allawi , Watheq J. Al-Mudhafar , Mohammed A. Abbas , David A. Wood
Drilling optimization requires accurate drill bit rate-of-penetration (ROP) predictions. ROP decreases drilling time and costs and increases rig productivity. This study employs random forest (RF), gradient boosting modeling (GBM), extreme gradient boosting (XGBoost), and adaptive boosting (Adaboost) models to generate ROP predictions. The models use well data from a 3200-m segment across the stratigraphic column (Dibdibba to Zubair formations) of the large West Qurna oil field in Southern Iraq, penetrating 19 formations and four oil reservoirs. The reservoir sections are between 40 and 440 m thick and consist of both carbonate and clastic lithologies. The ROP predictive models were developed using 14 operational parameters: TVD, weight on bit (WOB), torque, effective circulating density (ECD), drilling rotation per minute (RPM), flow rate, standpipe pressure (SPP), bit size, total RPM, D exponent, gamma ray (GR), density, neutron, caliper, and discrete lithology distribution. Training and validation of the ROP models involves data compiled from three development wells. Applying Random subsampling, the compiled dataset was split into 85 % for training and 15 % for validation and testing. The test subgroup's measured and predicted ROP mismatch was assessed using root mean square error (RMSE) and coefficient of correlation (R2). The RF, GBM, and XGBoost models provide ROP predictions versus depth with low errors. Models with cross-validation that integrate data from three wells deliver more accurate ROP predictions than datasets from single well. The input variables' influences on ROP optimization identify the optimal value ranges for 14 operating parameters that help to increase drilling speed and reduce cost.
{"title":"Leveraging boosting machine learning for drilling rate of penetration (ROP) prediction based on drilling and petrophysical parameters","authors":"Raed H. Allawi , Watheq J. Al-Mudhafar , Mohammed A. Abbas , David A. Wood","doi":"10.1016/j.aiig.2025.100121","DOIUrl":"10.1016/j.aiig.2025.100121","url":null,"abstract":"<div><div>Drilling optimization requires accurate drill bit rate-of-penetration (ROP) predictions. ROP decreases drilling time and costs and increases rig productivity. This study employs random forest (RF), gradient boosting modeling (GBM), extreme gradient boosting (XGBoost), and adaptive boosting (Adaboost) models to generate ROP predictions. The models use well data from a 3200-m segment across the stratigraphic column (Dibdibba to Zubair formations) of the large West Qurna oil field in Southern Iraq, penetrating 19 formations and four oil reservoirs. The reservoir sections are between 40 and 440 m thick and consist of both carbonate and clastic lithologies. The ROP predictive models were developed using 14 operational parameters: TVD, weight on bit (WOB), torque, effective circulating density (ECD), drilling rotation per minute (RPM), flow rate, standpipe pressure (SPP), bit size, total RPM, D exponent, gamma ray (GR), density, neutron, caliper, and discrete lithology distribution. Training and validation of the ROP models involves data compiled from three development wells. Applying Random subsampling, the compiled dataset was split into 85 % for training and 15 % for validation and testing. The test subgroup's measured and predicted ROP mismatch was assessed using root mean square error (RMSE) and coefficient of correlation (R<sup>2</sup>). The RF, GBM, and XGBoost models provide ROP predictions versus depth with low errors. Models with cross-validation that integrate data from three wells deliver more accurate ROP predictions than datasets from single well. The input variables' influences on ROP optimization identify the optimal value ranges for 14 operating parameters that help to increase drilling speed and reduce cost.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 1","pages":"Article 100121"},"PeriodicalIF":0.0,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143936979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}