首页 > 最新文献

Artificial Intelligence in Geosciences最新文献

英文 中文
Explaining machine learning models trained to predict Copernicus DEM errors in different land cover environments 解释机器学习模型训练预测哥白尼DEM误差在不同的土地覆盖环境
Pub Date : 2025-07-15 DOI: 10.1016/j.aiig.2025.100141
Michael Meadows, Karin Reinke, Simon Jones
Machine learning models are increasingly used to correct the vertical biases (mainly due to vegetation and buildings) in global Digital Elevation Models (DEMs), for downstream applications which need “bare earth” elevations. The predictive accuracy of these models has improved significantly as more flexible model architectures are developed and new explanatory datasets produced, leading to the recent release of three model-corrected DEMs (FABDEM, DiluviumDEM and FathomDEM). However, there has been relatively little focus so far on explaining or interrogating these models, especially important in this context given their downstream impact on many other applications (including natural hazard simulations). In this study we train five separate models (by land cover environment) to correct vertical biases in the Copernicus DEM and then explain them using SHapley Additive exPlanation (SHAP) values. Comparing the models, we find significant variation in terms of the specific input variables selected and their relative importance, suggesting that an ensemble of models (specialising by land cover) is likely preferable to a general model applied everywhere. Visualising the patterns learned by the models (using SHAP dependence plots) provides further insights, building confidence in some cases (where patterns are consistent with domain knowledge and past studies) and highlighting potentially problematic variables in others (such as proxy relationships which may not apply in new application sites). Our results have implications for future DEM error prediction studies, particularly in evaluating a very wide range of potential input variables (160 candidates) drawn from topographic, multispectral, Synthetic Aperture Radar, vegetation, climate and urbanisation datasets.
机器学习模型越来越多地用于纠正全球数字高程模型(dem)中的垂直偏差(主要是由于植被和建筑物),用于需要“裸地”高程的下游应用。随着更灵活的模型架构的开发和新的解释性数据集的产生,这些模型的预测精度得到了显著提高,导致最近发布了三种模型修正的dem (FABDEM, DiluviumDEM和FathomDEM)。然而,到目前为止,对这些模型的解释或质疑相对较少,特别是考虑到它们对许多其他应用(包括自然灾害模拟)的下游影响,这些模型在这种情况下尤为重要。在这项研究中,我们训练了五个独立的模型(按土地覆盖环境)来纠正哥白尼DEM中的垂直偏差,然后使用SHapley加性解释(SHAP)值对它们进行解释。比较这些模型,我们发现在选择的特定输入变量及其相对重要性方面存在显著差异,这表明模型的集合(按土地覆盖专门划分)可能比到处应用的一般模型更可取。可视化模型学习的模式(使用SHAP依赖图)提供了进一步的见解,在某些情况下(模式与领域知识和过去的研究一致)建立信心,并突出显示其他情况下潜在的问题变量(例如可能不适用于新应用程序站点的代理关系)。我们的研究结果对未来的DEM误差预测研究具有重要意义,特别是在评估从地形、多光谱、合成孔径雷达、植被、气候和城市化数据集提取的非常广泛的潜在输入变量(160个候选变量)方面。
{"title":"Explaining machine learning models trained to predict Copernicus DEM errors in different land cover environments","authors":"Michael Meadows,&nbsp;Karin Reinke,&nbsp;Simon Jones","doi":"10.1016/j.aiig.2025.100141","DOIUrl":"10.1016/j.aiig.2025.100141","url":null,"abstract":"<div><div>Machine learning models are increasingly used to correct the vertical biases (mainly due to vegetation and buildings) in global Digital Elevation Models (DEMs), for downstream applications which need “bare earth” elevations. The predictive accuracy of these models has improved significantly as more flexible model architectures are developed and new explanatory datasets produced, leading to the recent release of three model-corrected DEMs (FABDEM, DiluviumDEM and FathomDEM). However, there has been relatively little focus so far on explaining or interrogating these models, especially important in this context given their downstream impact on many other applications (including natural hazard simulations). In this study we train five separate models (by land cover environment) to correct vertical biases in the Copernicus DEM and then explain them using SHapley Additive exPlanation (SHAP) values. Comparing the models, we find significant variation in terms of the specific input variables selected and their relative importance, suggesting that an ensemble of models (specialising by land cover) is likely preferable to a general model applied everywhere. Visualising the patterns learned by the models (using SHAP dependence plots) provides further insights, building confidence in some cases (where patterns are consistent with domain knowledge and past studies) and highlighting potentially problematic variables in others (such as proxy relationships which may not apply in new application sites). Our results have implications for future DEM error prediction studies, particularly in evaluating a very wide range of potential input variables (160 candidates) drawn from topographic, multispectral, Synthetic Aperture Radar, vegetation, climate and urbanisation datasets.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 2","pages":"Article 100141"},"PeriodicalIF":0.0,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144663596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generating high-resolution climate data in the Andes using artificial intelligence: A lightweight alternative to the WRF model 使用人工智能在安第斯山脉生成高分辨率气候数据:WRF模型的轻量级替代方案
Pub Date : 2025-07-13 DOI: 10.1016/j.aiig.2025.100143
Christian Carhuancho , Edwin Villanueva , Christian Yarleque , Romel Erick Principe , Marcia Castromonte
In weather forecasting, generating atmospheric variables for regions with complex topography, such as the Andean regions with peaks reaching 6500 m above sea level, poses significant challenges. Traditional regional climate models often struggle to accurately represent the atmospheric behavior in such areas. Furthermore, the capability to produce high spatio-temporal resolution data (less than 27 km and hourly) is limited to a few institutions globally due to the substantial computational resources required. This study presents the results of atmospheric data generated using a new type of artificial intelligence (AI) models, aimed to reduce the computational cost of generating downscaled climate data using climate regional models like the Weather Research and Forecasting (WRF) model over the Andes. The WRF model was selected for this comparison due to its frequent use in simulating atmospheric variables in the Andes.
Our results demonstrate a higher downscaling performance for the four target weather variables studied (temperature, relative humidity, zonal and meridional wind) over coastal, mountain, and jungle regions. Moreover, this AI model offers several advantages, including lower computational costs compared to dynamic models like WRF and continuous improvement potential with additional training data.
在天气预报中,为地形复杂的地区(如海拔6500米的安第斯地区)生成大气变量是一项重大挑战。传统的区域气候模式往往难以准确地表示这些地区的大气行为。此外,由于需要大量的计算资源,产生高时空分辨率数据(每小时少于27公里)的能力仅限于全球少数机构。本研究展示了使用新型人工智能(AI)模型生成的大气数据的结果,旨在降低使用气候区域模型(如安第斯山脉的天气研究与预报(WRF)模型)生成缩小比例气候数据的计算成本。之所以选择WRF模式进行比较,是因为它经常用于模拟安第斯山脉的大气变量。我们的研究结果表明,在沿海、山区和丛林地区,研究的四个目标天气变量(温度、相对湿度、纬向风和经向风)具有更高的降尺度性能。此外,该人工智能模型具有几个优势,包括与WRF等动态模型相比,计算成本更低,并且具有额外训练数据的持续改进潜力。
{"title":"Generating high-resolution climate data in the Andes using artificial intelligence: A lightweight alternative to the WRF model","authors":"Christian Carhuancho ,&nbsp;Edwin Villanueva ,&nbsp;Christian Yarleque ,&nbsp;Romel Erick Principe ,&nbsp;Marcia Castromonte","doi":"10.1016/j.aiig.2025.100143","DOIUrl":"10.1016/j.aiig.2025.100143","url":null,"abstract":"<div><div>In weather forecasting, generating atmospheric variables for regions with complex topography, such as the Andean regions with peaks reaching 6500 m above sea level, poses significant challenges. Traditional regional climate models often struggle to accurately represent the atmospheric behavior in such areas. Furthermore, the capability to produce high spatio-temporal resolution data (less than 27 km and hourly) is limited to a few institutions globally due to the substantial computational resources required. This study presents the results of atmospheric data generated using a new type of artificial intelligence (AI) models, aimed to reduce the computational cost of generating downscaled climate data using climate regional models like the Weather Research and Forecasting (WRF) model over the Andes. The WRF model was selected for this comparison due to its frequent use in simulating atmospheric variables in the Andes.</div><div>Our results demonstrate a higher downscaling performance for the four target weather variables studied (temperature, relative humidity, zonal and meridional wind) over coastal, mountain, and jungle regions. Moreover, this AI model offers several advantages, including lower computational costs compared to dynamic models like WRF and continuous improvement potential with additional training data.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 2","pages":"Article 100143"},"PeriodicalIF":0.0,"publicationDate":"2025-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144653503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning assisted estimation of total solids content of drilling fluids 机器学习辅助估计钻井液的总固体含量
Pub Date : 2025-07-05 DOI: 10.1016/j.aiig.2025.100138
B.T. Gunel , Y.D. Pak , A.Ö. Herekeli , S. Gül , B. Kulga , E. Artun
Characterization and optimization of physical and chemical properties of drilling fluids are critical for the efficiency and success of drilling operations. In particular, maintaining the optimal levels of solids content is essential for achieving the most effective fluid performance. Proper management of solids content also reduces the risk of tool failures. Traditional solids content analysis methods, such as retort analysis, require substantial human intervention and time, which can lead to inaccuracies, time-management issues, and increased operational risks. In contrast to human-intensive methods, machine learning may offer a viable alternative for solids content estimation due to its pattern-recognition capability. In this study, a large set of laboratory reports of drilling-fluid analyses from 130 oil wells around the world were compiled to construct a comprehensive data set. The relationships among various rheological parameters were analyzed using statistical methods and machine learning algorithms. Several machine learning algorithms of diverse classes, namely linear (linear regression, ridge regression, and ElasticNet regression), kernel-based (support vector machine) and ensemble tree-based (gradient boosting, XGBoost, and random forests) algorithms, were trained and tuned to estimate solids content from other readily available drilling fluid properties. Input variables were kept consistent across all models for interpretation and comparison purposes. In the final stage, different evaluation metrics were employed to evaluate and compare the performance of different classes of machine learning models. Among all algorithms tested, random forests algorithm was found to be the best predictive model resulting in consistently high accuracy. Further optimization of the random forests model resulted in a mean absolute percentage error (MAPE) of 3.9% and 9.6% and R2 of 0.99 and 0.93 for the training and testing sets, respectively. Analysis of residuals, their histograms and Q-Q normality plots showed Gaussian distributions with residuals that are scattered around a mean of zero within error ranges of ±1% and ±4%, for training and testing, respectively. The selected model was further validated by applying the rheological measurements from mud samples taken from an offshore well from the Gulf of Mexico. The model was able to estimate total solids content in those four mud samples with an average absolute error of 1.08% of total solids content. The model was then used to develop a web-based graphical-user-interface (GUI) application, which can be practically used at the rig site by engineers to optimize drilling fluid programs. The proposed model can complement automation workflows that are designed to measure fundamental rheological properties in real time during drilling operations. While a st
钻井液物理化学性质的表征和优化对钻井作业的效率和成功至关重要。特别是,保持最佳固体含量水平对于实现最有效的流体性能至关重要。适当的固体含量管理也降低了工具故障的风险。传统的固体含量分析方法,如蒸馏器分析,需要大量的人工干预和时间,这可能导致不准确,时间管理问题,并增加操作风险。与人工密集型方法相比,机器学习由于其模式识别能力,可以为固体含量估计提供可行的替代方案。在这项研究中,收集了大量来自世界各地130口油井的钻井液分析实验室报告,构建了一个全面的数据集。利用统计方法和机器学习算法分析了各流变参数之间的关系。几种不同类型的机器学习算法,即线性(线性回归、脊回归和ElasticNet回归)、基于核(支持向量机)和基于集成树(梯度增强、XGBoost和随机森林)算法,经过训练和调整,可以从其他可用的钻井液性质中估计固体含量。为了解释和比较的目的,所有模型的输入变量保持一致。在最后阶段,采用不同的评估指标来评估和比较不同类别的机器学习模型的性能。在所有被测试的算法中,随机森林算法是最好的预测模型,具有较高的准确率。进一步优化随机森林模型,训练集和测试集的平均绝对百分比误差(MAPE)分别为3.9%和9.6%,R2分别为0.99和0.93。残差分析,其直方图和Q-Q正态图显示高斯分布,残差分散在平均值零附近,误差范围分别为±1%和±4%,用于训练和测试。通过对墨西哥湾海上油井的泥浆样品进行流变测量,进一步验证了所选模型的有效性。该模型能够估计出这4种泥浆样品中的总固体含量,平均绝对误差为总固体含量的1.08%。然后,该模型被用于开发基于web的图形用户界面(GUI)应用程序,工程师可以在钻井现场实际使用该应用程序来优化钻井液方案。所提出的模型可以补充自动化工作流程,旨在实时测量钻井作业中的基本流变性能。虽然标准的油罐测试在钻井现场大约需要2小时,但这种实时评估可以帮助钻井人员及时优化钻井液,单台钻机每年可节省2920个工时。
{"title":"Machine learning assisted estimation of total solids content of drilling fluids","authors":"B.T. Gunel ,&nbsp;Y.D. Pak ,&nbsp;A.Ö. Herekeli ,&nbsp;S. Gül ,&nbsp;B. Kulga ,&nbsp;E. Artun","doi":"10.1016/j.aiig.2025.100138","DOIUrl":"10.1016/j.aiig.2025.100138","url":null,"abstract":"<div><div>Characterization and optimization of physical and chemical properties of drilling fluids are critical for the efficiency and success of drilling operations. In particular, maintaining the optimal levels of solids content is essential for achieving the most effective fluid performance. Proper management of solids content also reduces the risk of tool failures. Traditional solids content analysis methods, such as retort analysis, require substantial human intervention and time, which can lead to inaccuracies, time-management issues, and increased operational risks. In contrast to human-intensive methods, machine learning may offer a viable alternative for solids content estimation due to its pattern-recognition capability. In this study, a large set of laboratory reports of drilling-fluid analyses from 130 oil wells around the world were compiled to construct a comprehensive data set. The relationships among various rheological parameters were analyzed using statistical methods and machine learning algorithms. Several machine learning algorithms of diverse classes, namely linear (linear regression, ridge regression, and ElasticNet regression), kernel-based (support vector machine) and ensemble tree-based (gradient boosting, XGBoost, and random forests) algorithms, were trained and tuned to estimate solids content from other readily available drilling fluid properties. Input variables were kept consistent across all models for interpretation and comparison purposes. In the final stage, different evaluation metrics were employed to evaluate and compare the performance of different classes of machine learning models. Among all algorithms tested, random forests algorithm was found to be the best predictive model resulting in consistently high accuracy. Further optimization of the random forests model resulted in a mean absolute percentage error (MAPE) of 3.9% and 9.6% and R<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span> of 0.99 and 0.93 for the training and testing sets, respectively. Analysis of residuals, their histograms and Q-Q normality plots showed Gaussian distributions with residuals that are scattered around a mean of zero within error ranges of <span><math><mo>±</mo></math></span>1% and <span><math><mo>±</mo></math></span>4%, for training and testing, respectively. The selected model was further validated by applying the rheological measurements from mud samples taken from an offshore well from the Gulf of Mexico. The model was able to estimate total solids content in those four mud samples with an average absolute error of 1.08% of total solids content. The model was then used to develop a web-based graphical-user-interface (GUI) application, which can be practically used at the rig site by engineers to optimize drilling fluid programs. The proposed model can complement automation workflows that are designed to measure fundamental rheological properties in real time during drilling operations. While a st","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 2","pages":"Article 100138"},"PeriodicalIF":0.0,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144581266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved estimation of two-phase capillary pressure with nuclear magnetic resonance measurements via machine learning 基于机器学习的核磁共振测量改进的两相毛细管压力估计
Pub Date : 2025-07-05 DOI: 10.1016/j.aiig.2025.100144
Oriyomi Raheem , Misael M. Morales , Wen Pan , Carlos Torres-Verdín
Capillary pressure plays a crucial role in determining the spatial distribution of oil and gas, particularly in medium-to-low permeability reservoirs, where it is closely linked to the rock's pore structure and wettability. In these environments, pore structure is the primary factor influencing capillary pressure, with different pore types affecting fluid transport through varying degrees of hydrocarbon saturation. One of the main challenges in characterizing pore structure is how to use data from core plugs to establish a relationship with microscopic pore and throat properties, enabling more accurate predictions of capillary pressure. While special core analysis laboratory experiments are effective, they are time-consuming and expensive. In contrast, nuclear magnetic resonance (NMR) measurements, which provide information on pore body size distribution, are faster and can be leveraged to estimate capillary pressure using machine learning algorithms. Recently, artificial intelligence methods have also been applied to capillary pressure prediction (Qi et al., 2024).
Currently, no readily applicable predictive model exists for estimating an entire capillary pressure curve directly from standard petrophysical logs and core data. Although porescale imaging and network modeling techniques can compute capillary pressure from micro-CT rock images (Øren and Bakke, 2003; Valvatne and Blunt, 2004), these approaches are time-consuming, limited to small sample volumes, and not yet practical for routine reservoir evaluation. In this study, we introduce rock classification techniques and implement a data-driven machine learning (ML) method to estimate saturation-dependent capillary pressure from core petrophysical properties. The new model integrates cumulative NMR data and densely resampled core measurements as training data, with prediction errors quantified throughout the process. To approach the common condition of sparsely sampled training data, we transformed the prediction problem into an overdetermined one by applying composite fitting to both capillary pressure and pore throat size distribution, and Gaussian cumulative distribution fitting to the NMR T2 measurements, generating evenly sampled data points. Using these preprocessed input features, we performed classification based on the natural logarithm of the permeability-to-porosity ratio (ln(k/ϕ)) to cluster distinct rock types. For each rock class, we applied regression techniques—such as random forest (RF), k-nearest neighbors (k-NN), extreme gradient boosting (XGB), and artificial neural networks (ANN)—to estimate the logarithm of capillary pressure. The methods were tested on blind core samples, and performance comparisons among different estimation methods
毛管压力对油气的空间分布起着至关重要的作用,特别是在中低渗透储层中,毛管压力与岩石的孔隙结构和润湿性密切相关。在这些环境中,孔隙结构是影响毛管压力的主要因素,不同孔隙类型通过不同程度的烃饱和度影响流体的输运。表征孔隙结构的主要挑战之一是如何利用岩心桥塞的数据建立微观孔隙和喉道特性的关系,从而更准确地预测毛管压力。虽然特殊的岩心分析实验室实验是有效的,但它们耗时且昂贵。相比之下,核磁共振(NMR)测量可以提供孔体大小分布的信息,速度更快,并且可以利用机器学习算法来估计毛细管压力。最近,人工智能方法也被应用于毛细管压力预测(Qi et al., 2024)。目前,还没有现成的预测模型可以直接从标准岩石物理测井和岩心数据中估计整个毛管压力曲线。虽然孔隙尺度成像和网络建模技术可以从微ct岩石图像中计算毛细压力(Øren和Bakke, 2003;Valvatne和Blunt, 2004),这些方法耗时长,仅限于小样本量,还不能用于常规储层评价。在这项研究中,我们引入了岩石分类技术,并实现了一种数据驱动的机器学习(ML)方法,通过岩心岩石物理性质来估计与饱和度相关的毛管压力。新模型将累积核磁共振数据和密集重采样的岩心测量数据作为训练数据,并在整个过程中量化预测误差。为了接近稀疏采样训练数据的常见情况,我们通过对毛细管压力和孔喉大小分布进行复合拟合,并对NMR T2测量值进行高斯累积分布拟合,将预测问题转化为过确定问题,生成均匀采样的数据点。利用这些预处理的输入特征,我们根据渗透率-孔隙度比(ln(k/ϕ))的自然对数进行分类,以聚类不同的岩石类型。对于每个岩石类别,我们应用回归技术——如随机森林(RF)、k近邻(k-NN)、极端梯度增强(XGB)和人工神经网络(ANN)——来估计毛细管压力的对数。对盲岩心样本进行了测试,并基于预测的相对标准误差对不同估计方法进行了性能比较。结果表明,核磁共振数据对岩石孔隙结构较为敏感,对毛细管压力和孔喉大小分布的预测有显著改善。对于毛细管压力和孔喉大小分布,极端梯度增强和随机森林模型的平均估计误差分别为5%和10%,表现最好。相比之下,当NMR T2数据被排除作为输入特征时,预测误差增加到25%。使用传统的高斯模型拟合和更高分辨率的重采样确保了训练数据覆盖了广泛的变异性。将核磁共振T2数据作为输入特征增强了模型捕捉非常规岩石中多峰的能力,使预测问题过度确定。通过向量输入特征预测向量函数,有效降低了预测误差。该解释工作流程可用于构建具有代表性的分类模型,并在广泛的饱和度范围内估计毛细管压力。
{"title":"Improved estimation of two-phase capillary pressure with nuclear magnetic resonance measurements via machine learning","authors":"Oriyomi Raheem ,&nbsp;Misael M. Morales ,&nbsp;Wen Pan ,&nbsp;Carlos Torres-Verdín","doi":"10.1016/j.aiig.2025.100144","DOIUrl":"10.1016/j.aiig.2025.100144","url":null,"abstract":"<div><div>Capillary pressure plays a crucial role in determining the spatial distribution of oil and gas, particularly in medium-to-low permeability reservoirs, where it is closely linked to the rock's pore structure and wettability. In these environments, pore structure is the primary factor influencing capillary pressure, with different pore types affecting fluid transport through varying degrees of hydrocarbon saturation. One of the main challenges in characterizing pore structure is how to use data from core plugs to establish a relationship with microscopic pore and throat properties, enabling more accurate predictions of capillary pressure. While special core analysis laboratory experiments are effective, they are time-consuming and expensive. In contrast, nuclear magnetic resonance (NMR) measurements, which provide information on pore body size distribution, are faster and can be leveraged to estimate capillary pressure using machine learning algorithms. Recently, artificial intelligence methods have also been applied to capillary pressure prediction (Qi et al., 2024).</div><div>Currently, no readily applicable predictive model exists for estimating an entire capillary pressure curve directly from standard petrophysical logs and core data. Although porescale imaging and network modeling techniques can compute capillary pressure from micro-CT rock images (Øren and Bakke, 2003; Valvatne and Blunt, 2004), these approaches are time-consuming, limited to small sample volumes, and not yet practical for routine reservoir evaluation. In this study, we introduce rock classification techniques and implement a data-driven machine learning (ML) method to estimate saturation-dependent capillary pressure from core petrophysical properties. The new model integrates cumulative NMR data and densely resampled core measurements as training data, with prediction errors quantified throughout the process. To approach the common condition of sparsely sampled training data, we transformed the prediction problem into an overdetermined one by applying composite fitting to both capillary pressure and pore throat size distribution, and Gaussian cumulative distribution fitting to the NMR <span><math><mrow><msub><mi>T</mi><mn>2</mn></msub></mrow></math></span> measurements, generating evenly sampled data points. Using these preprocessed input features, we performed classification based on the natural logarithm of the permeability-to-porosity ratio <span><math><mrow><mo>(</mo><mrow><mi>ln</mi><mrow><mo>(</mo><mrow><mi>k</mi><mo>/</mo><mi>ϕ</mi></mrow><mo>)</mo></mrow></mrow><mo>)</mo></mrow></math></span> to cluster distinct rock types. For each rock class, we applied regression techniques—such as random forest (RF), k-nearest neighbors (k-NN), extreme gradient boosting (XGB), and artificial neural networks (ANN)—to estimate the logarithm of capillary pressure. The methods were tested on blind core samples, and performance comparisons among different estimation methods ","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 2","pages":"Article 100144"},"PeriodicalIF":0.0,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144633212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning approaches for estimating maximum wall deflection in excavations with inconsistent clay stratigraphy 不一致粘土地层条件下挖掘最大壁挠度估计的深度学习方法
Pub Date : 2025-07-04 DOI: 10.1016/j.aiig.2025.100140
Vinh V. Le , HongGiang Nguyen , Nguyen Huu Ngu
This paper presents a deep learning architecture combined with exploratory data analysis to estimate maximum wall deflection in deep excavations. Six major geotechnical parameters were studied. Statistical methods, such as pair plots and Pearson correlation, highlighted excavation depth (correlation coefficient = 0.82) as the most significant factor. For method prediction, five deep learning models (CNN, LSTM, BiLSTM, CNN-LSTM, and CNN-BiLSTM) were built. The CNN-BiLSTM model excelled in training performance (R2 = 0.98, RMSE = 0.02), while BiLSTM reached superior testing results (R2 = 0.85, RMSE = 0.06), suggesting greater generalization ability. Based on the feature importance analysis from model weights, excavation depth, stiffness ratio, and bracing spacing were ranked as the highest contributors. This point verified a lack of prediction bias on residual plots and high model agreement with measured values on Taylor diagrams (correlation coefficient 0.92). The effectiveness of integrated techniques was reliably assured for predicting wall deformation. This approach facilitates more accurate and efficient geotechnical design and provides engineers with improved tools for risk evaluation and decision-making in deep excavation projects.
本文提出了一种结合探索性数据分析的深度学习体系结构,用于估计深基坑中墙体的最大挠度。研究了6个主要岩土参数。通过配对图和Pearson相关等统计方法,挖掘深度(相关系数= 0.82)是最显著的影响因素。在方法预测方面,建立了CNN、LSTM、BiLSTM、CNN-LSTM和CNN-BiLSTM五个深度学习模型。CNN-BiLSTM模型具有较好的训练性能(R2 = 0.98, RMSE = 0.02),而BiLSTM模型具有较好的测试结果(R2 = 0.85, RMSE = 0.06),具有较强的泛化能力。基于模型权重特征重要性分析,挖掘深度、刚度比和支撑间距是影响最大的因素。这一点证实了残差图上没有预测偏差,模型与泰勒图上的实测值高度吻合(相关系数0.92)。综合技术的有效性为预测围岩变形提供了可靠的保证。该方法有助于提高岩土工程设计的准确性和效率,为深基坑工程的风险评估和决策提供了改进的工具。
{"title":"Deep learning approaches for estimating maximum wall deflection in excavations with inconsistent clay stratigraphy","authors":"Vinh V. Le ,&nbsp;HongGiang Nguyen ,&nbsp;Nguyen Huu Ngu","doi":"10.1016/j.aiig.2025.100140","DOIUrl":"10.1016/j.aiig.2025.100140","url":null,"abstract":"<div><div>This paper presents a deep learning architecture combined with exploratory data analysis to estimate maximum wall deflection in deep excavations. Six major geotechnical parameters were studied. Statistical methods, such as pair plots and Pearson correlation, highlighted excavation depth (correlation coefficient = 0.82) as the most significant factor. For method prediction, five deep learning models (CNN, LSTM, BiLSTM, CNN-LSTM, and CNN-BiLSTM) were built. The CNN-BiLSTM model excelled in training performance (R<sup>2</sup> = 0.98, RMSE = 0.02), while BiLSTM reached superior testing results (R<sup>2</sup> = 0.85, RMSE = 0.06), suggesting greater generalization ability. Based on the feature importance analysis from model weights, excavation depth, stiffness ratio, and bracing spacing were ranked as the highest contributors. This point verified a lack of prediction bias on residual plots and high model agreement with measured values on Taylor diagrams (correlation coefficient 0.92). The effectiveness of integrated techniques was reliably assured for predicting wall deformation. This approach facilitates more accurate and efficient geotechnical design and provides engineers with improved tools for risk evaluation and decision-making in deep excavation projects.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 2","pages":"Article 100140"},"PeriodicalIF":0.0,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144581267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cellular automata models for simulation and prediction of urban land use change: Development and prospects 城市土地利用变化模拟与预测的元胞自动机模型:发展与展望
Pub Date : 2025-06-30 DOI: 10.1016/j.aiig.2025.100142
Baoling Gui, Anshuman Bhardwaj, Lydia Sam
Rapid urbanization and land-use changes are placing immense pressure on resources, infrastructure, and environmental sustainability. To address these, accurate urban simulation models are essential for sustainable development and governance. Among them, Cellular Automata (CA) models have become key tools for predicting urban expansion, optimizing land-use planning, and supporting data-driven decision-making. This review provides a comprehensive examination of the development of urban cellular automata (UCA) models, presenting a new framework to enhance individual UCA sub-modules within the context of emerging technologies, sustainable environments, and public governance. By addressing gaps in prior UCA modelling reviews—particularly in the integration and optimization of UCA sub-module technologies—this framework is designed to simplify UCA model understanding and development. We systematically review pioneering case studies, deconstruct current UCA operational processes, and explore modern technologies, such as big data and artificial intelligence, to optimize these sub-modules further. We discuss current limitations within UCA models and propose future pathways, emphasizing the necessity of comprehensive analyses for effective UCA simulations. Proposed solutions include strengthening our understanding of urban growth mechanisms, examining spatial positioning and temporal evolution dynamics, and enhancing urban geographic simulations with deep learning techniques to support sustainable transitions in public governance. These improvements offer data-driven decision support for environmental management, advancing policies that foster sustainable urban development.
快速城市化和土地利用变化给资源、基础设施和环境可持续性带来巨大压力。为了解决这些问题,精确的城市模拟模型对于可持续发展和治理至关重要。其中,元胞自动机(CA)模型已成为预测城市扩张、优化土地利用规划和支持数据驱动决策的关键工具。这篇综述对城市元胞自动机(UCA)模型的发展进行了全面的研究,提出了一个新的框架,在新兴技术、可持续环境和公共治理的背景下增强单个UCA子模块。通过解决先前UCA建模审查中的差距,特别是在UCA子模块技术的集成和优化方面,该框架旨在简化UCA模型的理解和开发。我们系统地回顾开创性的案例研究,解构当前UCA的操作流程,并探索现代技术,如大数据和人工智能,以进一步优化这些子模块。我们讨论了当前UCA模型的局限性,并提出了未来的途径,强调了对有效的UCA模拟进行综合分析的必要性。建议的解决方案包括加强我们对城市增长机制的理解,研究空间定位和时间演变动态,以及利用深度学习技术加强城市地理模拟,以支持公共治理的可持续转型。这些改进为环境管理提供了数据驱动的决策支持,推进了促进可持续城市发展的政策。
{"title":"Cellular automata models for simulation and prediction of urban land use change: Development and prospects","authors":"Baoling Gui,&nbsp;Anshuman Bhardwaj,&nbsp;Lydia Sam","doi":"10.1016/j.aiig.2025.100142","DOIUrl":"10.1016/j.aiig.2025.100142","url":null,"abstract":"<div><div>Rapid urbanization and land-use changes are placing immense pressure on resources, infrastructure, and environmental sustainability. To address these, accurate urban simulation models are essential for sustainable development and governance. Among them, Cellular Automata (CA) models have become key tools for predicting urban expansion, optimizing land-use planning, and supporting data-driven decision-making. This review provides a comprehensive examination of the development of urban cellular automata (UCA) models, presenting a new framework to enhance individual UCA sub-modules within the context of emerging technologies, sustainable environments, and public governance. By addressing gaps in prior UCA modelling reviews—particularly in the integration and optimization of UCA sub-module technologies—this framework is designed to simplify UCA model understanding and development. We systematically review pioneering case studies, deconstruct current UCA operational processes, and explore modern technologies, such as big data and artificial intelligence, to optimize these sub-modules further. We discuss current limitations within UCA models and propose future pathways, emphasizing the necessity of comprehensive analyses for effective UCA simulations. Proposed solutions include strengthening our understanding of urban growth mechanisms, examining spatial positioning and temporal evolution dynamics, and enhancing urban geographic simulations with deep learning techniques to support sustainable transitions in public governance. These improvements offer data-driven decision support for environmental management, advancing policies that foster sustainable urban development.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 2","pages":"Article 100142"},"PeriodicalIF":0.0,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144518220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of processing speed of NRS-ANN hybrid and ANN models for oil production rate estimation of reservoir under waterflooding 水驱油藏采油速度估计的神经网络-神经网络混合模型与神经网络模型处理速度比较
Pub Date : 2025-06-24 DOI: 10.1016/j.aiig.2025.100139
Paul Theophily Nsulangi , Werneld Egno Ngongi , John Mbogo Kafuku , Guan Zhen Liang
This study compared the predictive performance and processing speed of an artificial neural network (ANN) and a hybrid of a numerical reservoir simulation (NRS) and artificial neural network (NRS-ANN) models in estimating the oil production rate of the ZH86 reservoir block under waterflood recovery. The historical input variables: reservoir pressure, reservoir pore volume containing hydrocarbons, reservoir pore volume containing water and reservoir water injection rate used as inputs for ANN models. To create the NRS-ANN hybrid models, 314 data sets extracted from the NRS model, which included reservoir pressure, reservoir pore volume containing hydrocarbons, reservoir pore volume containing water and reservoir water injection rate were used. The output of the models was the historical oil production rate (HOPR in m3 per day) recorded from the ZH86 reservoir block. Models were developed using MATLAB R2021a and trained with 25 models in three replicate conditions (2, 4 and 6), each at 1000 epochs. A comparative analysis indicated that, for all 25 models, the ANN outperformed the NRS-ANN in terms of processing speed and prediction performance. ANN models achieved an average of R2 and MAE of 0.8433 and 8.0964 m3/day values, respectively, while NRS-ANN hybrid models achieved an average of R2 and MAE of 0.7828 and 8.2484 m3/day values, respectively. In addition, ANN models achieved a processing speed of 49 epochs/sec, 32 epochs/sec, and 24 epochs/sec after 2, 4, and 6 replicates, respectively. Whereas the NRS-ANN hybrid models achieved lower average processing speeds of 45 epochs/sec, 23 epochs/sec and 20 epochs/sec. In addition, the ANN optimal model outperforms the NRS-ANN model in terms of both processing speed and accuracy. The ANN optimal model achieved a speed of 336.44 epochs/sec, compared to the NRS-ANN hybrid optimal model, which achieved a speed of 52.16 epochs/sec. The ANN optimal model achieved lower RMSE and MAE values of 7.9291 m3/day and 5.3855 m3/day in the validation dataset compared with the hybrid ANS optimal model, which achieved 13.6821 m3/day and 9.2047 m3/day, respectively. The study also showed that the ANN optimal model consistently achieved higher R2 values: 0.9472, 0.9284 and 0.9316 in the training, test and validation data sets. Whereas the NRS-ANN hybrid optimal yielded lower R2 values of 0.8030, 0.8622 and 0.7776 for the training, testing and validation datasets. The study showed that ANN models are a more effective and reliable tool, as they balance both processing speed and accuracy in estimating the oil production rate of the ZH86 reservoir block under the waterflooding recovery method.
对比了人工神经网络模型(ANN)与数值油藏模拟模型(NRS)和人工神经网络模型(NRS-ANN)混合模型对ZH86油藏区块注水开采下产油量的预测性能和处理速度。历史输入变量:储层压力、含烃储层孔隙体积、含水储层孔隙体积和油藏注水速度作为人工神经网络模型的输入。为了建立NRS- ann混合模型,使用了从NRS模型中提取的314个数据集,包括储层压力、储层含烃孔隙体积、储层含水孔隙体积和储层注水速率。模型的输出是ZH86油藏区块记录的历史产油量(HOPR, m3 / d)。使用MATLAB R2021a开发模型,在3个重复条件(2、4、6)下对25个模型进行训练,每个条件1000次。对比分析表明,对于所有25个模型,人工神经网络在处理速度和预测性能方面都优于NRS-ANN。ANN模型的R2和MAE均值分别为0.8433和8.0964 m3/day,而NRS-ANN混合模型的R2和MAE均值分别为0.7828和8.2484 m3/day。在重复2次、4次和6次后,ANN模型的处理速度分别达到49次/秒、32次/秒和24次/秒。而NRS-ANN混合模型的平均处理速度较低,分别为45、23和20 epoch /sec。此外,ANN最优模型在处理速度和精度方面都优于NRS-ANN模型。神经网络优化模型的速度为336.44 epoch /sec,而NRS-ANN混合优化模型的速度为52.16 epoch /sec。在验证数据集中,ANN优化模型的RMSE和MAE值分别为7.9291 m3/day和5.3855 m3/day,而混合ANS优化模型的RMSE和MAE值分别为13.6821 m3/day和9.2047 m3/day。研究还表明,ANN最优模型在训练、测试和验证数据集中均获得较高的R2值,分别为0.9472、0.9284和0.9316。而NRS-ANN混合优化在训练、测试和验证数据集上产生的R2值较低,分别为0.8030、0.8622和0.7776。研究表明,在水驱采油方法下,人工神经网络模型在计算ZH86油藏区块产油量时兼顾了处理速度和准确性,是一种更有效、更可靠的工具。
{"title":"Comparison of processing speed of NRS-ANN hybrid and ANN models for oil production rate estimation of reservoir under waterflooding","authors":"Paul Theophily Nsulangi ,&nbsp;Werneld Egno Ngongi ,&nbsp;John Mbogo Kafuku ,&nbsp;Guan Zhen Liang","doi":"10.1016/j.aiig.2025.100139","DOIUrl":"10.1016/j.aiig.2025.100139","url":null,"abstract":"<div><div>This study compared the predictive performance and processing speed of an artificial neural network (ANN) and a hybrid of a numerical reservoir simulation (NRS) and artificial neural network (NRS-ANN) models in estimating the oil production rate of the ZH86 reservoir block under waterflood recovery. The historical input variables: reservoir pressure, reservoir pore volume containing hydrocarbons, reservoir pore volume containing water and reservoir water injection rate used as inputs for ANN models. To create the NRS-ANN hybrid models, 314 data sets extracted from the NRS model, which included reservoir pressure, reservoir pore volume containing hydrocarbons, reservoir pore volume containing water and reservoir water injection rate were used. The output of the models was the historical oil production rate (HOPR in m<sup>3</sup> per day) recorded from the ZH86 reservoir block. Models were developed using MATLAB R2021a and trained with 25 models in three replicate conditions (2, 4 and 6), each at 1000 epochs. A comparative analysis indicated that, for all 25 models, the ANN outperformed the NRS-ANN in terms of processing speed and prediction performance. ANN models achieved an average of R<sup>2</sup> and MAE of 0.8433 and 8.0964 m<sup>3</sup>/day values, respectively, while NRS-ANN hybrid models achieved an average of R<sup>2</sup> and MAE of 0.7828 and 8.2484 m<sup>3</sup>/day values, respectively. In addition, ANN models achieved a processing speed of 49 epochs/sec, 32 epochs/sec, and 24 epochs/sec after 2, 4, and 6 replicates, respectively. Whereas the NRS-ANN hybrid models achieved lower average processing speeds of 45 epochs/sec, 23 epochs/sec and 20 epochs/sec. In addition, the ANN optimal model outperforms the NRS-ANN model in terms of both processing speed and accuracy. The ANN optimal model achieved a speed of 336.44 epochs/sec, compared to the NRS-ANN hybrid optimal model, which achieved a speed of 52.16 epochs/sec. The ANN optimal model achieved lower RMSE and MAE values of 7.9291 m<sup>3</sup>/day and 5.3855 m<sup>3</sup>/day in the validation dataset compared with the hybrid ANS optimal model, which achieved 13.6821 m<sup>3</sup>/day and 9.2047 m<sup>3</sup>/day, respectively. The study also showed that the ANN optimal model consistently achieved higher R<sup>2</sup> values: 0.9472, 0.9284 and 0.9316 in the training, test and validation data sets. Whereas the NRS-ANN hybrid optimal yielded lower R<sup>2</sup> values of 0.8030, 0.8622 and 0.7776 for the training, testing and validation datasets. The study showed that ANN models are a more effective and reliable tool, as they balance both processing speed and accuracy in estimating the oil production rate of the ZH86 reservoir block under the waterflooding recovery method.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 2","pages":"Article 100139"},"PeriodicalIF":0.0,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144653505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpretable machine learning models for evaluating strength of ternary geopolymers 用于评估三元地聚合物强度的可解释机器学习模型
Pub Date : 2025-06-23 DOI: 10.1016/j.aiig.2025.100128
Junfei Zhang , Huisheng Cheng , Ninghui Sun , Zehui Huo , Junlin Chen
Ternary geopolymers incorporating multiple solid wastes such as steel slag (SS), fly ash (FA), and granulated blast furnace slag (GBFS) are considered environmentally friendly and exhibit enhanced performance. However, the mechanisms governing strength development and the design of optimal mixtures are not fully understood due to the complexity of their components. This study presents the development of four machine learning models—Artificial Neural Network (ANN), Support Vector Regression (SVR), Extremely Randomized Tree (ERT), and Gradient Boosting Regression (GBR)—for predicting the unconfined compressive strength (UCS) of ternary geopolymers. The models were trained using a dataset comprising 120 mixtures derived from laboratory tests. Shapley Additive Explanations analysis was employed to interpret the machine learning models and elucidate the influence of different components on the properties of ternary geopolymers. The results indicate that ANN exhibits the highest predictive accuracy for UCS (R = 0.949). Furthermore, the UCS of ternary geopolymers is most sensitive to the content of GBFS. This study provides valuable insights for optimizing the mix proportions in ternary blended geopolymer mixtures.
含有多种固体废物(如钢渣(SS)、粉煤灰(FA)和粒状高炉渣(GBFS))的三元地聚合物被认为是环保的,并表现出增强的性能。然而,由于其成分的复杂性,控制强度发展和最佳混合物设计的机制尚未完全了解。本研究提出了四种机器学习模型的发展-人工神经网络(ANN),支持向量回归(SVR),极度随机树(ERT)和梯度增强回归(GBR) -用于预测三元地聚合物的无侧限抗压强度(UCS)。这些模型使用由实验室测试得出的120种混合物组成的数据集进行训练。采用Shapley加性解释分析来解释机器学习模型,并阐明不同组分对三元地聚合物性质的影响。结果表明,人工神经网络对UCS的预测准确率最高(R = 0.949)。此外,三元地聚合物的UCS对GBFS的含量最为敏感。该研究为优化三元共混地聚合物混合物的混合比例提供了有价值的见解。
{"title":"Interpretable machine learning models for evaluating strength of ternary geopolymers","authors":"Junfei Zhang ,&nbsp;Huisheng Cheng ,&nbsp;Ninghui Sun ,&nbsp;Zehui Huo ,&nbsp;Junlin Chen","doi":"10.1016/j.aiig.2025.100128","DOIUrl":"10.1016/j.aiig.2025.100128","url":null,"abstract":"<div><div>Ternary geopolymers incorporating multiple solid wastes such as steel slag (SS), fly ash (FA), and granulated blast furnace slag (GBFS) are considered environmentally friendly and exhibit enhanced performance. However, the mechanisms governing strength development and the design of optimal mixtures are not fully understood due to the complexity of their components. This study presents the development of four machine learning models—Artificial Neural Network (ANN), Support Vector Regression (SVR), Extremely Randomized Tree (ERT), and Gradient Boosting Regression (GBR)—for predicting the unconfined compressive strength (UCS) of ternary geopolymers. The models were trained using a dataset comprising 120 mixtures derived from laboratory tests. Shapley Additive Explanations analysis was employed to interpret the machine learning models and elucidate the influence of different components on the properties of ternary geopolymers. The results indicate that ANN exhibits the highest predictive accuracy for UCS (R = 0.949). Furthermore, the UCS of ternary geopolymers is most sensitive to the content of GBFS. This study provides valuable insights for optimizing the mix proportions in ternary blended geopolymer mixtures.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 2","pages":"Article 100128"},"PeriodicalIF":0.0,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the application of machine learning algorithms in predicting the permeability of oil reservoirs 机器学习算法在油藏渗透率预测中的应用研究
Pub Date : 2025-06-03 DOI: 10.1016/j.aiig.2025.100126
Andrey V. Soromotin , Dmitriy A. Martyushev , João Luiz Junho Pereira
Permeability is one of the main oil reservoir characteristics. It affects potential oil production, well-completion technologies, the choice of enhanced oil recovery methods, and more. The methods used to determine and predict reservoir permeability have serious shortcomings. This article aims to refine and adapt machine learning techniques using historical data from hydrocarbon field development to evaluate and predict parameters such as the skin factor and permeability of the remote reservoir zone. The article analyzes data from 4045 wells tests in oil fields in the Perm Krai (Russia). An evaluation of the performance of different Machine Learning (ML) algorithms in the prediction of the well permeability is performed. Three different real datasets are used to train more than 20 machine learning regressors, whose hyperparameters are optimized using Bayesian Optimization (BO). The resulting models demonstrate significantly better predictive performance compared to traditional methods and the best ML model found is one that never was applied before to this problem. The permeability prediction model is characterized by a high R2 adjusted value of 0.799. A promising approach is the integration of machine learning methods and the use of pressure recovery curves to estimate permeability in real-time. The work is unique for its approach to predicting pressure recovery curves during well operation without stopping wells, providing primary data for interpretation. These innovations are exclusive and can improve the accuracy of permeability forecasts. It also reduces well downtime associated with traditional well-testing procedures. The proposed methods pave the way for more efficient and cost-effective reservoir development, ultimately supporting better decision-making and resource optimization in oil production.
渗透率是油藏的主要特征之一。它会影响潜在的产油量、完井技术、提高采收率方法的选择等。目前用于确定和预测储层渗透率的方法存在严重缺陷。本文旨在利用油气田开发的历史数据来改进和适应机器学习技术,以评估和预测偏远储层的表皮系数和渗透率等参数。本文分析了俄罗斯彼尔姆边疆区油田4045口试井的数据。对不同机器学习(ML)算法在预测井渗透率方面的性能进行了评估。使用三个不同的真实数据集训练20多个机器学习回归量,并使用贝叶斯优化(BO)对其超参数进行优化。与传统方法相比,结果模型显示出更好的预测性能,并且发现的最佳ML模型是以前从未应用于此问题的模型。渗透率预测模型具有较高的R2调整值(0.799)。一种很有前途的方法是结合机器学习方法和使用压力恢复曲线来实时估计渗透率。这项工作的独特之处在于,它可以在不停井的情况下预测井运行过程中的压力恢复曲线,为解释提供了原始数据。这些创新是独一无二的,可以提高渗透率预测的准确性。它还减少了与传统试井程序相关的井停工期。所提出的方法为更高效、更具成本效益的油藏开发铺平了道路,最终支持更好的石油生产决策和资源优化。
{"title":"On the application of machine learning algorithms in predicting the permeability of oil reservoirs","authors":"Andrey V. Soromotin ,&nbsp;Dmitriy A. Martyushev ,&nbsp;João Luiz Junho Pereira","doi":"10.1016/j.aiig.2025.100126","DOIUrl":"10.1016/j.aiig.2025.100126","url":null,"abstract":"<div><div>Permeability is one of the main oil reservoir characteristics. It affects potential oil production, well-completion technologies, the choice of enhanced oil recovery methods, and more. The methods used to determine and predict reservoir permeability have serious shortcomings. This article aims to refine and adapt machine learning techniques using historical data from hydrocarbon field development to evaluate and predict parameters such as the skin factor and permeability of the remote reservoir zone. The article analyzes data from 4045 wells tests in oil fields in the Perm Krai (Russia). An evaluation of the performance of different Machine Learning (ML) algorithms in the prediction of the well permeability is performed. Three different real datasets are used to train more than 20 machine learning regressors, whose hyperparameters are optimized using Bayesian Optimization (BO). The resulting models demonstrate significantly better predictive performance compared to traditional methods and the best ML model found is one that never was applied before to this problem. The permeability prediction model is characterized by a high R<sup>2</sup> adjusted value of 0.799. A promising approach is the integration of machine learning methods and the use of pressure recovery curves to estimate permeability in real-time. The work is unique for its approach to predicting pressure recovery curves during well operation without stopping wells, providing primary data for interpretation. These innovations are exclusive and can improve the accuracy of permeability forecasts. It also reduces well downtime associated with traditional well-testing procedures. The proposed methods pave the way for more efficient and cost-effective reservoir development, ultimately supporting better decision-making and resource optimization in oil production.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 2","pages":"Article 100126"},"PeriodicalIF":0.0,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144330611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A staged deep learning approach to spatial refinement in 3D temporal atmospheric transport 基于阶段深度学习的三维时空大气传输空间细化方法
Pub Date : 2025-06-01 DOI: 10.1016/j.aiig.2025.100120
M. Giselle Fernández-Godino , Wai Tong Chung , Akshay A. Gowardhan , Matthias Ihme , Qingkai Kong , Donald D. Lucas , Stephen C. Myers
High-resolution spatiotemporal simulations effectively capture the complexities of atmospheric plume dispersion in complex terrain. However, their high computational cost makes them impractical for applications requiring rapid responses or iterative processes, such as optimization, uncertainty quantification, or inverse modeling. To address this challenge, this work introduces the Dual-Stage Temporal Three-dimensional UNet Super-resolution (DST3D-UNet-SR) model, a highly efficient deep learning model for plume dispersion predictions. DST3D-UNet-SR is composed of two sequential modules: the temporal module (TM), which predicts the transient evolution of a plume in complex terrain from low-resolution temporal data, and the spatial refinement module (SRM), which subsequently enhances the spatial resolution of the TM predictions. We train DST3D-UNet-SR using a comprehensive dataset derived from high-resolution large eddy simulations (LES) of plume transport. We propose the DST3D-UNet-SR model to significantly accelerate LES of three-dimensional (3D) plume dispersion by three orders of magnitude. Additionally, the model demonstrates the ability to dynamically adapt to evolving conditions through the incorporation of new observational data, substantially improving prediction accuracy in high-concentration regions near the source.
高分辨率时空模拟有效地捕捉了复杂地形下大气羽散的复杂性。然而,它们的高计算成本使得它们不适合需要快速响应或迭代过程的应用,例如优化、不确定性量化或逆建模。为了应对这一挑战,本工作引入了双阶段时间三维UNet超分辨率(DST3D-UNet-SR)模型,这是一种用于羽散预测的高效深度学习模型。DST3D-UNet-SR由两个连续模块组成:时间模块(TM)和空间细化模块(SRM),前者从低分辨率时间数据预测复杂地形中羽流的瞬态演变,后者提高了TM预测的空间分辨率。我们使用来自羽流传输的高分辨率大涡模拟(LES)的综合数据集来训练DST3D-UNet-SR。我们提出了DST3D-UNet-SR模型,将三维羽散的LES显著加速了三个数量级。此外,该模型通过纳入新的观测数据,显示出动态适应不断变化的条件的能力,大大提高了源附近高浓度区域的预测精度。
{"title":"A staged deep learning approach to spatial refinement in 3D temporal atmospheric transport","authors":"M. Giselle Fernández-Godino ,&nbsp;Wai Tong Chung ,&nbsp;Akshay A. Gowardhan ,&nbsp;Matthias Ihme ,&nbsp;Qingkai Kong ,&nbsp;Donald D. Lucas ,&nbsp;Stephen C. Myers","doi":"10.1016/j.aiig.2025.100120","DOIUrl":"10.1016/j.aiig.2025.100120","url":null,"abstract":"<div><div>High-resolution spatiotemporal simulations effectively capture the complexities of atmospheric plume dispersion in complex terrain. However, their high computational cost makes them impractical for applications requiring rapid responses or iterative processes, such as optimization, uncertainty quantification, or inverse modeling. To address this challenge, this work introduces the Dual-Stage Temporal Three-dimensional UNet Super-resolution (DST3D-UNet-SR) model, a highly efficient deep learning model for plume dispersion predictions. DST3D-UNet-SR is composed of two sequential modules: the temporal module (TM), which predicts the transient evolution of a plume in complex terrain from low-resolution temporal data, and the spatial refinement module (SRM), which subsequently enhances the spatial resolution of the TM predictions. We train DST3D-UNet-SR using a comprehensive dataset derived from high-resolution large eddy simulations (LES) of plume transport. We propose the DST3D-UNet-SR model to significantly accelerate LES of three-dimensional (3D) plume dispersion by three orders of magnitude. Additionally, the model demonstrates the ability to dynamically adapt to evolving conditions through the incorporation of new observational data, substantially improving prediction accuracy in high-concentration regions near the source.</div></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"6 1","pages":"Article 100120"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144185115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Artificial Intelligence in Geosciences
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1