The current utility of mud gas data is typically limited to geological and petrophysical correlation, formation evaluation, and fluid typing. A critical and comprehensive review of the literature on mud gas data revealed that the mud gas data is abundantly acquired during drilling but not sufficiently utilized in real time. There is the need to leverage the current advances in machine learning technology and the race towards the digital transformation of the petroleum industry to create new opportunities for more extensive utility of mud gas data. Now that data is the new “oil” or “gold”, the utility of the rich and abundant mud gas data could be explored for real-time applications. Such new possibilities are capable of adding more value to the reservoir characterization workflow ahead of geophysical logging, geological core data analysis, and well testing. Achieving this will facilitate early decision-making, improve safety, reduce nonproductive time, and ultimately accelerate the attainment of the digital transformation objective of the petroleum industry. We conclude with identifying possible future directions for the ultimate attainment of maximizing the utility of mud gas data through real-time and more advanced applications.
{"title":"Contributions of machine learning to quantitative and real-time mud gas data analysis: A critical review","authors":"Fatai Anifowose, Mokhles Mezghani, Saleh Badawood, Javed Ismail","doi":"10.1016/j.acags.2022.100095","DOIUrl":"10.1016/j.acags.2022.100095","url":null,"abstract":"<div><p>The current utility of mud gas data is typically limited to geological and petrophysical correlation, formation evaluation, and fluid typing. A critical and comprehensive review of the literature on mud gas data revealed that the mud gas data is abundantly acquired during drilling but not sufficiently utilized in real time. There is the need to leverage the current advances in machine learning technology and the race towards the digital transformation of the petroleum industry to create new opportunities for more extensive utility of mud gas data. Now that data is the new “oil” or “gold”, the utility of the rich and abundant mud gas data could be explored for real-time applications. Such new possibilities are capable of adding more value to the reservoir characterization workflow ahead of geophysical logging, geological core data analysis, and well testing. Achieving this will facilitate early decision-making, improve safety, reduce nonproductive time, and ultimately accelerate the attainment of the digital transformation objective of the petroleum industry. We conclude with identifying possible future directions for the ultimate attainment of maximizing the utility of mud gas data through real-time and more advanced applications.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"16 ","pages":"Article 100095"},"PeriodicalIF":3.4,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590197422000179/pdfft?md5=05e1d93af07412f49d3f45e78cc62d49&pid=1-s2.0-S2590197422000179-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44840264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1016/j.acags.2022.100094
Timothy C.C. Lui , Daniel D. Gregory , Marek Anderson , Well-Shen Lee , Sharon A. Cowling
In this study we compared various machine learning techniques that used soil geochemistry to aid in geologic mapping. We tested six different sampling methods (undersample, oversample, Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN), SMOTE and Edited Nearest Neighbor (SMOTEENN), and SMOTE and Tomek links (SMOTETomek)). SMOTE performed best with ADASYN and SMOTETomek having slightly lower effectiveness. Nine machine learning algorithms (naïve Bayes, logistic regression, quadratic discriminant analysis, nearest neighbors, radial basis function support-vector machine, artificial neural network, random forest, AdaBoost classifier, and gradient boosting classifier) were compared and AdaBoost classifiers and gradient boosting classifiers were found to be most effective. Finally, we experimented with multiple classifier systems (MCS) testing different combinations of algorithms and various combinatorial functions. It was found that MCS can outperform individual models, and the best MCS combined nearest neighbors, radial basis function support-vector machine, artificial neural network, random forest, AdaBoost classifiers, and gradient boosting classifier, then applied a logistic regression to the probabilities output by the models. Ultimately, we created a tool that is able to adequately predict underlying geology in the study area using soil geochemistry.
{"title":"Applying machine learning methods to predict geology using soil sample geochemistry","authors":"Timothy C.C. Lui , Daniel D. Gregory , Marek Anderson , Well-Shen Lee , Sharon A. Cowling","doi":"10.1016/j.acags.2022.100094","DOIUrl":"10.1016/j.acags.2022.100094","url":null,"abstract":"<div><p>In this study we compared various machine learning techniques that used soil geochemistry to aid in geologic mapping. We tested six different sampling methods (undersample, oversample, Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN), SMOTE and Edited Nearest Neighbor (SMOTEENN), and SMOTE and Tomek links (SMOTETomek)). SMOTE performed best with ADASYN and SMOTETomek having slightly lower effectiveness. Nine machine learning algorithms (naïve Bayes, logistic regression, quadratic discriminant analysis, nearest neighbors, radial basis function support-vector machine, artificial neural network, random forest, AdaBoost classifier, and gradient boosting classifier) were compared and AdaBoost classifiers and gradient boosting classifiers were found to be most effective. Finally, we experimented with multiple classifier systems (MCS) testing different combinations of algorithms and various combinatorial functions. It was found that MCS can outperform individual models, and the best <span>MCS</span> combined nearest neighbors, radial basis function support-vector machine, artificial neural network, random forest, AdaBoost classifiers, and gradient boosting classifier, then applied a logistic regression to the probabilities output by the models. Ultimately, we created a tool that is able to adequately predict underlying geology in the study area using soil geochemistry.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"16 ","pages":"Article 100094"},"PeriodicalIF":3.4,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590197422000167/pdfft?md5=b4237ad5984255315f0bafb7d1c87083&pid=1-s2.0-S2590197422000167-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49360901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1016/j.acags.2022.100100
Solomon Asante-Okyere , Chuanbo Shen , Harrison Osei
Lithology identification is a fundamental activity in oil and gas exploration. The application of artificial intelligence (AI) is currently being adopted as a state-of-the-art means of automating lithology identification. One aspect of this AI approach is the application of population search algorithms to optimise hyperparameters for enhanced prediction performance. For the first time, Bayesian optimization is deployed to determine the optimal learning parameters for more accurate tree and tree ensemble lithology classifiers. The aim is to rely on the ability of Bayesian optimization to consider previous classification results to improve the output of decision and ensemble tree lithology models using well logs as inputs. The proposed Bayesian optimised decision tree (BODT) generated the best classification accuracy of 89.8% as compared to 86.9%, 83.3% and 81.2% for fine, medium and coarse trees. For the ensembled trees, the Bayesian optimised AdaBoost (BO-AdaBoost) classifier generated the highest improved prediction accuracy of 94.2% while Bayesian optimised Bagged (BO-Bagged) and Bayesian optimised RUSBoost (BO-RUSBoost) had a lower accuracy rate of 94.0% and 77.1% respectively. Additionally, the performance of the Bayesian optimised classifiers offered higher reliability when compared with particle swarm optimization-based artificial neural networks (PSO-ANN). Hence, incorporating Bayesian optimization as a hyperparameter search algorithm will improve litholofacies recognition, leading to a higher accuracy rate and thereby provide an improved alternative for intelligent lithology identification.
{"title":"Enhanced machine learning tree classifiers for lithology identification using Bayesian optimization","authors":"Solomon Asante-Okyere , Chuanbo Shen , Harrison Osei","doi":"10.1016/j.acags.2022.100100","DOIUrl":"10.1016/j.acags.2022.100100","url":null,"abstract":"<div><p>Lithology identification is a fundamental activity in oil and gas exploration. The application of artificial intelligence (AI) is currently being adopted as a state-of-the-art means of automating lithology identification. One aspect of this AI approach is the application of population search algorithms to optimise hyperparameters for enhanced prediction performance. For the first time, Bayesian optimization is deployed to determine the optimal learning parameters for more accurate tree and tree ensemble lithology classifiers. The aim is to rely on the ability of Bayesian optimization to consider previous classification results to improve the output of decision and ensemble tree lithology models using well logs as inputs. The proposed Bayesian optimised decision tree (BODT) generated the best classification accuracy of 89.8% as compared to 86.9%, 83.3% and 81.2% for fine, medium and coarse trees. For the ensembled trees, the Bayesian optimised AdaBoost (BO-AdaBoost) classifier generated the highest improved prediction accuracy of 94.2% while Bayesian optimised Bagged (BO-Bagged) and Bayesian optimised RUSBoost (BO-RUSBoost) had a lower accuracy rate of 94.0% and 77.1% respectively. Additionally, the performance of the Bayesian optimised classifiers offered higher reliability when compared with particle swarm optimization-based artificial neural networks (PSO-ANN). Hence, incorporating Bayesian optimization as a hyperparameter search algorithm will improve litholofacies recognition, leading to a higher accuracy rate and thereby provide an improved alternative for intelligent lithology identification.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"16 ","pages":"Article 100100"},"PeriodicalIF":3.4,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590197422000222/pdfft?md5=70b41f43c359f6a72c8f285b2d646140&pid=1-s2.0-S2590197422000222-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48858608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Microfossil fish teeth, known as ichthyoliths, provide a key constraint on the depositional age and environment of deep-sea sediments, especially pelagic clays where siliceous and calcareous microfossils are rarely observed. However, traditional methods for the observation of ichthyoliths require considerable time and manual labor, which can hinder their wider application. In this study, we constructed a system to automatically detect ichthyoliths in microscopic images by combining two open source deep learning models. First, the regions for ichthyoliths within the microscopic images are predicted by the instance segmentation model Mask R–CNN. All the detected regions are then re-classified using the image classification model EfficientNet-V2 to determine the classes more accurately. Compared with only using the Mask R–CNN model, the combined system offers significantly higher performance (89.0% precision, 78.6% recall, and an F1 score of 83.5%), demonstrating the utility of the system. Our system can also predict the lengths of the teeth that have been detected, with more than 90% of the predicted lengths being within ±20% of measured length. This system provides a novel, automated, and reliable approach for the detection and length measurement of ichthyoliths from microscope images that can be applied in a range of paleoceanographic and paleoecological contexts.
{"title":"Automated detection of microfossil fish teeth from slide images using combined deep learning models","authors":"Kazuhide Mimura , Shugo Minabe , Kentaro Nakamura , Kazutaka Yasukawa , Junichiro Ohta , Yasuhiro Kato","doi":"10.1016/j.acags.2022.100092","DOIUrl":"10.1016/j.acags.2022.100092","url":null,"abstract":"<div><p>Microfossil fish teeth, known as ichthyoliths, provide a key constraint on the depositional age and environment of deep-sea sediments, especially pelagic clays where siliceous and calcareous microfossils are rarely observed. However, traditional methods for the observation of ichthyoliths require considerable time and manual labor, which can hinder their wider application. In this study, we constructed a system to automatically detect ichthyoliths in microscopic images by combining two open source deep learning models. First, the regions for ichthyoliths within the microscopic images are predicted by the instance segmentation model Mask R–CNN. All the detected regions are then re-classified using the image classification model EfficientNet-V2 to determine the classes more accurately. Compared with only using the Mask R–CNN model, the combined system offers significantly higher performance (89.0% precision, 78.6% recall, and an F1 score of 83.5%), demonstrating the utility of the system. Our system can also predict the lengths of the teeth that have been detected, with more than 90% of the predicted lengths being within ±20% of measured length. This system provides a novel, automated, and reliable approach for the detection and length measurement of ichthyoliths from microscope images that can be applied in a range of paleoceanographic and paleoecological contexts.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"16 ","pages":"Article 100092"},"PeriodicalIF":3.4,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590197422000143/pdfft?md5=4c5b722dfdb0bf4e0c4d6459e827ee6b&pid=1-s2.0-S2590197422000143-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43365348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01DOI: 10.1016/j.acags.2022.100090
McLean Trott , Matthew Leybourne , Lindsay Hall , Daniel Layton-Matthews
Systematic manual and algorithmic classification workflows to characterize rock types are increasingly applied in the mineral exploration and mining industry, leveraging large systematically collected datasets. The aim of these are robust and repeatable classifications to aid more traditional visual logging practices. This study uses random forest algorithms to examine the impacts of integrating distinct datasets with complementary characteristics; chemistry to enable compositional distinctions, and photography to enable textural distinctions. We use a random forest classifier to examine the accuracy metrics of models producing rock type classifications using these two data types independently and integrated together. Prediction accuracy, measured using 10-fold cross validation, was 87% for geochemical-only inputs, 85% for photographic-only inputs, and 90% for mixed inputs from both datasets. A mining and exploration project in the Late Miocene to early Pliocene porphyry belt in Chile is the site of this case study, where datasets were systematically acquired using in-field methods on historical drill-cores. Results indicate that classification of lithology is improved by integration of photography-based and composition-based feature inputs. We infer that the benefits of integration would increase in proportion with increasing compositional similarity between rock types. This approach might also be applied to similar geological problems, such as alteration or metallurgical classifications; and with somewhat distinct datatypes, such as geochemical interval data and photographic metric extraction from coincident intervals in core photos.
{"title":"Random forest rock type classification with integration of geochemical and photographic data","authors":"McLean Trott , Matthew Leybourne , Lindsay Hall , Daniel Layton-Matthews","doi":"10.1016/j.acags.2022.100090","DOIUrl":"10.1016/j.acags.2022.100090","url":null,"abstract":"<div><p>Systematic manual and algorithmic classification workflows to characterize rock types are increasingly applied in the mineral exploration and mining industry, leveraging large systematically collected datasets. The aim of these are robust and repeatable classifications to aid more traditional visual logging practices. This study uses random forest algorithms to examine the impacts of integrating distinct datasets with complementary characteristics; chemistry to enable compositional distinctions, and photography to enable textural distinctions. We use a random forest classifier to examine the accuracy metrics of models producing rock type classifications using these two data types independently and integrated together. Prediction accuracy, measured using 10-fold cross validation, was 87% for geochemical-only inputs, 85% for photographic-only inputs, and 90% for mixed inputs from both datasets. A mining and exploration project in the Late Miocene to early Pliocene porphyry belt in Chile is the site of this case study, where datasets were systematically acquired using in-field methods on historical drill-cores. Results indicate that classification of lithology is improved by integration of photography-based and composition-based feature inputs. We infer that the benefits of integration would increase in proportion with increasing compositional similarity between rock types. This approach might also be applied to similar geological problems, such as alteration or metallurgical classifications; and with somewhat distinct datatypes, such as geochemical interval data and photographic metric extraction from coincident intervals in core photos.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"15 ","pages":"Article 100090"},"PeriodicalIF":3.4,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S259019742200012X/pdfft?md5=9f3745cefef8413c910837602f704efe&pid=1-s2.0-S259019742200012X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"53922094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01DOI: 10.1016/j.acags.2022.100089
Sait I. Ozkaya , M.M. Al-Fahmi
Conductive fractures in petroleum reservoirs may be totally isolated or fully interconnected. There is an intermediate state between the two extremities. Partially fractured reservoirs include finite fracture networks (FFN), which are bundles of interconnected fractures embedded in a sea of isolated fractures. Devising measures for sizes of FFNs is crucial in estimating critical engineering aspects such as productivity index, production decline rate and expected ultimate recovery of wells especially in reservoirs with low matrix porosity and permeability. Here, we present results of statistical evaluation of FFN size in relation to fracture connectivity which is in essence the number of fracture intersections per fracture. The analysis is based on a large number of stochastic 2-dimensional (2D) Poisson models of sub-vertical layer bound fractures. Fracture length in the models has log normal or truncated power distribution and fracture strike has circular normal distribution. The models may have single or multiple fracture sets and various truncation modes with different probabilities.
The analysis shows that number of fracture intersections per fracture can be accurately estimated by a fracture connectivity index, which is defined as product of facture scan-line density, average fracture length and sine of strike standard deviation. The statistically significant finding of this study is that the number of fractures within a FFN is an exponential function of fracture connectivity index. All three fracture properties defining the index can be measured on borehole image logs. Hence it should be possible to estimate fracture connectivity and corresponding FFN size from borehole image data. The analysis pertains to 2D fracture connectivity which is always the lower bound of number of fracture intersections in 3-dimensions. Therefore the exponential relationships must also hold for actual 3-dimensional layer-bound fractures with variable dips.
{"title":"Estimating size of finite fracture networks in layered reservoirs","authors":"Sait I. Ozkaya , M.M. Al-Fahmi","doi":"10.1016/j.acags.2022.100089","DOIUrl":"10.1016/j.acags.2022.100089","url":null,"abstract":"<div><p>Conductive fractures in petroleum reservoirs may be totally isolated or fully interconnected. There is an intermediate state between the two extremities. Partially fractured reservoirs include finite fracture networks (FFN), which are bundles of interconnected fractures embedded in a sea of isolated fractures. Devising measures for sizes of FFNs is crucial in estimating critical engineering aspects such as productivity index, production decline rate and expected ultimate recovery of wells especially in reservoirs with low matrix porosity and permeability. Here, we present results of statistical evaluation of FFN size in relation to fracture connectivity which is in essence the number of fracture intersections per fracture. The analysis is based on a large number of stochastic 2-dimensional (2D) Poisson models of sub-vertical layer bound fractures. Fracture length in the models has log normal or truncated power distribution and fracture strike has circular normal distribution. The models may have single or multiple fracture sets and various truncation modes with different probabilities.</p><p>The analysis shows that number of fracture intersections per fracture can be accurately estimated by a fracture connectivity index, which is defined as product of facture scan-line density, average fracture length and sine of strike standard deviation. The statistically significant finding of this study is that the number of fractures within a FFN is an exponential function of fracture connectivity index. All three fracture properties defining the index can be measured on borehole image logs. Hence it should be possible to estimate fracture connectivity and corresponding FFN size from borehole image data. The analysis pertains to 2D fracture connectivity which is always the lower bound of number of fracture intersections in 3-dimensions. Therefore the exponential relationships must also hold for actual 3-dimensional layer-bound fractures with variable dips.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"15 ","pages":"Article 100089"},"PeriodicalIF":3.4,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590197422000118/pdfft?md5=c965145e7845eaea78b57cf39d69d86d&pid=1-s2.0-S2590197422000118-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46065297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Well logging is essential in studies which require an understanding of the subsurface geology and depositional conditions. Unfortunately, the measurements are rarely complete and missing data intervals are common due to operational issues or malfunction of the logging equipment. Therefore the imputation of missing data from down-hole well logs is a common problem in subsurface workflows. Recently, many different approaches have been used for imputation but they are often manual or data set specific. Machine learning has reignited interest in this field with promises of a more generic and simpler approach. We explore whether the chaining of machine learning for mutli-log imputation improves results by overcoming disparities in the patterns of missing data. Our research interest is primarily petroleum geophysics and therefore this study focuses on the elastic logs of compressional (DT) and shear (DTS) sonic along with the bulk density (RHOB). However, the method may be applied to all sufficiently large well log data sets in any industry.
{"title":"Multivariate imputation via chained equations for elastic well log imputation and prediction","authors":"Antony Hallam , Debajoy Mukherjee , Romain Chassagne","doi":"10.1016/j.acags.2022.100083","DOIUrl":"10.1016/j.acags.2022.100083","url":null,"abstract":"<div><p>Well logging is essential in studies which require an understanding of the subsurface geology and depositional conditions. Unfortunately, the measurements are rarely complete and missing data intervals are common due to operational issues or malfunction of the logging equipment. Therefore the imputation of missing data from down-hole well logs is a common problem in subsurface workflows. Recently, many different approaches have been used for imputation but they are often manual or data set specific. Machine learning has reignited interest in this field with promises of a more generic and simpler approach. We explore whether the chaining of machine learning for mutli-log imputation improves results by overcoming disparities in the patterns of missing data. Our research interest is primarily petroleum geophysics and therefore this study focuses on the elastic logs of compressional (DT) and shear (DTS) sonic along with the bulk density (RHOB). However, the method may be applied to all sufficiently large well log data sets in any industry.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"14 ","pages":"Article 100083"},"PeriodicalIF":3.4,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590197422000052/pdfft?md5=75bc3e9371817b929126e095a3078aca&pid=1-s2.0-S2590197422000052-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42143016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-01DOI: 10.1016/j.acags.2022.100085
Sarouyeh Khoshkholgh, Ivanka Orozova-Bekkevold, Klaus Mosegaard
When hydrocarbon reservoirs are used as a CO2 storage facility, an accurate uncertainty analysis and risk assessment is essential. An integration of information from geological knowledge, geological modelling, well log data, and geophysical data provides the basis for this analysis. Modelling the time development of stress/strain changes in the overburden provides prior knowledge about fault and fracture probability in the reservoir, which in turn is used in seismic inversion to constrain models of faulting and fracturing. One main problem in solving large scale seismic inverse problems is high computational cost and inefficiency. We use a newly introduced methodology - Informed-proposal Monte Carlo (IPMC) - to deal with this problem, and to carry out a conceptual study based on real data from the Danish North Sea. The result outlines a methodology for evaluating the risk of having sub-seismic faulting in the overburden that potentially compromises the CO2 storage of the reservoir.
{"title":"Evolution of the stress and strain field in the tyra field during the Post-Chalk Deposition and seismic inversion of fault zone using informed-proposal Monte Carlo","authors":"Sarouyeh Khoshkholgh, Ivanka Orozova-Bekkevold, Klaus Mosegaard","doi":"10.1016/j.acags.2022.100085","DOIUrl":"10.1016/j.acags.2022.100085","url":null,"abstract":"<div><p>When hydrocarbon reservoirs are used as a CO2 storage facility, an accurate uncertainty analysis and risk assessment is essential. An integration of information from geological knowledge, geological modelling, well log data, and geophysical data provides the basis for this analysis. Modelling the time development of stress/strain changes in the overburden provides prior knowledge about fault and fracture probability in the reservoir, which in turn is used in seismic inversion to constrain models of faulting and fracturing. One main problem in solving large scale seismic inverse problems is high computational cost and inefficiency. We use a newly introduced methodology - Informed-proposal Monte Carlo (IPMC) - to deal with this problem, and to carry out a conceptual study based on real data from the Danish North Sea. The result outlines a methodology for evaluating the risk of having sub-seismic faulting in the overburden that potentially compromises the CO2 storage of the reservoir.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"14 ","pages":"Article 100085"},"PeriodicalIF":3.4,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590197422000076/pdfft?md5=f2aeb16d5a833110a0c3e557c58c7ab4&pid=1-s2.0-S2590197422000076-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43700141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-01DOI: 10.1016/j.acags.2022.100086
Ikechukwu Kalu , Christopher E. Ndehedehe , Onuwa Okwuashi , Aniekan E. Eyoh
This study evaluates the steepest descent algorithm as a tool for root mean square (RMS) error optimization in geodetic reference systems to improve the integrity of transformation. With an initial RMS error estimate of 0.01830m, the negative gradient direction was applied through the steepest optimization leading to a final RMS error estimate of 0.00051m. Using the exact line search mode with a one-point step size of 0.1, we achieved the minimum values in less than sixty iterations, regardless of the slow convergence rate of the steepest descent algorithm.
{"title":"Estimating the seven transformational parameters between two geodetic datums using the steepest descent algorithm of machine learning","authors":"Ikechukwu Kalu , Christopher E. Ndehedehe , Onuwa Okwuashi , Aniekan E. Eyoh","doi":"10.1016/j.acags.2022.100086","DOIUrl":"https://doi.org/10.1016/j.acags.2022.100086","url":null,"abstract":"<div><p>This study evaluates the steepest descent algorithm as a tool for root mean square (RMS) error optimization in geodetic reference systems to improve the integrity of transformation. With an initial RMS error estimate of 0.01830m, the negative gradient direction was applied through the steepest optimization leading to a final RMS error estimate of 0.00051m. Using the exact line search mode with a one-point step size of 0.1, we achieved the minimum values in less than sixty iterations, regardless of the slow convergence rate of the steepest descent algorithm.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"14 ","pages":"Article 100086"},"PeriodicalIF":3.4,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590197422000088/pdfft?md5=eb92d5772ded0edd7f0a090531d968d3&pid=1-s2.0-S2590197422000088-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137273704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-01DOI: 10.1016/j.acags.2022.100084
Christopher J.M. Lawley , Stefania Raimondo , Tianyi Chen , Lindsay Brin , Anton Zakharov , Daniel Kur , Jenny Hui , Glen Newton , Sari L. Burgoyne , Geneviève Marquis
Geoscientists use observations and descriptions of the rock record to study the origins and history of our planet, which has resulted in a vast volume of scientific literature. Recent progress in natural language processing (NLP) has the potential to parse through and extract knowledge from unstructured text, but there has, so far, been only limited work on the concepts and vocabularies that are specific to geoscience. Herein we harvest and process public geoscientific reports (i.e., Canadian federal and provincial geological survey publications databases) and a subset of open access and peer-reviewed publications to train new, geoscience-specific language models to address that knowledge gap. Language model performance is validated using a series of new geoscience-specific NLP tasks (i.e., analogies, clustering, relatedness, and nearest neighbour analysis) that were developed as part of the current study. The raw and processed national geological survey corpora, language models, and evaluation criteria are all made public for the first time. We demonstrate that non-contextual (i.e., Global Vectors for Word Representation, GloVe) and contextual (i.e., Bidirectional Encoder Representations from Transformers, BERT) language models updated using the geoscientific corpora outperform the generic versions of these models for each of the evaluation criteria. Principal component analysis further demonstrates that word embeddings trained on geoscientific text capture meaningful semantic relationships, including rock classifications, mineral properties and compositions, and the geochemical behaviour of elements. Semantic relationships that emerge from the vector space have the potential to unlock latent knowledge within unstructured text, and perhaps more importantly, also highlight the potential for other downstream geoscience-focused NLP tasks (e.g., keyword prediction, document similarity, recommender systems, rock and mineral classification).
{"title":"Geoscience language models and their intrinsic evaluation","authors":"Christopher J.M. Lawley , Stefania Raimondo , Tianyi Chen , Lindsay Brin , Anton Zakharov , Daniel Kur , Jenny Hui , Glen Newton , Sari L. Burgoyne , Geneviève Marquis","doi":"10.1016/j.acags.2022.100084","DOIUrl":"10.1016/j.acags.2022.100084","url":null,"abstract":"<div><p>Geoscientists use observations and descriptions of the rock record to study the origins and history of our planet, which has resulted in a vast volume of scientific literature. Recent progress in natural language processing (NLP) has the potential to parse through and extract knowledge from unstructured text, but there has, so far, been only limited work on the concepts and vocabularies that are specific to geoscience. Herein we harvest and process public geoscientific reports (i.e., Canadian federal and provincial geological survey publications databases) and a subset of open access and peer-reviewed publications to train new, geoscience-specific language models to address that knowledge gap. Language model performance is validated using a series of new geoscience-specific NLP tasks (i.e., analogies, clustering, relatedness, and nearest neighbour analysis) that were developed as part of the current study. The raw and processed national geological survey corpora, language models, and evaluation criteria are all made public for the first time. We demonstrate that non-contextual (i.e., Global Vectors for Word Representation, GloVe) and contextual (i.e., Bidirectional Encoder Representations from Transformers, BERT) language models updated using the geoscientific corpora outperform the generic versions of these models for each of the evaluation criteria. Principal component analysis further demonstrates that word embeddings trained on geoscientific text capture meaningful semantic relationships, including rock classifications, mineral properties and compositions, and the geochemical behaviour of elements. Semantic relationships that emerge from the vector space have the potential to unlock latent knowledge within unstructured text, and perhaps more importantly, also highlight the potential for other downstream geoscience-focused NLP tasks (e.g., keyword prediction, document similarity, recommender systems, rock and mineral classification).</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"14 ","pages":"Article 100084"},"PeriodicalIF":3.4,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590197422000064/pdfft?md5=fd1994f61f32c322ea2f0108221236be&pid=1-s2.0-S2590197422000064-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42112365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}