Syed M Amir, Mohammad Rasheed Khan, Ekarit Panacharoensawad, Serhii Kryvenko
{"title":"Integration of Petrophysical Log Data with Computational Intelligence for the Development of a Lithology Predictor","authors":"Syed M Amir, Mohammad Rasheed Khan, Ekarit Panacharoensawad, Serhii Kryvenko","doi":"10.2118/202047-ms","DOIUrl":null,"url":null,"abstract":"\n Wrong manual interpretation from the log data about the formation type and other important information can be catastrophic for the company-operator. With Machine-Learning (ML) (a branch of Artificial Intelligence) algorithms, the interpretation of formation type from the log data has been addressed. As a result, we have successfully developed a program able to accurately predict the type of formation.\n Using the conventional Machine Learning technique of splitting the data into training, validation and test sets, we tried six different ML algorithms to fit with the training part of the data and then verify their prediction accuracy with cross-validation scores and cross-validation predictions which tests the performance of the classifiers (ML algorithms) on the validation set. The three best performing classifiers were selected and further improved by a search of classifier's best hyperparameters. These improved classifiers are further tested on unseen data to produce a comparative analysis.\n Our prediction accuracy with Receiver Operating Characteristic (ROC) scores and ROC-Area Under-the-Curve (ROC-AUC) for each type of formation from the log data lies in the range of 95-99%, except for formations such as shaly sandstone and shale (50% and 84% respectively). The reason for this seemed to be under-fitting i.e., during the training, the classifiers did not see enough instances of these types of formation to know exactly what characteristics of the data make the type of formation to be shaly sandstone or shale. The issue of under-fitting was verified by skimming through the data. To resolve this problem, we suggest training classifiers with a larger data with more targets (types of formation). Furthermore, during the data cleaning (prior to classifier training) and data analysis phases we have discovered important relationships between well logs and defined relative importance of each well log for different formations. This observation can be investigated further to help eliminate the use of multiple well logs while dealing with some formations (based on prior geological knowledge) and reduce the cost of the well logging operations. Using our program with a larger well log data consisting of more formation type instances, we can train the classifiers to accurately predict the formation type irrespectively of differences in formation type.\n Our program is dynamic in the sense that with different targets, i.e., type of formation fluid instead of type of formation or both together, it can successfully predict either or both targets. Increasing the numbers of data instances resulted in a better training and thus, more accurate predictions. Utilization of the program will make the formation-evaluation process easier, faster, automated and more-precise.","PeriodicalId":359083,"journal":{"name":"Day 2 Tue, October 27, 2020","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 2 Tue, October 27, 2020","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/202047-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Wrong manual interpretation from the log data about the formation type and other important information can be catastrophic for the company-operator. With Machine-Learning (ML) (a branch of Artificial Intelligence) algorithms, the interpretation of formation type from the log data has been addressed. As a result, we have successfully developed a program able to accurately predict the type of formation.
Using the conventional Machine Learning technique of splitting the data into training, validation and test sets, we tried six different ML algorithms to fit with the training part of the data and then verify their prediction accuracy with cross-validation scores and cross-validation predictions which tests the performance of the classifiers (ML algorithms) on the validation set. The three best performing classifiers were selected and further improved by a search of classifier's best hyperparameters. These improved classifiers are further tested on unseen data to produce a comparative analysis.
Our prediction accuracy with Receiver Operating Characteristic (ROC) scores and ROC-Area Under-the-Curve (ROC-AUC) for each type of formation from the log data lies in the range of 95-99%, except for formations such as shaly sandstone and shale (50% and 84% respectively). The reason for this seemed to be under-fitting i.e., during the training, the classifiers did not see enough instances of these types of formation to know exactly what characteristics of the data make the type of formation to be shaly sandstone or shale. The issue of under-fitting was verified by skimming through the data. To resolve this problem, we suggest training classifiers with a larger data with more targets (types of formation). Furthermore, during the data cleaning (prior to classifier training) and data analysis phases we have discovered important relationships between well logs and defined relative importance of each well log for different formations. This observation can be investigated further to help eliminate the use of multiple well logs while dealing with some formations (based on prior geological knowledge) and reduce the cost of the well logging operations. Using our program with a larger well log data consisting of more formation type instances, we can train the classifiers to accurately predict the formation type irrespectively of differences in formation type.
Our program is dynamic in the sense that with different targets, i.e., type of formation fluid instead of type of formation or both together, it can successfully predict either or both targets. Increasing the numbers of data instances resulted in a better training and thus, more accurate predictions. Utilization of the program will make the formation-evaluation process easier, faster, automated and more-precise.