{"title":"Sensitivity Analysis and Feature Selection for Drilling-oriented Models","authors":"Sofia Tariq, D. Sui","doi":"10.1115/1.4062382","DOIUrl":null,"url":null,"abstract":"\n Data-driven models have risen in popularity during the past ten years, which increase the effectiveness and durability of systems without necessitating a lot of human involvement. Despite all of their advantages, they remain the limitations in terms of model interpretation, data selection and model evaluation, etc. Sensitivity Analysis is a powerful tool to decipher behaviors of data-driven models to analyze the correlations among inputs and outputs of models, and quantify the severity of inputs' influence on outputs to effectively interpret these black-box models. Feature Selection (FS) is a pre-processing approach used in data-driven modeling to select the crucial parameters as inputs fed to models. For the most of existing works, the FS is well-used to select inputs through the analysis on the drilling data correlations, while SA is seldom employed for data-driven model evaluation and interpretation in drilling applications. Data-driven Rate of Penetration (ROP) models have consistently outperformed many conventional ROP models, most likely as a result of their strong data analysis capabilities, capacity to learn from data in order to recognize data patterns, and effective policies for making logical decisions automatically. A data-driven ROP model was developed from a benchmarking field drilling dataset in this work. Following the ROP modelling, sensitivity analysis methods were employed to identify the input variables that had the greatest influence on ROP estimations. The FS techniques and the sensitivity analysis were combined during the data preprocessing to identify the most important aspects for modelling. The outcomes demonstrate that using the obust sensitivity analysis techniques to overcome the limits of machine learning models allows for the best interpretation and understanding of the produced data-driven models.","PeriodicalId":15676,"journal":{"name":"Journal of Energy Resources Technology-transactions of The Asme","volume":null,"pages":null},"PeriodicalIF":2.6000,"publicationDate":"2023-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Energy Resources Technology-transactions of The Asme","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1115/1.4062382","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
Data-driven models have risen in popularity during the past ten years, which increase the effectiveness and durability of systems without necessitating a lot of human involvement. Despite all of their advantages, they remain the limitations in terms of model interpretation, data selection and model evaluation, etc. Sensitivity Analysis is a powerful tool to decipher behaviors of data-driven models to analyze the correlations among inputs and outputs of models, and quantify the severity of inputs' influence on outputs to effectively interpret these black-box models. Feature Selection (FS) is a pre-processing approach used in data-driven modeling to select the crucial parameters as inputs fed to models. For the most of existing works, the FS is well-used to select inputs through the analysis on the drilling data correlations, while SA is seldom employed for data-driven model evaluation and interpretation in drilling applications. Data-driven Rate of Penetration (ROP) models have consistently outperformed many conventional ROP models, most likely as a result of their strong data analysis capabilities, capacity to learn from data in order to recognize data patterns, and effective policies for making logical decisions automatically. A data-driven ROP model was developed from a benchmarking field drilling dataset in this work. Following the ROP modelling, sensitivity analysis methods were employed to identify the input variables that had the greatest influence on ROP estimations. The FS techniques and the sensitivity analysis were combined during the data preprocessing to identify the most important aspects for modelling. The outcomes demonstrate that using the obust sensitivity analysis techniques to overcome the limits of machine learning models allows for the best interpretation and understanding of the produced data-driven models.
期刊介绍:
Specific areas of importance including, but not limited to: Fundamentals of thermodynamics such as energy, entropy and exergy, laws of thermodynamics; Thermoeconomics; Alternative and renewable energy sources; Internal combustion engines; (Geo) thermal energy storage and conversion systems; Fundamental combustion of fuels; Energy resource recovery from biomass and solid wastes; Carbon capture; Land and offshore wells drilling; Production and reservoir engineering;, Economics of energy resource exploitation