Peng-xiang Diwu , Beichen Zhao , Hangxiangpan Wang , Chao Wen , Siwei Nie , Wenjing Wei , A-qiao Li , Jingjie Xu , Fengyuan Zhang
{"title":"Machine learning classification algorithm screening for the main controlling factors of heavy oil CO2 huff and puff","authors":"Peng-xiang Diwu , Beichen Zhao , Hangxiangpan Wang , Chao Wen , Siwei Nie , Wenjing Wei , A-qiao Li , Jingjie Xu , Fengyuan Zhang","doi":"10.1016/j.ptlrs.2024.04.002","DOIUrl":null,"url":null,"abstract":"<div><div>CO<sub>2</sub> huff and puff technology can enhance the recovery of heavy oil in high-water-cut stages. However, the effectiveness of this method varies significantly under different geological and fluid conditions, which leads to a high-dimensional and small-sample (HDSS) dataset. It is difficult for conventional techniques that identify key factors that influence CO<sub>2</sub> huff and puff effects, such as fuzzy mathematics, to manage HDSS datasets, which often contain nonlinear and irremovable abnormal data. To accurately pinpoint the primary control factors for heavy oil CO<sub>2</sub> huff and puff, four machine learning classification algorithms were adopted. These algorithms were selected to align with the characteristics of HDSS datasets, taking into account algorithmic principles and an analysis of key control factors. The results demonstrated that logistic regression encounters difficulties when dealing with nonlinear data, whereas the extreme gradient boosting and gradient boosting decision tree algorithms exhibit greater sensitivity to abnormal data. By contrast, the random forest algorithm proved to be insensitive to outliers and provided a reliable ranking of factors that influence CO<sub>2</sub> huff and puff effects. The top five control factors identified were the distance between parallel wells, cumulative gas injection volume, liquid production rate of parallel wells, huff and puff timing, and heterogeneous Lorentz coefficient. These research findings not only contribute to the precise implementation of heavy oil CO<sub>2</sub> huff and puff but also offer valuable insights into selecting classification algorithms for typical HDSS data.</div></div>","PeriodicalId":19756,"journal":{"name":"Petroleum Research","volume":"9 4","pages":"Pages 541-552"},"PeriodicalIF":0.0000,"publicationDate":"2024-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Petroleum Research","FirstCategoryId":"1087","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2096249524000371","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Earth and Planetary Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
CO2 huff and puff technology can enhance the recovery of heavy oil in high-water-cut stages. However, the effectiveness of this method varies significantly under different geological and fluid conditions, which leads to a high-dimensional and small-sample (HDSS) dataset. It is difficult for conventional techniques that identify key factors that influence CO2 huff and puff effects, such as fuzzy mathematics, to manage HDSS datasets, which often contain nonlinear and irremovable abnormal data. To accurately pinpoint the primary control factors for heavy oil CO2 huff and puff, four machine learning classification algorithms were adopted. These algorithms were selected to align with the characteristics of HDSS datasets, taking into account algorithmic principles and an analysis of key control factors. The results demonstrated that logistic regression encounters difficulties when dealing with nonlinear data, whereas the extreme gradient boosting and gradient boosting decision tree algorithms exhibit greater sensitivity to abnormal data. By contrast, the random forest algorithm proved to be insensitive to outliers and provided a reliable ranking of factors that influence CO2 huff and puff effects. The top five control factors identified were the distance between parallel wells, cumulative gas injection volume, liquid production rate of parallel wells, huff and puff timing, and heterogeneous Lorentz coefficient. These research findings not only contribute to the precise implementation of heavy oil CO2 huff and puff but also offer valuable insights into selecting classification algorithms for typical HDSS data.