Khalid Ibrahim Hasan, Karar H. Alfarttoosi, P. Kanjariya, Asha Rajiv, Aman Shankhyan, M. Manjula, Bhavik Jain, Satish Kumar Samal, Waam Mohammed Taher, Mariem Alwan, Mahmood Jasem Jawad, Hiba Mushtaq, Ahmad Abumalek
{"title":"Machine Learning-Assisted Transient Modeling of Asphaltene Particles Aggregation Size","authors":"Khalid Ibrahim Hasan, Karar H. Alfarttoosi, P. Kanjariya, Asha Rajiv, Aman Shankhyan, M. Manjula, Bhavik Jain, Satish Kumar Samal, Waam Mohammed Taher, Mariem Alwan, Mahmood Jasem Jawad, Hiba Mushtaq, Ahmad Abumalek","doi":"10.1134/S0012501625600020","DOIUrl":null,"url":null,"abstract":"<p>Precise estimation of aggregate size of asphaltene particles in oil reservoirs characterized with the resulted formation damage and well blockage issues are critical to the smooth oil production and successful planning of pertinent remedial tasks. In this research, it is aimed to construct data-driven soft-computing based models of Extra Trees (ET), Multilayer Perceptron Artificial Neural Network (MLP-ANN), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Random Forest (RF), K-nearest Neighbors (KNN), Adaptive Boosting (AdaBoost), Ensemble Learning (EL), Decision Tree (DT), Linear Regression, Ridge Regression, and Lasso Regression to predict asphaltene aggregation size in terms of time, asphaltene concentration of model oil, heteroatoms content of asphaltenes, hydrogen content of asphaltenes, and voltage based upon previously published experimental data. A widely recognized outlier identification methodology is implemented to the collected dataset to evaluate its reliability prior to model development. Furthermore, the relevancy index is calculated for every input variable to determine its relative impact on aggregation size. K-fold cross validation algorithm is used during model training to reduce overfitting. It is indicated that in contrast to asphaltene hydrogen content, other parameters such as voltage, time, asphaltene concentration and hydrogen content of asphaltenes are all directly influencing aggregate size. Moreover, both graphical and statistical evaluations demonstrate that the CNN model surpasses all other examined constructed models in performance as evidenced with lowest value in mean squared error and largest value of coefficient of determination<i>.</i></p>","PeriodicalId":532,"journal":{"name":"Doklady Physical Chemistry","volume":"519 1-2","pages":"175 - 191"},"PeriodicalIF":1.1000,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Doklady Physical Chemistry","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1134/S0012501625600020","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Precise estimation of aggregate size of asphaltene particles in oil reservoirs characterized with the resulted formation damage and well blockage issues are critical to the smooth oil production and successful planning of pertinent remedial tasks. In this research, it is aimed to construct data-driven soft-computing based models of Extra Trees (ET), Multilayer Perceptron Artificial Neural Network (MLP-ANN), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Random Forest (RF), K-nearest Neighbors (KNN), Adaptive Boosting (AdaBoost), Ensemble Learning (EL), Decision Tree (DT), Linear Regression, Ridge Regression, and Lasso Regression to predict asphaltene aggregation size in terms of time, asphaltene concentration of model oil, heteroatoms content of asphaltenes, hydrogen content of asphaltenes, and voltage based upon previously published experimental data. A widely recognized outlier identification methodology is implemented to the collected dataset to evaluate its reliability prior to model development. Furthermore, the relevancy index is calculated for every input variable to determine its relative impact on aggregation size. K-fold cross validation algorithm is used during model training to reduce overfitting. It is indicated that in contrast to asphaltene hydrogen content, other parameters such as voltage, time, asphaltene concentration and hydrogen content of asphaltenes are all directly influencing aggregate size. Moreover, both graphical and statistical evaluations demonstrate that the CNN model surpasses all other examined constructed models in performance as evidenced with lowest value in mean squared error and largest value of coefficient of determination.
期刊介绍:
Doklady Physical Chemistry is a monthly journal containing English translations of current Russian research in physical chemistry from the Physical Chemistry sections of the Doklady Akademii Nauk (Proceedings of the Russian Academy of Sciences). The journal publishes the most significant new research in physical chemistry being done in Russia, thus ensuring its scientific priority. Doklady Physical Chemistry presents short preliminary accounts of the application of the state-of-the-art physical chemistry ideas and methods to the study of organic and inorganic compounds and macromolecules; polymeric, inorganic and composite materials as well as corresponding processes. The journal is intended for scientists in all fields of chemistry and in interdisciplinary sciences.