Mahsa Hajihosseinlou, Abbas Maghsoudi, Reza Ghezelbash
{"title":"Regularization in machine learning models for MVT Pb-Zn prospectivity mapping: applying lasso and elastic-net algorithms","authors":"Mahsa Hajihosseinlou, Abbas Maghsoudi, Reza Ghezelbash","doi":"10.1007/s12145-024-01404-5","DOIUrl":null,"url":null,"abstract":"<p>The current research employed the least absolute shrinkage and selection operator (Lasso) and Elastic-net algorithms to examine their potential utilization in MVT Pb-Zn prospectivity modeling. In training the model, both Elastic-net and Lasso regularization approaches include a penalty term to the loss function. Since this penalty term limits the feature coefficients, the model is motivated to prioritize the most informative features and penalize the less relevant ones. The Varcheh district in western Iran was the source of the geological, geochemical, tectonic, and alteration dataset. We applied stratified 5-fold cross-validation to train the dataset, ensuring consistent and comprehensive performance evaluation across different data subsets. This method improved data utilization and provided more reliable performance estimates by averaging metrics over multiple folds, thereby enhancing the model’s generalization assessment. The hyperparameters were adjusted using random search, quickly finding near-optimal solutions. Our investigation revealed that Elastic-net exhibited superior prediction accuracy and model robustness compared to Lasso. The combination of L1 and L2 regularization in Elastic-net, offers a more adaptable technique than Lasso, which just utilizes L1 regularization. This feature enables Elastic-net to handle scenarios in which there have been correlated predictors successfully.</p>","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"94 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Earth Science Informatics","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s12145-024-01404-5","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
The current research employed the least absolute shrinkage and selection operator (Lasso) and Elastic-net algorithms to examine their potential utilization in MVT Pb-Zn prospectivity modeling. In training the model, both Elastic-net and Lasso regularization approaches include a penalty term to the loss function. Since this penalty term limits the feature coefficients, the model is motivated to prioritize the most informative features and penalize the less relevant ones. The Varcheh district in western Iran was the source of the geological, geochemical, tectonic, and alteration dataset. We applied stratified 5-fold cross-validation to train the dataset, ensuring consistent and comprehensive performance evaluation across different data subsets. This method improved data utilization and provided more reliable performance estimates by averaging metrics over multiple folds, thereby enhancing the model’s generalization assessment. The hyperparameters were adjusted using random search, quickly finding near-optimal solutions. Our investigation revealed that Elastic-net exhibited superior prediction accuracy and model robustness compared to Lasso. The combination of L1 and L2 regularization in Elastic-net, offers a more adaptable technique than Lasso, which just utilizes L1 regularization. This feature enables Elastic-net to handle scenarios in which there have been correlated predictors successfully.
期刊介绍:
The Earth Science Informatics [ESIN] journal aims at rapid publication of high-quality, current, cutting-edge, and provocative scientific work in the area of Earth Science Informatics as it relates to Earth systems science and space science. This includes articles on the application of formal and computational methods, computational Earth science, spatial and temporal analyses, and all aspects of computer applications to the acquisition, storage, processing, interchange, and visualization of data and information about the materials, properties, processes, features, and phenomena that occur at all scales and locations in the Earth system’s five components (atmosphere, hydrosphere, geosphere, biosphere, cryosphere) and in space (see "About this journal" for more detail). The quarterly journal publishes research, methodology, and software articles, as well as editorials, comments, and book and software reviews. Review articles of relevant findings, topics, and methodologies are also considered.