An urgent call for robust statistical methods in reliable feature importance analysis across machine learning

IF 6.5 1区 化学 Q2 CHEMISTRY, PHYSICAL Journal of Catalysis Pub Date : 2025-06-01 Epub Date: 2025-03-24 DOI:10.1016/j.jcat.2025.116098
Yoshiyasu Takefuji
{"title":"An urgent call for robust statistical methods in reliable feature importance analysis across machine learning","authors":"Yoshiyasu Takefuji","doi":"10.1016/j.jcat.2025.116098","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate analytical outcomes in machine learning are contingent on error-free calculations and a solid understanding of foundational principles. A notable challenge arises from the lack of ground truth values for validation, complicating the assessment of feature importance, especially when employing linear models with parametric assumptions. This paper critiques the use of Pearson correlation and feature importances derived from Gradient Boosting Regressor (GBR), emphasizing their limitations in analyzing nonlinear and nonparametric data. We propose robust statistical methods, such as Spearman’s correlation and Kendall’s tau, as alternatives for capturing complex relationships while providing essential directional information. Additionally, attention to Variance Inflation Factor (VIF) is crucial for mitigating feature inflation. By addressing these concerns, researchers can achieve more reliable analyses and deeper insight into variable relationships.</div></div>","PeriodicalId":346,"journal":{"name":"Journal of Catalysis","volume":"446 ","pages":"Article 116098"},"PeriodicalIF":6.5000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Catalysis","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0021951725001630","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/24 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Accurate analytical outcomes in machine learning are contingent on error-free calculations and a solid understanding of foundational principles. A notable challenge arises from the lack of ground truth values for validation, complicating the assessment of feature importance, especially when employing linear models with parametric assumptions. This paper critiques the use of Pearson correlation and feature importances derived from Gradient Boosting Regressor (GBR), emphasizing their limitations in analyzing nonlinear and nonparametric data. We propose robust statistical methods, such as Spearman’s correlation and Kendall’s tau, as alternatives for capturing complex relationships while providing essential directional information. Additionally, attention to Variance Inflation Factor (VIF) is crucial for mitigating feature inflation. By addressing these concerns, researchers can achieve more reliable analyses and deeper insight into variable relationships.

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
迫切需要在机器学习中可靠的特征重要性分析中使用稳健的统计方法
在机器学习中,准确的分析结果取决于无错误的计算和对基本原理的深刻理解。一个值得注意的挑战来自缺乏验证的基础真值,使特征重要性的评估复杂化,特别是在使用带有参数假设的线性模型时。本文批评了Pearson相关性和梯度增强回归(GBR)的特征重要性的使用,强调了它们在分析非线性和非参数数据时的局限性。我们提出了稳健的统计方法,如Spearman的相关性和Kendall的tau,作为捕获复杂关系的替代方法,同时提供必要的方向信息。此外,注意方差膨胀因子(VIF)对于减轻功能膨胀至关重要。通过解决这些问题,研究人员可以获得更可靠的分析和更深入的洞察变量关系。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Catalysis
Journal of Catalysis 工程技术-工程:化工
CiteScore
12.30
自引率
5.50%
发文量
447
审稿时长
31 days
期刊介绍: The Journal of Catalysis publishes scholarly articles on both heterogeneous and homogeneous catalysis, covering a wide range of chemical transformations. These include various types of catalysis, such as those mediated by photons, plasmons, and electrons. The focus of the studies is to understand the relationship between catalytic function and the underlying chemical properties of surfaces and metal complexes. The articles in the journal offer innovative concepts and explore the synthesis and kinetics of inorganic solids and homogeneous complexes. Furthermore, they discuss spectroscopic techniques for characterizing catalysts, investigate the interaction of probes and reacting species with catalysts, and employ theoretical methods. The research presented in the journal should have direct relevance to the field of catalytic processes, addressing either fundamental aspects or applications of catalysis.
期刊最新文献
Graphitic carbon nitride supported manganese catalyst for β-Alkylation of secondary alcohols with primary alcohols via double hydrogen autotransfer A single catalyst solution: unraveling propane-to-propene conversion over WOx/SiO2 catalysts combining DFT and microkinetic modeling study Encapsulation-driven geometric and electronic tuning of Rh nanoparticles in aluminum-modified zeolite for ambient-pressure methanation Tailoring the polyolefin hydrogenolysis performance of Ru/TiO2 through TiO2 support facet engineering Highly selective linear α-olefins production from syngas over alkali free FexCy@MnOx catalyst
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1