Machine learning models with distinct Shapley value explanations decouple feature attribution and interpretation for chemical compound predictions

IF 7.9 2区 综合性期刊 Q1 CHEMISTRY, MULTIDISCIPLINARY Cell Reports Physical Science Pub Date : 2024-07-23 DOI:10.1016/j.xcrp.2024.102110
{"title":"Machine learning models with distinct Shapley value explanations decouple feature attribution and interpretation for chemical compound predictions","authors":"","doi":"10.1016/j.xcrp.2024.102110","DOIUrl":null,"url":null,"abstract":"<p>Explaining black box predictions of machine learning (ML) models is a topical issue in artificial intelligence (AI) research. For the identification of features determining predictions, the Shapley value formalism originally developed in game theory is widely used in different fields. Typically, Shapley values quantifying feature contributions to predictions need to be approximated in machine learning. We introduce a framework for the calculation of exact Shapley values for 4 kernel functions used in support vector machine (SVM) models and analyze consistently accurate compound activity predictions based on exact Shapley values. Dramatic changes in feature contributions are detected depending on the kernel function, leading to mostly distinct explanations of predictions of the same test compounds. Very different feature contributions yield comparable predictions, which complicate numerical and graphical model explanation and decouple feature attribution and human interpretability.</p>","PeriodicalId":9703,"journal":{"name":"Cell Reports Physical Science","volume":"25 1","pages":""},"PeriodicalIF":7.9000,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell Reports Physical Science","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1016/j.xcrp.2024.102110","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Explaining black box predictions of machine learning (ML) models is a topical issue in artificial intelligence (AI) research. For the identification of features determining predictions, the Shapley value formalism originally developed in game theory is widely used in different fields. Typically, Shapley values quantifying feature contributions to predictions need to be approximated in machine learning. We introduce a framework for the calculation of exact Shapley values for 4 kernel functions used in support vector machine (SVM) models and analyze consistently accurate compound activity predictions based on exact Shapley values. Dramatic changes in feature contributions are detected depending on the kernel function, leading to mostly distinct explanations of predictions of the same test compounds. Very different feature contributions yield comparable predictions, which complicate numerical and graphical model explanation and decouple feature attribution and human interpretability.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
具有不同夏普利值解释的机器学习模型将特征归属与化合物预测解释分离开来
解释机器学习(ML)模型的黑箱预测是人工智能(AI)研究中的一个热点问题。为了识别决定预测的特征,最初在博弈论中发展起来的夏普利值形式主义被广泛应用于不同领域。通常,在机器学习中需要对量化特征对预测贡献的夏普利值进行近似。我们为支持向量机(SVM)模型中使用的 4 个核函数引入了一个计算精确夏普利值的框架,并分析了基于精确夏普利值的持续准确的复合活动预测。根据核函数的不同,可以检测到特征贡献的巨大变化,从而对相同测试化合物的预测结果做出截然不同的解释。非常不同的特征贡献会产生相似的预测结果,这就使数字和图形模型解释变得复杂,并使特征归属和人类可解释性脱钩。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Cell Reports Physical Science
Cell Reports Physical Science Energy-Energy (all)
CiteScore
11.40
自引率
2.20%
发文量
388
审稿时长
62 days
期刊介绍: Cell Reports Physical Science, a premium open-access journal from Cell Press, features high-quality, cutting-edge research spanning the physical sciences. It serves as an open forum fostering collaboration among physical scientists while championing open science principles. Published works must signify significant advancements in fundamental insight or technological applications within fields such as chemistry, physics, materials science, energy science, engineering, and related interdisciplinary studies. In addition to longer articles, the journal considers impactful short-form reports and short reviews covering recent literature in emerging fields. Continually adapting to the evolving open science landscape, the journal reviews its policies to align with community consensus and best practices.
期刊最新文献
Amino acid-dependent phase equilibrium and material properties of tetrapeptide condensates. Paper microfluidic sentinel sensors enable rapid and on-site wastewater surveillance in community settings Catalyzing deep decarbonization with federated battery diagnosis and prognosis for better data management in energy storage systems 4.8-V all-solid-state garnet-based lithium-metal batteries with stable interface Deformation of collagen-based tissues investigated using a systematic review and meta-analysis of synchrotron x-ray scattering studies
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1