首页 > 最新文献

Data Mining and Knowledge Discovery最新文献

英文 中文
Can local explanation techniques explain linear additive models? 局部解释技术能解释线性加性模型吗?
3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-09-19 DOI: 10.1007/s10618-023-00971-3
Amir Hossein Akhavan Rahnama, Judith Bütepage, Pierre Geurts, Henrik Boström
Abstract Local model-agnostic additive explanation techniques decompose the predicted output of a black-box model into additive feature importance scores. Questions have been raised about the accuracy of the produced local additive explanations. We investigate this by studying whether some of the most popular explanation techniques can accurately explain the decisions of linear additive models. We show that even though the explanations generated by these techniques are linear additives, they can fail to provide accurate explanations when explaining linear additive models. In the experiments, we measure the accuracy of additive explanations, as produced by, e.g., LIME and SHAP, along with the non-additive explanations of Local Permutation Importance (LPI) when explaining Linear and Logistic Regression and Gaussian naive Bayes models over 40 tabular datasets. We also investigate the degree to which different factors, such as the number of numerical or categorical or correlated features, the predictive performance of the black-box model, explanation sample size, similarity metric, and the pre-processing technique used on the dataset can directly affect the accuracy of local explanations.
局部模型不可知的加性解释技术将黑盒模型的预测输出分解为加性特征重要性分数。对产生的局部加性解释的准确性提出了质疑。我们通过研究一些最流行的解释技术是否能准确地解释线性加性模型的决策来研究这一点。我们表明,尽管这些技术产生的解释是线性添加的,但在解释线性添加模型时,它们可能无法提供准确的解释。在实验中,我们测量了由LIME和SHAP等产生的加性解释的准确性,以及在解释线性和逻辑回归以及高斯朴素贝叶斯模型超过40个表格数据集时,局部排列重要性(LPI)的非加性解释。我们还研究了不同因素,如数值或分类或相关特征的数量,黑箱模型的预测性能,解释样本量,相似性度量和数据集上使用的预处理技术,可以直接影响局部解释准确性的程度。
{"title":"Can local explanation techniques explain linear additive models?","authors":"Amir Hossein Akhavan Rahnama, Judith Bütepage, Pierre Geurts, Henrik Boström","doi":"10.1007/s10618-023-00971-3","DOIUrl":"https://doi.org/10.1007/s10618-023-00971-3","url":null,"abstract":"Abstract Local model-agnostic additive explanation techniques decompose the predicted output of a black-box model into additive feature importance scores. Questions have been raised about the accuracy of the produced local additive explanations. We investigate this by studying whether some of the most popular explanation techniques can accurately explain the decisions of linear additive models. We show that even though the explanations generated by these techniques are linear additives, they can fail to provide accurate explanations when explaining linear additive models. In the experiments, we measure the accuracy of additive explanations, as produced by, e.g., LIME and SHAP, along with the non-additive explanations of Local Permutation Importance (LPI) when explaining Linear and Logistic Regression and Gaussian naive Bayes models over 40 tabular datasets. We also investigate the degree to which different factors, such as the number of numerical or categorical or correlated features, the predictive performance of the black-box model, explanation sample size, similarity metric, and the pre-processing technique used on the dataset can directly affect the accuracy of local explanations.","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135060685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improving position encoding of transformers for multivariate time series classification 多变量时间序列分类中变压器位置编码的改进
3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-09-05 DOI: 10.1007/s10618-023-00948-2
Navid Mohammadi Foumani, Chang Wei Tan, Geoffrey I. Webb, Mahsa Salehi
Abstract Transformers have demonstrated outstanding performance in many applications of deep learning. When applied to time series data, transformers require effective position encoding to capture the ordering of the time series data. The efficacy of position encoding in time series analysis is not well-studied and remains controversial, e.g., whether it is better to inject absolute position encoding or relative position encoding, or a combination of them. In order to clarify this, we first review existing absolute and relative position encoding methods when applied in time series classification. We then proposed a new absolute position encoding method dedicated to time series data called time Absolute Position Encoding (tAPE). Our new method incorporates the series length and input embedding dimension in absolute position encoding. Additionally, we propose computationally Efficient implementation of Relative Position Encoding (eRPE) to improve generalisability for time series. We then propose a novel multivariate time series classification model combining tAPE/eRPE and convolution-based input encoding named ConvTran to improve the position and data embedding of time series data. The proposed absolute and relative position encoding methods are simple and efficient. They can be easily integrated into transformer blocks and used for downstream tasks such as forecasting, extrinsic regression, and anomaly detection. Extensive experiments on 32 multivariate time-series datasets show that our model is significantly more accurate than state-of-the-art convolution and transformer-based models. Code and models are open-sourced at https://github.com/Navidfoumani/ConvTran .
变压器在深度学习的许多应用中表现出了出色的性能。当应用于时间序列数据时,变压器需要有效的位置编码来捕获时间序列数据的顺序。位置编码在时间序列分析中的有效性研究并不充分,存在争议,例如,是注入绝对位置编码还是相对位置编码更好,还是两者结合更好。为了澄清这一点,我们首先回顾了现有的绝对位置和相对位置编码方法在时间序列分类中的应用。然后,我们提出了一种新的用于时间序列数据的绝对位置编码方法,称为时间绝对位置编码(tAPE)。该方法在绝对位置编码中结合了序列长度和输入嵌入维数。此外,我们提出了相对位置编码(eRPE)的计算效率实现,以提高时间序列的通用性。然后,我们提出了一种新的多变量时间序列分类模型,将tAPE/eRPE和基于卷积的输入编码相结合,称为ConvTran,以改善时间序列数据的位置和数据嵌入。所提出的绝对位置和相对位置编码方法简单有效。它们可以很容易地集成到变压器块中,并用于下游任务,如预测、外部回归和异常检测。在32个多变量时间序列数据集上进行的大量实验表明,我们的模型比最先进的卷积和基于变压器的模型要准确得多。代码和模型在https://github.com/Navidfoumani/ConvTran上是开源的。
{"title":"Improving position encoding of transformers for multivariate time series classification","authors":"Navid Mohammadi Foumani, Chang Wei Tan, Geoffrey I. Webb, Mahsa Salehi","doi":"10.1007/s10618-023-00948-2","DOIUrl":"https://doi.org/10.1007/s10618-023-00948-2","url":null,"abstract":"Abstract Transformers have demonstrated outstanding performance in many applications of deep learning. When applied to time series data, transformers require effective position encoding to capture the ordering of the time series data. The efficacy of position encoding in time series analysis is not well-studied and remains controversial, e.g., whether it is better to inject absolute position encoding or relative position encoding, or a combination of them. In order to clarify this, we first review existing absolute and relative position encoding methods when applied in time series classification. We then proposed a new absolute position encoding method dedicated to time series data called time Absolute Position Encoding (tAPE). Our new method incorporates the series length and input embedding dimension in absolute position encoding. Additionally, we propose computationally Efficient implementation of Relative Position Encoding (eRPE) to improve generalisability for time series. We then propose a novel multivariate time series classification model combining tAPE/eRPE and convolution-based input encoding named ConvTran to improve the position and data embedding of time series data. The proposed absolute and relative position encoding methods are simple and efficient. They can be easily integrated into transformer blocks and used for downstream tasks such as forecasting, extrinsic regression, and anomaly detection. Extensive experiments on 32 multivariate time-series datasets show that our model is significantly more accurate than state-of-the-art convolution and transformer-based models. Code and models are open-sourced at https://github.com/Navidfoumani/ConvTran .","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135205529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Z-Time: efficient and effective interpretable multivariate time series classification Z-Time:高效有效的可解释多元时间序列分类
IF 4.8 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-09-05 DOI: 10.1007/s10618-023-00969-x
Zed Lee, Tony Lindgren, P. Papapetrou
{"title":"Z-Time: efficient and effective interpretable multivariate time series classification","authors":"Zed Lee, Tony Lindgren, P. Papapetrou","doi":"10.1007/s10618-023-00969-x","DOIUrl":"https://doi.org/10.1007/s10618-023-00969-x","url":null,"abstract":"","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":" ","pages":""},"PeriodicalIF":4.8,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49003760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Studying bias in visual features through the lens of optimal transport 通过最优运输的视角研究视觉特征偏差
IF 4.8 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-09-02 DOI: 10.1007/s10618-023-00972-2
Simone Fabbrizzi, Xuan Zhao, Emmanouil Krasanakis, Symeon Papadopoulos, Eirini Ntoutsi
{"title":"Studying bias in visual features through the lens of optimal transport","authors":"Simone Fabbrizzi, Xuan Zhao, Emmanouil Krasanakis, Symeon Papadopoulos, Eirini Ntoutsi","doi":"10.1007/s10618-023-00972-2","DOIUrl":"https://doi.org/10.1007/s10618-023-00972-2","url":null,"abstract":"","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":" ","pages":""},"PeriodicalIF":4.8,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46788038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Network embedding based on high-degree penalty and adaptive negative sampling 基于高度惩罚和自适应负采样的网络嵌入
IF 4.8 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-09-02 DOI: 10.1007/s10618-023-00973-1
Gang-Feng Ma, Xu-Hua Yang, Wei Ye, Xinli Xu, Lei Ye
{"title":"Network embedding based on high-degree penalty and adaptive negative sampling","authors":"Gang-Feng Ma, Xu-Hua Yang, Wei Ye, Xinli Xu, Lei Ye","doi":"10.1007/s10618-023-00973-1","DOIUrl":"https://doi.org/10.1007/s10618-023-00973-1","url":null,"abstract":"","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":" ","pages":""},"PeriodicalIF":4.8,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46219433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving neural network’s robustness on tabular data with D-layers 用d层提高神经网络对表格数据的鲁棒性
IF 4.8 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-08-31 DOI: 10.1007/s10618-023-00965-1
Haiyang Xia, Nayyar Zaidi, Yishuo Zhang, Gang Li
{"title":"Improving neural network’s robustness on tabular data with D-layers","authors":"Haiyang Xia, Nayyar Zaidi, Yishuo Zhang, Gang Li","doi":"10.1007/s10618-023-00965-1","DOIUrl":"https://doi.org/10.1007/s10618-023-00965-1","url":null,"abstract":"","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":" ","pages":""},"PeriodicalIF":4.8,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46620747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sky-signatures: detecting and characterizing recurrent behavior in sequential data 天空签名:在序列数据中检测和描述循环行为
IF 4.8 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-08-29 DOI: 10.1007/s10618-023-00949-1
Clément Gautrais, Peggy Cellier, T. Guyet, R. Quiniou, A. Termier
{"title":"Sky-signatures: detecting and characterizing recurrent behavior in sequential data","authors":"Clément Gautrais, Peggy Cellier, T. Guyet, R. Quiniou, A. Termier","doi":"10.1007/s10618-023-00949-1","DOIUrl":"https://doi.org/10.1007/s10618-023-00949-1","url":null,"abstract":"","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":" ","pages":""},"PeriodicalIF":4.8,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43639644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SALτ: efficiently stopping TAR by improving priors estimates SALτ:通过改进先验估计有效地阻止TAR
IF 4.8 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-08-28 DOI: 10.1007/s10618-023-00961-5
Alessio Molinari, Andrea Esuli
{"title":"SALτ: efficiently stopping TAR by improving priors estimates","authors":"Alessio Molinari, Andrea Esuli","doi":"10.1007/s10618-023-00961-5","DOIUrl":"https://doi.org/10.1007/s10618-023-00961-5","url":null,"abstract":"","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":"1 1","pages":""},"PeriodicalIF":4.8,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42153255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A semi-supervised interactive algorithm for change point detection 一种用于变点检测的半监督交互式算法
IF 4.8 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-08-28 DOI: 10.1007/s10618-023-00974-0
Zhenxiang Cao, N. Seeuws, Marina De Vos, Alexander Bertrand
{"title":"A semi-supervised interactive algorithm for change point detection","authors":"Zhenxiang Cao, N. Seeuws, Marina De Vos, Alexander Bertrand","doi":"10.1007/s10618-023-00974-0","DOIUrl":"https://doi.org/10.1007/s10618-023-00974-0","url":null,"abstract":"","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":" ","pages":""},"PeriodicalIF":4.8,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48565843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A tale of two roles: exploring topic-specific susceptibility and influence in cascade prediction 两个角色的故事:探索主题特异性敏感性和级联预测的影响
IF 4.8 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-08-27 DOI: 10.1007/s10618-023-00953-5
Ninghan Chen, Xihui Chen, Zhiqiang Zhong, Jun Pang
{"title":"A tale of two roles: exploring topic-specific susceptibility and influence in cascade prediction","authors":"Ninghan Chen, Xihui Chen, Zhiqiang Zhong, Jun Pang","doi":"10.1007/s10618-023-00953-5","DOIUrl":"https://doi.org/10.1007/s10618-023-00953-5","url":null,"abstract":"","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":" ","pages":""},"PeriodicalIF":4.8,"publicationDate":"2023-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47485085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Data Mining and Knowledge Discovery
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1