{"title":"考虑特定标签相关信息的增强型多标签特征选择","authors":"Qingqi Han , Zhanpeng Zhao , Liang Hu, Wanfu Gao","doi":"10.1016/j.eswa.2024.125819","DOIUrl":null,"url":null,"abstract":"<div><div>In fields such as text classification and image recognition, multi-label data is frequently encountered. However, extracting information-rich and reliable features from high-dimensional multi-label datasets presents significant challenges in pattern recognition tasks. Traditional information-theoretic feature selection methods utilize a greedy algorithm strategy, selecting the feature that best meets the evaluation criteria in each iteration. However, the optimal result of each iteration does not necessarily yield a globally optimal solution. These methods primarily focus on the overall relevance of each feature with respect to all labels from a macro perspective, often overlooking the distribution of relevant information among features. This oversight can lead to the selection of features that are weakly correlated with the labels. Additionally, they neglect the impact of redundancy measures on feature scoring, resulting in the selection of some irrelevant features. To address these issues, we propose a novel multi-label feature selection method that evaluates the relevance between feature sets and label sets from both macro and micro perspectives. This method maximizes the relevance between features and the label set while ensuring the selection of features that are strongly correlated with each individual label. Classification experiments conducted on eight multi-label datasets demonstrate that the proposed method consistently outperforms seven comparative methods.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"264 ","pages":"Article 125819"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced multi-label feature selection considering label-specific relevant information\",\"authors\":\"Qingqi Han , Zhanpeng Zhao , Liang Hu, Wanfu Gao\",\"doi\":\"10.1016/j.eswa.2024.125819\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In fields such as text classification and image recognition, multi-label data is frequently encountered. However, extracting information-rich and reliable features from high-dimensional multi-label datasets presents significant challenges in pattern recognition tasks. Traditional information-theoretic feature selection methods utilize a greedy algorithm strategy, selecting the feature that best meets the evaluation criteria in each iteration. However, the optimal result of each iteration does not necessarily yield a globally optimal solution. These methods primarily focus on the overall relevance of each feature with respect to all labels from a macro perspective, often overlooking the distribution of relevant information among features. This oversight can lead to the selection of features that are weakly correlated with the labels. Additionally, they neglect the impact of redundancy measures on feature scoring, resulting in the selection of some irrelevant features. To address these issues, we propose a novel multi-label feature selection method that evaluates the relevance between feature sets and label sets from both macro and micro perspectives. This method maximizes the relevance between features and the label set while ensuring the selection of features that are strongly correlated with each individual label. Classification experiments conducted on eight multi-label datasets demonstrate that the proposed method consistently outperforms seven comparative methods.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"264 \",\"pages\":\"Article 125819\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-11-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417424026861\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417424026861","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Enhanced multi-label feature selection considering label-specific relevant information
In fields such as text classification and image recognition, multi-label data is frequently encountered. However, extracting information-rich and reliable features from high-dimensional multi-label datasets presents significant challenges in pattern recognition tasks. Traditional information-theoretic feature selection methods utilize a greedy algorithm strategy, selecting the feature that best meets the evaluation criteria in each iteration. However, the optimal result of each iteration does not necessarily yield a globally optimal solution. These methods primarily focus on the overall relevance of each feature with respect to all labels from a macro perspective, often overlooking the distribution of relevant information among features. This oversight can lead to the selection of features that are weakly correlated with the labels. Additionally, they neglect the impact of redundancy measures on feature scoring, resulting in the selection of some irrelevant features. To address these issues, we propose a novel multi-label feature selection method that evaluates the relevance between feature sets and label sets from both macro and micro perspectives. This method maximizes the relevance between features and the label set while ensuring the selection of features that are strongly correlated with each individual label. Classification experiments conducted on eight multi-label datasets demonstrate that the proposed method consistently outperforms seven comparative methods.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.