层次文本分类的解纠缠特征图

IF 8.1 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Processing & Management Pub Date : 2025-05-01 Epub Date: 2025-01-14 DOI:10.1016/j.ipm.2025.104065

Renyuan Liu, Xuejie Zhang, Jin Wang, Xiaobing Zhou

{"title":"层次文本分类的解纠缠特征图","authors":"Renyuan Liu, Xuejie Zhang, Jin Wang, Xiaobing Zhou","doi":"10.1016/j.ipm.2025.104065","DOIUrl":null,"url":null,"abstract":"<div><div>Effectively utilizing the hierarchical relationship among labels is the core of Hierarchical Text Classification (HTC). Previous research on HTC has tended to enhance the dependencies between labels. However, they overlook some labels that may conflict with other labels because alleviating label conflicts also weakens label dependencies and reduces the model performance. Therefore, this paper focuses on the issue of label conflicts and studies methods to alleviate label conflicts without affecting the mutual support relationship between labels. To solve the abovementioned problem, we first use the feature disentanglement method to cut off all label connections. Then, the connection among labels is selectively established by constructing a hierarchical graph on disentangled features. Finally, the Graph Neural Networks (GNN) is adopted to encode the obtained Disentanglement Feature Graph (DFG) and enables only labels with connections to support each other, while labels without connections do not interfere with each other. The experimental results on the WOS, RCV1-v2, and BGC datasets show the effectiveness of DFG. In detail, the experimental results show that on the WOS dataset, the model incorporating DFG achieved a 1.07% improvement in Macro-F1, surpassing the best model by 0.27%. On the RCV1-v2 dataset, the model incorporating DFG achieved a 0.95% improvement in Micro-F1, surpassing the best model by 0.21%. On the BGC dataset, the model incorporating DFG achieved a 1.81% improvement in Micro-F1, surpassing the best model by 0.45%.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104065"},"PeriodicalIF":8.1000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Disentangled feature graph for Hierarchical Text Classification\",\"authors\":\"Renyuan Liu, Xuejie Zhang, Jin Wang, Xiaobing Zhou\",\"doi\":\"10.1016/j.ipm.2025.104065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Effectively utilizing the hierarchical relationship among labels is the core of Hierarchical Text Classification (HTC). Previous research on HTC has tended to enhance the dependencies between labels. However, they overlook some labels that may conflict with other labels because alleviating label conflicts also weakens label dependencies and reduces the model performance. Therefore, this paper focuses on the issue of label conflicts and studies methods to alleviate label conflicts without affecting the mutual support relationship between labels. To solve the abovementioned problem, we first use the feature disentanglement method to cut off all label connections. Then, the connection among labels is selectively established by constructing a hierarchical graph on disentangled features. Finally, the Graph Neural Networks (GNN) is adopted to encode the obtained Disentanglement Feature Graph (DFG) and enables only labels with connections to support each other, while labels without connections do not interfere with each other. The experimental results on the WOS, RCV1-v2, and BGC datasets show the effectiveness of DFG. In detail, the experimental results show that on the WOS dataset, the model incorporating DFG achieved a 1.07% improvement in Macro-F1, surpassing the best model by 0.27%. On the RCV1-v2 dataset, the model incorporating DFG achieved a 0.95% improvement in Micro-F1, surpassing the best model by 0.21%. On the BGC dataset, the model incorporating DFG achieved a 1.81% improvement in Micro-F1, surpassing the best model by 0.45%.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 3\",\"pages\":\"Article 104065\"},\"PeriodicalIF\":8.1000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S030645732500007X\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/14 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030645732500007X","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/14 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

有效利用标签之间的层次关系是层次文本分类的核心。以往对HTC的研究倾向于增强标签之间的依赖关系。然而，他们忽略了一些可能与其他标签冲突的标签，因为减轻标签冲突也会削弱标签依赖并降低模型性能。因此，本文关注标签冲突问题，研究如何在不影响标签之间相互支持关系的前提下缓解标签冲突。为了解决上述问题，我们首先使用特征解缠方法切断所有标签连接。然后，通过在解纠缠特征上构造层次图，选择性地建立标签之间的联系。最后，采用图神经网络（Graph Neural Networks， GNN）对得到的解纠缠特征图（Disentanglement Feature Graph， DFG）进行编码，使有连接的标签之间相互支持，无连接的标签之间互不干扰。在WOS、RCV1-v2和BGC数据集上的实验结果表明了DFG的有效性。实验结果表明，在WOS数据集上，采用DFG的模型在Macro-F1上提高了1.07%，比最佳模型提高了0.27%。在RCV1-v2数据集上，纳入DFG的模型在Micro-F1上取得了0.95%的改进，比最佳模型提高了0.21%。在BGC数据集上，纳入DFG的模型在Micro-F1上取得了1.81%的改进，比最佳模型提高了0.45%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Disentangled feature graph for Hierarchical Text Classification

Effectively utilizing the hierarchical relationship among labels is the core of Hierarchical Text Classification (HTC). Previous research on HTC has tended to enhance the dependencies between labels. However, they overlook some labels that may conflict with other labels because alleviating label conflicts also weakens label dependencies and reduces the model performance. Therefore, this paper focuses on the issue of label conflicts and studies methods to alleviate label conflicts without affecting the mutual support relationship between labels. To solve the abovementioned problem, we first use the feature disentanglement method to cut off all label connections. Then, the connection among labels is selectively established by constructing a hierarchical graph on disentangled features. Finally, the Graph Neural Networks (GNN) is adopted to encode the obtained Disentanglement Feature Graph (DFG) and enables only labels with connections to support each other, while labels without connections do not interfere with each other. The experimental results on the WOS, RCV1-v2, and BGC datasets show the effectiveness of DFG. In detail, the experimental results show that on the WOS dataset, the model incorporating DFG achieved a 1.07% improvement in Macro-F1, surpassing the best model by 0.27%. On the RCV1-v2 dataset, the model incorporating DFG achieved a 0.95% improvement in Micro-F1, surpassing the best model by 0.21%. On the BGC dataset, the model incorporating DFG achieved a 1.81% improvement in Micro-F1, surpassing the best model by 0.45%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.